[RFC] DMAEngine: Define generic transfer request api
diff mbox

Message ID 1311453088-5063-1-git-send-email-jaswinder.singh@linaro.org
State New, archived
Headers show

Commit Message

Jassi Brar July 23, 2011, 8:31 p.m. UTC
This is an attempt to define an api that could be used for doing
fancy data transfers like interleaved to contiguous copy and vice-versa.
Traditional SG_list based transfers tend to be very inefficient
in such cases. Such cases call for some very condensed api to convey
pattern of the transfer. This is an attempt at that condensed api.

The api supports all 4 variants of scatter-gather and contiguous transfer.
Besides, it could easily represent common operations like
 	device_prep_dma_{cyclic, memset, memcpy}
and maybe some more that I am not sure of.

Of course, this api too can't help transfers that don't lend to DMA by
nature, i.e, scattered tiny read/writes with no periodic pattern. 

Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
---
 include/linux/dmaengine.h |   74 +++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 74 insertions(+), 0 deletions(-)

Comments

Linus Walleij July 25, 2011, 10:55 a.m. UTC | #1
2011/7/23 Jassi Brar <jaswinder.singh@linaro.org>:

> This is an attempt to define an api that could be used for doing
> fancy data transfers like interleaved to contiguous copy and vice-versa.
> Traditional SG_list based transfers tend to be very inefficient
> in such cases. Such cases call for some very condensed api to convey
> pattern of the transfer. This is an attempt at that condensed api.
>
> The api supports all 4 variants of scatter-gather and contiguous transfer.
> Besides, it could easily represent common operations like
>        device_prep_dma_{cyclic, memset, memcpy}
> and maybe some more that I am not sure of.
>
> Of course, this api too can't help transfers that don't lend to DMA by
> nature, i.e, scattered tiny read/writes with no periodic pattern.
>
> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>

I think this also looks interesting, it's the same idea as for using
control commands to set up striding, but by supporting a totally
new function call instead.

Do you think this method can be filled in with a default
implementation that converts the request to a plain sglist request
to memory or slave if interleaved transfers are not available?
That would mean that drivers can use this API on any DMA
engine, no matter whether it supports ICG/striding or not.

If we do that, this is much better than using a special control
command.

+ * @src_inc: If the source address increments after reading from it.
+ * @dst_inc: If the destination address increments after writing to it.

Currently whether source/destination increments or not is
implicit from transaction type, i.e. if it's a mem2mem transfer
it is implicit that both increment, if it's a slave transfer we can
infer from the direction (to/from peripheral) whether source or
destination is supposed to increment.

IMO this means duplication, and removes abstraction from the
API, it looks more like specific bits to be set in the DMA
hardware to control increments.

In this form it makes possible to set up nonsensical operations like
a memory-to-memory transfer to a non-incrementing address
which is not a peripheral FIFO, which is not good IMO. The API
should be semantically designed for setting up sensible operations.

+ * @op: The operation to perform on source data before writing it on
+ *      to destination address.

It is a enum dma_transaction_type after all, so can't it just say
"type of transfer"?

I think you can infer the above increment settings from this type,
so they should be removed.

+ * @frm_irq: If the client expects DMAC driver to do callback after each frame.

I don't understand the usecase for getting this IRQ, and it deviates
from the DMAengine way of setting a flag on the descriptor.
Instead I think you should maybe provide a new flag for the
unsigned long flags you're already passing in here:

+       struct dma_async_tx_descriptor *(*device_prep_dma_xfer)(
+               struct dma_chan *chan, struct xfer_template *xt,
+               unsigned long flags);
                 ^^^^^^^^^^^^^^^^^^^^^^

Thanks,
Linus Walleij
Vinod Koul July 25, 2011, 11:30 a.m. UTC | #2
On Sun, 2011-07-24 at 02:01 +0530, Jassi Brar wrote:
> This is an attempt to define an api that could be used for doing
> fancy data transfers like interleaved to contiguous copy and vice-versa.
> Traditional SG_list based transfers tend to be very inefficient
> in such cases. Such cases call for some very condensed api to convey
> pattern of the transfer. This is an attempt at that condensed api.
> 
> The api supports all 4 variants of scatter-gather and contiguous transfer.
> Besides, it could easily represent common operations like
>  	device_prep_dma_{cyclic, memset, memcpy}
and how do you specify if the transfer is cyclic, memset or mempcy...
> and maybe some more that I am not sure of.
> 
> Of course, this api too can't help transfers that don't lend to DMA by
> nature, i.e, scattered tiny read/writes with no periodic pattern. 
For that you use current sg_list :)
> 
> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
> ---
>  include/linux/dmaengine.h |   74 +++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 74 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index eee7add..a6cdb57 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -74,6 +74,76 @@ enum dma_transaction_type {
>  /* last transaction type for creation of the capabilities mask */
>  #define DMA_TX_TYPE_END (DMA_CYCLIC + 1)
>  
> +/**
> + * Generic Transfer Request
> + * ------------------------
> + * A chunk is collection of contiguous bytes to be transfered.
> + * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
> + * ICGs may or maynot change between chunks.
> + * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
> + *  that when repeated an integral number of times, specifies the transfer.
> + * A transfer template is specification of a Frame, the number of times
> + *  it is to be repeated and other per-transfer attributes.
> + *
> + * Practically, a client driver would have ready a template for each
> + *  type of transfer it is going to need during its lifetime and
> + *  set only 'src_start' and 'dst_start' before submitting the requests.
> + *
> + *
> + *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
> + *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
> + *
> + *    ==  Chunk size
> + *    ... ICG
> + */
> +
> +/**
> + * struct data_chunk - Element of scatter-gather list that makes a frame.
> + * @size: Number of bytes to read from source.
> + *	  size_dst := fn(op, size_src), so doesn't mean much for destination.
> + * @icg: Number of bytes to jump after last src/dst address of this
> + *	 chunk and before first src/dst address for next chunk.
> + *	 Ignored for dst(assumed 0), if dst_inc is true and dst_sgl is false.
> + *	 Ignored for src(assumed 0), if src_inc is true and src_sgl is false.
> + */
> +struct data_chunk {
> +	size_t size;
> +	size_t icg;
> +};
> +
> +/**
> + * struct xfer_template - Template to convey DMAC the transfer pattern
> + *	 and attributes.
> + * @op: The operation to perform on source data before writing it on
> + *	 to destination address.
example of ops pls
> + * @src_start: Absolute address of source for the first chunk.
> + * @dst_start: Absolute address of destination for the first chunk.
absolute = bus addr, right, if so bus addr would be a better term
> + * @src_inc: If the source address increments after reading from it.
> + * @dst_inc: If the destination address increments after writing to it.
Is the inc in bytes?
> + * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
> + *		Otherwise, source is read contiguously (icg ignored).
> + *		Ignored if src_inc is false.
> + * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
> + *		Otherwise, destination is filled contiguously (icg ignored).
> + *		Ignored if dst_inc is false.
> + * @frm_irq: If the client expects DMAC driver to do callback after each frame.
why not reuse the callback in descriptor for this?
> + * @numf: Number of frames in this template.
> + * @frame_size: Number of chunks in a frame i.e, size of sgl[].
> + * @sgl: Array of {chunk,icg} pairs that make up a frame.
> + */
> +struct xfer_template {
> +	enum dma_transaction_type op;
> +	dma_addr_t src_start;
> +	dma_addr_t dst_start;
> +	bool src_inc;
> +	bool dst_inc;
> +	bool src_sgl;
> +	bool dst_sgl;
> +	bool frm_irq;
> +	size_t numf;
> +	size_t frame_size;
> +	struct data_chunk sgl[0];
> +};
>  
>  /**
>   * enum dma_ctrl_flags - DMA flags to augment operation preparation,
> @@ -430,6 +500,7 @@ struct dma_tx_state {
>   * @device_prep_dma_cyclic: prepare a cyclic dma operation suitable for audio.
>   *	The function takes a buffer of size buf_len. The callback function will
>   *	be called after period_len bytes have been transferred.
> + * @device_prep_dma_xfer: Transfer expression in 'most' generic way.
>   * @device_control: manipulate all pending operations on a channel, returns
>   *	zero or error code
>   * @device_tx_status: poll for transaction completion, the optional
> @@ -494,6 +565,9 @@ struct dma_device {
>  	struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
>  		struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
>  		size_t period_len, enum dma_data_direction direction);
> +	struct dma_async_tx_descriptor *(*device_prep_dma_xfer)(
> +		struct dma_chan *chan, struct xfer_template *xt,
> +		unsigned long flags);
>  	int (*device_control)(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
>  		unsigned long arg);
>  

Is this coordinated with Sundaram, he was trying similar thing with help
from Linus W?
Jassi Brar July 25, 2011, 1:13 p.m. UTC | #3
On 25 July 2011 16:25, Linus Walleij <linus.ml.walleij@gmail.com> wrote:
> 2011/7/23 Jassi Brar <jaswinder.singh@linaro.org>:
>
>> This is an attempt to define an api that could be used for doing
>> fancy data transfers like interleaved to contiguous copy and vice-versa.
>> Traditional SG_list based transfers tend to be very inefficient
>> in such cases. Such cases call for some very condensed api to convey
>> pattern of the transfer. This is an attempt at that condensed api.
>>
>> The api supports all 4 variants of scatter-gather and contiguous transfer.
>> Besides, it could easily represent common operations like
>>        device_prep_dma_{cyclic, memset, memcpy}
>> and maybe some more that I am not sure of.
>>
>> Of course, this api too can't help transfers that don't lend to DMA by
>> nature, i.e, scattered tiny read/writes with no periodic pattern.
>>
>> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
>
> I think this also looks interesting, it's the same idea as for using
> control commands to set up striding, but by supporting a totally
> new function call instead.
>
> Do you think this method can be filled in with a default
> implementation that converts the request to a plain sglist request
> to memory or slave if interleaved transfers are not available?
> That would mean that drivers can use this API on any DMA
> engine, no matter whether it supports ICG/striding or not.
At the moment, this is supposed to be used only for the fancy kind
of transfers that involve lengths and skips in a few _bytes_.
For which SGL is ineffient in the first place.

>
> If we do that, this is much better than using a special control
> command.
Maybe I should have added something like 'GENERAL_XFER' capability
flag to the api.
Though not set in stone, but I strongly believe this should be how it is now.

>
> + * @src_inc: If the source address increments after reading from it.
> + * @dst_inc: If the destination address increments after writing to it.
>
> Currently whether source/destination increments or not is
> implicit from transaction type, i.e. if it's a mem2mem transfer
> it is implicit that both increment, if it's a slave transfer we can
> infer from the direction (to/from peripheral) whether source or
> destination is supposed to increment.
Such few things are for later.... final part of my conspiracy ;)
Now that we are renovating the DMA API, we might come to
agree that this callback can just as well be employed for
Mem<->Device transfer.

>
> IMO this means duplication, and removes abstraction from the
> API, it looks more like specific bits to be set in the DMA
> hardware to control increments.
>
> In this form it makes possible to set up nonsensical operations like
> a memory-to-memory transfer to a non-incrementing address which
> is not a peripheral FIFO, which is not good IMO.
Not really.
We can convey 'memset' type of operation with this
 src_inc := false
 dsr_inc := true
 src_width := length of byte-sequence to repeated

The point is to cover as many possibilities as possible (at acceptable cost)
And of course, either the client won't make invalid requests or the
DMAC will reject them -- just as is _already_ the case.

>  The API should be semantically designed for setting up sensible operations.
No API can be fool-proof.

>
> + * @op: The operation to perform on source data before writing it on
> + *      to destination address.
>
> It is a enum dma_transaction_type after all, so can't it just say
> "type of transfer"?
It is to stress that it is possible to do complex operations as well.
XOR is not exactly a transfer, but an operation.
Theoretically a 'dma' can decode an mp3 stream and write the expanded
raw data to destination. So, it's not exactly a transfer - it's an operation.

> + * @frm_irq: If the client expects DMAC driver to do callback after each frame.
>
> I don't understand the usecase for getting this IRQ, and it deviates
> from the DMAengine way of setting a flag on the descriptor.
It doesn't take much bytes, so I would beg on my knees to please let it live.
Someday, it could be used to convey 'cyclic' transfer requests - that
is (only important members are set)
struct xfer_template {
       dma_addr_t src_start = address of ring buffer;
       bool frm_irq = true;
       size_t numf = ~0;
       size_t frame_size = number of parts of ring buffer;
};


> Instead I think you should maybe provide a new flag for the
> unsigned long flags you're already passing in here:
>
> +       struct dma_async_tx_descriptor *(*device_prep_dma_xfer)(
> +               struct dma_chan *chan, struct xfer_template *xt,
> +               unsigned long flags);
>                 ^^^^^^^^^^^^^^^^^^^^^^
This is at your mercy.

Thanks,
Jassi
Jassi Brar July 25, 2011, 1:53 p.m. UTC | #4
On 25 July 2011 17:00, Vinod Koul <vkoul@infradead.org> wrote:
> On Sun, 2011-07-24 at 02:01 +0530, Jassi Brar wrote:
>> This is an attempt to define an api that could be used for doing
>> fancy data transfers like interleaved to contiguous copy and vice-versa.
>> Traditional SG_list based transfers tend to be very inefficient
>> in such cases. Such cases call for some very condensed api to convey
>> pattern of the transfer. This is an attempt at that condensed api.
>>
>> The api supports all 4 variants of scatter-gather and contiguous transfer.
>> Besides, it could easily represent common operations like
>>       device_prep_dma_{cyclic, memset, memcpy}
> and how do you specify if the transfer is cyclic, memset or mempcy...
>> and maybe some more that I am not sure of.
>>
>> Of course, this api too can't help transfers that don't lend to DMA by
>> nature, i.e, scattered tiny read/writes with no periodic pattern.
> For that you use current sg_list :)
_tiny_ is the keyword here :)
SGL isn't an efficient method to convey transfer patterns with lengths
and skips in a few bytes.
And if they are big enough, we can use this api still.

>>
>> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
>> ---
>>  include/linux/dmaengine.h |   74 +++++++++++++++++++++++++++++++++++++++++++++
>>  1 files changed, 74 insertions(+), 0 deletions(-)
>>
>> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
>> index eee7add..a6cdb57 100644
>> --- a/include/linux/dmaengine.h
>> +++ b/include/linux/dmaengine.h
>> @@ -74,6 +74,76 @@ enum dma_transaction_type {
>>  /* last transaction type for creation of the capabilities mask */
>>  #define DMA_TX_TYPE_END (DMA_CYCLIC + 1)
>>
>> +/**
>> + * Generic Transfer Request
>> + * ------------------------
>> + * A chunk is collection of contiguous bytes to be transfered.
>> + * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
>> + * ICGs may or maynot change between chunks.
>> + * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
>> + *  that when repeated an integral number of times, specifies the transfer.
>> + * A transfer template is specification of a Frame, the number of times
>> + *  it is to be repeated and other per-transfer attributes.
>> + *
>> + * Practically, a client driver would have ready a template for each
>> + *  type of transfer it is going to need during its lifetime and
>> + *  set only 'src_start' and 'dst_start' before submitting the requests.
>> + *
>> + *
>> + *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
>> + *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
>> + *
>> + *    ==  Chunk size
>> + *    ... ICG
>> + */
>> +
>> +/**
>> + * struct data_chunk - Element of scatter-gather list that makes a frame.
>> + * @size: Number of bytes to read from source.
>> + *     size_dst := fn(op, size_src), so doesn't mean much for destination.
>> + * @icg: Number of bytes to jump after last src/dst address of this
>> + *    chunk and before first src/dst address for next chunk.
>> + *    Ignored for dst(assumed 0), if dst_inc is true and dst_sgl is false.
>> + *    Ignored for src(assumed 0), if src_inc is true and src_sgl is false.
>> + */
>> +struct data_chunk {
>> +     size_t size;
>> +     size_t icg;
>> +};
>> +
>> +/**
>> + * struct xfer_template - Template to convey DMAC the transfer pattern
>> + *    and attributes.
>> + * @op: The operation to perform on source data before writing it on
>> + *    to destination address.
> example of ops pls
We can convey almost every kind of offline-'dma' operation via this generic api.

>> + * @src_start: Absolute address of source for the first chunk.
>> + * @dst_start: Absolute address of destination for the first chunk.
> absolute = bus addr, right, if so bus addr would be a better term
Yup

>> + * @src_inc: If the source address increments after reading from it.
>> + * @dst_inc: If the destination address increments after writing to it.
> Is the inc in bytes?
I mean increment by src_width bytes(I removed it and made it an attribute
of the channel rather than transfer). That could be conveyed via maybe
SLAVE_CONFIG ... or with a new different name ;)

>> + * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
>> + *           Otherwise, source is read contiguously (icg ignored).
>> + *           Ignored if src_inc is false.
>> + * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
>> + *           Otherwise, destination is filled contiguously (icg ignored).
>> + *           Ignored if dst_inc is false.
>> + * @frm_irq: If the client expects DMAC driver to do callback after each frame.
> why not reuse the callback in descriptor for this?
At the moment, it is for at least conveying cyclic transfers in a
single descriptor.

>> + * @numf: Number of frames in this template.
>> + * @frame_size: Number of chunks in a frame i.e, size of sgl[].
>> + * @sgl: Array of {chunk,icg} pairs that make up a frame.
>> + */
>> +struct xfer_template {
>> +     enum dma_transaction_type op;
>> +     dma_addr_t src_start;
>> +     dma_addr_t dst_start;
>> +     bool src_inc;
>> +     bool dst_inc;
>> +     bool src_sgl;
>> +     bool dst_sgl;
>> +     bool frm_irq;
>> +     size_t numf;
>> +     size_t frame_size;
>> +     struct data_chunk sgl[0];
>> +};
>>
>>  /**
>>   * enum dma_ctrl_flags - DMA flags to augment operation preparation,
>> @@ -430,6 +500,7 @@ struct dma_tx_state {
>>   * @device_prep_dma_cyclic: prepare a cyclic dma operation suitable for audio.
>>   *   The function takes a buffer of size buf_len. The callback function will
>>   *   be called after period_len bytes have been transferred.
>> + * @device_prep_dma_xfer: Transfer expression in 'most' generic way.
>>   * @device_control: manipulate all pending operations on a channel, returns
>>   *   zero or error code
>>   * @device_tx_status: poll for transaction completion, the optional
>> @@ -494,6 +565,9 @@ struct dma_device {
>>       struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
>>               struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
>>               size_t period_len, enum dma_data_direction direction);
>> +     struct dma_async_tx_descriptor *(*device_prep_dma_xfer)(
>> +             struct dma_chan *chan, struct xfer_template *xt,
>> +             unsigned long flags);
>>       int (*device_control)(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
>>               unsigned long arg);
>>
>
> Is this coordinated with Sundaram, he was trying similar thing with help
> from Linus W?
>
Nopes.
I replied, to his proposal to add TI specific flag to DMA API, that we need
something generic enough to be usable by other DMACs as well.

For example, types of transfers possible with PL330 are almost unlimited
because PL330 is actually a processor with it's own instruction set !
I kept in mind the PL330 capabilities and all the trickery I had to employ
while working on DMA client(multimedia) drivers.

Please look at this api from POV that we are to write an ideal DMA API
from scratch. Later we can decide which parts are too old to be changed
and which could be introduced for new drivers and slowly/safely modified
in extant drivers.

I don't know if it is acceptable(please let me know if it is not), but I am
introducing my bigger picture a stroke at time, because neither am I
really good at communicating, nor people tend to accept even
'independently-valid' patches if they are introduced as part of something
bigger that they won't probably like. Or so do I think.

Thanks,
Jassi
Jassi Brar July 25, 2011, 2:17 p.m. UTC | #5
On 25 July 2011 17:00, Vinod Koul <vkoul@infradead.org> wrote:
> On Sun, 2011-07-24 at 02:01 +0530, Jassi Brar wrote:
>> This is an attempt to define an api that could be used for doing
>> fancy data transfers like interleaved to contiguous copy and vice-versa.
>> Traditional SG_list based transfers tend to be very inefficient
>> in such cases. Such cases call for some very condensed api to convey
>> pattern of the transfer. This is an attempt at that condensed api.
>>
>> The api supports all 4 variants of scatter-gather and contiguous transfer.
>> Besides, it could easily represent common operations like
>>       device_prep_dma_{cyclic, memset, memcpy}
> and how do you specify if the transfer is cyclic, memset or mempcy...
Sorry I missed this in the dense. Maybe it's a good idea to isolate
enough the context and reply in a post ?

Cyclic - I have already explained to Linus. Please ask for clarifications.

Memset (with pattern of source bus-width size)
*********
struct xfer_template {
       bool src_inc = false;
       bool dst_inc = true;
       size_t numf = memset_length / pattern_size;
       size_t frame_size = 1;
       struct data_chunk sgl[0]   represents the pattern to be repeated
};

If we need to set memory with a pattern or arbitrary length, we can
add read-width and write-width members(actually I just removed before
posting).
Jassi Brar July 25, 2011, 2:36 p.m. UTC | #6
On 25 July 2011 17:00, Vinod Koul <vkoul@infradead.org> wrote:
> On Sun, 2011-07-24 at 02:01 +0530, Jassi Brar wrote:
>> This is an attempt to define an api that could be used for doing
>> fancy data transfers like interleaved to contiguous copy and vice-versa.
>> Traditional SG_list based transfers tend to be very inefficient
>> in such cases. Such cases call for some very condensed api to convey
>> pattern of the transfer. This is an attempt at that condensed api.
>>
>> The api supports all 4 variants of scatter-gather and contiguous transfer.
>> Besides, it could easily represent common operations like
>>       device_prep_dma_{cyclic, memset, memcpy}
> and how do you specify if the transfer is cyclic, memset or mempcy...
Sorry I missed this in the dense. Maybe it's a good idea to isolate
enough the context and reply in a post ?

Cyclic - I have already explained to Linus. Please ask for clarifications.

Memset (with pattern of source bus-width size)
*********
struct xfer_template {
       bool src_inc = false;
       bool dst_inc = true;
       size_t numf = memset_length / pattern_size;
       size_t frame_size = 1;
       struct data_chunk sgl[0]   represents the pattern to be repeated
};

If we need to set memory with a pattern or arbitrary length, we can
add read-width and write-width members(actually I just removed before
posting).

Patch
diff mbox

diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index eee7add..a6cdb57 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -74,6 +74,76 @@  enum dma_transaction_type {
 /* last transaction type for creation of the capabilities mask */
 #define DMA_TX_TYPE_END (DMA_CYCLIC + 1)
 
+/**
+ * Generic Transfer Request
+ * ------------------------
+ * A chunk is collection of contiguous bytes to be transfered.
+ * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
+ * ICGs may or maynot change between chunks.
+ * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
+ *  that when repeated an integral number of times, specifies the transfer.
+ * A transfer template is specification of a Frame, the number of times
+ *  it is to be repeated and other per-transfer attributes.
+ *
+ * Practically, a client driver would have ready a template for each
+ *  type of transfer it is going to need during its lifetime and
+ *  set only 'src_start' and 'dst_start' before submitting the requests.
+ *
+ *
+ *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
+ *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
+ *
+ *    ==  Chunk size
+ *    ... ICG
+ */
+
+/**
+ * struct data_chunk - Element of scatter-gather list that makes a frame.
+ * @size: Number of bytes to read from source.
+ *	  size_dst := fn(op, size_src), so doesn't mean much for destination.
+ * @icg: Number of bytes to jump after last src/dst address of this
+ *	 chunk and before first src/dst address for next chunk.
+ *	 Ignored for dst(assumed 0), if dst_inc is true and dst_sgl is false.
+ *	 Ignored for src(assumed 0), if src_inc is true and src_sgl is false.
+ */
+struct data_chunk {
+	size_t size;
+	size_t icg;
+};
+
+/**
+ * struct xfer_template - Template to convey DMAC the transfer pattern
+ *	 and attributes.
+ * @op: The operation to perform on source data before writing it on
+ *	 to destination address.
+ * @src_start: Absolute address of source for the first chunk.
+ * @dst_start: Absolute address of destination for the first chunk.
+ * @src_inc: If the source address increments after reading from it.
+ * @dst_inc: If the destination address increments after writing to it.
+ * @src_sgl: If the 'icg' of sgl[] applies to Source (scattered read).
+ *		Otherwise, source is read contiguously (icg ignored).
+ *		Ignored if src_inc is false.
+ * @dst_sgl: If the 'icg' of sgl[] applies to Destination (scattered write).
+ *		Otherwise, destination is filled contiguously (icg ignored).
+ *		Ignored if dst_inc is false.
+ * @frm_irq: If the client expects DMAC driver to do callback after each frame.
+ * @numf: Number of frames in this template.
+ * @frame_size: Number of chunks in a frame i.e, size of sgl[].
+ * @sgl: Array of {chunk,icg} pairs that make up a frame.
+ */
+struct xfer_template {
+	enum dma_transaction_type op;
+	dma_addr_t src_start;
+	dma_addr_t dst_start;
+	bool src_inc;
+	bool dst_inc;
+	bool src_sgl;
+	bool dst_sgl;
+	bool frm_irq;
+	size_t numf;
+	size_t frame_size;
+	struct data_chunk sgl[0];
+};
 
 /**
  * enum dma_ctrl_flags - DMA flags to augment operation preparation,
@@ -430,6 +500,7 @@  struct dma_tx_state {
  * @device_prep_dma_cyclic: prepare a cyclic dma operation suitable for audio.
  *	The function takes a buffer of size buf_len. The callback function will
  *	be called after period_len bytes have been transferred.
+ * @device_prep_dma_xfer: Transfer expression in 'most' generic way.
  * @device_control: manipulate all pending operations on a channel, returns
  *	zero or error code
  * @device_tx_status: poll for transaction completion, the optional
@@ -494,6 +565,9 @@  struct dma_device {
 	struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
 		struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
 		size_t period_len, enum dma_data_direction direction);
+	struct dma_async_tx_descriptor *(*device_prep_dma_xfer)(
+		struct dma_chan *chan, struct xfer_template *xt,
+		unsigned long flags);
 	int (*device_control)(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
 		unsigned long arg);