diff mbox

[v2,1/5] Documentation: dmaengine: pxa-dma design

Message ID 1428781236-25806-2-git-send-email-robert.jarzmik@free.fr (mailing list archive)
State Changes Requested
Headers show

Commit Message

Robert Jarzmik April 11, 2015, 7:40 p.m. UTC
Document the new design of the pxa dma driver.

Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
---
 Documentation/dmaengine/pxa_dma.txt | 157 ++++++++++++++++++++++++++++++++++++
 1 file changed, 157 insertions(+)
 create mode 100644 Documentation/dmaengine/pxa_dma.txt

Comments

Vinod Koul May 8, 2015, 4:36 a.m. UTC | #1
On Sat, Apr 11, 2015 at 09:40:32PM +0200, Robert Jarzmik wrote:
> Document the new design of the pxa dma driver.
> 
> Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
> ---
>  Documentation/dmaengine/pxa_dma.txt | 157 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 157 insertions(+)
>  create mode 100644 Documentation/dmaengine/pxa_dma.txt
> 
> diff --git a/Documentation/dmaengine/pxa_dma.txt b/Documentation/dmaengine/pxa_dma.txt
> new file mode 100644
> index 0000000..63db9fe
> --- /dev/null
> +++ b/Documentation/dmaengine/pxa_dma.txt
> @@ -0,0 +1,157 @@
> +PXA/MMP - DMA Slave controller
> +==============================
> +
> +Constraints
> +-----------
> +  a) Transfers hot queuing
> +     A driver submitting a transfer and issuing it should be granted the transfer
> +     is queued even on a running DMA channel.
this is bit confusing, esp latter part.. do you mean "A driver submitting a
transfer and issuing it should be granted the transfer queue even on a
running DMA channel" ??

> +     This implies that the queuing doesn't wait for the previous transfer end,
> +     and that the descriptor chaining is not only done in the irq/tasklet code
> +     triggered by the end of the transfer.
how is it differenat than current dmaengine semantics where you say
issue_pending() is invoked when current transfer finished? Here is you have
to do descriptor chaining so bit it.
> +
> +  b) All transfers having asked for confirmation should be signaled
> +     Any issued transfer with DMA_PREP_INTERRUPT should trigger a callback call.
> +     This implies that even if an irq/tasklet is triggered by end of tx1, but
> +     at the time of irq/dma tx2 is already finished, tx1->complete() and
> +     tx2->complete() should be called.
> +
> +  c) Channel residue calculation
> +     A channel should be able to report how much advanced is a transfer. The
in a							    ^^^^
> +     granularity is still descriptor based.
This is not pxa specfic

> +
> +  d) Channel running state
> +     A driver should be able to query if a channel is running or not. For the
> +     multimedia case, such as video capture, if a transfer is submitted and then
> +     a check of the DMA channel reports a "stopped channel", the transfer should
> +     not be issued until the next "start of frame interrupt", hence the need to
> +     know if a channel is in running or stopped state.
How do you query that?

> +
> +  e) Bandwidth guarantee
> +     The PXA architecture has 4 levels of DMAs priorities : high, normal, low.
> +     The high prorities get twice as much bandwidth as the normal, which get twice
> +     as much as the low priorities.
> +     A driver should be able to request a priority, especially the real-time
> +     ones such as pxa_camera with (big) throughputs.
and how..?

> +
> +  f) Transfer reusability
> +     An issued and finished transfer should be "reusable". The choice of
> +     "DMA_CTRL_ACK" should be left to the client, not the dma driver.
again how is this pxa specfic, if not documented we should move this to
dmaengine documentation
Robert Jarzmik May 8, 2015, 12:52 p.m. UTC | #2
Vinod Koul <vinod.koul@intel.com> writes:

> On Sat, Apr 11, 2015 at 09:40:32PM +0200, Robert Jarzmik wrote:
>> Document the new design of the pxa dma driver.
>> +  a) Transfers hot queuing
>> +     A driver submitting a transfer and issuing it should be granted the transfer
>> +     is queued even on a running DMA channel.
> this is bit confusing, esp latter part.. do you mean "A driver submitting a
> transfer and issuing it should be granted the transfer queue even on a
> running DMA channel" ??

Euh no, I meant that a transfer which is submitted and issued on a _phy_
doesn't wait for a _phy_ to stop and restart, but is submitted on a "running
channel". The other drivers, especially mmp_pdma waited for the phy to stop
before relaunching a new transfer.

I don't have a clear idea on a better wording yet ...

>> +     This implies that the queuing doesn't wait for the previous transfer end,
>> +     and that the descriptor chaining is not only done in the irq/tasklet code
>> +     triggered by the end of the transfer.
> how is it differenat than current dmaengine semantics where you say
> issue_pending() is invoked when current transfer finished? Here is you have
> to do descriptor chaining so bit it.
Your sentence is a bit difficult for me to understand.

>> +  c) Channel residue calculation
>> +     A channel should be able to report how much advanced is a transfer. The
> in a							    ^^^^
For v3.

>> +     granularity is still descriptor based.
> This is not pxa specfic
True. Do you want me to remove the (c) from the document ?

>> +
>> +  d) Channel running state
>> +     A driver should be able to query if a channel is running or not. For the
>> +     multimedia case, such as video capture, if a transfer is submitted and then
>> +     a check of the DMA channel reports a "stopped channel", the transfer should
>> +     not be issued until the next "start of frame interrupt", hence the need to
>> +     know if a channel is in running or stopped state.
> How do you query that?
With dma_async_is_tx_complete() giving :
 - dma_cookie_t last_submitted
 - dma_cookie_t last_issued

The channel is still running if (last_submitted < last_issued).

>
>> +
>> +  e) Bandwidth guarantee
>> +     The PXA architecture has 4 levels of DMAs priorities : high, normal, low.
>> +     The high prorities get twice as much bandwidth as the normal, which get twice
>> +     as much as the low priorities.
>> +     A driver should be able to request a priority, especially the real-time
>> +     ones such as pxa_camera with (big) throughputs.
> and how..?
By passing this information :
 - in a devicetree environment, check pxad_dma_xlate()
 - in a platform device environment, check pxad_filter_fn()

>> +  f) Transfer reusability
>> +     An issued and finished transfer should be "reusable". The choice of
>> +     "DMA_CTRL_ACK" should be left to the client, not the dma driver.
> again how is this pxa specfic, if not documented we should move this to
> dmaengine documentation

Yes, I agree. I should move this to dmaengine slave documentation, in
Documentation/dmaengine/provider.txt (in the Misc notes section). Do you want me
to submit a patch to change the "Undocumented feature" into a properly
documented feature ?

Cheers.
Vinod Koul May 12, 2015, 10:12 a.m. UTC | #3
On Fri, May 08, 2015 at 02:52:46PM +0200, Robert Jarzmik wrote:
> Vinod Koul <vinod.koul@intel.com> writes:
> 
> > On Sat, Apr 11, 2015 at 09:40:32PM +0200, Robert Jarzmik wrote:
> >> Document the new design of the pxa dma driver.
> >> +  a) Transfers hot queuing
> >> +     A driver submitting a transfer and issuing it should be granted the transfer
> >> +     is queued even on a running DMA channel.
> > this is bit confusing, esp latter part.. do you mean "A driver submitting a
> > transfer and issuing it should be granted the transfer queue even on a
> > running DMA channel" ??
> 
> Euh no, I meant that a transfer which is submitted and issued on a _phy_
> doesn't wait for a _phy_ to stop and restart, but is submitted on a "running
> channel". The other drivers, especially mmp_pdma waited for the phy to stop
> before relaunching a new transfer.
> 
> I don't have a clear idea on a better wording yet ...
Ah okay, with that explanation it helps, can you add that to
comments/documentation

> 
> >> +     This implies that the queuing doesn't wait for the previous transfer end,
> >> +     and that the descriptor chaining is not only done in the irq/tasklet code
> >> +     triggered by the end of the transfer.
> > how is it differenat than current dmaengine semantics where you say
> > issue_pending() is invoked when current transfer finished? Here is you have
> > to do descriptor chaining so bit it.
> Your sentence is a bit difficult for me to understand.
Sorry for typo, meant:
how is it different than current dmaengine semantics where you say
issue_pending() is invoked when current transfer finishes? Here you are
doing descriptor chaining, so be it.
Ideally dmaengine driver should keep submitting txns and opportunistically
based on HW optimize it. All this is transparent to clients, they submit and
wait for callback.

> >> +     granularity is still descriptor based.
> > This is not pxa specfic
> True. Do you want me to remove the (c) from the document ?
yes

> >> +  f) Transfer reusability
> >> +     An issued and finished transfer should be "reusable". The choice of
> >> +     "DMA_CTRL_ACK" should be left to the client, not the dma driver.
> > again how is this pxa specfic, if not documented we should move this to
> > dmaengine documentation
> 
> Yes, I agree. I should move this to dmaengine slave documentation, in
> Documentation/dmaengine/provider.txt (in the Misc notes section). Do you want me
> to submit a patch to change the "Undocumented feature" into a properly
> documented feature ?
That would be great
Robert Jarzmik May 12, 2015, 7:13 p.m. UTC | #4
Vinod Koul <vinod.koul@intel.com> writes:

> On Fri, May 08, 2015 at 02:52:46PM +0200, Robert Jarzmik wrote:
>> Vinod Koul <vinod.koul@intel.com> writes:
>> 
>> Euh no, I meant that a transfer which is submitted and issued on a _phy_
>> doesn't wait for a _phy_ to stop and restart, but is submitted on a "running
>> channel". The other drivers, especially mmp_pdma waited for the phy to stop
>> before relaunching a new transfer.
>> 
>> I don't have a clear idea on a better wording yet ...
> Ah okay, with that explanation it helps, can you add that to
> comments/documentation
Sure, for v3.

>> >> +     This implies that the queuing doesn't wait for the previous transfer end,
>> >> +     and that the descriptor chaining is not only done in the irq/tasklet code
>> >> +     triggered by the end of the transfer.
>> > how is it differenat than current dmaengine semantics where you say
>> > issue_pending() is invoked when current transfer finished? Here is you have
>> > to do descriptor chaining so bit it.
>> Your sentence is a bit difficult for me to understand.
> Sorry for typo, meant:
> how is it different than current dmaengine semantics where you say
> issue_pending() is invoked when current transfer finishes? Here you are
> doing descriptor chaining, so be it.

It is not "different" from dmaengine semantics. It's an implementation choice
which is not strictly required by dmaengine, and therefore a requirement on top
of what dmaengine offers.
Dmaengine requires to submit a transfer, and gives issue_pending() to provide a
guarantee the transfer will be executed. The dmaengine drivers can choose to
either queue the transfer when the previous one's completion is notified
(interrupt), or hot-queue the transfer while the channel is running.

This constraint documents the fact that this specific dmaengine driver's
implementation chose to hot-chain transfers whenever possible.

> Ideally dmaengine driver should keep submitting txns and opportunistically
> based on HW optimize it. All this is transparent to clients, they submit and
> wait for callback.
True. Yet this is not a requirement, it's a "good design" behavior. I wonder how
many dmaengine drivers are behaving in an optimize way ...

>> >> +     granularity is still descriptor based.
>> > This is not pxa specfic
>> True. Do you want me to remove the (c) from the document ?
> yes
Ok, for v3.

>> >> +  f) Transfer reusability
>> >> +     An issued and finished transfer should be "reusable". The choice of
>> >> +     "DMA_CTRL_ACK" should be left to the client, not the dma driver.
>> > again how is this pxa specfic, if not documented we should move this to
>> > dmaengine documentation
>> 
>> Yes, I agree. I should move this to dmaengine slave documentation, in
>> Documentation/dmaengine/provider.txt (in the Misc notes section). Do you want me
>> to submit a patch to change the "Undocumented feature" into a properly
>> documented feature ?
> That would be great
On my way, for v3.

Cheers.
diff mbox

Patch

diff --git a/Documentation/dmaengine/pxa_dma.txt b/Documentation/dmaengine/pxa_dma.txt
new file mode 100644
index 0000000..63db9fe
--- /dev/null
+++ b/Documentation/dmaengine/pxa_dma.txt
@@ -0,0 +1,157 @@ 
+PXA/MMP - DMA Slave controller
+==============================
+
+Constraints
+-----------
+  a) Transfers hot queuing
+     A driver submitting a transfer and issuing it should be granted the transfer
+     is queued even on a running DMA channel.
+     This implies that the queuing doesn't wait for the previous transfer end,
+     and that the descriptor chaining is not only done in the irq/tasklet code
+     triggered by the end of the transfer.
+
+  b) All transfers having asked for confirmation should be signaled
+     Any issued transfer with DMA_PREP_INTERRUPT should trigger a callback call.
+     This implies that even if an irq/tasklet is triggered by end of tx1, but
+     at the time of irq/dma tx2 is already finished, tx1->complete() and
+     tx2->complete() should be called.
+
+  c) Channel residue calculation
+     A channel should be able to report how much advanced is a transfer. The
+     granularity is still descriptor based.
+
+  d) Channel running state
+     A driver should be able to query if a channel is running or not. For the
+     multimedia case, such as video capture, if a transfer is submitted and then
+     a check of the DMA channel reports a "stopped channel", the transfer should
+     not be issued until the next "start of frame interrupt", hence the need to
+     know if a channel is in running or stopped state.
+
+  e) Bandwidth guarantee
+     The PXA architecture has 4 levels of DMAs priorities : high, normal, low.
+     The high prorities get twice as much bandwidth as the normal, which get twice
+     as much as the low priorities.
+     A driver should be able to request a priority, especially the real-time
+     ones such as pxa_camera with (big) throughputs.
+
+  f) Transfer reusability
+     An issued and finished transfer should be "reusable". The choice of
+     "DMA_CTRL_ACK" should be left to the client, not the dma driver.
+
+Design
+------
+  a) Virtual channels
+     Same concept as in sa11x0 driver, ie. a driver was assigned a "virtual
+     channel" linked to the requestor line, and the physical DMA channel is
+     assigned on the fly when the transfer is issued.
+
+  b) Transfer anatomy for a scatter-gather transfer
+     +------------+-----+---------------+----------------+-----------------+
+     | desc-sg[0] | ... | desc-sg[last] | status updater | finisher/linker |
+     +------------+-----+---------------+----------------+-----------------+
+
+     This structure is pointed by dma->sg_cpu.
+     The descriptors are used as follows :
+      - desc-sg[i]: i-th descriptor, transferring the i-th sg
+        element to the video buffer scatter gather
+      - status updater
+        Transfers a single u32 to a well known dma coherent memory to leave
+        a trace that this transfer is done. The "well known" is unique per
+        physical channel, meaning that a read of this value will tell which
+        is the last finished transfer at that point in time.
+      - finisher: has ddadr=DADDR_STOP, dcmd=ENDIRQEN
+      - linker: has ddadr= desc-sg[0] of next transfer, dcmd=0
+
+  b) Transfers hot-chaining
+     Suppose the running chain is :
+         Buffer 1         Buffer 2
+     +---------+----+---+  +----+----+----+---+
+     | d0 | .. | dN | l |  | d0 | .. | dN | f |
+     +---------+----+-|-+  ^----+----+----+---+
+                      |    |
+                      +----+
+
+     After a call to dmaengine_submit(b3), the chain will look like :
+          Buffer 1              Buffer 2             Buffer 3
+     +---------+----+---+  +----+----+----+---+  +----+----+----+---+
+     | d0 | .. | dN | l |  | d0 | .. | dN | l |  | d0 | .. | dN | f |
+     +---------+----+-|-+  ^----+----+----+-|-+  ^----+----+----+---+
+                      |    |                |    |
+                      +----+                +----+
+                                           new_link
+
+     If while new_link was created the DMA channel stopped, it is _not_
+     restarted. Hot-chaining doesn't break the assumption that
+     dma_async_issue_pending() is to be used to ensure the transfer is actually started.
+
+     One exception to this rule :
+       - if Buffer1 and Buffer2 had all their addresses 8 bytes aligned
+       - and if Buffer3 has at least one address not 4 bytes aligned
+       - then hot-chaining cannot happen, as the channel must be stopped, the
+         "align bit" must be set, and the channel restarted As a consequence,
+         such a transfer tx_submit() will be queued on the submitted queue, and
+         this specific case if the DMA is already running in aligned mode.
+
+  c) Transfers completion updater
+     Each time a transfer is completed on a channel, an interrupt might be
+     generated or not, up to the client's request. But in each case, the last
+     descriptor of a transfer, the "status updater", will write the latest
+     transfer being completed into the physical channel's completion mark.
+
+     This will speed up residue calculation, for large transfers such as video
+     buffers which hold around 6k descriptors or more. This also allows without
+     any lock to find out what is the latest completed transfer in a running
+     DMA chain.
+
+  d) Transfers completion, irq and tasklet
+     When a transfer flagged as "DMA_PREP_INTERRUPT" is finished, the dma irq
+     is raised. Upon this interrupt, a tasklet is scheduled for the physical
+     channel.
+     The tasklet is responsible for :
+      - reading the physical channel last updater mark
+      - calling all the transfer callbacks of finished transfers, based on
+        that mark, and each transfer flags.
+     If a transfer is completed while this handling is done, a dma irq will
+     be raised, and the tasklet will be scheduled once again, having a new
+     updater mark.
+
+  e) Residue
+     Residue granularity will be descriptor based. The issued but not completed
+     transfers will be scanned for all of their descriptors against the
+     currently running descriptor.
+
+  f) Most complicated case of driver's tx queues
+     The most tricky situation is when :
+       - there are not "acked" transfers (tx0)
+       - a driver submitted an aligned tx1, not chained
+       - a driver submitted an aligned tx2 => tx2 is cold chained to tx1
+       - a driver issued tx1+tx2 => channel is running in aligned mode
+       - a driver submitted an aligned tx3 => tx3 is hot-chained
+       - a driver submitted an unaligned tx4 => tx4 is put in submitted queue,
+         not chained
+       - a driver issued tx4 => tx4 is put in issued queue, not chained
+       - a driver submitted an aligned tx5 => tx5 is put in submitted queue, not
+         chained
+       - a driver submitted an aligned tx6 => tx6 is put in submitted queue,
+         cold chained to tx5
+
+     This translates into (after tx4 is issued) :
+       - issued queue
+     +-----+ +-----+ +-----+ +-----+
+     | tx1 | | tx2 | | tx3 | | tx4 |
+     +---|-+ ^---|-+ ^-----+ +-----+
+         |   |   |   |
+         +---+   +---+
+       - submitted queue
+     +-----+ +-----+
+     | tx5 | | tx6 |
+     +---|-+ ^-----+
+         |   |
+         +---+
+       - completed queue : empty
+       - allocated queue : tx0
+
+     It should be noted that after tx3 is completed, the channel is stopped, and
+     restarted in "unaligned mode" to handle tx4.
+
+Author: Robert Jarzmik <robert.jarzmik@free.fr>