diff mbox

[v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

Message ID 1390409565-4200-2-git-send-email-sthokal@xilinx.com (mailing list archive)
State Rejected
Delegated to: Vinod Koul
Headers show

Commit Message

Srikanth Thokala Jan. 22, 2014, 4:52 p.m. UTC
This is the driver for the AXI Video Direct Memory Access (AXI
VDMA) core, which is a soft Xilinx IP core that provides high-
bandwidth direct memory access between memory and AXI4-Stream
type video target peripherals. The core provides efficient two
dimensional DMA operations with independent asynchronous read
and write channel operation.

This module works on Zynq (ARM Based SoC) and Microblaze platforms.

Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
---
NOTE:
1. Created a separate directory 'dma/xilinx' as Xilinx has two more
   DMA IPs and we are also planning to upstream these drivers soon.
2. Rebased on v3.13.0-rc8

Changes in v2:
- Removed DMA Test client module from the patchset as suggested
  by Andy Shevchenko
- Removed device-id DT property, as suggested by Arnd Bergmann
- Properly documented DT bindings as suggested by Arnd Bergmann
- Returning with error, if registration of DMA to node fails
- Fixed typo errors
- Used BIT() macro at applicable places
- Added missing header file to the patchset
- Changed copyright year to include 2014
---
 .../devicetree/bindings/dma/xilinx/xilinx_vdma.txt |   75 +
 drivers/dma/Kconfig                                |   14 +
 drivers/dma/Makefile                               |    1 +
 drivers/dma/xilinx/Makefile                        |    1 +
 drivers/dma/xilinx/xilinx_vdma.c                   | 1486 ++++++++++++++++++++
 include/linux/amba/xilinx_dma.h                    |   50 +
 6 files changed, 1627 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
 create mode 100644 drivers/dma/xilinx/Makefile
 create mode 100644 drivers/dma/xilinx/xilinx_vdma.c
 create mode 100644 include/linux/amba/xilinx_dma.h

Comments

Levente Kurusa Jan. 22, 2014, 9:30 p.m. UTC | #1
Hello,

2014/1/22 Srikanth Thokala <sthokal@xilinx.com>:
> This is the driver for the AXI Video Direct Memory Access (AXI
> VDMA) core, which is a soft Xilinx IP core that provides high-
> bandwidth direct memory access between memory and AXI4-Stream
> type video target peripherals. The core provides efficient two
> dimensional DMA operations with independent asynchronous read
> and write channel operation.
>
> This module works on Zynq (ARM Based SoC) and Microblaze platforms.
>
> Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
> ---

Another two remarks, after you fixed them ( or not :-) )
you can have my:

Reviewed-by: Levente Kurusa <levex@linux.com>

Oh, and next time please if you post a patch that fixes something I pointed out,
CC me as I had a hard time finding this patch, thanks. :-)

> NOTE:
> 1. Created a separate directory 'dma/xilinx' as Xilinx has two more
>    DMA IPs and we are also planning to upstream these drivers soon.
> 2. Rebased on v3.13.0-rc8
>
> Changes in v2:
> - Removed DMA Test client module from the patchset as suggested
>   by Andy Shevchenko
> - Removed device-id DT property, as suggested by Arnd Bergmann
> - Properly documented DT bindings as suggested by Arnd Bergmann
> - Returning with error, if registration of DMA to node fails
> - Fixed typo errors
> - Used BIT() macro at applicable places
> - Added missing header file to the patchset
> - Changed copyright year to include 2014
> ---
>  .../devicetree/bindings/dma/xilinx/xilinx_vdma.txt |   75 +
>  drivers/dma/Kconfig                                |   14 +
>  drivers/dma/Makefile                               |    1 +
>  drivers/dma/xilinx/Makefile                        |    1 +
>  drivers/dma/xilinx/xilinx_vdma.c                   | 1486 ++++++++++++++++++++
>  include/linux/amba/xilinx_dma.h                    |   50 +
>  6 files changed, 1627 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
>  create mode 100644 drivers/dma/xilinx/Makefile
>  create mode 100644 drivers/dma/xilinx/xilinx_vdma.c
>  create mode 100644 include/linux/amba/xilinx_dma.h
>
> diff --git a/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
> new file mode 100644
> index 0000000..ab8be1a
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
> @@ -0,0 +1,75 @@
> +Xilinx AXI VDMA engine, it does transfers between memory and video devices.
> +It can be configured to have one channel or two channels. If configured
> +as two channels, one is to transmit to the video device and another is
> +to receive from the video device.
> +
> +Required properties:
> +- compatible: Should be "xlnx,axi-vdma-1.00.a"
> +- #dma-cells: Should be <1>, see "dmas" property below
> +- reg: Should contain VDMA registers location and length.
> +- xlnx,num-fstores: Should be the number of framebuffers as configured in h/w.
> +- dma-channel child node: Should have atleast one channel and can have upto
> +       two channels per device. This node specifies the properties of each
> +       DMA channel (see child node properties below).
> +
> +Optional properties:
> +- xlnx,include-sg: Tells whether configured for Scatter-mode in
> +       the hardware.
> [...]
> +
> +/**
> + * xilinx_vdma_is_running - Check if VDMA channel is running
> + * @chan: Driver specific VDMA channel
> + *
> + * Return: '1' if running, '0' if not.
> + */
> +static int xilinx_vdma_is_running(struct xilinx_vdma_chan *chan)
> +{
> +       return !(vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) &
> +                XILINX_VDMA_DMASR_HALTED) &&
> +               (vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR) &
> +                XILINX_VDMA_DMACR_RUNSTOP);
> +}
> +
> +/**
> + * xilinx_vdma_is_idle - Check if VDMA channel is idle
> + * @chan: Driver specific VDMA channel
> + *
> + * Return: '1' if idle, '0' if not.
> + */
> +static int xilinx_vdma_is_idle(struct xilinx_vdma_chan *chan)
> +{
> +       return vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) &
> +               XILINX_VDMA_DMASR_IDLE;
> +}
> +
> +/**
> + * xilinx_vdma_halt - Halt VDMA channel
> + * @chan: Driver specific VDMA channel
> + */
> +static void xilinx_vdma_halt(struct xilinx_vdma_chan *chan)
> +{
> +       int loop = XILINX_VDMA_LOOP_COUNT + 1;
> +
> +       vdma_ctrl_clr(chan, XILINX_VDMA_REG_DMACR, XILINX_VDMA_DMACR_RUNSTOP);
> +
> +       /* Wait for the hardware to halt */
> +       while (loop--)
> +               if (vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) &
> +                   XILINX_VDMA_DMASR_HALTED)
> +                       break;
> +
> +       if (!loop) {
> +               dev_err(chan->dev, "Cannot stop channel %p: %x\n",
> +                       chan, vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR));
> +               chan->err = true;
> +       }
> +
> +       return;
> +}
> +
> +/**
> + * xilinx_vdma_start - Start VDMA channel
> + * @chan: Driver specific VDMA channel
> + */
> +static void xilinx_vdma_start(struct xilinx_vdma_chan *chan)
> +{
> +       int loop = XILINX_VDMA_LOOP_COUNT + 1;
> +
> +       vdma_ctrl_set(chan, XILINX_VDMA_REG_DMACR, XILINX_VDMA_DMACR_RUNSTOP);
> +
> +       /* Wait for the hardware to start */
> +       while (loop--)
> +               if (!(vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) &
> +                     XILINX_VDMA_DMASR_HALTED))
> +                       break;
> +
> +       if (!loop) {
> +               dev_err(chan->dev, "Cannot start channel %p: %x\n",
> +                       chan, vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR));
> +
> +               chan->err = true;
> +       }
> +
> +       return;
> +}
> +
> +/**
> + * xilinx_vdma_start_transfer - Starts VDMA transfer
> + * @chan: Driver specific channel struct pointer
> + */
> +static void xilinx_vdma_start_transfer(struct xilinx_vdma_chan *chan)
> +{
> +       struct xilinx_vdma_config *config = &chan->config;
> +       struct xilinx_vdma_tx_descriptor *desc;
> +       unsigned long flags;
> +       u32 reg;
> +       struct xilinx_vdma_tx_segment *head, *tail = NULL;
> +
> +       if (chan->err)
> +               return;
> +
> +       spin_lock_irqsave(&chan->lock, flags);
> +
> +       /* There's already an active descriptor, bail out. */
> +       if (chan->active_desc)
> +               goto out_unlock;
> +
> +       if (list_empty(&chan->pending_list))
> +               goto out_unlock;
> +
> +       desc = list_first_entry(&chan->pending_list,
> +                               struct xilinx_vdma_tx_descriptor, node);
> +
> +       /* If it is SG mode and hardware is busy, cannot submit */
> +       if (chan->has_sg && xilinx_vdma_is_running(chan) &&
> +           !xilinx_vdma_is_idle(chan)) {
> +               dev_dbg(chan->dev, "DMA controller still busy\n");
> +               goto out_unlock;
> +       }
> +
> +       if (chan->err)
> +               goto out_unlock;
> +
> +       /*
> +        * If hardware is idle, then all descriptors on the running lists are
> +        * done, start new transfers
> +        */
> +       if (chan->has_sg) {
> +               head = list_first_entry(&desc->segments,
> +                                       struct xilinx_vdma_tx_segment, node);
> +               tail = list_entry(desc->segments.prev,
> +                                 struct xilinx_vdma_tx_segment, node);
> +
> +               vdma_ctrl_write(chan, XILINX_VDMA_REG_CURDESC, head->phys);
> +       }
> +
> +       /* Configure the hardware using info in the config structure */
> +       reg = vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR);
> +
> +       if (config->frm_cnt_en)
> +               reg |= XILINX_VDMA_DMACR_FRAMECNT_EN;
> +       else
> +               reg &= ~XILINX_VDMA_DMACR_FRAMECNT_EN;
> +
> +       /*
> +        * With SG, start with circular mode, so that BDs can be fetched.
> +        * In direct register mode, if not parking, enable circular mode
> +        */
> +       if (chan->has_sg || !config->park)
> +               reg |= XILINX_VDMA_DMACR_CIRC_EN;
> +
> +       if (config->park)
> +               reg &= ~XILINX_VDMA_DMACR_CIRC_EN;
> +
> +       vdma_ctrl_write(chan, XILINX_VDMA_REG_DMACR, reg);
> +
> +       if (config->park && (config->park_frm >= 0) &&
> +                       (config->park_frm < chan->num_frms)) {
> +               if (chan->direction == DMA_MEM_TO_DEV)
> +                       vdma_write(chan, XILINX_VDMA_REG_PARK_PTR,
> +                               config->park_frm <<
> +                                       XILINX_VDMA_PARK_PTR_RD_REF_SHIFT);
> +               else
> +                       vdma_write(chan, XILINX_VDMA_REG_PARK_PTR,
> +                               config->park_frm <<
> +                                       XILINX_VDMA_PARK_PTR_WR_REF_SHIFT);
> +       }
> +
> +       /* Start the hardware */
> +       xilinx_vdma_start(chan);
> +
> +       if (chan->err)
> +               goto out_unlock;
> +
> +       /* Start the transfer */
> +       if (chan->has_sg) {
> +               vdma_ctrl_write(chan, XILINX_VDMA_REG_TAILDESC, tail->phys);
> +       } else {
> +               struct xilinx_vdma_tx_segment *segment;
> +               int i = 0;
> +
> +               list_for_each_entry(segment, &desc->segments, node)
> +                       vdma_desc_write(chan,
> +                                       XILINX_VDMA_REG_START_ADDRESS(i++),
> +                                       segment->hw.buf_addr);
> +
> +               vdma_desc_write(chan, XILINX_VDMA_REG_HSIZE, config->hsize);
> +               vdma_desc_write(chan, XILINX_VDMA_REG_FRMDLY_STRIDE,
> +                               (config->frm_dly <<
> +                                XILINX_VDMA_FRMDLY_STRIDE_FRMDLY_SHIFT) |
> +                               (config->stride <<
> +                                XILINX_VDMA_FRMDLY_STRIDE_STRIDE_SHIFT));
> +               vdma_desc_write(chan, XILINX_VDMA_REG_VSIZE, config->vsize);
> +       }
> +
> +       list_del(&desc->node);
> +       chan->active_desc = desc;
> +
> +out_unlock:
> +       spin_unlock_irqrestore(&chan->lock, flags);
> +}
> +
> +/**
> + * xilinx_vdma_issue_pending - Issue pending transactions
> + * @dchan: DMA channel
> + */
> +static void xilinx_vdma_issue_pending(struct dma_chan *dchan)
> +{
> +       struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
> +
> +       xilinx_vdma_start_transfer(chan);
> +}
> +
> +/**
> + * xilinx_vdma_complete_descriptor - Mark the active descriptor as complete
> + * @chan : xilinx DMA channel
> + *
> + * CONTEXT: hardirq
> + */
> +static void xilinx_vdma_complete_descriptor(struct xilinx_vdma_chan *chan)
> +{
> +       struct xilinx_vdma_tx_descriptor *desc;
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&chan->lock, flags);
> +
> +       desc = chan->active_desc;
> +       if (!desc) {
> +               dev_dbg(chan->dev, "no running descriptors\n");
> +               goto out_unlock;
> +       }
> +
> +       list_add_tail(&desc->node, &chan->done_list);
> +
> +       /* Update the completed cookie and reset the active descriptor. */
> +       chan->completed_cookie = desc->async_tx.cookie;
> +       chan->active_desc = NULL;
> +
> +out_unlock:
> +       spin_unlock_irqrestore(&chan->lock, flags);
> +}
> +
> +/**
> + * xilinx_vdma_reset - Reset VDMA channel
> + * @chan: Driver specific VDMA channel
> + *
> + * Return: '0' on success and failure value on error
> + */
> +static int xilinx_vdma_reset(struct xilinx_vdma_chan *chan)
> +{
> +       int loop = XILINX_VDMA_LOOP_COUNT + 1;
> +       u32 tmp;
> +
> +       vdma_ctrl_set(chan, XILINX_VDMA_REG_DMACR, XILINX_VDMA_DMACR_RESET);
> +
> +       tmp = vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR) &
> +               XILINX_VDMA_DMACR_RESET;
> +
> +       /* Wait for the hardware to finish reset */
> +       while (loop-- && tmp)
> +               tmp = vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR) &
> +                       XILINX_VDMA_DMACR_RESET;
> +
> +       if (!loop) {
> +               dev_err(chan->dev, "reset timeout, cr %x, sr %x\n",
> +                       vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR),
> +                       vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR));
> +               return -ETIMEDOUT;
> +       }
> +
> +       chan->err = false;
> +
> +       return 0;
> +}
> +
> +/**
> + * xilinx_vdma_chan_reset - Reset VDMA channel and enable interrupts
> + * @chan: Driver specific VDMA channel
> + *
> + * Return: '0' on success and failure value on error
> + */
> +static int xilinx_vdma_chan_reset(struct xilinx_vdma_chan *chan)
> +{
> +       int err;
> +
> +       /* Reset VDMA */
> +       err = xilinx_vdma_reset(chan);
> +       if (err)
> +               return err;
> +
> +       /* Enable interrupts */
> +       vdma_ctrl_set(chan, XILINX_VDMA_REG_DMACR,
> +                     XILINX_VDMA_DMAXR_ALL_IRQ_MASK);
> +
> +       return 0;
> +}
> +
> +/**
> + * xilinx_vdma_irq_handler - VDMA Interrupt handler
> + * @irq: IRQ number
> + * @data: Pointer to the Xilinx VDMA channel structure
> + *
> + * Return: IRQ_HANDLED/IRQ_NONE
> + */
> +static irqreturn_t xilinx_vdma_irq_handler(int irq, void *data)
> +{
> +       struct xilinx_vdma_chan *chan = data;
> +       u32 status;
> +
> +       /* Read the status and ack the interrupts. */
> +       status = vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR);
> +       if (!(status & XILINX_VDMA_DMAXR_ALL_IRQ_MASK))
> +               return IRQ_NONE;
> +
> +       vdma_ctrl_write(chan, XILINX_VDMA_REG_DMASR,
> +                       status & XILINX_VDMA_DMAXR_ALL_IRQ_MASK);
> +
> +       if (status & XILINX_VDMA_DMASR_ERR_IRQ) {
> +               /*
> +                * An error occurred. If C_FLUSH_ON_FSYNC is enabled and the
> +                * error is recoverable, ignore it. Otherwise flag the error.
> +                *
> +                * Only recoverable errors can be cleared in the DMASR register,
> +                * make sure not to write to other error bits to 1.
> +                */
> +               u32 errors = status & XILINX_VDMA_DMASR_ALL_ERR_MASK;
> +               vdma_ctrl_write(chan, XILINX_VDMA_REG_DMASR,
> +                               errors & XILINX_VDMA_DMASR_ERR_RECOVER_MASK);
> +
> +               if (!chan->flush_on_fsync ||
> +                   (errors & ~XILINX_VDMA_DMASR_ERR_RECOVER_MASK)) {
> +                       dev_err(chan->dev,
> +                               "Channel %p has errors %x, cdr %x tdr %x\n",
> +                               chan, errors,
> +                               vdma_ctrl_read(chan, XILINX_VDMA_REG_CURDESC),
> +                               vdma_ctrl_read(chan, XILINX_VDMA_REG_TAILDESC));
> +                       chan->err = true;
> +               }
> +       }
> +
> +       if (status & XILINX_VDMA_DMASR_DLY_CNT_IRQ) {
> +               /*
> +                * Device takes too long to do the transfer when user requires
> +                * responsiveness.
> +                */
> +               dev_dbg(chan->dev, "Inter-packet latency too long\n");
> +       }
> +
> +       if (status & XILINX_VDMA_DMASR_FRM_CNT_IRQ) {
> +               xilinx_vdma_complete_descriptor(chan);
> +               xilinx_vdma_start_transfer(chan);
> +       }
> +
> +       tasklet_schedule(&chan->tasklet);
> +       return IRQ_HANDLED;
> +}
> +
> +/**
> + * xilinx_vdma_tx_submit - Submit DMA transaction
> + * @tx: Async transaction descriptor
> + *
> + * Return: cookie value on success and failure value on error
> + */
> +static dma_cookie_t xilinx_vdma_tx_submit(struct dma_async_tx_descriptor *tx)
> +{
> +       struct xilinx_vdma_tx_descriptor *desc = to_vdma_tx_descriptor(tx);
> +       struct xilinx_vdma_chan *chan = to_xilinx_chan(tx->chan);
> +       struct xilinx_vdma_tx_segment *segment;
> +       dma_cookie_t cookie;
> +       unsigned long flags;
> +       int err;
> +
> +       if (chan->err) {
> +               /*
> +                * If reset fails, need to hard reset the system.
> +                * Channel is no longer functional
> +                */
> +               err = xilinx_vdma_chan_reset(chan);
> +               if (err < 0)
> +                       return err;
> +       }
> +
> +       spin_lock_irqsave(&chan->lock, flags);
> +
> +       /* Assign cookies to all of the segments that make up this transaction.
> +        * Use the cookie of the last segment as the transaction cookie.
> +        */
> +       cookie = chan->cookie;
> +
> +       list_for_each_entry(segment, &desc->segments, node) {
> +               if (cookie < DMA_MAX_COOKIE)
> +                       cookie++;
> +               else
> +                       cookie = DMA_MIN_COOKIE;
> +
> +               segment->cookie = cookie;
> +       }
> +
> +       tx->cookie = cookie;
> +       chan->cookie = cookie;
> +
> +       /* Append the transaction to the pending transactions queue. */
> +       list_add_tail(&desc->node, &chan->pending_list);
> +
> +       spin_unlock_irqrestore(&chan->lock, flags);
> +
> +       return cookie;
> +}
> +
> +/**
> + * xilinx_vdma_prep_slave_sg - prepare a descriptor for a DMA_SLAVE transaction
> + * @dchan: DMA channel
> + * @sgl: scatterlist to transfer to/from
> + * @sg_len: number of entries in @sgl
> + * @dir: DMA direction
> + * @flags: transfer ack flags
> + * @context: unused
> + *
> + * Return: Async transaction descriptor on success and NULL on failure
> + */
> +static struct dma_async_tx_descriptor *
> +xilinx_vdma_prep_slave_sg(struct dma_chan *dchan, struct scatterlist *sgl,
> +                         unsigned int sg_len, enum dma_transfer_direction dir,
> +                         unsigned long flags, void *context)
> +{
> +       struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
> +       struct xilinx_vdma_tx_descriptor *desc;
> +       struct xilinx_vdma_tx_segment *segment;
> +       struct xilinx_vdma_tx_segment *prev = NULL;
> +       struct scatterlist *sg;
> +       int i;
> +
> +       if (chan->direction != dir || sg_len == 0)
> +               return NULL;
> +
> +       /* Enforce one sg entry for one frame. */
> +       if (sg_len != chan->num_frms) {
> +               dev_err(chan->dev,
> +               "number of entries %d not the same as num stores %d\n",
> +                       sg_len, chan->num_frms);
> +               return NULL;
> +       }
> +
> +       /* Allocate a transaction descriptor. */
> +       desc = xilinx_vdma_alloc_tx_descriptor(chan);
> +       if (!desc)
> +               return NULL;
> +
> +       dma_async_tx_descriptor_init(&desc->async_tx, &chan->common);
> +       desc->async_tx.tx_submit = xilinx_vdma_tx_submit;
> +       desc->async_tx.cookie = 0;
> +       async_tx_ack(&desc->async_tx);
> +
> +       /* Build the list of transaction segments. */
> +       for_each_sg(sgl, sg, sg_len, i) {
> +               struct xilinx_vdma_desc_hw *hw;
> +
> +               /* Allocate the link descriptor from DMA pool */
> +               segment = xilinx_vdma_alloc_tx_segment(chan);
> +               if (!segment)
> +                       goto error;
> +
> +               /* Fill in the hardware descriptor */
> +               hw = &segment->hw;
> +               hw->buf_addr = sg_dma_address(sg);
> +               hw->vsize = chan->config.vsize;
> +               hw->hsize = chan->config.hsize;
> +               hw->stride = (chan->config.frm_dly <<
> +                             XILINX_VDMA_FRMDLY_STRIDE_FRMDLY_SHIFT) |
> +                            (chan->config.stride <<
> +                             XILINX_VDMA_FRMDLY_STRIDE_STRIDE_SHIFT);
> +               if (prev)
> +                       prev->hw.next_desc = segment->phys;
> +
> +               /* Insert the segment into the descriptor segments list. */
> +               list_add_tail(&segment->node, &desc->segments);
> +
> +               prev = segment;
> +       }
> +
> +       /* Link the last hardware descriptor with the first. */
> +       segment = list_first_entry(&desc->segments,
> +                                  struct xilinx_vdma_tx_segment, node);
> +       prev->hw.next_desc = segment->phys;
> +
> +       return &desc->async_tx;
> +
> +error:
> +       xilinx_vdma_free_tx_descriptor(chan, desc);
> +       return NULL;
> +}
> +
> +/**
> + * xilinx_vdma_terminate_all - Halt the channel and free descriptors
> + * @chan: Driver specific VDMA Channel pointer
> + */
> +static void xilinx_vdma_terminate_all(struct xilinx_vdma_chan *chan)
> +{
> +       /* Halt the DMA engine */
> +       xilinx_vdma_halt(chan);
> +
> +       /* Remove and free all of the descriptors in the lists */
> +       xilinx_vdma_free_descriptors(chan);
> +}
> +
> +/**
> + * xilinx_vdma_slave_config - Configure VDMA channel
> + * Run-time configuration for Axi VDMA, supports:
> + * . halt the channel
> + * . configure interrupt coalescing and inter-packet delay threshold
> + * . start/stop parking
> + * . enable genlock
> + * . set transfer information using config struct
> + *
> + * @chan: Driver specific VDMA Channel pointer
> + * @cfg: Channel configuration pointer
> + *
> + * Return: '0' on success and failure value on error
> + */
> +static int xilinx_vdma_slave_config(struct xilinx_vdma_chan *chan,
> +                                   struct xilinx_vdma_config *cfg)
> +{
> +       u32 dmacr;
> +
> +       if (cfg->reset)
> +               return xilinx_vdma_chan_reset(chan);
> +
> +       dmacr = vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR);
> +
> +       /* If vsize is -1, it is park-related operations */
> +       if (cfg->vsize == -1) {
> +               if (cfg->park)
> +                       dmacr &= ~XILINX_VDMA_DMACR_CIRC_EN;
> +               else
> +                       dmacr |= XILINX_VDMA_DMACR_CIRC_EN;
> +
> +               vdma_ctrl_write(chan, XILINX_VDMA_REG_DMACR, dmacr);
> +               return 0;
> +       }
> +
> +       /* If hsize is -1, it is interrupt threshold settings */
> +       if (cfg->hsize == -1) {
> +               if (cfg->coalesc <= XILINX_VDMA_DMACR_FRAME_COUNT_MAX) {
> +                       dmacr &= ~XILINX_VDMA_DMACR_FRAME_COUNT_MASK;
> +                       dmacr |= cfg->coalesc <<
> +                                XILINX_VDMA_DMACR_FRAME_COUNT_SHIFT;
> +                       chan->config.coalesc = cfg->coalesc;
> +               }
> +
> +               if (cfg->delay <= XILINX_VDMA_DMACR_DELAY_MAX) {
> +                       dmacr &= ~XILINX_VDMA_DMACR_DELAY_MASK;
> +                       dmacr |= cfg->delay << XILINX_VDMA_DMACR_DELAY_SHIFT;
> +                       chan->config.delay = cfg->delay;
> +               }
> +
> +               vdma_ctrl_write(chan, XILINX_VDMA_REG_DMACR, dmacr);
> +               return 0;
> +       }
> +
> +       /* Transfer information */
> +       chan->config.vsize = cfg->vsize;
> +       chan->config.hsize = cfg->hsize;
> +       chan->config.stride = cfg->stride;
> +       chan->config.frm_dly = cfg->frm_dly;
> +       chan->config.park = cfg->park;
> +
> +       /* genlock settings */
> +       chan->config.gen_lock = cfg->gen_lock;
> +       chan->config.master = cfg->master;
> +
> +       if (cfg->gen_lock && chan->genlock) {
> +               dmacr |= XILINX_VDMA_DMACR_GENLOCK_EN;
> +               dmacr |= cfg->master << XILINX_VDMA_DMACR_MASTER_SHIFT;
> +       }
> +
> +       chan->config.frm_cnt_en = cfg->frm_cnt_en;
> +       if (cfg->park)
> +               chan->config.park_frm = cfg->park_frm;
> +       else
> +               chan->config.park_frm = -1;
> +
> +       chan->config.coalesc = cfg->coalesc;
> +       chan->config.delay = cfg->delay;
> +       if (cfg->coalesc <= XILINX_VDMA_DMACR_FRAME_COUNT_MAX) {
> +               dmacr |= cfg->coalesc << XILINX_VDMA_DMACR_FRAME_COUNT_SHIFT;
> +               chan->config.coalesc = cfg->coalesc;
> +       }
> +
> +       if (cfg->delay <= XILINX_VDMA_DMACR_DELAY_MAX) {
> +               dmacr |= cfg->delay << XILINX_VDMA_DMACR_DELAY_SHIFT;
> +               chan->config.delay = cfg->delay;
> +       }
> +
> +       /* FSync Source selection */
> +       dmacr &= ~XILINX_VDMA_DMACR_FSYNCSRC_MASK;
> +       dmacr |= cfg->ext_fsync << XILINX_VDMA_DMACR_FSYNCSRC_SHIFT;
> +
> +       vdma_ctrl_write(chan, XILINX_VDMA_REG_DMACR, dmacr);
> +       return 0;
> +}
> +
> +/**
> + * xilinx_vdma_device_control - Configure DMA channel of the device
> + * @dchan: DMA Channel pointer
> + * @cmd: DMA control command
> + * @arg: Channel configuration
> + *
> + * Return: '0' on success and failure value on error
> + */
> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
> +                                     enum dma_ctrl_cmd cmd, unsigned long arg)
> +{
> +       struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
> +
> +       switch (cmd) {
> +       case DMA_TERMINATE_ALL:
> +               xilinx_vdma_terminate_all(chan);
> +               return 0;
> +       case DMA_SLAVE_CONFIG:
> +               return xilinx_vdma_slave_config(chan,
> +                                       (struct xilinx_vdma_config *)arg);
> +       default:
> +               return -ENXIO;
> +       }
> +}
> +
> +/* -----------------------------------------------------------------------------
> + * Probe and remove
> + */
> +
> +/**
> + * xilinx_vdma_chan_remove - Per Channel remove function
> + * @chan: Driver specific VDMA channel
> + */
> +static void xilinx_vdma_chan_remove(struct xilinx_vdma_chan *chan)
> +{
> +       /* Disable all interrupts */
> +       vdma_ctrl_clr(chan, XILINX_VDMA_REG_DMACR,
> +                     XILINX_VDMA_DMAXR_ALL_IRQ_MASK);
> +
> +       list_del(&chan->common.device_node);
> +}
> +
> +/**
> + * xilinx_vdma_chan_probe - Per Channel Probing
> + * It get channel features from the device tree entry and
> + * initialize special channel handling routines
> + *
> + * @xdev: Driver specific device structure
> + * @node: Device node
> + *
> + * Return: '0' on success and failure value on error
> + */
> +static int xilinx_vdma_chan_probe(struct xilinx_vdma_device *xdev,
> +                                 struct device_node *node)
> +{
> +       struct xilinx_vdma_chan *chan;
> +       bool has_dre = false;
> +       u32 value;
> +       int err;
> +
> +       /* Allocate and initialize the channel structure */
> +       chan = devm_kzalloc(xdev->dev, sizeof(*chan), GFP_KERNEL);
> +       if (!chan)
> +               return -ENOMEM;
> +
> +       chan->dev = xdev->dev;
> +       chan->xdev = xdev;
> +       chan->has_sg = xdev->has_sg;
> +
> +       spin_lock_init(&chan->lock);
> +       INIT_LIST_HEAD(&chan->pending_list);
> +       INIT_LIST_HEAD(&chan->done_list);
> +
> +       /* Retrieve the channel properties from the device tree */
> +       has_dre = of_property_read_bool(node, "xlnx,include-dre");
> +
> +       chan->genlock = of_property_read_bool(node, "xlnx,genlock-mode");
> +
> +       err = of_property_read_u32(node, "xlnx,datawidth", &value);
> +       if (!err) {
> +               u32 width = value >> 3; /* Convert bits to bytes */
> +
> +               /* If data width is greater than 8 bytes, DRE is not in hw */
> +               if (width > 8)
> +                       has_dre = false;
> +
> +               if (!has_dre)
> +                       xdev->common.copy_align = fls(width - 1);
> +       } else {
> +               dev_err(xdev->dev, "missing xlnx,datawidth property\n");
> +               return err;
> +       }

Can you please convert this to:
if (err) {
 dev_err(...);
 return err;
}

That way we can avoid the else clause.
> +
> +       if (of_device_is_compatible(node, "xlnx,axi-vdma-mm2s-channel")) {
> +               chan->direction = DMA_MEM_TO_DEV;
> +               chan->id = 0;
> +
> +               chan->ctrl_offset = XILINX_VDMA_MM2S_CTRL_OFFSET;
> +               chan->desc_offset = XILINX_VDMA_MM2S_DESC_OFFSET;
> +
> +               if (xdev->flush_on_fsync == XILINX_VDMA_FLUSH_BOTH ||
> +                   xdev->flush_on_fsync == XILINX_VDMA_FLUSH_MM2S)
> +                       chan->flush_on_fsync = true;
> +       } else if (of_device_is_compatible(node,
> +                                           "xlnx,axi-vdma-s2mm-channel")) {
> +               chan->direction = DMA_DEV_TO_MEM;
> +               chan->id = 1;
> +
> +               chan->ctrl_offset = XILINX_VDMA_S2MM_CTRL_OFFSET;
> +               chan->desc_offset = XILINX_VDMA_S2MM_DESC_OFFSET;
> +
> +               if (xdev->flush_on_fsync == XILINX_VDMA_FLUSH_BOTH ||
> +                   xdev->flush_on_fsync == XILINX_VDMA_FLUSH_S2MM)
> +                       chan->flush_on_fsync = true;
> +       } else {
> +               dev_err(xdev->dev, "Invalid channel compatible node\n");
> +               return -EINVAL;
> +       }
> +
> +       /* Request the interrupt */
> +       chan->irq = irq_of_parse_and_map(node, 0);
> +       err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler,
> +                              IRQF_SHARED, "xilinx-vdma-controller", chan);
> +       if (err) {
> +               dev_err(xdev->dev, "unable to request IRQ\n");

It might be worth to also tell the IRQ number that failed
to register.

> +               return err;
> +       }
> +
> +       /* Initialize the DMA channel and add it to the DMA engine channels
> +        * list.
> +        */
> +       chan->common.device = &xdev->common;
> +
> +       list_add_tail(&chan->common.device_node, &xdev->common.channels);
> +       xdev->chan[chan->id] = chan;
> +
> +       /* Reset the channel */
> +       err = xilinx_vdma_chan_reset(chan);
> +       if (err < 0) {
> +               dev_err(xdev->dev, "Reset channel failed\n");
> +               return err;
> +       }
> +
> +       return 0;
> +}
> +
> +/**
> + * struct of_dma_filter_xilinx_args - Channel filter args
> + * @dev: DMA device structure
> + * @chan_id: Channel id
> + */
> +struct of_dma_filter_xilinx_args {
> +       struct dma_device *dev;
> +       u32 chan_id;
> +};
> +
> +/**
> + * xilinx_vdma_dt_filter - VDMA channel filter function
> + * @chan: DMA channel pointer
> + * @param: Filter match value
> + *
> + * Return: true/false based on the result
> + */
> +static bool xilinx_vdma_dt_filter(struct dma_chan *chan, void *param)
> +{
> +       struct of_dma_filter_xilinx_args *args = param;
> +
> +       return chan->device == args->dev && chan->chan_id == args->chan_id;
> +}
> +
> +/**
> + * of_dma_xilinx_xlate - Translation function
> + * @dma_spec: Pointer to DMA specifier as found in the device tree
> + * @ofdma: Pointer to DMA controller data
> + *
> + * Return: DMA channel pointer on success and NULL on error
> + */
> +static struct dma_chan *of_dma_xilinx_xlate(struct of_phandle_args *dma_spec,
> +                                               struct of_dma *ofdma)
> +{
> +       struct of_dma_filter_xilinx_args args;
> +       dma_cap_mask_t cap;
> +
> +       args.dev = ofdma->of_dma_data;
> +       if (!args.dev)
> +               return NULL;
> +
> +       if (dma_spec->args_count != 1)
> +               return NULL;
> +
> +       dma_cap_zero(cap);
> +       dma_cap_set(DMA_SLAVE, cap);
> +
> +       args.chan_id = dma_spec->args[0];
> +
> +       return dma_request_channel(cap, xilinx_vdma_dt_filter, &args);
> +}
> +
> +/**
> + * xilinx_vdma_probe - Driver probe function
> + * @pdev: Pointer to the platform_device structure
> + *
> + * Return: '0' on success and failure value on error
> + */
> +static int xilinx_vdma_probe(struct platform_device *pdev)
> +{
> +       struct device_node *node = pdev->dev.of_node;
> +       struct xilinx_vdma_device *xdev;
> +       struct device_node *child;
> +       struct resource *io;
> +       u32 num_frames;
> +       int i, err;
> +
> +       dev_info(&pdev->dev, "Probing xilinx axi vdma engine\n");
> +
> +       /* Allocate and initialize the DMA engine structure */
> +       xdev = devm_kzalloc(&pdev->dev, sizeof(*xdev), GFP_KERNEL);
> +       if (!xdev)
> +               return -ENOMEM;
> +
> +       xdev->dev = &pdev->dev;
> +
> +       /* Request and map I/O memory */
> +       io = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +       xdev->regs = devm_ioremap_resource(&pdev->dev, io);
> +       if (IS_ERR(xdev->regs))
> +               return PTR_ERR(xdev->regs);
> +
> +       /* Retrieve the DMA engine properties from the device tree */
> +       xdev->has_sg = of_property_read_bool(node, "xlnx,include-sg");
> +
> +       err = of_property_read_u32(node, "xlnx,num-fstores", &num_frames);
> +       if (err < 0) {
> +               dev_err(xdev->dev, "missing xlnx,num-fstores property\n");
> +               return err;
> +       }
> +
> +       of_property_read_u32(node, "xlnx,flush-fsync", &xdev->flush_on_fsync);

Error check?
> +
> +       /* Initialize the DMA engine */
> +       xdev->common.dev = &pdev->dev;
> +
> +       INIT_LIST_HEAD(&xdev->common.channels);
> +       dma_cap_set(DMA_SLAVE, xdev->common.cap_mask);
> +       dma_cap_set(DMA_PRIVATE, xdev->common.cap_mask);
> +
> +       xdev->common.device_alloc_chan_resources =
> +                               xilinx_vdma_alloc_chan_resources;
> +       xdev->common.device_free_chan_resources =
> +                               xilinx_vdma_free_chan_resources;
> +       xdev->common.device_prep_slave_sg = xilinx_vdma_prep_slave_sg;
> +       xdev->common.device_control = xilinx_vdma_device_control;
> +       xdev->common.device_tx_status = xilinx_vdma_tx_status;
> +       xdev->common.device_issue_pending = xilinx_vdma_issue_pending;
> +
> +       platform_set_drvdata(pdev, xdev);
> +
> +       /* Initialize the channels */
> +       for_each_child_of_node(node, child) {
> +               err = xilinx_vdma_chan_probe(xdev, child);
> +               if (err < 0)
> +                       goto error;
> +       }
> +
> +       for (i = 0; i < XILINX_VDMA_MAX_CHANS_PER_DEVICE; i++) {
> +               if (xdev->chan[i])
> +                       xdev->chan[i]->num_frms = num_frames;
> +       }
> +
> +       /* Register the DMA engine with the core */
> +       dma_async_device_register(&xdev->common);
> +
> +       err = of_dma_controller_register(node, of_dma_xilinx_xlate,
> +                                        &xdev->common);
> +       if (err < 0) {
> +               dev_err(&pdev->dev, "Unable to register DMA to DT\n");
> +               dma_async_device_unregister(&xdev->common);
> +               goto error;
> +       }
> +
> +       return 0;
> +
> +error:
> +       for (i = 0; i < XILINX_VDMA_MAX_CHANS_PER_DEVICE; i++) {
> +               if (xdev->chan[i])
> +                       xilinx_vdma_chan_remove(xdev->chan[i]);
> +       }
> +
> +       return err;
> +}
> +
> +/**
> + * xilinx_vdma_remove - Driver remove function
> + * @pdev: Pointer to the platform_device structure
> + *
> + * Return: Always '0'
> + */
> +static int xilinx_vdma_remove(struct platform_device *pdev)
> +{
> +       struct xilinx_vdma_device *xdev;
> +       int i;
> +
> +       of_dma_controller_free(pdev->dev.of_node);
> +
> +       xdev = platform_get_drvdata(pdev);
> +       dma_async_device_unregister(&xdev->common);
> +
> +       for (i = 0; i < XILINX_VDMA_MAX_CHANS_PER_DEVICE; i++) {
> +               if (xdev->chan[i])
> +                       xilinx_vdma_chan_remove(xdev->chan[i]);
> +       }
> +
> +       return 0;
> +}
> +
> +static const struct of_device_id xilinx_vdma_of_ids[] = {
> +       { .compatible = "xlnx,axi-vdma-1.00.a",},
> +       {}
> +};
> +
> +static struct platform_driver xilinx_vdma_driver = {
> +       .driver = {
> +               .name = "xilinx-vdma",
> +               .owner = THIS_MODULE,
> +               .of_match_table = xilinx_vdma_of_ids,
> +       },
> +       .probe = xilinx_vdma_probe,
> +       .remove = xilinx_vdma_remove,
> +};
> +
> +module_platform_driver(xilinx_vdma_driver);
> +
> +MODULE_AUTHOR("Xilinx, Inc.");
> +MODULE_DESCRIPTION("Xilinx VDMA driver");
> +MODULE_LICENSE("GPL v2");
> diff --git a/include/linux/amba/xilinx_dma.h b/include/linux/amba/xilinx_dma.h
> new file mode 100644
> index 0000000..48a8c8b
> --- /dev/null
> +++ b/include/linux/amba/xilinx_dma.h
> @@ -0,0 +1,50 @@
> +/*
> + * Xilinx DMA Engine drivers support header file
> + *
> + * Copyright (C) 2010-2014 Xilinx, Inc. All rights reserved.
> + *
> + * This is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +#ifndef __DMA_XILINX_DMA_H
> +#define __DMA_XILINX_DMA_H
> +
> +#include <linux/dma-mapping.h>
> +#include <linux/dmaengine.h>
> +
> +/**
> + * struct xilinx_vdma_config - VDMA Configuration structure
> + * @vsize: Vertical size
> + * @hsize: Horizontal size
> + * @stride: Stride
> + * @frm_dly: Frame delay
> + * @gen_lock: Whether in gen-lock mode
> + * @master: Master that it syncs to
> + * @frm_cnt_en: Enable frame count enable
> + * @park: Whether wants to park
> + * @park_frm: Frame to park on
> + * @coalesc: Interrupt coalescing threshold
> + * @delay: Delay counter
> + * @reset: Reset Channel
> + * @ext_fsync: External Frame Sync source
> + */
> +struct xilinx_vdma_config {
> +       int vsize;
> +       int hsize;
> +       int stride;
> +       int frm_dly;
> +       int gen_lock;
> +       int master;
> +       int frm_cnt_en;
> +       int park;
> +       int park_frm;
> +       int coalesc;
> +       int delay;
> +       int reset;
> +       int ext_fsync;
> +};
> +
> +#endif
> --

--
Regards,
Levente Kurusa
--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lars-Peter Clausen Jan. 23, 2014, 11:25 a.m. UTC | #2
On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
[...]
> +/**
> + * xilinx_vdma_device_control - Configure DMA channel of the device
> + * @dchan: DMA Channel pointer
> + * @cmd: DMA control command
> + * @arg: Channel configuration
> + *
> + * Return: '0' on success and failure value on error
> + */
> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
> +				      enum dma_ctrl_cmd cmd, unsigned long arg)
> +{
> +	struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
> +
> +	switch (cmd) {
> +	case DMA_TERMINATE_ALL:
> +		xilinx_vdma_terminate_all(chan);
> +		return 0;
> +	case DMA_SLAVE_CONFIG:
> +		return xilinx_vdma_slave_config(chan,
> +					(struct xilinx_vdma_config *)arg);

You really shouldn't be overloading the generic API with your own semantics.
DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.

> +	default:
> +		return -ENXIO;
> +	}
> +}
> +
[...]
> +
> +	/* Request the interrupt */
> +	chan->irq = irq_of_parse_and_map(node, 0);
> +	err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler,
> +			       IRQF_SHARED, "xilinx-vdma-controller", chan);

This is a clasic example of where to not use devm_request_irq. 'chan' is
accessed in the interrupt handler, but if you use devm_request_irq 'chan'
will be freed before the interrupt handler has been released, which means
there is now a race condition where the interrupt handler can access already
freed memory.

> +	if (err) {
> +		dev_err(xdev->dev, "unable to request IRQ\n");
> +		return err;
> +	}
> +
> +	/* Initialize the DMA channel and add it to the DMA engine channels
> +	 * list.
> +	 */
> +	chan->common.device = &xdev->common;
> +
> +	list_add_tail(&chan->common.device_node, &xdev->common.channels);
> +	xdev->chan[chan->id] = chan;
> +
> +	/* Reset the channel */
> +	err = xilinx_vdma_chan_reset(chan);
> +	if (err < 0) {
> +		dev_err(xdev->dev, "Reset channel failed\n");
> +		return err;
> +	}
> +
> +	return 0;
> +}
> +
> +/**
> + * struct of_dma_filter_xilinx_args - Channel filter args
> + * @dev: DMA device structure
> + * @chan_id: Channel id
> + */
> +struct of_dma_filter_xilinx_args {
> +	struct dma_device *dev;
> +	u32 chan_id;
> +};
> +
> +/**
> + * xilinx_vdma_dt_filter - VDMA channel filter function
> + * @chan: DMA channel pointer
> + * @param: Filter match value
> + *
> + * Return: true/false based on the result
> + */
> +static bool xilinx_vdma_dt_filter(struct dma_chan *chan, void *param)
> +{
> +	struct of_dma_filter_xilinx_args *args = param;
> +
> +	return chan->device == args->dev && chan->chan_id == args->chan_id;
> +}
> +
> +/**
> + * of_dma_xilinx_xlate - Translation function
> + * @dma_spec: Pointer to DMA specifier as found in the device tree
> + * @ofdma: Pointer to DMA controller data
> + *
> + * Return: DMA channel pointer on success and NULL on error
> + */
> +static struct dma_chan *of_dma_xilinx_xlate(struct of_phandle_args *dma_spec,
> +						struct of_dma *ofdma)
> +{
> +	struct of_dma_filter_xilinx_args args;
> +	dma_cap_mask_t cap;
> +
> +	args.dev = ofdma->of_dma_data;
> +	if (!args.dev)
> +		return NULL;
> +
> +	if (dma_spec->args_count != 1)
> +		return NULL;
> +
> +	dma_cap_zero(cap);
> +	dma_cap_set(DMA_SLAVE, cap);
> +
> +	args.chan_id = dma_spec->args[0];
> +
> +	return dma_request_channel(cap, xilinx_vdma_dt_filter, &args);

There is a new helper function called dma_get_slave_channel, which makes
this much easier. Take a look at the k3dma.c driver for an example.
> +}

--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andy Shevchenko Jan. 23, 2014, 1:32 p.m. UTC | #3
On Wed, 2014-01-22 at 22:22 +0530, Srikanth Thokala wrote:
> This is the driver for the AXI Video Direct Memory Access (AXI
> VDMA) core, which is a soft Xilinx IP core that provides high-
> bandwidth direct memory access between memory and AXI4-Stream
> type video target peripherals. The core provides efficient two
> dimensional DMA operations with independent asynchronous read
> and write channel operation.
> 
> This module works on Zynq (ARM Based SoC) and Microblaze platforms.

Few comments below.

> 
> Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
> ---
> NOTE:
> 1. Created a separate directory 'dma/xilinx' as Xilinx has two more
>    DMA IPs and we are also planning to upstream these drivers soon.
> 2. Rebased on v3.13.0-rc8
> 
> Changes in v2:
> - Removed DMA Test client module from the patchset as suggested
>   by Andy Shevchenko
> - Removed device-id DT property, as suggested by Arnd Bergmann
> - Properly documented DT bindings as suggested by Arnd Bergmann
> - Returning with error, if registration of DMA to node fails
> - Fixed typo errors
> - Used BIT() macro at applicable places
> - Added missing header file to the patchset
> - Changed copyright year to include 2014
> ---
>  .../devicetree/bindings/dma/xilinx/xilinx_vdma.txt |   75 +
>  drivers/dma/Kconfig                                |   14 +
>  drivers/dma/Makefile                               |    1 +
>  drivers/dma/xilinx/Makefile                        |    1 +
>  drivers/dma/xilinx/xilinx_vdma.c                   | 1486 ++++++++++++++++++++
>  include/linux/amba/xilinx_dma.h                    |   50 +
>  6 files changed, 1627 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
>  create mode 100644 drivers/dma/xilinx/Makefile
>  create mode 100644 drivers/dma/xilinx/xilinx_vdma.c
>  create mode 100644 include/linux/amba/xilinx_dma.h
> 
> diff --git a/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
> new file mode 100644
> index 0000000..ab8be1a
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
> @@ -0,0 +1,75 @@
> +Xilinx AXI VDMA engine, it does transfers between memory and video devices.
> +It can be configured to have one channel or two channels. If configured
> +as two channels, one is to transmit to the video device and another is
> +to receive from the video device.
> +
> +Required properties:
> +- compatible: Should be "xlnx,axi-vdma-1.00.a"
> +- #dma-cells: Should be <1>, see "dmas" property below
> +- reg: Should contain VDMA registers location and length.
> +- xlnx,num-fstores: Should be the number of framebuffers as configured in h/w.
> +- dma-channel child node: Should have atleast one channel and can have upto
> +	two channels per device. This node specifies the properties of each
> +	DMA channel (see child node properties below).
> +
> +Optional properties:
> +- xlnx,include-sg: Tells whether configured for Scatter-mode in
> +	the hardware.
> +- xlnx,flush-fsync: Tells whether which channel to Flush on Frame sync.
> +	It takes following values:
> +	{1}, flush both channels
> +	{2}, flush mm2s channel
> +	{3}, flush s2mm channel
> +
> +Required child node properties:
> +- compatible: It should be either "xlnx,axi-vdma-mm2s-channel" or
> +	"xlnx,axi-vdma-s2mm-channel".
> +- interrupts: Should contain per channel VDMA interrupts.
> +- xlnx,data-width: Should contain the stream data width, take values
> +	{32,64...1024}.
> +
> +Option child node properties:
> +- xlnx,include-dre: Tells whether hardware is configured for Data
> +	Realignment Engine.
> +- xlnx,genlock-mode: Tells whether Genlock synchronization is
> +	enabled/disabled in hardware.
> +
> +Example:
> +++++++++
> +
> +axi_vdma_0: axivdma@40030000 {
> +	compatible = "xlnx,axi-vdma-1.00.a";
> +	#dma_cells = <1>;
> +	reg = < 0x40030000 0x10000 >;
> +	xlnx,num-fstores = <0x8>;
> +	xlnx,flush-fsync = <0x1>;
> +	dma-channel@40030000 {
> +		compatible = "xlnx,axi-vdma-mm2s-channel";
> +		interrupts = < 0 54 4 >;
> +		xlnx,datawidth = <0x40>;
> +	} ;
> +	dma-channel@40030030 {
> +		compatible = "xlnx,axi-vdma-s2mm-channel";
> +		interrupts = < 0 53 4 >;
> +		xlnx,datawidth = <0x40>;
> +	} ;
> +} ;
> +
> +
> +* DMA client
> +
> +Required properties:
> +- dmas: a list of <[Video DMA device phandle] [Channel ID]> pairs,
> +	where Channel ID is '0' for write/tx and '1' for read/rx
> +	channel.
> +- dma-names: a list of DMA channel names, one per "dmas" entry
> +
> +Example:
> +++++++++
> +
> +vdmatest_0: vdmatest@0 {
> +	compatible ="xlnx,axi-vdma-test-1.00.a";
> +	dmas = <&axi_vdma_0 0
> +		&axi_vdma_0 1>;
> +	dma-names = "vdma0", "vdma1";
> +} ;
> diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
> index c823daa..2a74651 100644
> --- a/drivers/dma/Kconfig
> +++ b/drivers/dma/Kconfig
> @@ -334,6 +334,20 @@ config K3_DMA
>  	  Support the DMA engine for Hisilicon K3 platform
>  	  devices.
>  
> +config XILINX_VDMA
> +	tristate "Xilinx AXI VDMA Engine"
> +	depends on (ARCH_ZYNQ || MICROBLAZE)
> +	select DMA_ENGINE
> +	help
> +	  Enable support for Xilinx AXI VDMA Soft IP.
> +
> +	  This engine provides high-bandwidth direct memory access
> +	  between memory and AXI4-Stream video type target
> +	  peripherals including peripherals which support AXI4-
> +	  Stream Video Protocol.  It has two stream interfaces/
> +	  channels, Memory Mapped to Stream (MM2S) and Stream to
> +	  Memory Mapped (S2MM) for the data transfers.
> +
>  config DMA_ENGINE
>  	bool
>  
> diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile
> index 0ce2da9..d84130b 100644
> --- a/drivers/dma/Makefile
> +++ b/drivers/dma/Makefile
> @@ -42,3 +42,4 @@ obj-$(CONFIG_MMP_PDMA) += mmp_pdma.o
>  obj-$(CONFIG_DMA_JZ4740) += dma-jz4740.o
>  obj-$(CONFIG_TI_CPPI41) += cppi41.o
>  obj-$(CONFIG_K3_DMA) += k3dma.o
> +obj-y += xilinx/
> diff --git a/drivers/dma/xilinx/Makefile b/drivers/dma/xilinx/Makefile
> new file mode 100644
> index 0000000..3c4e9f2
> --- /dev/null
> +++ b/drivers/dma/xilinx/Makefile
> @@ -0,0 +1 @@
> +obj-$(CONFIG_XILINX_VDMA) += xilinx_vdma.o
> diff --git a/drivers/dma/xilinx/xilinx_vdma.c b/drivers/dma/xilinx/xilinx_vdma.c
> new file mode 100644
> index 0000000..4c0d04c
> --- /dev/null
> +++ b/drivers/dma/xilinx/xilinx_vdma.c
> @@ -0,0 +1,1486 @@
> +/*
> + * DMA driver for Xilinx Video DMA Engine
> + *
> + * Copyright (C) 2010-2014 Xilinx, Inc. All rights reserved.
> + *
> + * Based on the Freescale DMA driver.
> + *
> + * Description:
> + * The AXI Video Direct Memory Access (AXI VDMA) core is a soft Xilinx IP
> + * core that provides high-bandwidth direct memory access between memory
> + * and AXI4-Stream type video target peripherals. The core provides efficient
> + * two dimensional DMA operations with independent asynchronous read (S2MM)
> + * and write (MM2S) channel operation. It can be configured to have either
> + * one channel or two channels. If configured as two channels, one is to
> + * transmit to the video device (MM2S) and another is to receive from the
> + * video device (S2MM). Initialization, status, interrupt and management
> + * registers are accessed through an AXI4-Lite slave interface.
> + *
> + * This program is free software: you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation, either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +#include <linux/amba/xilinx_dma.h>
> +#include <linux/bitops.h>
> +#include <linux/dmapool.h>
> +#include <linux/init.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/module.h>
> +#include <linux/of_address.h>
> +#include <linux/of_dma.h>
> +#include <linux/of_platform.h>
> +#include <linux/of_irq.h>
> +#include <linux/slab.h>
> +
> +/* Register/Descriptor Offsets */
> +#define XILINX_VDMA_MM2S_CTRL_OFFSET		0x0000
> +#define XILINX_VDMA_S2MM_CTRL_OFFSET		0x0030
> +#define XILINX_VDMA_MM2S_DESC_OFFSET		0x0050
> +#define XILINX_VDMA_S2MM_DESC_OFFSET		0x00a0
> +
> +/* Control Registers */
> +#define XILINX_VDMA_REG_DMACR			0x0000
> +#define XILINX_VDMA_DMACR_DELAY_MAX		0xff
> +#define XILINX_VDMA_DMACR_DELAY_SHIFT		24
> +#define XILINX_VDMA_DMACR_FRAME_COUNT_MAX	0xff
> +#define XILINX_VDMA_DMACR_FRAME_COUNT_SHIFT	16
> +#define XILINX_VDMA_DMACR_ERR_IRQ		BIT(14)
> +#define XILINX_VDMA_DMACR_DLY_CNT_IRQ		BIT(13)
> +#define XILINX_VDMA_DMACR_FRM_CNT_IRQ		BIT(12)
> +#define XILINX_VDMA_DMACR_MASTER_SHIFT		8
> +#define XILINX_VDMA_DMACR_FSYNCSRC_SHIFT	5
> +#define XILINX_VDMA_DMACR_FRAMECNT_EN		BIT(4)
> +#define XILINX_VDMA_DMACR_GENLOCK_EN		BIT(3)
> +#define XILINX_VDMA_DMACR_RESET			BIT(2)
> +#define XILINX_VDMA_DMACR_CIRC_EN		BIT(1)
> +#define XILINX_VDMA_DMACR_RUNSTOP		BIT(0)
> +#define XILINX_VDMA_DMACR_DELAY_MASK		\
> +				(XILINX_VDMA_DMACR_DELAY_MAX << \
> +				XILINX_VDMA_DMACR_DELAY_SHIFT)
> +#define XILINX_VDMA_DMACR_FRAME_COUNT_MASK	\
> +				(XILINX_VDMA_DMACR_FRAME_COUNT_MAX << \
> +				XILINX_VDMA_DMACR_FRAME_COUNT_SHIFT)
> +#define XILINX_VDMA_DMACR_MASTER_MASK		\
> +				(0xf << XILINX_VDMA_DMACR_MASTER_SHIFT)
> +#define XILINX_VDMA_DMACR_FSYNCSRC_MASK		\
> +				(3 << XILINX_VDMA_DMACR_FSYNCSRC_SHIFT)
> +
> +#define XILINX_VDMA_REG_DMASR			0x0004
> +#define XILINX_VDMA_DMASR_DELAY_SHIFT		24
> +#define XILINX_VDMA_DMASR_FRAME_COUNT_SHIFT	16
> +#define XILINX_VDMA_DMASR_EOL_LATE_ERR		BIT(15)
> +#define XILINX_VDMA_DMASR_ERR_IRQ		BIT(14)
> +#define XILINX_VDMA_DMASR_DLY_CNT_IRQ		BIT(13)
> +#define XILINX_VDMA_DMASR_FRM_CNT_IRQ		BIT(12)
> +#define XILINX_VDMA_DMASR_SOF_LATE_ERR		BIT(11)
> +#define XILINX_VDMA_DMASR_SG_DEC_ERR		BIT(10)
> +#define XILINX_VDMA_DMASR_SG_SLV_ERR		BIT(9)
> +#define XILINX_VDMA_DMASR_EOF_EARLY_ERR		BIT(8)
> +#define XILINX_VDMA_DMASR_SOF_EARLY_ERR		BIT(7)
> +#define XILINX_VDMA_DMASR_DMA_DEC_ERR		BIT(6)
> +#define XILINX_VDMA_DMASR_DMA_SLAVE_ERR		BIT(5)
> +#define XILINX_VDMA_DMASR_DMA_INT_ERR		BIT(4)
> +#define XILINX_VDMA_DMASR_IDLE			BIT(1)
> +#define XILINX_VDMA_DMASR_HALTED		BIT(0)
> +
> +#define XILINX_VDMA_DMASR_DELAY_MASK		\
> +				(0xff << XILINX_VDMA_DMASR_DELAY_SHIFT)
> +#define XILINX_VDMA_DMASR_FRAME_COUNT_MASK	\
> +				(0xff << XILINX_VDMA_DMASR_FRAME_COUNT_SHIFT)

Does 0xff require to be a separate definition in both above cases?

> +
> +#define XILINX_VDMA_REG_CURDESC			0x0008
> +#define XILINX_VDMA_REG_TAILDESC		0x0010
> +#define XILINX_VDMA_REG_REG_INDEX		0x0014
> +#define XILINX_VDMA_REG_FRMSTORE		0x0018
> +#define XILINX_VDMA_REG_THRESHOLD		0x001c
> +#define XILINX_VDMA_REG_FRMPTR_STS		0x0024
> +#define XILINX_VDMA_REG_PARK_PTR		0x0028
> +#define XILINX_VDMA_PARK_PTR_WR_REF_SHIFT	8
> +#define XILINX_VDMA_PARK_PTR_RD_REF_SHIFT	0
> +#define XILINX_VDMA_REG_VDMA_VERSION		0x002c
> +
> +/* Register Direct Mode Registers */
> +#define XILINX_VDMA_REG_VSIZE			0x0000
> +#define XILINX_VDMA_REG_HSIZE			0x0004
> +
> +#define XILINX_VDMA_REG_FRMDLY_STRIDE		0x0008
> +#define XILINX_VDMA_FRMDLY_STRIDE_FRMDLY_SHIFT	24
> +#define XILINX_VDMA_FRMDLY_STRIDE_STRIDE_SHIFT	0
> +#define XILINX_VDMA_FRMDLY_STRIDE_FRMDLY_MASK	\
> +				(0x1f <<	\
> +				XILINX_VDMA_FRMDLY_STRIDE_FRMDLY_SHIFT)
> +#define XILINX_VDMA_FRMDLY_STRIDE_STRIDE_MASK	\
> +				(0xffff <<	\
> +				XILINX_VDMA_FRMDLY_STRIDE_STRIDE_MASK)
> +
> +#define XILINX_VDMA_REG_START_ADDRESS(n)	(0x000c + 4 * (n))
> +
> +/* Hw specific definitions */

HW or Hardware

> +#define XILINX_VDMA_MAX_CHANS_PER_DEVICE	0x2
> +
> +#define XILINX_VDMA_DMAXR_ALL_IRQ_MASK	(XILINX_VDMA_DMASR_FRM_CNT_IRQ | \
> +					 XILINX_VDMA_DMASR_DLY_CNT_IRQ | \
> +					 XILINX_VDMA_DMASR_ERR_IRQ)
> +
> +#define XILINX_VDMA_DMASR_ALL_ERR_MASK	(XILINX_VDMA_DMASR_EOL_LATE_ERR | \
> +					 XILINX_VDMA_DMASR_SOF_LATE_ERR | \
> +					 XILINX_VDMA_DMASR_SG_DEC_ERR | \
> +					 XILINX_VDMA_DMASR_SG_SLV_ERR | \
> +					 XILINX_VDMA_DMASR_EOF_EARLY_ERR | \
> +					 XILINX_VDMA_DMASR_SOF_EARLY_ERR | \
> +					 XILINX_VDMA_DMASR_DMA_DEC_ERR | \
> +					 XILINX_VDMA_DMASR_DMA_SLAVE_ERR | \
> +					 XILINX_VDMA_DMASR_DMA_INT_ERR)
> +
> +/*
> + * Recoverable errors are DMA Internal error, SOF Early, EOF Early and SOF Late.
> + * They are only recoverable when C_FLUSH_ON_FSYNC is enabled in the h/w system.
> + */
> +#define XILINX_VDMA_DMASR_ERR_RECOVER_MASK	\
> +					(XILINX_VDMA_DMASR_SOF_LATE_ERR | \

Do you need so many tabs for an indentation here and in other places?
May be better to keep some style here (I mean on which line you start
the value of the definition).

> +					 XILINX_VDMA_DMASR_EOF_EARLY_ERR | \
> +					 XILINX_VDMA_DMASR_SOF_EARLY_ERR | \
> +					 XILINX_VDMA_DMASR_DMA_INT_ERR)
> +
> +/* Axi VDMA Flush on Fsync bits */
> +#define XILINX_VDMA_FLUSH_S2MM			3
> +#define XILINX_VDMA_FLUSH_MM2S			2
> +#define XILINX_VDMA_FLUSH_BOTH			1
> +
> +/* Delay loop counter to prevent hardware failure */
> +#define XILINX_VDMA_LOOP_COUNT			1000000
> +
> +/**
> + * struct xilinx_vdma_desc_hw - Hardware Descriptor
> + * @next_desc: Next Descriptor Pointer @0x00
> + * @pad1: Reserved @0x04
> + * @buf_addr: Buffer address @0x08
> + * @pad2: Reserved @0x0C
> + * @vsize: Vertical Size @0x10
> + * @hsize: Horizontal Size @0x14
> + * @stride: Number of bytes between the first
> + *	    pixels of each horizontal line @0x18
> + */
> +struct xilinx_vdma_desc_hw {
> +	u32 next_desc;
> +	u32 pad1;
> +	u32 buf_addr;
> +	u32 pad2;
> +	u32 vsize;
> +	u32 hsize;
> +	u32 stride;
> +} __aligned(64);
> +
> +/**
> + * struct xilinx_vdma_tx_segment - Descriptor segment
> + * @hw: Hardware descriptor
> + * @node: Node in the descriptor segments list
> + * @cookie: Segment cookie
> + * @phys: Physical address of segment
> + */
> +struct xilinx_vdma_tx_segment {
> +	struct xilinx_vdma_desc_hw hw;
> +	struct list_head node;
> +	dma_cookie_t cookie;
> +	dma_addr_t phys;
> +} __aligned(64);
> +
> +/**
> + * struct xilinx_vdma_tx_descriptor - Per Transaction structure
> + * @async_tx: Async transaction descriptor
> + * @segments: TX segments list
> + * @node: Node in the channel descriptors list
> + */
> +struct xilinx_vdma_tx_descriptor {
> +	struct dma_async_tx_descriptor async_tx;
> +	struct list_head segments;
> +	struct list_head node;
> +};
> +
> +#define to_vdma_tx_descriptor(tx) \
> +	container_of(tx, struct xilinx_vdma_tx_descriptor, async_tx)
> +
> +/**
> + * struct xilinx_vdma_chan - Driver specific VDMA channel structure
> + * @xdev: Driver specific device structure
> + * @ctrl_offset: Control registers offset
> + * @desc_offset: TX descriptor registers offset
> + * @completed_cookie: Maximum cookie completed
> + * @cookie: The current cookie
> + * @lock: Descriptor operation lock
> + * @pending_list: Descriptors waiting
> + * @active_desc: Active descriptor
> + * @done_list: Complete descriptors
> + * @common: DMA common channel
> + * @desc_pool: Descriptors pool
> + * @dev: The dma device
> + * @irq: Channel IRQ
> + * @id: Channel ID
> + * @direction: Transfer direction
> + * @num_frms: Number of frames
> + * @has_sg: Support scatter transfers
> + * @genlock: Support genlock mode
> + * @err: Channel has errors
> + * @tasklet: Cleanup work after irq
> + * @config: Device configuration info
> + * @flush_on_fsync: Flush on Frame sync
> + */
> +struct xilinx_vdma_chan {
> +	struct xilinx_vdma_device *xdev;
> +	u32 ctrl_offset;
> +	u32 desc_offset;
> +	dma_cookie_t completed_cookie;
> +	dma_cookie_t cookie;
> +	spinlock_t lock;
> +	struct list_head pending_list;
> +	struct xilinx_vdma_tx_descriptor *active_desc;
> +	struct list_head done_list;
> +	struct dma_chan common;
> +	struct dma_pool *desc_pool;
> +	struct device *dev;
> +	int irq;
> +	int id;
> +	enum dma_transfer_direction direction;
> +	int num_frms;
> +	bool has_sg;
> +	bool genlock;
> +	bool err;
> +	struct tasklet_struct tasklet;
> +	struct xilinx_vdma_config config;
> +	bool flush_on_fsync;
> +};
> +
> +/**
> + * struct xilinx_vdma_device - VDMA device structure
> + * @regs: I/O mapped base address
> + * @dev: Device Structure
> + * @common: DMA device structure
> + * @chan: Driver specific VDMA channel
> + * @has_sg: Specifies whether Scatter-Gather is present or not
> + * @flush_on_fsync: Flush on frame sync
> + */
> +struct xilinx_vdma_device {
> +	void __iomem *regs;
> +	struct device *dev;
> +	struct dma_device common;
> +	struct xilinx_vdma_chan *chan[XILINX_VDMA_MAX_CHANS_PER_DEVICE];
> +	bool has_sg;
> +	u32 flush_on_fsync;
> +};
> +
> +#define to_xilinx_chan(chan) \
> +			container_of(chan, struct xilinx_vdma_chan, common)
> +
> +/* IO accessors */
> +static inline u32 vdma_read(struct xilinx_vdma_chan *chan, u32 reg)
> +{
> +	return ioread32(chan->xdev->regs + reg);
> +}
> +
> +static inline void vdma_write(struct xilinx_vdma_chan *chan, u32 reg, u32 value)
> +{
> +	iowrite32(value, chan->xdev->regs + reg);
> +}
> +
> +static inline void vdma_desc_write(struct xilinx_vdma_chan *chan, u32 reg,
> +				   u32 value)
> +{
> +	vdma_write(chan, chan->desc_offset + reg, value);
> +}
> +
> +static inline u32 vdma_ctrl_read(struct xilinx_vdma_chan *chan, u32 reg)
> +{
> +	return vdma_read(chan, chan->ctrl_offset + reg);
> +}
> +
> +static inline void vdma_ctrl_write(struct xilinx_vdma_chan *chan, u32 reg,
> +				   u32 value)
> +{
> +	vdma_write(chan, chan->ctrl_offset + reg, value);
> +}
> +
> +static inline void vdma_ctrl_clr(struct xilinx_vdma_chan *chan, u32 reg,
> +				 u32 clr)
> +{
> +	vdma_ctrl_write(chan, reg, vdma_ctrl_read(chan, reg) & ~clr);
> +}
> +
> +static inline void vdma_ctrl_set(struct xilinx_vdma_chan *chan, u32 reg,
> +				 u32 set)
> +{
> +	vdma_ctrl_write(chan, reg, vdma_ctrl_read(chan, reg) | set);
> +}
> +
> +/* -----------------------------------------------------------------------------
> + * Descriptors and segments alloc and free
> + */
> +
> +/**
> + * xilinx_vdma_alloc_tx_segment - Allocate transaction segment
> + * @chan: Driver specific VDMA channel
> + *
> + * Return: The allocated segment on success and NULL on failure.
> + */
> +static struct xilinx_vdma_tx_segment *
> +xilinx_vdma_alloc_tx_segment(struct xilinx_vdma_chan *chan)
> +{
> +	struct xilinx_vdma_tx_segment *segment;
> +	dma_addr_t phys;
> +
> +	segment = dma_pool_alloc(chan->desc_pool, GFP_ATOMIC, &phys);
> +	if (!segment)
> +		return NULL;
> +
> +	memset(segment, 0, sizeof(*segment));
> +	segment->phys = phys;
> +
> +	return segment;
> +}
> +
> +/**
> + * xilinx_vdma_free_tx_segment - Free transaction segment
> + * @chan: Driver specific VDMA channel
> + * @segment: VDMA transaction segment
> + */
> +static void xilinx_vdma_free_tx_segment(struct xilinx_vdma_chan *chan,
> +					struct xilinx_vdma_tx_segment *segment)
> +{
> +	dma_pool_free(chan->desc_pool, segment, segment->phys);
> +}
> +
> +/**
> + * xilinx_vdma_tx_descriptor - Allocate transaction descriptor
> + * @chan: Driver specific VDMA channel
> + *
> + * Return: The allocated descriptor on success and NULL on failure.
> + */
> +static struct xilinx_vdma_tx_descriptor *
> +xilinx_vdma_alloc_tx_descriptor(struct xilinx_vdma_chan *chan)
> +{
> +	struct xilinx_vdma_tx_descriptor *desc;
> +
> +	desc = kzalloc(sizeof(*desc), GFP_KERNEL);
> +	if (!desc)
> +		return NULL;
> +
> +	INIT_LIST_HEAD(&desc->segments);
> +
> +	return desc;
> +}
> +
> +/**
> + * xilinx_vdma_free_tx_descriptor - Free transaction descriptor
> + * @chan: Driver specific VDMA channel
> + * @desc: VDMA transaction descriptor
> + */
> +static void
> +xilinx_vdma_free_tx_descriptor(struct xilinx_vdma_chan *chan,
> +			       struct xilinx_vdma_tx_descriptor *desc)
> +{
> +	struct xilinx_vdma_tx_segment *segment, *next;
> +
> +	if (!desc)
> +		return;
> +
> +	list_for_each_entry_safe(segment, next, &desc->segments, node) {
> +		list_del(&segment->node);
> +		xilinx_vdma_free_tx_segment(chan, segment);
> +	}
> +
> +	kfree(desc);
> +}
> +
> +/* Required functions */
> +
> +/**
> + * xilinx_vdma_free_descriptors - Free descriptors list
> + * @chan: Driver specific VDMA channel
> + * @list: List to parse and delete the descriptor
> + */
> +static void xilinx_vdma_free_desc_list(struct xilinx_vdma_chan *chan,
> +					struct list_head *list)
> +{
> +	struct xilinx_vdma_tx_descriptor *desc, *next;
> +
> +	list_for_each_entry_safe(desc, next, list, node) {
> +		list_del(&desc->node);
> +		xilinx_vdma_free_tx_descriptor(chan, desc);
> +	}
> +}
> +
> +/**
> + * xilinx_vdma_free_descriptors - Free channel descriptors
> + * @chan: Driver specific VDMA channel
> + */
> +static void xilinx_vdma_free_descriptors(struct xilinx_vdma_chan *chan)
> +{
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&chan->lock, flags);
> +
> +	xilinx_vdma_free_desc_list(chan, &chan->pending_list);
> +	xilinx_vdma_free_desc_list(chan, &chan->done_list);
> +
> +	xilinx_vdma_free_tx_descriptor(chan, chan->active_desc);
> +	chan->active_desc = NULL;
> +
> +	spin_unlock_irqrestore(&chan->lock, flags);
> +}
> +
> +/**
> + * xilinx_vdma_free_chan_resources - Free channel resources
> + * @dchan: DMA channel
> + */
> +static void xilinx_vdma_free_chan_resources(struct dma_chan *dchan)
> +{
> +	struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
> +
> +	dev_dbg(chan->dev, "Free all channel resources.\n");
> +
> +	tasklet_kill(&chan->tasklet);
> +	xilinx_vdma_free_descriptors(chan);
> +	dma_pool_destroy(chan->desc_pool);
> +	chan->desc_pool = NULL;
> +}
> +
> +/**
> + * xilinx_vdma_chan_desc_cleanup - Clean channel descriptors
> + * @chan: Driver specific VDMA channel
> + */
> +static void xilinx_vdma_chan_desc_cleanup(struct xilinx_vdma_chan *chan)
> +{
> +	struct xilinx_vdma_tx_descriptor *desc, *next;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&chan->lock, flags);
> +
> +	list_for_each_entry_safe(desc, next, &chan->done_list, node) {
> +		dma_async_tx_callback callback;
> +		void *callback_param;
> +
> +		/* Remove from the list of running transactions */
> +		list_del(&desc->node);
> +
> +		/* Run the link descriptor callback function */
> +		callback = desc->async_tx.callback;
> +		callback_param = desc->async_tx.callback_param;
> +		if (callback) {
> +			spin_unlock_irqrestore(&chan->lock, flags);
> +			callback(callback_param);
> +			spin_lock_irqsave(&chan->lock, flags);
> +		}
> +
> +		/* Run any dependencies, then free the descriptor */
> +		dma_run_dependencies(&desc->async_tx);
> +		xilinx_vdma_free_tx_descriptor(chan, desc);
> +	}
> +
> +	spin_unlock_irqrestore(&chan->lock, flags);
> +}
> +
> +/**
> + * xilinx_vdma_do_tasklet - Schedule completion tasklet
> + * @data: Pointer to the Xilinx VDMA channel structure
> + */
> +static void xilinx_vdma_do_tasklet(unsigned long data)
> +{
> +	struct xilinx_vdma_chan *chan = (struct xilinx_vdma_chan *)data;
> +
> +	xilinx_vdma_chan_desc_cleanup(chan);
> +}
> +
> +/**
> + * xilinx_vdma_alloc_chan_resources - Allocate channel resources
> + * @dchan: DMA channel
> + *
> + * Return: '1' on success and failure value on error

May be return 0 on success as it usual practice? Here and in the other
places as well.

> + */
> +static int xilinx_vdma_alloc_chan_resources(struct dma_chan *dchan)
> +{
> +	struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
> +
> +	/* Has this channel already been allocated? */
> +	if (chan->desc_pool)
> +		return 1;
> +
> +	/*
> +	 * We need the descriptor to be aligned to 64bytes
> +	 * for meeting Xilinx VDMA specification requirement.
> +	 */
> +	chan->desc_pool = dma_pool_create("xilinx_vdma_desc_pool",
> +				chan->dev,
> +				sizeof(struct xilinx_vdma_tx_segment),
> +				__alignof__(struct xilinx_vdma_tx_segment), 0);
> +	if (!chan->desc_pool) {
> +		dev_err(chan->dev,
> +			"unable to allocate channel %d descriptor pool\n",
> +			chan->id);
> +		return -ENOMEM;
> +	}
> +
> +	tasklet_init(&chan->tasklet, xilinx_vdma_do_tasklet,
> +			(unsigned long)chan);
> +
> +	chan->completed_cookie = DMA_MIN_COOKIE;
> +	chan->cookie = DMA_MIN_COOKIE;
> +
> +	/* There is at least one descriptor free to be allocated */
> +	return 1;
> +}
> +
> +/**
> + * xilinx_vdma_tx_status - Get VDMA transaction status
> + * @dchan: DMA channel
> + * @cookie: Transaction identifier
> + * @txstate: Transaction state
> + *
> + * Return: DMA transaction status
> + */
> +static enum dma_status xilinx_vdma_tx_status(struct dma_chan *dchan,
> +					dma_cookie_t cookie,
> +					struct dma_tx_state *txstate)
> +{
> +	struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
> +	dma_cookie_t last_used;
> +	dma_cookie_t last_complete;
> +
> +	xilinx_vdma_chan_desc_cleanup(chan);
> +
> +	last_used = dchan->cookie;
> +	last_complete = chan->completed_cookie;
> +
> +	dma_set_tx_state(txstate, last_complete, last_used, 0);
> +
> +	return dma_async_is_complete(cookie, last_complete, last_used);
> +}
> +
> +/**
> + * xilinx_vdma_is_running - Check if VDMA channel is running
> + * @chan: Driver specific VDMA channel
> + *
> + * Return: '1' if running, '0' if not.
> + */
> +static int xilinx_vdma_is_running(struct xilinx_vdma_chan *chan)
> +{
> +	return !(vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) &
> +		 XILINX_VDMA_DMASR_HALTED) &&
> +		(vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR) &
> +		 XILINX_VDMA_DMACR_RUNSTOP);
> +}
> +
> +/**
> + * xilinx_vdma_is_idle - Check if VDMA channel is idle
> + * @chan: Driver specific VDMA channel
> + *
> + * Return: '1' if idle, '0' if not.
> + */
> +static int xilinx_vdma_is_idle(struct xilinx_vdma_chan *chan)
> +{
> +	return vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) &
> +		XILINX_VDMA_DMASR_IDLE;
> +}
> +
> +/**
> + * xilinx_vdma_halt - Halt VDMA channel
> + * @chan: Driver specific VDMA channel
> + */
> +static void xilinx_vdma_halt(struct xilinx_vdma_chan *chan)
> +{
> +	int loop = XILINX_VDMA_LOOP_COUNT + 1;
> +
> +	vdma_ctrl_clr(chan, XILINX_VDMA_REG_DMACR, XILINX_VDMA_DMACR_RUNSTOP);
> +
> +	/* Wait for the hardware to halt */
> +	while (loop--)
> +		if (vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) &
> +		    XILINX_VDMA_DMASR_HALTED)
> +			break;
> +
> +	if (!loop) {
> +		dev_err(chan->dev, "Cannot stop channel %p: %x\n",
> +			chan, vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR));
> +		chan->err = true;
> +	}
> +
> +	return;
> +}
> +
> +/**
> + * xilinx_vdma_start - Start VDMA channel
> + * @chan: Driver specific VDMA channel
> + */
> +static void xilinx_vdma_start(struct xilinx_vdma_chan *chan)
> +{
> +	int loop = XILINX_VDMA_LOOP_COUNT + 1;
> +
> +	vdma_ctrl_set(chan, XILINX_VDMA_REG_DMACR, XILINX_VDMA_DMACR_RUNSTOP);
> +
> +	/* Wait for the hardware to start */
> +	while (loop--)
> +		if (!(vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) &
> +		      XILINX_VDMA_DMASR_HALTED))
> +			break;
> +
> +	if (!loop) {
> +		dev_err(chan->dev, "Cannot start channel %p: %x\n",
> +			chan, vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR));
> +
> +		chan->err = true;
> +	}
> +
> +	return;
> +}
> +
> +/**
> + * xilinx_vdma_start_transfer - Starts VDMA transfer
> + * @chan: Driver specific channel struct pointer
> + */
> +static void xilinx_vdma_start_transfer(struct xilinx_vdma_chan *chan)
> +{
> +	struct xilinx_vdma_config *config = &chan->config;
> +	struct xilinx_vdma_tx_descriptor *desc;
> +	unsigned long flags;
> +	u32 reg;
> +	struct xilinx_vdma_tx_segment *head, *tail = NULL;
> +
> +	if (chan->err)
> +		return;
> +
> +	spin_lock_irqsave(&chan->lock, flags);
> +
> +	/* There's already an active descriptor, bail out. */
> +	if (chan->active_desc)
> +		goto out_unlock;
> +
> +	if (list_empty(&chan->pending_list))
> +		goto out_unlock;
> +
> +	desc = list_first_entry(&chan->pending_list,
> +				struct xilinx_vdma_tx_descriptor, node);
> +
> +	/* If it is SG mode and hardware is busy, cannot submit */
> +	if (chan->has_sg && xilinx_vdma_is_running(chan) &&
> +	    !xilinx_vdma_is_idle(chan)) {
> +		dev_dbg(chan->dev, "DMA controller still busy\n");
> +		goto out_unlock;
> +	}
> +
> +	if (chan->err)
> +		goto out_unlock;
> +
> +	/*
> +	 * If hardware is idle, then all descriptors on the running lists are
> +	 * done, start new transfers
> +	 */
> +	if (chan->has_sg) {
> +		head = list_first_entry(&desc->segments,
> +					struct xilinx_vdma_tx_segment, node);
> +		tail = list_entry(desc->segments.prev,
> +				  struct xilinx_vdma_tx_segment, node);
> +
> +		vdma_ctrl_write(chan, XILINX_VDMA_REG_CURDESC, head->phys);
> +	}
> +
> +	/* Configure the hardware using info in the config structure */
> +	reg = vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR);
> +
> +	if (config->frm_cnt_en)
> +		reg |= XILINX_VDMA_DMACR_FRAMECNT_EN;
> +	else
> +		reg &= ~XILINX_VDMA_DMACR_FRAMECNT_EN;
> +
> +	/*
> +	 * With SG, start with circular mode, so that BDs can be fetched.
> +	 * In direct register mode, if not parking, enable circular mode
> +	 */
> +	if (chan->has_sg || !config->park)
> +		reg |= XILINX_VDMA_DMACR_CIRC_EN;
> +
> +	if (config->park)
> +		reg &= ~XILINX_VDMA_DMACR_CIRC_EN;
> +
> +	vdma_ctrl_write(chan, XILINX_VDMA_REG_DMACR, reg);
> +
> +	if (config->park && (config->park_frm >= 0) &&
> +			(config->park_frm < chan->num_frms)) {
> +		if (chan->direction == DMA_MEM_TO_DEV)
> +			vdma_write(chan, XILINX_VDMA_REG_PARK_PTR,
> +				config->park_frm <<
> +					XILINX_VDMA_PARK_PTR_RD_REF_SHIFT);
> +		else
> +			vdma_write(chan, XILINX_VDMA_REG_PARK_PTR,
> +				config->park_frm <<
> +					XILINX_VDMA_PARK_PTR_WR_REF_SHIFT);
> +	}
> +
> +	/* Start the hardware */
> +	xilinx_vdma_start(chan);
> +
> +	if (chan->err)
> +		goto out_unlock;
> +
> +	/* Start the transfer */
> +	if (chan->has_sg) {
> +		vdma_ctrl_write(chan, XILINX_VDMA_REG_TAILDESC, tail->phys);
> +	} else {
> +		struct xilinx_vdma_tx_segment *segment;
> +		int i = 0;
> +
> +		list_for_each_entry(segment, &desc->segments, node)
> +			vdma_desc_write(chan,
> +					XILINX_VDMA_REG_START_ADDRESS(i++),
> +					segment->hw.buf_addr);
> +
> +		vdma_desc_write(chan, XILINX_VDMA_REG_HSIZE, config->hsize);
> +		vdma_desc_write(chan, XILINX_VDMA_REG_FRMDLY_STRIDE,
> +				(config->frm_dly <<
> +				 XILINX_VDMA_FRMDLY_STRIDE_FRMDLY_SHIFT) |
> +				(config->stride <<
> +				 XILINX_VDMA_FRMDLY_STRIDE_STRIDE_SHIFT));
> +		vdma_desc_write(chan, XILINX_VDMA_REG_VSIZE, config->vsize);
> +	}
> +
> +	list_del(&desc->node);
> +	chan->active_desc = desc;
> +
> +out_unlock:
> +	spin_unlock_irqrestore(&chan->lock, flags);
> +}
> +
> +/**
> + * xilinx_vdma_issue_pending - Issue pending transactions
> + * @dchan: DMA channel
> + */
> +static void xilinx_vdma_issue_pending(struct dma_chan *dchan)
> +{
> +	struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
> +
> +	xilinx_vdma_start_transfer(chan);
> +}
> +
> +/**
> + * xilinx_vdma_complete_descriptor - Mark the active descriptor as complete
> + * @chan : xilinx DMA channel
> + *
> + * CONTEXT: hardirq
> + */
> +static void xilinx_vdma_complete_descriptor(struct xilinx_vdma_chan *chan)
> +{
> +	struct xilinx_vdma_tx_descriptor *desc;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&chan->lock, flags);
> +
> +	desc = chan->active_desc;
> +	if (!desc) {
> +		dev_dbg(chan->dev, "no running descriptors\n");
> +		goto out_unlock;
> +	}
> +
> +	list_add_tail(&desc->node, &chan->done_list);
> +
> +	/* Update the completed cookie and reset the active descriptor. */
> +	chan->completed_cookie = desc->async_tx.cookie;
> +	chan->active_desc = NULL;
> +
> +out_unlock:
> +	spin_unlock_irqrestore(&chan->lock, flags);
> +}
> +
> +/**
> + * xilinx_vdma_reset - Reset VDMA channel
> + * @chan: Driver specific VDMA channel
> + *
> + * Return: '0' on success and failure value on error
> + */
> +static int xilinx_vdma_reset(struct xilinx_vdma_chan *chan)
> +{
> +	int loop = XILINX_VDMA_LOOP_COUNT + 1;
> +	u32 tmp;
> +
> +	vdma_ctrl_set(chan, XILINX_VDMA_REG_DMACR, XILINX_VDMA_DMACR_RESET);
> +
> +	tmp = vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR) &
> +		XILINX_VDMA_DMACR_RESET;
> +
> +	/* Wait for the hardware to finish reset */
> +	while (loop-- && tmp)
> +		tmp = vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR) &
> +			XILINX_VDMA_DMACR_RESET;
> +
> +	if (!loop) {
> +		dev_err(chan->dev, "reset timeout, cr %x, sr %x\n",
> +			vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR),
> +			vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR));
> +		return -ETIMEDOUT;
> +	}
> +
> +	chan->err = false;
> +
> +	return 0;
> +}
> +
> +/**
> + * xilinx_vdma_chan_reset - Reset VDMA channel and enable interrupts
> + * @chan: Driver specific VDMA channel
> + *
> + * Return: '0' on success and failure value on error
> + */
> +static int xilinx_vdma_chan_reset(struct xilinx_vdma_chan *chan)
> +{
> +	int err;
> +
> +	/* Reset VDMA */
> +	err = xilinx_vdma_reset(chan);
> +	if (err)
> +		return err;
> +
> +	/* Enable interrupts */
> +	vdma_ctrl_set(chan, XILINX_VDMA_REG_DMACR,
> +		      XILINX_VDMA_DMAXR_ALL_IRQ_MASK);
> +
> +	return 0;
> +}
> +
> +/**
> + * xilinx_vdma_irq_handler - VDMA Interrupt handler
> + * @irq: IRQ number
> + * @data: Pointer to the Xilinx VDMA channel structure
> + *
> + * Return: IRQ_HANDLED/IRQ_NONE
> + */
> +static irqreturn_t xilinx_vdma_irq_handler(int irq, void *data)
> +{
> +	struct xilinx_vdma_chan *chan = data;
> +	u32 status;
> +
> +	/* Read the status and ack the interrupts. */
> +	status = vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR);
> +	if (!(status & XILINX_VDMA_DMAXR_ALL_IRQ_MASK))
> +		return IRQ_NONE;
> +
> +	vdma_ctrl_write(chan, XILINX_VDMA_REG_DMASR,
> +			status & XILINX_VDMA_DMAXR_ALL_IRQ_MASK);
> +
> +	if (status & XILINX_VDMA_DMASR_ERR_IRQ) {
> +		/*
> +		 * An error occurred. If C_FLUSH_ON_FSYNC is enabled and the
> +		 * error is recoverable, ignore it. Otherwise flag the error.
> +		 *
> +		 * Only recoverable errors can be cleared in the DMASR register,
> +		 * make sure not to write to other error bits to 1.
> +		 */
> +		u32 errors = status & XILINX_VDMA_DMASR_ALL_ERR_MASK;
> +		vdma_ctrl_write(chan, XILINX_VDMA_REG_DMASR,
> +				errors & XILINX_VDMA_DMASR_ERR_RECOVER_MASK);
> +
> +		if (!chan->flush_on_fsync ||
> +		    (errors & ~XILINX_VDMA_DMASR_ERR_RECOVER_MASK)) {
> +			dev_err(chan->dev,
> +				"Channel %p has errors %x, cdr %x tdr %x\n",
> +				chan, errors,
> +				vdma_ctrl_read(chan, XILINX_VDMA_REG_CURDESC),
> +				vdma_ctrl_read(chan, XILINX_VDMA_REG_TAILDESC));
> +			chan->err = true;
> +		}
> +	}
> +
> +	if (status & XILINX_VDMA_DMASR_DLY_CNT_IRQ) {
> +		/*
> +		 * Device takes too long to do the transfer when user requires
> +		 * responsiveness.
> +		 */
> +		dev_dbg(chan->dev, "Inter-packet latency too long\n");
> +	}
> +
> +	if (status & XILINX_VDMA_DMASR_FRM_CNT_IRQ) {
> +		xilinx_vdma_complete_descriptor(chan);
> +		xilinx_vdma_start_transfer(chan);
> +	}
> +
> +	tasklet_schedule(&chan->tasklet);
> +	return IRQ_HANDLED;
> +}
> +
> +/**
> + * xilinx_vdma_tx_submit - Submit DMA transaction
> + * @tx: Async transaction descriptor
> + *
> + * Return: cookie value on success and failure value on error
> + */
> +static dma_cookie_t xilinx_vdma_tx_submit(struct dma_async_tx_descriptor *tx)
> +{
> +	struct xilinx_vdma_tx_descriptor *desc = to_vdma_tx_descriptor(tx);
> +	struct xilinx_vdma_chan *chan = to_xilinx_chan(tx->chan);
> +	struct xilinx_vdma_tx_segment *segment;
> +	dma_cookie_t cookie;
> +	unsigned long flags;
> +	int err;
> +
> +	if (chan->err) {
> +		/*
> +		 * If reset fails, need to hard reset the system.
> +		 * Channel is no longer functional
> +		 */
> +		err = xilinx_vdma_chan_reset(chan);
> +		if (err < 0)
> +			return err;
> +	}
> +
> +	spin_lock_irqsave(&chan->lock, flags);
> +
> +	/* Assign cookies to all of the segments that make up this transaction.
> +	 * Use the cookie of the last segment as the transaction cookie.
> +	 */

Keep style of multiline comment the same over the code.

> +	cookie = chan->cookie;
> +
> +	list_for_each_entry(segment, &desc->segments, node) {
> +		if (cookie < DMA_MAX_COOKIE)
> +			cookie++;
> +		else
> +			cookie = DMA_MIN_COOKIE;
> +
> +		segment->cookie = cookie;
> +	}
> +
> +	tx->cookie = cookie;
> +	chan->cookie = cookie;
> +
> +	/* Append the transaction to the pending transactions queue. */
> +	list_add_tail(&desc->node, &chan->pending_list);
> +
> +	spin_unlock_irqrestore(&chan->lock, flags);
> +
> +	return cookie;
> +}
> +
> +/**
> + * xilinx_vdma_prep_slave_sg - prepare a descriptor for a DMA_SLAVE transaction
> + * @dchan: DMA channel
> + * @sgl: scatterlist to transfer to/from
> + * @sg_len: number of entries in @sgl
> + * @dir: DMA direction
> + * @flags: transfer ack flags
> + * @context: unused
> + *
> + * Return: Async transaction descriptor on success and NULL on failure
> + */
> +static struct dma_async_tx_descriptor *
> +xilinx_vdma_prep_slave_sg(struct dma_chan *dchan, struct scatterlist *sgl,
> +			  unsigned int sg_len, enum dma_transfer_direction dir,
> +			  unsigned long flags, void *context)
> +{
> +	struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
> +	struct xilinx_vdma_tx_descriptor *desc;
> +	struct xilinx_vdma_tx_segment *segment;
> +	struct xilinx_vdma_tx_segment *prev = NULL;
> +	struct scatterlist *sg;
> +	int i;
> +
> +	if (chan->direction != dir || sg_len == 0)
> +		return NULL;
> +
> +	/* Enforce one sg entry for one frame. */
> +	if (sg_len != chan->num_frms) {
> +		dev_err(chan->dev,
> +		"number of entries %d not the same as num stores %d\n",
> +			sg_len, chan->num_frms);
> +		return NULL;
> +	}
> +
> +	/* Allocate a transaction descriptor. */
> +	desc = xilinx_vdma_alloc_tx_descriptor(chan);
> +	if (!desc)
> +		return NULL;
> +
> +	dma_async_tx_descriptor_init(&desc->async_tx, &chan->common);
> +	desc->async_tx.tx_submit = xilinx_vdma_tx_submit;
> +	desc->async_tx.cookie = 0;
> +	async_tx_ack(&desc->async_tx);
> +
> +	/* Build the list of transaction segments. */
> +	for_each_sg(sgl, sg, sg_len, i) {
> +		struct xilinx_vdma_desc_hw *hw;
> +
> +		/* Allocate the link descriptor from DMA pool */
> +		segment = xilinx_vdma_alloc_tx_segment(chan);
> +		if (!segment)
> +			goto error;
> +
> +		/* Fill in the hardware descriptor */
> +		hw = &segment->hw;
> +		hw->buf_addr = sg_dma_address(sg);
> +		hw->vsize = chan->config.vsize;
> +		hw->hsize = chan->config.hsize;
> +		hw->stride = (chan->config.frm_dly <<
> +			      XILINX_VDMA_FRMDLY_STRIDE_FRMDLY_SHIFT) |
> +			     (chan->config.stride <<
> +			      XILINX_VDMA_FRMDLY_STRIDE_STRIDE_SHIFT);
> +		if (prev)
> +			prev->hw.next_desc = segment->phys;
> +
> +		/* Insert the segment into the descriptor segments list. */
> +		list_add_tail(&segment->node, &desc->segments);
> +
> +		prev = segment;
> +	}
> +
> +	/* Link the last hardware descriptor with the first. */
> +	segment = list_first_entry(&desc->segments,
> +				   struct xilinx_vdma_tx_segment, node);
> +	prev->hw.next_desc = segment->phys;
> +
> +	return &desc->async_tx;
> +
> +error:
> +	xilinx_vdma_free_tx_descriptor(chan, desc);
> +	return NULL;
> +}
> +
> +/**
> + * xilinx_vdma_terminate_all - Halt the channel and free descriptors
> + * @chan: Driver specific VDMA Channel pointer
> + */
> +static void xilinx_vdma_terminate_all(struct xilinx_vdma_chan *chan)
> +{
> +	/* Halt the DMA engine */
> +	xilinx_vdma_halt(chan);
> +
> +	/* Remove and free all of the descriptors in the lists */
> +	xilinx_vdma_free_descriptors(chan);
> +}
> +
> +/**
> + * xilinx_vdma_slave_config - Configure VDMA channel
> + * Run-time configuration for Axi VDMA, supports:
> + * . halt the channel
> + * . configure interrupt coalescing and inter-packet delay threshold
> + * . start/stop parking
> + * . enable genlock
> + * . set transfer information using config struct
> + *
> + * @chan: Driver specific VDMA Channel pointer
> + * @cfg: Channel configuration pointer
> + *
> + * Return: '0' on success and failure value on error
> + */
> +static int xilinx_vdma_slave_config(struct xilinx_vdma_chan *chan,
> +				    struct xilinx_vdma_config *cfg)
> +{
> +	u32 dmacr;
> +
> +	if (cfg->reset)
> +		return xilinx_vdma_chan_reset(chan);
> +
> +	dmacr = vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR);
> +
> +	/* If vsize is -1, it is park-related operations */
> +	if (cfg->vsize == -1) {
> +		if (cfg->park)
> +			dmacr &= ~XILINX_VDMA_DMACR_CIRC_EN;
> +		else
> +			dmacr |= XILINX_VDMA_DMACR_CIRC_EN;
> +
> +		vdma_ctrl_write(chan, XILINX_VDMA_REG_DMACR, dmacr);
> +		return 0;
> +	}
> +
> +	/* If hsize is -1, it is interrupt threshold settings */
> +	if (cfg->hsize == -1) {
> +		if (cfg->coalesc <= XILINX_VDMA_DMACR_FRAME_COUNT_MAX) {
> +			dmacr &= ~XILINX_VDMA_DMACR_FRAME_COUNT_MASK;
> +			dmacr |= cfg->coalesc <<
> +				 XILINX_VDMA_DMACR_FRAME_COUNT_SHIFT;
> +			chan->config.coalesc = cfg->coalesc;
> +		}
> +
> +		if (cfg->delay <= XILINX_VDMA_DMACR_DELAY_MAX) {
> +			dmacr &= ~XILINX_VDMA_DMACR_DELAY_MASK;
> +			dmacr |= cfg->delay << XILINX_VDMA_DMACR_DELAY_SHIFT;
> +			chan->config.delay = cfg->delay;
> +		}
> +
> +		vdma_ctrl_write(chan, XILINX_VDMA_REG_DMACR, dmacr);
> +		return 0;
> +	}
> +
> +	/* Transfer information */
> +	chan->config.vsize = cfg->vsize;
> +	chan->config.hsize = cfg->hsize;
> +	chan->config.stride = cfg->stride;
> +	chan->config.frm_dly = cfg->frm_dly;
> +	chan->config.park = cfg->park;
> +
> +	/* genlock settings */
> +	chan->config.gen_lock = cfg->gen_lock;
> +	chan->config.master = cfg->master;
> +
> +	if (cfg->gen_lock && chan->genlock) {
> +		dmacr |= XILINX_VDMA_DMACR_GENLOCK_EN;
> +		dmacr |= cfg->master << XILINX_VDMA_DMACR_MASTER_SHIFT;
> +	}
> +
> +	chan->config.frm_cnt_en = cfg->frm_cnt_en;
> +	if (cfg->park)
> +		chan->config.park_frm = cfg->park_frm;
> +	else
> +		chan->config.park_frm = -1;
> +
> +	chan->config.coalesc = cfg->coalesc;
> +	chan->config.delay = cfg->delay;
> +	if (cfg->coalesc <= XILINX_VDMA_DMACR_FRAME_COUNT_MAX) {
> +		dmacr |= cfg->coalesc << XILINX_VDMA_DMACR_FRAME_COUNT_SHIFT;
> +		chan->config.coalesc = cfg->coalesc;
> +	}
> +
> +	if (cfg->delay <= XILINX_VDMA_DMACR_DELAY_MAX) {
> +		dmacr |= cfg->delay << XILINX_VDMA_DMACR_DELAY_SHIFT;
> +		chan->config.delay = cfg->delay;
> +	}
> +
> +	/* FSync Source selection */
> +	dmacr &= ~XILINX_VDMA_DMACR_FSYNCSRC_MASK;
> +	dmacr |= cfg->ext_fsync << XILINX_VDMA_DMACR_FSYNCSRC_SHIFT;
> +
> +	vdma_ctrl_write(chan, XILINX_VDMA_REG_DMACR, dmacr);
> +	return 0;
> +}
> +
> +/**
> + * xilinx_vdma_device_control - Configure DMA channel of the device
> + * @dchan: DMA Channel pointer
> + * @cmd: DMA control command
> + * @arg: Channel configuration
> + *
> + * Return: '0' on success and failure value on error
> + */
> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
> +				      enum dma_ctrl_cmd cmd, unsigned long arg)
> +{
> +	struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
> +
> +	switch (cmd) {
> +	case DMA_TERMINATE_ALL:
> +		xilinx_vdma_terminate_all(chan);
> +		return 0;
> +	case DMA_SLAVE_CONFIG:
> +		return xilinx_vdma_slave_config(chan,
> +					(struct xilinx_vdma_config *)arg);
> +	default:
> +		return -ENXIO;
> +	}
> +}
> +
> +/* -----------------------------------------------------------------------------
> + * Probe and remove
> + */
> +
> +/**
> + * xilinx_vdma_chan_remove - Per Channel remove function
> + * @chan: Driver specific VDMA channel
> + */
> +static void xilinx_vdma_chan_remove(struct xilinx_vdma_chan *chan)
> +{
> +	/* Disable all interrupts */
> +	vdma_ctrl_clr(chan, XILINX_VDMA_REG_DMACR,
> +		      XILINX_VDMA_DMAXR_ALL_IRQ_MASK);
> +
> +	list_del(&chan->common.device_node);
> +}
> +
> +/**
> + * xilinx_vdma_chan_probe - Per Channel Probing
> + * It get channel features from the device tree entry and
> + * initialize special channel handling routines
> + *
> + * @xdev: Driver specific device structure
> + * @node: Device node
> + *
> + * Return: '0' on success and failure value on error
> + */
> +static int xilinx_vdma_chan_probe(struct xilinx_vdma_device *xdev,
> +				  struct device_node *node)
> +{
> +	struct xilinx_vdma_chan *chan;
> +	bool has_dre = false;
> +	u32 value;
> +	int err;
> +
> +	/* Allocate and initialize the channel structure */
> +	chan = devm_kzalloc(xdev->dev, sizeof(*chan), GFP_KERNEL);
> +	if (!chan)
> +		return -ENOMEM;
> +
> +	chan->dev = xdev->dev;
> +	chan->xdev = xdev;
> +	chan->has_sg = xdev->has_sg;
> +
> +	spin_lock_init(&chan->lock);
> +	INIT_LIST_HEAD(&chan->pending_list);
> +	INIT_LIST_HEAD(&chan->done_list);
> +
> +	/* Retrieve the channel properties from the device tree */
> +	has_dre = of_property_read_bool(node, "xlnx,include-dre");
> +
> +	chan->genlock = of_property_read_bool(node, "xlnx,genlock-mode");
> +
> +	err = of_property_read_u32(node, "xlnx,datawidth", &value);
> +	if (!err) {
> +		u32 width = value >> 3; /* Convert bits to bytes */
> +
> +		/* If data width is greater than 8 bytes, DRE is not in hw */
> +		if (width > 8)
> +			has_dre = false;
> +
> +		if (!has_dre)
> +			xdev->common.copy_align = fls(width - 1);
> +	} else {
> +		dev_err(xdev->dev, "missing xlnx,datawidth property\n");
> +		return err;
> +	}
> +
> +	if (of_device_is_compatible(node, "xlnx,axi-vdma-mm2s-channel")) {
> +		chan->direction = DMA_MEM_TO_DEV;
> +		chan->id = 0;
> +
> +		chan->ctrl_offset = XILINX_VDMA_MM2S_CTRL_OFFSET;
> +		chan->desc_offset = XILINX_VDMA_MM2S_DESC_OFFSET;
> +
> +		if (xdev->flush_on_fsync == XILINX_VDMA_FLUSH_BOTH ||
> +		    xdev->flush_on_fsync == XILINX_VDMA_FLUSH_MM2S)
> +			chan->flush_on_fsync = true;
> +	} else if (of_device_is_compatible(node,
> +					    "xlnx,axi-vdma-s2mm-channel")) {
> +		chan->direction = DMA_DEV_TO_MEM;
> +		chan->id = 1;
> +
> +		chan->ctrl_offset = XILINX_VDMA_S2MM_CTRL_OFFSET;
> +		chan->desc_offset = XILINX_VDMA_S2MM_DESC_OFFSET;
> +
> +		if (xdev->flush_on_fsync == XILINX_VDMA_FLUSH_BOTH ||
> +		    xdev->flush_on_fsync == XILINX_VDMA_FLUSH_S2MM)
> +			chan->flush_on_fsync = true;
> +	} else {
> +		dev_err(xdev->dev, "Invalid channel compatible node\n");
> +		return -EINVAL;
> +	}
> +
> +	/* Request the interrupt */
> +	chan->irq = irq_of_parse_and_map(node, 0);
> +	err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler,
> +			       IRQF_SHARED, "xilinx-vdma-controller", chan);
> +	if (err) {
> +		dev_err(xdev->dev, "unable to request IRQ\n");
> +		return err;
> +	}
> +
> +	/* Initialize the DMA channel and add it to the DMA engine channels
> +	 * list.
> +	 */
> +	chan->common.device = &xdev->common;
> +
> +	list_add_tail(&chan->common.device_node, &xdev->common.channels);
> +	xdev->chan[chan->id] = chan;
> +
> +	/* Reset the channel */
> +	err = xilinx_vdma_chan_reset(chan);
> +	if (err < 0) {
> +		dev_err(xdev->dev, "Reset channel failed\n");
> +		return err;
> +	}
> +
> +	return 0;
> +}
> +
> +/**
> + * struct of_dma_filter_xilinx_args - Channel filter args
> + * @dev: DMA device structure
> + * @chan_id: Channel id
> + */
> +struct of_dma_filter_xilinx_args {
> +	struct dma_device *dev;
> +	u32 chan_id;
> +};
> +
> +/**
> + * xilinx_vdma_dt_filter - VDMA channel filter function
> + * @chan: DMA channel pointer
> + * @param: Filter match value
> + *
> + * Return: true/false based on the result
> + */
> +static bool xilinx_vdma_dt_filter(struct dma_chan *chan, void *param)
> +{
> +	struct of_dma_filter_xilinx_args *args = param;
> +
> +	return chan->device == args->dev && chan->chan_id == args->chan_id;
> +}
> +
> +/**
> + * of_dma_xilinx_xlate - Translation function
> + * @dma_spec: Pointer to DMA specifier as found in the device tree
> + * @ofdma: Pointer to DMA controller data
> + *
> + * Return: DMA channel pointer on success and NULL on error
> + */
> +static struct dma_chan *of_dma_xilinx_xlate(struct of_phandle_args *dma_spec,
> +						struct of_dma *ofdma)
> +{
> +	struct of_dma_filter_xilinx_args args;
> +	dma_cap_mask_t cap;
> +
> +	args.dev = ofdma->of_dma_data;
> +	if (!args.dev)
> +		return NULL;
> +
> +	if (dma_spec->args_count != 1)
> +		return NULL;
> +
> +	dma_cap_zero(cap);
> +	dma_cap_set(DMA_SLAVE, cap);
> +
> +	args.chan_id = dma_spec->args[0];
> +
> +	return dma_request_channel(cap, xilinx_vdma_dt_filter, &args);
> +}
> +
> +/**
> + * xilinx_vdma_probe - Driver probe function
> + * @pdev: Pointer to the platform_device structure
> + *
> + * Return: '0' on success and failure value on error
> + */
> +static int xilinx_vdma_probe(struct platform_device *pdev)
> +{
> +	struct device_node *node = pdev->dev.of_node;
> +	struct xilinx_vdma_device *xdev;
> +	struct device_node *child;
> +	struct resource *io;
> +	u32 num_frames;
> +	int i, err;
> +
> +	dev_info(&pdev->dev, "Probing xilinx axi vdma engine\n");
> +
> +	/* Allocate and initialize the DMA engine structure */
> +	xdev = devm_kzalloc(&pdev->dev, sizeof(*xdev), GFP_KERNEL);
> +	if (!xdev)
> +		return -ENOMEM;
> +
> +	xdev->dev = &pdev->dev;
> +
> +	/* Request and map I/O memory */
> +	io = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +	xdev->regs = devm_ioremap_resource(&pdev->dev, io);
> +	if (IS_ERR(xdev->regs))
> +		return PTR_ERR(xdev->regs);
> +
> +	/* Retrieve the DMA engine properties from the device tree */
> +	xdev->has_sg = of_property_read_bool(node, "xlnx,include-sg");
> +
> +	err = of_property_read_u32(node, "xlnx,num-fstores", &num_frames);
> +	if (err < 0) {
> +		dev_err(xdev->dev, "missing xlnx,num-fstores property\n");
> +		return err;
> +	}
> +
> +	of_property_read_u32(node, "xlnx,flush-fsync", &xdev->flush_on_fsync);
> +
> +	/* Initialize the DMA engine */
> +	xdev->common.dev = &pdev->dev;
> +
> +	INIT_LIST_HEAD(&xdev->common.channels);
> +	dma_cap_set(DMA_SLAVE, xdev->common.cap_mask);
> +	dma_cap_set(DMA_PRIVATE, xdev->common.cap_mask);
> +
> +	xdev->common.device_alloc_chan_resources =
> +				xilinx_vdma_alloc_chan_resources;
> +	xdev->common.device_free_chan_resources =
> +				xilinx_vdma_free_chan_resources;
> +	xdev->common.device_prep_slave_sg = xilinx_vdma_prep_slave_sg;
> +	xdev->common.device_control = xilinx_vdma_device_control;
> +	xdev->common.device_tx_status = xilinx_vdma_tx_status;
> +	xdev->common.device_issue_pending = xilinx_vdma_issue_pending;
> +
> +	platform_set_drvdata(pdev, xdev);
> +
> +	/* Initialize the channels */
> +	for_each_child_of_node(node, child) {
> +		err = xilinx_vdma_chan_probe(xdev, child);
> +		if (err < 0)
> +			goto error;
> +	}
> +
> +	for (i = 0; i < XILINX_VDMA_MAX_CHANS_PER_DEVICE; i++) {
> +		if (xdev->chan[i])
> +			xdev->chan[i]->num_frms = num_frames;
> +	}
> +
> +	/* Register the DMA engine with the core */
> +	dma_async_device_register(&xdev->common);
> +
> +	err = of_dma_controller_register(node, of_dma_xilinx_xlate,
> +					 &xdev->common);
> +	if (err < 0) {
> +		dev_err(&pdev->dev, "Unable to register DMA to DT\n");
> +		dma_async_device_unregister(&xdev->common);
> +		goto error;
> +	}
> +
> +	return 0;
> +
> +error:
> +	for (i = 0; i < XILINX_VDMA_MAX_CHANS_PER_DEVICE; i++) {
> +		if (xdev->chan[i])
> +			xilinx_vdma_chan_remove(xdev->chan[i]);
> +	}
> +
> +	return err;
> +}
> +
> +/**
> + * xilinx_vdma_remove - Driver remove function
> + * @pdev: Pointer to the platform_device structure
> + *
> + * Return: Always '0'
> + */
> +static int xilinx_vdma_remove(struct platform_device *pdev)
> +{
> +	struct xilinx_vdma_device *xdev;
> +	int i;
> +
> +	of_dma_controller_free(pdev->dev.of_node);
> +
> +	xdev = platform_get_drvdata(pdev);

You could move this assignment to a variables block, it's normal
practice.

> +	dma_async_device_unregister(&xdev->common);
> +
> +	for (i = 0; i < XILINX_VDMA_MAX_CHANS_PER_DEVICE; i++) {
> +		if (xdev->chan[i])
> +			xilinx_vdma_chan_remove(xdev->chan[i]);
> +	}
> +
> +	return 0;
> +}
> +
> +static const struct of_device_id xilinx_vdma_of_ids[] = {
> +	{ .compatible = "xlnx,axi-vdma-1.00.a",},
> +	{}
> +};
> +
> +static struct platform_driver xilinx_vdma_driver = {
> +	.driver = {
> +		.name = "xilinx-vdma",
> +		.owner = THIS_MODULE,
> +		.of_match_table = xilinx_vdma_of_ids,
> +	},
> +	.probe = xilinx_vdma_probe,
> +	.remove = xilinx_vdma_remove,
> +};
> +
> +module_platform_driver(xilinx_vdma_driver);
> +
> +MODULE_AUTHOR("Xilinx, Inc.");
> +MODULE_DESCRIPTION("Xilinx VDMA driver");
> +MODULE_LICENSE("GPL v2");
> diff --git a/include/linux/amba/xilinx_dma.h b/include/linux/amba/xilinx_dma.h
> new file mode 100644
> index 0000000..48a8c8b
> --- /dev/null
> +++ b/include/linux/amba/xilinx_dma.h
> @@ -0,0 +1,50 @@
> +/*
> + * Xilinx DMA Engine drivers support header file
> + *
> + * Copyright (C) 2010-2014 Xilinx, Inc. All rights reserved.
> + *
> + * This is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +#ifndef __DMA_XILINX_DMA_H
> +#define __DMA_XILINX_DMA_H
> +
> +#include <linux/dma-mapping.h>
> +#include <linux/dmaengine.h>
> +
> +/**
> + * struct xilinx_vdma_config - VDMA Configuration structure
> + * @vsize: Vertical size
> + * @hsize: Horizontal size
> + * @stride: Stride
> + * @frm_dly: Frame delay
> + * @gen_lock: Whether in gen-lock mode
> + * @master: Master that it syncs to
> + * @frm_cnt_en: Enable frame count enable
> + * @park: Whether wants to park
> + * @park_frm: Frame to park on
> + * @coalesc: Interrupt coalescing threshold
> + * @delay: Delay counter
> + * @reset: Reset Channel
> + * @ext_fsync: External Frame Sync source
> + */
> +struct xilinx_vdma_config {
> +	int vsize;
> +	int hsize;
> +	int stride;
> +	int frm_dly;
> +	int gen_lock;
> +	int master;
> +	int frm_cnt_en;
> +	int park;
> +	int park_frm;
> +	int coalesc;
> +	int delay;
> +	int reset;
> +	int ext_fsync;
> +};
> +
> +#endif
Andy Shevchenko Jan. 23, 2014, 1:38 p.m. UTC | #4
On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote:
> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:


[...]

> > +	/* Request the interrupt */

> > +	chan->irq = irq_of_parse_and_map(node, 0);

> > +	err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler,

> > +			       IRQF_SHARED, "xilinx-vdma-controller", chan);

> 

> This is a clasic example of where to not use devm_request_irq. 'chan' is

> accessed in the interrupt handler, but if you use devm_request_irq 'chan'

> will be freed before the interrupt handler has been released, which means

> there is now a race condition where the interrupt handler can access already

> freed memory.


Could you elaborate this case? As far as I understood managed resources
are a kind of stack pile. In this case you have no such condition. Where
am I wrong?


-- 
Andy Shevchenko <andriy.shevchenko@intel.com>
Intel Finland Oy
---------------------------------------------------------------------
Intel Finland Oy
Registered Address: PL 281, 00181 Helsinki 
Business Identity Code: 0357606 - 4 
Domiciled in Helsinki 

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
Lars-Peter Clausen Jan. 23, 2014, 1:50 p.m. UTC | #5
On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote:
> On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote:
>> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
> 
> [...]
> 
>>> +	/* Request the interrupt */
>>> +	chan->irq = irq_of_parse_and_map(node, 0);
>>> +	err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler,
>>> +			       IRQF_SHARED, "xilinx-vdma-controller", chan);
>>
>> This is a clasic example of where to not use devm_request_irq. 'chan' is
>> accessed in the interrupt handler, but if you use devm_request_irq 'chan'
>> will be freed before the interrupt handler has been released, which means
>> there is now a race condition where the interrupt handler can access already
>> freed memory.ta
> 
> Could you elaborate this case? As far as I understood managed resources
> are a kind of stack pile. In this case you have no such condition. Where
> am I wrong?

The stacked stuff is only ran after the remove() function. Which means that
you call dma_async_device_unregister() before the interrupt handler is
freed. Another issue with the interrupt handler is a bit hidden. The driver
does not call tasklet_kill in the remove function. Which it should though to
make sure that the tasklet does not race against the freeing of the memory.
And in order to make sure that the tasklet is not rescheduled you need to
free the irq before killing the tasklet, since the interrupt handler
schedules the tasklet.

- Lars

--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andy Shevchenko Jan. 23, 2014, 2 p.m. UTC | #6
On Thu, 2014-01-23 at 14:50 +0100, Lars-Peter Clausen wrote:
> On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote:
> > On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote:
> >> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
> > 
> > [...]
> > 
> >>> +	/* Request the interrupt */
> >>> +	chan->irq = irq_of_parse_and_map(node, 0);
> >>> +	err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler,
> >>> +			       IRQF_SHARED, "xilinx-vdma-controller", chan);
> >>
> >> This is a clasic example of where to not use devm_request_irq. 'chan' is
> >> accessed in the interrupt handler, but if you use devm_request_irq 'chan'
> >> will be freed before the interrupt handler has been released, which means
> >> there is now a race condition where the interrupt handler can access already
> >> freed memory.ta
> > 
> > Could you elaborate this case? As far as I understood managed resources
> > are a kind of stack pile. In this case you have no such condition. Where
> > am I wrong?
> 
> The stacked stuff is only ran after the remove() function. Which means that
> you call dma_async_device_unregister() before the interrupt handler is
> freed. Another issue with the interrupt handler is a bit hidden. The driver
> does not call tasklet_kill in the remove function. Which it should though to
> make sure that the tasklet does not race against the freeing of the memory.
> And in order to make sure that the tasklet is not rescheduled you need to
> free the irq before killing the tasklet, since the interrupt handler
> schedules the tasklet.

So, you mean devm_request_irq() will race in any DMA driver?

I think the proper solution is to disable all device work in
the .remove() and devm will care about resources.
> majordomo-info.html
Lars-Peter Clausen Jan. 23, 2014, 2:07 p.m. UTC | #7
On 01/23/2014 03:00 PM, Andy Shevchenko wrote:
> On Thu, 2014-01-23 at 14:50 +0100, Lars-Peter Clausen wrote:
>> On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote:
>>> On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote:
>>>> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
>>>
>>> [...]
>>>
>>>>> +	/* Request the interrupt */
>>>>> +	chan->irq = irq_of_parse_and_map(node, 0);
>>>>> +	err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler,
>>>>> +			       IRQF_SHARED, "xilinx-vdma-controller", chan);
>>>>
>>>> This is a clasic example of where to not use devm_request_irq. 'chan' is
>>>> accessed in the interrupt handler, but if you use devm_request_irq 'chan'
>>>> will be freed before the interrupt handler has been released, which means
>>>> there is now a race condition where the interrupt handler can access already
>>>> freed memory.ta
>>>
>>> Could you elaborate this case? As far as I understood managed resources
>>> are a kind of stack pile. In this case you have no such condition. Where
>>> am I wrong?
>>
>> The stacked stuff is only ran after the remove() function. Which means that
>> you call dma_async_device_unregister() before the interrupt handler is
>> freed. Another issue with the interrupt handler is a bit hidden. The driver
>> does not call tasklet_kill in the remove function. Which it should though to
>> make sure that the tasklet does not race against the freeing of the memory.
>> And in order to make sure that the tasklet is not rescheduled you need to
>> free the irq before killing the tasklet, since the interrupt handler
>> schedules the tasklet.
> 
> So, you mean devm_request_irq() will race in any DMA driver?

Most likely yes. devm_request_irq() is race condition prone for the majority
of device driver. You have to be really careful if you want to use it.

> 
> I think the proper solution is to disable all device work in
> the .remove() and devm will care about resources.

As long as the interrupt handler is registered it can be called, the only
proper solution is to make sure that the order in which resources are torn
down is correct.

- Lars
--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Srikanth Thokala Jan. 23, 2014, 5:35 p.m. UTC | #8
Hi Levente,

On Thu, Jan 23, 2014 at 3:00 AM, Levente Kurusa <levex@linux.com> wrote:
> Hello,
>
> 2014/1/22 Srikanth Thokala <sthokal@xilinx.com>:
>> This is the driver for the AXI Video Direct Memory Access (AXI
>> VDMA) core, which is a soft Xilinx IP core that provides high-
>> bandwidth direct memory access between memory and AXI4-Stream
>> type video target peripherals. The core provides efficient two
>> dimensional DMA operations with independent asynchronous read
>> and write channel operation.
>>
>> This module works on Zynq (ARM Based SoC) and Microblaze platforms.
>>
>> Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
>> ---
>
> Another two remarks, after you fixed them ( or not :-) )
> you can have my:
>
> Reviewed-by: Levente Kurusa <levex@linux.com>
>
> Oh, and next time please if you post a patch that fixes something I pointed out,
> CC me as I had a hard time finding this patch, thanks. :-)

Sure. Thanks

>
>> NOTE:
>> 1. Created a separate directory 'dma/xilinx' as Xilinx has two more
>>    DMA IPs and we are also planning to upstream these drivers soon.
>> 2. Rebased on v3.13.0-rc8
>>
>> Changes in v2:
>> - Removed DMA Test client module from the patchset as suggested
>>   by Andy Shevchenko
>> - Removed device-id DT property, as suggested by Arnd Bergmann
>> - Properly documented DT bindings as suggested by Arnd Bergmann
>> - Returning with error, if registration of DMA to node fails
>> - Fixed typo errors
>> - Used BIT() macro at applicable places
>> - Added missing header file to the patchset
>> - Changed copyright year to include 2014
>> ---
>>  .../devicetree/bindings/dma/xilinx/xilinx_vdma.txt |   75 +
>>  drivers/dma/Kconfig                                |   14 +
>>  drivers/dma/Makefile                               |    1 +
>>  drivers/dma/xilinx/Makefile                        |    1 +
>>  drivers/dma/xilinx/xilinx_vdma.c                   | 1486 ++++++++++++++++++++
>>  include/linux/amba/xilinx_dma.h                    |   50 +
>>  6 files changed, 1627 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
>>  create mode 100644 drivers/dma/xilinx/Makefile
>>  create mode 100644 drivers/dma/xilinx/xilinx_vdma.c
>>  create mode 100644 include/linux/amba/xilinx_dma.h
>>
>> diff --git a/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
>> new file mode 100644
>> index 0000000..ab8be1a
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
>> @@ -0,0 +1,75 @@
>> +Xilinx AXI VDMA engine, it does transfers between memory and video devices.
>> +It can be configured to have one channel or two channels. If configured
>> +as two channels, one is to transmit to the video device and another is
>> +to receive from the video device.
>> +
>> +Required properties:
>> +- compatible: Should be "xlnx,axi-vdma-1.00.a"
>> +- #dma-cells: Should be <1>, see "dmas" property below
>> +- reg: Should contain VDMA registers location and length.
>> +- xlnx,num-fstores: Should be the number of framebuffers as configured in h/w.
>> +- dma-channel child node: Should have atleast one channel and can have upto
>> +       two channels per device. This node specifies the properties of each
>> +       DMA channel (see child node properties below).
>> +
>> +Optional properties:
>> +- xlnx,include-sg: Tells whether configured for Scatter-mode in
>> +       the hardware.
>> [...]
>> +
>> +/**
>> + * xilinx_vdma_is_running - Check if VDMA channel is running
>> + * @chan: Driver specific VDMA channel
>> + *
>> + * Return: '1' if running, '0' if not.
>> + */
>> +static int xilinx_vdma_is_running(struct xilinx_vdma_chan *chan)
>> +{
>> +       return !(vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) &
>> +                XILINX_VDMA_DMASR_HALTED) &&
>> +               (vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR) &
>> +                XILINX_VDMA_DMACR_RUNSTOP);
>> +}
>> +

[...]

>> +       /* Retrieve the channel properties from the device tree */
>> +       has_dre = of_property_read_bool(node, "xlnx,include-dre");
>> +
>> +       chan->genlock = of_property_read_bool(node, "xlnx,genlock-mode");
>> +
>> +       err = of_property_read_u32(node, "xlnx,datawidth", &value);
>> +       if (!err) {
>> +               u32 width = value >> 3; /* Convert bits to bytes */
>> +
>> +               /* If data width is greater than 8 bytes, DRE is not in hw */
>> +               if (width > 8)
>> +                       has_dre = false;
>> +
>> +               if (!has_dre)
>> +                       xdev->common.copy_align = fls(width - 1);
>> +       } else {
>> +               dev_err(xdev->dev, "missing xlnx,datawidth property\n");
>> +               return err;
>> +       }
>
> Can you please convert this to:
> if (err) {
>  dev_err(...);
>  return err;
> }
>
> That way we can avoid the else clause.

Ok. I will fix it in v3.

>> +
>> +       if (of_device_is_compatible(node, "xlnx,axi-vdma-mm2s-channel")) {
>> +               chan->direction = DMA_MEM_TO_DEV;
>> +               chan->id = 0;
>> +
>> +               chan->ctrl_offset = XILINX_VDMA_MM2S_CTRL_OFFSET;
>> +               chan->desc_offset = XILINX_VDMA_MM2S_DESC_OFFSET;
>> +
>> +               if (xdev->flush_on_fsync == XILINX_VDMA_FLUSH_BOTH ||
>> +                   xdev->flush_on_fsync == XILINX_VDMA_FLUSH_MM2S)
>> +                       chan->flush_on_fsync = true;
>> +       } else if (of_device_is_compatible(node,
>> +                                           "xlnx,axi-vdma-s2mm-channel")) {
>> +               chan->direction = DMA_DEV_TO_MEM;
>> +               chan->id = 1;
>> +
>> +               chan->ctrl_offset = XILINX_VDMA_S2MM_CTRL_OFFSET;
>> +               chan->desc_offset = XILINX_VDMA_S2MM_DESC_OFFSET;
>> +
>> +               if (xdev->flush_on_fsync == XILINX_VDMA_FLUSH_BOTH ||
>> +                   xdev->flush_on_fsync == XILINX_VDMA_FLUSH_S2MM)
>> +                       chan->flush_on_fsync = true;
>> +       } else {
>> +               dev_err(xdev->dev, "Invalid channel compatible node\n");
>> +               return -EINVAL;
>> +       }
>> +
>> +       /* Request the interrupt */
>> +       chan->irq = irq_of_parse_and_map(node, 0);
>> +       err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler,
>> +                              IRQF_SHARED, "xilinx-vdma-controller", chan);
>> +       if (err) {
>> +               dev_err(xdev->dev, "unable to request IRQ\n");
>
> It might be worth to also tell the IRQ number that failed
> to register.

Ok.

>
>> +               return err;
>> +       }
>> +
>> +       /* Initialize the DMA channel and add it to the DMA engine channels
>> +        * list.
>> +        */
>> +       chan->common.device = &xdev->common;
>> +
>> +       list_add_tail(&chan->common.device_node, &xdev->common.channels);

[...]

>> +       err = of_property_read_u32(node, "xlnx,num-fstores", &num_frames);
>> +       if (err < 0) {
>> +               dev_err(xdev->dev, "missing xlnx,num-fstores property\n");
>> +               return err;
>> +       }
>> +
>> +       of_property_read_u32(node, "xlnx,flush-fsync", &xdev->flush_on_fsync);
>
> Error check?

Sure, with a warning message as it is optional DT property.  I will
fix it in v3.

Srikanth

[...]
>> --
>
> --
> Regards,
> Levente Kurusa
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Srikanth Thokala Jan. 23, 2014, 5:52 p.m. UTC | #9
Hi Andy,

On Thu, Jan 23, 2014 at 7:02 PM, Andy Shevchenko
<andriy.shevchenko@linux.intel.com> wrote:
> On Wed, 2014-01-22 at 22:22 +0530, Srikanth Thokala wrote:
>> This is the driver for the AXI Video Direct Memory Access (AXI
>> VDMA) core, which is a soft Xilinx IP core that provides high-
>> bandwidth direct memory access between memory and AXI4-Stream
>> type video target peripherals. The core provides efficient two
>> dimensional DMA operations with independent asynchronous read
>> and write channel operation.
>>
>> This module works on Zynq (ARM Based SoC) and Microblaze platforms.
>
> Few comments below.

Ok,

>
>>
>> Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
>> ---
>> NOTE:
>> 1. Created a separate directory 'dma/xilinx' as Xilinx has two more
>>    DMA IPs and we are also planning to upstream these drivers soon.
>> 2. Rebased on v3.13.0-rc8
>>
>> Changes in v2:
>> - Removed DMA Test client module from the patchset as suggested
>>   by Andy Shevchenko
>> - Removed device-id DT property, as suggested by Arnd Bergmann
>> - Properly documented DT bindings as suggested by Arnd Bergmann
>> - Returning with error, if registration of DMA to node fails
>> - Fixed typo errors
>> - Used BIT() macro at applicable places
>> - Added missing header file to the patchset
>> - Changed copyright year to include 2014
>> ---
>>  .../devicetree/bindings/dma/xilinx/xilinx_vdma.txt |   75 +
>>  drivers/dma/Kconfig                                |   14 +
>>  drivers/dma/Makefile                               |    1 +
>>  drivers/dma/xilinx/Makefile                        |    1 +
>>  drivers/dma/xilinx/xilinx_vdma.c                   | 1486 ++++++++++++++++++++
>>  include/linux/amba/xilinx_dma.h                    |   50 +
>>  6 files changed, 1627 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
>>  create mode 100644 drivers/dma/xilinx/Makefile
>>  create mode 100644 drivers/dma/xilinx/xilinx_vdma.c
>>  create mode 100644 include/linux/amba/xilinx_dma.h
>>
>> diff --git a/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
>> new file mode 100644
>> index 0000000..ab8be1a
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
>> @@ -0,0 +1,75 @@
>> +Xilinx AXI VDMA engine, it does transfers between memory and video devices.
>> +It can be configured to have one channel or two channels. If configured
>> +as two channels, one is to transmit to the video device and another is
>> +to receive from the video device.
>> +
>> +Required properties:
>> +- compatible: Should be "xlnx,axi-vdma-1.00.a"
>> +- #dma-cells: Should be <1>, see "dmas" property below
>> +- reg: Should contain VDMA registers location and length.
>> +- xlnx,num-fstores: Should be the number of framebuffers as configured in h/w.
>> +- dma-channel child node: Should have atleast one channel and can have upto
>> +     two channels per device. This node specifies the properties of each
>> +     DMA channel (see child node properties below).
>> +
>> +Optional properties:
>> +- xlnx,include-sg: Tells whether configured for Scatter-mode in
>> +     the hardware.
>> +- xlnx,flush-fsync: Tells whether which channel to Flush on Frame sync.
>> +     It takes following values:
>> +     {1}, flush both channels
>> +     {2}, flush mm2s channel
>> +     {3}, flush s2mm channel
>> +
>> +Required child node properties:
>> +- compatible: It should be either "xlnx,axi-vdma-mm2s-channel" or
>> +     "xlnx,axi-vdma-s2mm-channel".
>> +- interrupts: Should contain per channel VDMA interrupts.
>> +- xlnx,data-width: Should contain the stream data width, take values
>> +     {32,64...1024}.
>> +
>> +Option child node properties:
>> +- xlnx,include-dre: Tells whether hardware is configured for Data
>> +     Realignment Engine.
>> +- xlnx,genlock-mode: Tells whether Genlock synchronization is
>> +     enabled/disabled in hardware.
>> +
>> +Example:
>> +++++++++
>> +
>> +axi_vdma_0: axivdma@40030000 {
>> +     compatible = "xlnx,axi-vdma-1.00.a";
>> +     #dma_cells = <1>;
>> +     reg = < 0x40030000 0x10000 >;
>> +     xlnx,num-fstores = <0x8>;
>> +     xlnx,flush-fsync = <0x1>;
>> +     dma-channel@40030000 {
>> +             compatible = "xlnx,axi-vdma-mm2s-channel";
>> +             interrupts = < 0 54 4 >;
>> +             xlnx,datawidth = <0x40>;
>> +     } ;
>> +     dma-channel@40030030 {
>> +             compatible = "xlnx,axi-vdma-s2mm-channel";
>> +             interrupts = < 0 53 4 >;
>> +             xlnx,datawidth = <0x40>;
>> +     } ;
>> +} ;
>> +
>> +
>> +* DMA client
>> +
>> +Required properties:
>> +- dmas: a list of <[Video DMA device phandle] [Channel ID]> pairs,
>> +     where Channel ID is '0' for write/tx and '1' for read/rx
>> +     channel.
>> +- dma-names: a list of DMA channel names, one per "dmas" entry
>> +
>> +Example:
>> +++++++++
>> +
>> +vdmatest_0: vdmatest@0 {
>> +     compatible ="xlnx,axi-vdma-test-1.00.a";
>> +     dmas = <&axi_vdma_0 0
>> +             &axi_vdma_0 1>;
>> +     dma-names = "vdma0", "vdma1";
>> +} ;
>> diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
>> index c823daa..2a74651 100644
>> --- a/drivers/dma/Kconfig
>> +++ b/drivers/dma/Kconfig
>> @@ -334,6 +334,20 @@ config K3_DMA
>>         Support the DMA engine for Hisilicon K3 platform
>>         devices.
>>
>> +config XILINX_VDMA
>> +     tristate "Xilinx AXI VDMA Engine"
>> +     depends on (ARCH_ZYNQ || MICROBLAZE)
>> +     select DMA_ENGINE
>> +     help
>> +       Enable support for Xilinx AXI VDMA Soft IP.
>> +
>> +       This engine provides high-bandwidth direct memory access
>> +       between memory and AXI4-Stream video type target
>> +       peripherals including peripherals which support AXI4-
>> +       Stream Video Protocol.  It has two stream interfaces/
>> +       channels, Memory Mapped to Stream (MM2S) and Stream to
>> +       Memory Mapped (S2MM) for the data transfers.
>> +
>>  config DMA_ENGINE
>>       bool
>>
>> diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile
>> index 0ce2da9..d84130b 100644
>> --- a/drivers/dma/Makefile
>> +++ b/drivers/dma/Makefile
>> @@ -42,3 +42,4 @@ obj-$(CONFIG_MMP_PDMA) += mmp_pdma.o
>>  obj-$(CONFIG_DMA_JZ4740) += dma-jz4740.o
>>  obj-$(CONFIG_TI_CPPI41) += cppi41.o
>>  obj-$(CONFIG_K3_DMA) += k3dma.o
>> +obj-y += xilinx/
>> diff --git a/drivers/dma/xilinx/Makefile b/drivers/dma/xilinx/Makefile
>> new file mode 100644
>> index 0000000..3c4e9f2
>> --- /dev/null
>> +++ b/drivers/dma/xilinx/Makefile
>> @@ -0,0 +1 @@
>> +obj-$(CONFIG_XILINX_VDMA) += xilinx_vdma.o
>> diff --git a/drivers/dma/xilinx/xilinx_vdma.c b/drivers/dma/xilinx/xilinx_vdma.c
>> new file mode 100644
>> index 0000000..4c0d04c
>> --- /dev/null
>> +++ b/drivers/dma/xilinx/xilinx_vdma.c
>> @@ -0,0 +1,1486 @@
>> +/*
>> + * DMA driver for Xilinx Video DMA Engine
>> + *
>> + * Copyright (C) 2010-2014 Xilinx, Inc. All rights reserved.
>> + *
>> + * Based on the Freescale DMA driver.
>> + *
>> + * Description:
>> + * The AXI Video Direct Memory Access (AXI VDMA) core is a soft Xilinx IP
>> + * core that provides high-bandwidth direct memory access between memory
>> + * and AXI4-Stream type video target peripherals. The core provides efficient
>> + * two dimensional DMA operations with independent asynchronous read (S2MM)
>> + * and write (MM2S) channel operation. It can be configured to have either
>> + * one channel or two channels. If configured as two channels, one is to
>> + * transmit to the video device (MM2S) and another is to receive from the
>> + * video device (S2MM). Initialization, status, interrupt and management
>> + * registers are accessed through an AXI4-Lite slave interface.
>> + *
>> + * This program is free software: you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation, either version 2 of the License, or
>> + * (at your option) any later version.
>> + */
>> +
>> +#include <linux/amba/xilinx_dma.h>
>> +#include <linux/bitops.h>
>> +#include <linux/dmapool.h>
>> +#include <linux/init.h>
>> +#include <linux/interrupt.h>
>> +#include <linux/io.h>
>> +#include <linux/module.h>
>> +#include <linux/of_address.h>
>> +#include <linux/of_dma.h>
>> +#include <linux/of_platform.h>
>> +#include <linux/of_irq.h>
>> +#include <linux/slab.h>
>> +
>> +/* Register/Descriptor Offsets */
>> +#define XILINX_VDMA_MM2S_CTRL_OFFSET         0x0000
>> +#define XILINX_VDMA_S2MM_CTRL_OFFSET         0x0030
>> +#define XILINX_VDMA_MM2S_DESC_OFFSET         0x0050
>> +#define XILINX_VDMA_S2MM_DESC_OFFSET         0x00a0
>> +
>> +/* Control Registers */
>> +#define XILINX_VDMA_REG_DMACR                        0x0000
>> +#define XILINX_VDMA_DMACR_DELAY_MAX          0xff
>> +#define XILINX_VDMA_DMACR_DELAY_SHIFT                24
>> +#define XILINX_VDMA_DMACR_FRAME_COUNT_MAX    0xff
>> +#define XILINX_VDMA_DMACR_FRAME_COUNT_SHIFT  16
>> +#define XILINX_VDMA_DMACR_ERR_IRQ            BIT(14)
>> +#define XILINX_VDMA_DMACR_DLY_CNT_IRQ                BIT(13)
>> +#define XILINX_VDMA_DMACR_FRM_CNT_IRQ                BIT(12)
>> +#define XILINX_VDMA_DMACR_MASTER_SHIFT               8
>> +#define XILINX_VDMA_DMACR_FSYNCSRC_SHIFT     5
>> +#define XILINX_VDMA_DMACR_FRAMECNT_EN                BIT(4)
>> +#define XILINX_VDMA_DMACR_GENLOCK_EN         BIT(3)
>> +#define XILINX_VDMA_DMACR_RESET                      BIT(2)
>> +#define XILINX_VDMA_DMACR_CIRC_EN            BIT(1)
>> +#define XILINX_VDMA_DMACR_RUNSTOP            BIT(0)
>> +#define XILINX_VDMA_DMACR_DELAY_MASK         \
>> +                             (XILINX_VDMA_DMACR_DELAY_MAX << \
>> +                             XILINX_VDMA_DMACR_DELAY_SHIFT)
>> +#define XILINX_VDMA_DMACR_FRAME_COUNT_MASK   \
>> +                             (XILINX_VDMA_DMACR_FRAME_COUNT_MAX << \
>> +                             XILINX_VDMA_DMACR_FRAME_COUNT_SHIFT)
>> +#define XILINX_VDMA_DMACR_MASTER_MASK                \
>> +                             (0xf << XILINX_VDMA_DMACR_MASTER_SHIFT)
>> +#define XILINX_VDMA_DMACR_FSYNCSRC_MASK              \
>> +                             (3 << XILINX_VDMA_DMACR_FSYNCSRC_SHIFT)
>> +
>> +#define XILINX_VDMA_REG_DMASR                        0x0004
>> +#define XILINX_VDMA_DMASR_DELAY_SHIFT                24
>> +#define XILINX_VDMA_DMASR_FRAME_COUNT_SHIFT  16
>> +#define XILINX_VDMA_DMASR_EOL_LATE_ERR               BIT(15)
>> +#define XILINX_VDMA_DMASR_ERR_IRQ            BIT(14)
>> +#define XILINX_VDMA_DMASR_DLY_CNT_IRQ                BIT(13)
>> +#define XILINX_VDMA_DMASR_FRM_CNT_IRQ                BIT(12)
>> +#define XILINX_VDMA_DMASR_SOF_LATE_ERR               BIT(11)
>> +#define XILINX_VDMA_DMASR_SG_DEC_ERR         BIT(10)
>> +#define XILINX_VDMA_DMASR_SG_SLV_ERR         BIT(9)
>> +#define XILINX_VDMA_DMASR_EOF_EARLY_ERR              BIT(8)
>> +#define XILINX_VDMA_DMASR_SOF_EARLY_ERR              BIT(7)
>> +#define XILINX_VDMA_DMASR_DMA_DEC_ERR                BIT(6)
>> +#define XILINX_VDMA_DMASR_DMA_SLAVE_ERR              BIT(5)
>> +#define XILINX_VDMA_DMASR_DMA_INT_ERR                BIT(4)
>> +#define XILINX_VDMA_DMASR_IDLE                       BIT(1)
>> +#define XILINX_VDMA_DMASR_HALTED             BIT(0)
>> +
>> +#define XILINX_VDMA_DMASR_DELAY_MASK         \
>> +                             (0xff << XILINX_VDMA_DMASR_DELAY_SHIFT)
>> +#define XILINX_VDMA_DMASR_FRAME_COUNT_MASK   \
>> +                             (0xff << XILINX_VDMA_DMASR_FRAME_COUNT_SHIFT)
>
> Does 0xff require to be a separate definition in both above cases?

Ok, will reuse DELAY/FRAME_MAX masks.

>
>> +
>> +#define XILINX_VDMA_REG_CURDESC                      0x0008
>> +#define XILINX_VDMA_REG_TAILDESC             0x0010
>> +#define XILINX_VDMA_REG_REG_INDEX            0x0014
>> +#define XILINX_VDMA_REG_FRMSTORE             0x0018
>> +#define XILINX_VDMA_REG_THRESHOLD            0x001c
>> +#define XILINX_VDMA_REG_FRMPTR_STS           0x0024
>> +#define XILINX_VDMA_REG_PARK_PTR             0x0028
>> +#define XILINX_VDMA_PARK_PTR_WR_REF_SHIFT    8
>> +#define XILINX_VDMA_PARK_PTR_RD_REF_SHIFT    0
>> +#define XILINX_VDMA_REG_VDMA_VERSION         0x002c
>> +
>> +/* Register Direct Mode Registers */
>> +#define XILINX_VDMA_REG_VSIZE                        0x0000
>> +#define XILINX_VDMA_REG_HSIZE                        0x0004
>> +
>> +#define XILINX_VDMA_REG_FRMDLY_STRIDE                0x0008
>> +#define XILINX_VDMA_FRMDLY_STRIDE_FRMDLY_SHIFT       24
>> +#define XILINX_VDMA_FRMDLY_STRIDE_STRIDE_SHIFT       0
>> +#define XILINX_VDMA_FRMDLY_STRIDE_FRMDLY_MASK        \
>> +                             (0x1f <<        \
>> +                             XILINX_VDMA_FRMDLY_STRIDE_FRMDLY_SHIFT)
>> +#define XILINX_VDMA_FRMDLY_STRIDE_STRIDE_MASK        \
>> +                             (0xffff <<      \
>> +                             XILINX_VDMA_FRMDLY_STRIDE_STRIDE_MASK)
>> +
>> +#define XILINX_VDMA_REG_START_ADDRESS(n)     (0x000c + 4 * (n))
>> +
>> +/* Hw specific definitions */
>
> HW or Hardware

Ok.

>
>> +#define XILINX_VDMA_MAX_CHANS_PER_DEVICE     0x2
>> +
>> +#define XILINX_VDMA_DMAXR_ALL_IRQ_MASK       (XILINX_VDMA_DMASR_FRM_CNT_IRQ | \
>> +                                      XILINX_VDMA_DMASR_DLY_CNT_IRQ | \
>> +                                      XILINX_VDMA_DMASR_ERR_IRQ)
>> +
>> +#define XILINX_VDMA_DMASR_ALL_ERR_MASK       (XILINX_VDMA_DMASR_EOL_LATE_ERR | \
>> +                                      XILINX_VDMA_DMASR_SOF_LATE_ERR | \
>> +                                      XILINX_VDMA_DMASR_SG_DEC_ERR | \
>> +                                      XILINX_VDMA_DMASR_SG_SLV_ERR | \
>> +                                      XILINX_VDMA_DMASR_EOF_EARLY_ERR | \
>> +                                      XILINX_VDMA_DMASR_SOF_EARLY_ERR | \
>> +                                      XILINX_VDMA_DMASR_DMA_DEC_ERR | \
>> +                                      XILINX_VDMA_DMASR_DMA_SLAVE_ERR | \
>> +                                      XILINX_VDMA_DMASR_DMA_INT_ERR)
>> +
>> +/*
>> + * Recoverable errors are DMA Internal error, SOF Early, EOF Early and SOF Late.
>> + * They are only recoverable when C_FLUSH_ON_FSYNC is enabled in the h/w system.
>> + */
>> +#define XILINX_VDMA_DMASR_ERR_RECOVER_MASK   \
>> +                                     (XILINX_VDMA_DMASR_SOF_LATE_ERR | \
>
> Do you need so many tabs for an indentation here and in other places?
> May be better to keep some style here (I mean on which line you start
> the value of the definition).

Ok.

>
>> +                                      XILINX_VDMA_DMASR_EOF_EARLY_ERR | \
>> +                                      XILINX_VDMA_DMASR_SOF_EARLY_ERR | \
>> +                                      XILINX_VDMA_DMASR_DMA_INT_ERR)
>> +
>> +/* Axi VDMA Flush on Fsync bits */
>> +#define XILINX_VDMA_FLUSH_S2MM                       3
>> +#define XILINX_VDMA_FLUSH_MM2S                       2
>> +#define XILINX_VDMA_FLUSH_BOTH                       1
>> +
>> +/* Delay loop counter to prevent hardware failure */
>> +#define XILINX_VDMA_LOOP_COUNT                       1000000
>> +
>> +/**
>> + * struct xilinx_vdma_desc_hw - Hardware Descriptor
>> + * @next_desc: Next Descriptor Pointer @0x00
>> + * @pad1: Reserved @0x04
>> + * @buf_addr: Buffer address @0x08
>> + * @pad2: Reserved @0x0C
>> + * @vsize: Vertical Size @0x10
>> + * @hsize: Horizontal Size @0x14
>> + * @stride: Number of bytes between the first
>> + *       pixels of each horizontal line @0x18
>> + */
>> +struct xilinx_vdma_desc_hw {
>> +     u32 next_desc;
>> +     u32 pad1;
>> +     u32 buf_addr;
>> +     u32 pad2;
>> +     u32 vsize;
>> +     u32 hsize;
>> +     u32 stride;
>> +} __aligned(64);
>> +
>> +/**
>> + * struct xilinx_vdma_tx_segment - Descriptor segment
>> + * @hw: Hardware descriptor
>> + * @node: Node in the descriptor segments list
>> + * @cookie: Segment cookie
>> + * @phys: Physical address of segment
>> + */
>> +struct xilinx_vdma_tx_segment {
>> +     struct xilinx_vdma_desc_hw hw;
>> +     struct list_head node;
>> +     dma_cookie_t cookie;
>> +     dma_addr_t phys;
>> +} __aligned(64);
>> +
>> +/**
>> + * struct xilinx_vdma_tx_descriptor - Per Transaction structure
>> + * @async_tx: Async transaction descriptor
>> + * @segments: TX segments list
>> + * @node: Node in the channel descriptors list
>> + */
>> +struct xilinx_vdma_tx_descriptor {
>> +     struct dma_async_tx_descriptor async_tx;
>> +     struct list_head segments;
>> +     struct list_head node;
>> +};
>> +
>> +#define to_vdma_tx_descriptor(tx) \
>> +     container_of(tx, struct xilinx_vdma_tx_descriptor, async_tx)
>> +
>> +/**
>> + * struct xilinx_vdma_chan - Driver specific VDMA channel structure
>> + * @xdev: Driver specific device structure
>> + * @ctrl_offset: Control registers offset
>> + * @desc_offset: TX descriptor registers offset
>> + * @completed_cookie: Maximum cookie completed
>> + * @cookie: The current cookie
>> + * @lock: Descriptor operation lock
>> + * @pending_list: Descriptors waiting
>> + * @active_desc: Active descriptor
>> + * @done_list: Complete descriptors
>> + * @common: DMA common channel
>> + * @desc_pool: Descriptors pool
>> + * @dev: The dma device
>> + * @irq: Channel IRQ
>> + * @id: Channel ID
>> + * @direction: Transfer direction
>> + * @num_frms: Number of frames
>> + * @has_sg: Support scatter transfers
>> + * @genlock: Support genlock mode
>> + * @err: Channel has errors
>> + * @tasklet: Cleanup work after irq
>> + * @config: Device configuration info
>> + * @flush_on_fsync: Flush on Frame sync
>> + */
>> +struct xilinx_vdma_chan {
>> +     struct xilinx_vdma_device *xdev;
>> +     u32 ctrl_offset;
>> +     u32 desc_offset;
>> +     dma_cookie_t completed_cookie;
>> +     dma_cookie_t cookie;
>> +     spinlock_t lock;
>> +     struct list_head pending_list;
>> +     struct xilinx_vdma_tx_descriptor *active_desc;
>> +     struct list_head done_list;
>> +     struct dma_chan common;
>> +     struct dma_pool *desc_pool;
>> +     struct device *dev;
>> +     int irq;
>> +     int id;
>> +     enum dma_transfer_direction direction;
>> +     int num_frms;
>> +     bool has_sg;
>> +     bool genlock;
>> +     bool err;
>> +     struct tasklet_struct tasklet;
>> +     struct xilinx_vdma_config config;
>> +     bool flush_on_fsync;
>> +};
>> +
>> +/**
>> + * struct xilinx_vdma_device - VDMA device structure
>> + * @regs: I/O mapped base address
>> + * @dev: Device Structure
>> + * @common: DMA device structure
>> + * @chan: Driver specific VDMA channel
>> + * @has_sg: Specifies whether Scatter-Gather is present or not
>> + * @flush_on_fsync: Flush on frame sync
>> + */
>> +struct xilinx_vdma_device {
>> +     void __iomem *regs;
>> +     struct device *dev;
>> +     struct dma_device common;
>> +     struct xilinx_vdma_chan *chan[XILINX_VDMA_MAX_CHANS_PER_DEVICE];
>> +     bool has_sg;
>> +     u32 flush_on_fsync;
>> +};
>> +
>> +#define to_xilinx_chan(chan) \
>> +                     container_of(chan, struct xilinx_vdma_chan, common)
>> +
>> +/* IO accessors */
>> +static inline u32 vdma_read(struct xilinx_vdma_chan *chan, u32 reg)
>> +{
>> +     return ioread32(chan->xdev->regs + reg);
>> +}
>> +
>> +static inline void vdma_write(struct xilinx_vdma_chan *chan, u32 reg, u32 value)
>> +{
>> +     iowrite32(value, chan->xdev->regs + reg);
>> +}
>> +
>> +static inline void vdma_desc_write(struct xilinx_vdma_chan *chan, u32 reg,
>> +                                u32 value)
>> +{
>> +     vdma_write(chan, chan->desc_offset + reg, value);
>> +}
>> +
>> +static inline u32 vdma_ctrl_read(struct xilinx_vdma_chan *chan, u32 reg)
>> +{
>> +     return vdma_read(chan, chan->ctrl_offset + reg);
>> +}
>> +
>> +static inline void vdma_ctrl_write(struct xilinx_vdma_chan *chan, u32 reg,
>> +                                u32 value)
>> +{
>> +     vdma_write(chan, chan->ctrl_offset + reg, value);
>> +}
>> +
>> +static inline void vdma_ctrl_clr(struct xilinx_vdma_chan *chan, u32 reg,
>> +                              u32 clr)
>> +{
>> +     vdma_ctrl_write(chan, reg, vdma_ctrl_read(chan, reg) & ~clr);
>> +}
>> +
>> +static inline void vdma_ctrl_set(struct xilinx_vdma_chan *chan, u32 reg,
>> +                              u32 set)
>> +{
>> +     vdma_ctrl_write(chan, reg, vdma_ctrl_read(chan, reg) | set);
>> +}
>> +
>> +/* -----------------------------------------------------------------------------
>> + * Descriptors and segments alloc and free
>> + */
>> +
>> +/**
>> + * xilinx_vdma_alloc_tx_segment - Allocate transaction segment
>> + * @chan: Driver specific VDMA channel
>> + *
>> + * Return: The allocated segment on success and NULL on failure.
>> + */
>> +static struct xilinx_vdma_tx_segment *
>> +xilinx_vdma_alloc_tx_segment(struct xilinx_vdma_chan *chan)
>> +{
>> +     struct xilinx_vdma_tx_segment *segment;
>> +     dma_addr_t phys;
>> +
>> +     segment = dma_pool_alloc(chan->desc_pool, GFP_ATOMIC, &phys);
>> +     if (!segment)
>> +             return NULL;
>> +
>> +     memset(segment, 0, sizeof(*segment));
>> +     segment->phys = phys;
>> +
>> +     return segment;
>> +}
>> +
>> +/**
>> + * xilinx_vdma_free_tx_segment - Free transaction segment
>> + * @chan: Driver specific VDMA channel
>> + * @segment: VDMA transaction segment
>> + */
>> +static void xilinx_vdma_free_tx_segment(struct xilinx_vdma_chan *chan,
>> +                                     struct xilinx_vdma_tx_segment *segment)
>> +{
>> +     dma_pool_free(chan->desc_pool, segment, segment->phys);
>> +}
>> +
>> +/**
>> + * xilinx_vdma_tx_descriptor - Allocate transaction descriptor
>> + * @chan: Driver specific VDMA channel
>> + *
>> + * Return: The allocated descriptor on success and NULL on failure.
>> + */
>> +static struct xilinx_vdma_tx_descriptor *
>> +xilinx_vdma_alloc_tx_descriptor(struct xilinx_vdma_chan *chan)
>> +{
>> +     struct xilinx_vdma_tx_descriptor *desc;
>> +
>> +     desc = kzalloc(sizeof(*desc), GFP_KERNEL);
>> +     if (!desc)
>> +             return NULL;
>> +
>> +     INIT_LIST_HEAD(&desc->segments);
>> +
>> +     return desc;
>> +}
>> +
>> +/**
>> + * xilinx_vdma_free_tx_descriptor - Free transaction descriptor
>> + * @chan: Driver specific VDMA channel
>> + * @desc: VDMA transaction descriptor
>> + */
>> +static void
>> +xilinx_vdma_free_tx_descriptor(struct xilinx_vdma_chan *chan,
>> +                            struct xilinx_vdma_tx_descriptor *desc)
>> +{
>> +     struct xilinx_vdma_tx_segment *segment, *next;
>> +
>> +     if (!desc)
>> +             return;
>> +
>> +     list_for_each_entry_safe(segment, next, &desc->segments, node) {
>> +             list_del(&segment->node);
>> +             xilinx_vdma_free_tx_segment(chan, segment);
>> +     }
>> +
>> +     kfree(desc);
>> +}
>> +
>> +/* Required functions */
>> +
>> +/**
>> + * xilinx_vdma_free_descriptors - Free descriptors list
>> + * @chan: Driver specific VDMA channel
>> + * @list: List to parse and delete the descriptor
>> + */
>> +static void xilinx_vdma_free_desc_list(struct xilinx_vdma_chan *chan,
>> +                                     struct list_head *list)
>> +{
>> +     struct xilinx_vdma_tx_descriptor *desc, *next;
>> +
>> +     list_for_each_entry_safe(desc, next, list, node) {
>> +             list_del(&desc->node);
>> +             xilinx_vdma_free_tx_descriptor(chan, desc);
>> +     }
>> +}
>> +
>> +/**
>> + * xilinx_vdma_free_descriptors - Free channel descriptors
>> + * @chan: Driver specific VDMA channel
>> + */
>> +static void xilinx_vdma_free_descriptors(struct xilinx_vdma_chan *chan)
>> +{
>> +     unsigned long flags;
>> +
>> +     spin_lock_irqsave(&chan->lock, flags);
>> +
>> +     xilinx_vdma_free_desc_list(chan, &chan->pending_list);
>> +     xilinx_vdma_free_desc_list(chan, &chan->done_list);
>> +
>> +     xilinx_vdma_free_tx_descriptor(chan, chan->active_desc);
>> +     chan->active_desc = NULL;
>> +
>> +     spin_unlock_irqrestore(&chan->lock, flags);
>> +}
>> +
>> +/**
>> + * xilinx_vdma_free_chan_resources - Free channel resources
>> + * @dchan: DMA channel
>> + */
>> +static void xilinx_vdma_free_chan_resources(struct dma_chan *dchan)
>> +{
>> +     struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
>> +
>> +     dev_dbg(chan->dev, "Free all channel resources.\n");
>> +
>> +     tasklet_kill(&chan->tasklet);
>> +     xilinx_vdma_free_descriptors(chan);
>> +     dma_pool_destroy(chan->desc_pool);
>> +     chan->desc_pool = NULL;
>> +}
>> +
>> +/**
>> + * xilinx_vdma_chan_desc_cleanup - Clean channel descriptors
>> + * @chan: Driver specific VDMA channel
>> + */
>> +static void xilinx_vdma_chan_desc_cleanup(struct xilinx_vdma_chan *chan)
>> +{
>> +     struct xilinx_vdma_tx_descriptor *desc, *next;
>> +     unsigned long flags;
>> +
>> +     spin_lock_irqsave(&chan->lock, flags);
>> +
>> +     list_for_each_entry_safe(desc, next, &chan->done_list, node) {
>> +             dma_async_tx_callback callback;
>> +             void *callback_param;
>> +
>> +             /* Remove from the list of running transactions */
>> +             list_del(&desc->node);
>> +
>> +             /* Run the link descriptor callback function */
>> +             callback = desc->async_tx.callback;
>> +             callback_param = desc->async_tx.callback_param;
>> +             if (callback) {
>> +                     spin_unlock_irqrestore(&chan->lock, flags);
>> +                     callback(callback_param);
>> +                     spin_lock_irqsave(&chan->lock, flags);
>> +             }
>> +
>> +             /* Run any dependencies, then free the descriptor */
>> +             dma_run_dependencies(&desc->async_tx);
>> +             xilinx_vdma_free_tx_descriptor(chan, desc);
>> +     }
>> +
>> +     spin_unlock_irqrestore(&chan->lock, flags);
>> +}
>> +
>> +/**
>> + * xilinx_vdma_do_tasklet - Schedule completion tasklet
>> + * @data: Pointer to the Xilinx VDMA channel structure
>> + */
>> +static void xilinx_vdma_do_tasklet(unsigned long data)
>> +{
>> +     struct xilinx_vdma_chan *chan = (struct xilinx_vdma_chan *)data;
>> +
>> +     xilinx_vdma_chan_desc_cleanup(chan);
>> +}
>> +
>> +/**
>> + * xilinx_vdma_alloc_chan_resources - Allocate channel resources
>> + * @dchan: DMA channel
>> + *
>> + * Return: '1' on success and failure value on error
>
> May be return 0 on success as it usual practice? Here and in the other
> places as well.

Ok.

>
>> + */
>> +static int xilinx_vdma_alloc_chan_resources(struct dma_chan *dchan)
>> +{
>> +     struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
>> +
>> +     /* Has this channel already been allocated? */
>> +     if (chan->desc_pool)
>> +             return 1;
>> +
>> +     /*
>> +      * We need the descriptor to be aligned to 64bytes
>> +      * for meeting Xilinx VDMA specification requirement.
>> +      */
>> +     chan->desc_pool = dma_pool_create("xilinx_vdma_desc_pool",
>> +                             chan->dev,
>> +                             sizeof(struct xilinx_vdma_tx_segment),
>> +                             __alignof__(struct xilinx_vdma_tx_segment), 0);
>> +     if (!chan->desc_pool) {
>> +             dev_err(chan->dev,
>> +                     "unable to allocate channel %d descriptor pool\n",
>> +                     chan->id);
>> +             return -ENOMEM;
>> +     }
>> +
>> +     tasklet_init(&chan->tasklet, xilinx_vdma_do_tasklet,
>> +                     (unsigned long)chan);
>> +
>> +     chan->completed_cookie = DMA_MIN_COOKIE;
>> +     chan->cookie = DMA_MIN_COOKIE;
>> +
>> +     /* There is at least one descriptor free to be allocated */
>> +     return 1;
>> +}
>> +
>> +/**
>> + * xilinx_vdma_tx_status - Get VDMA transaction status
>> + * @dchan: DMA channel
>> + * @cookie: Transaction identifier
>> + * @txstate: Transaction state
>> + *
>> + * Return: DMA transaction status
>> + */
>> +static enum dma_status xilinx_vdma_tx_status(struct dma_chan *dchan,
>> +                                     dma_cookie_t cookie,
>> +                                     struct dma_tx_state *txstate)
>> +{
>> +     struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
>> +     dma_cookie_t last_used;
>> +     dma_cookie_t last_complete;
>> +
>> +     xilinx_vdma_chan_desc_cleanup(chan);
>> +
>> +     last_used = dchan->cookie;
>> +     last_complete = chan->completed_cookie;
>> +
>> +     dma_set_tx_state(txstate, last_complete, last_used, 0);
>> +
>> +     return dma_async_is_complete(cookie, last_complete, last_used);
>> +}
>> +
>> +/**
>> + * xilinx_vdma_is_running - Check if VDMA channel is running
>> + * @chan: Driver specific VDMA channel
>> + *
>> + * Return: '1' if running, '0' if not.
>> + */
>> +static int xilinx_vdma_is_running(struct xilinx_vdma_chan *chan)
>> +{
>> +     return !(vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) &
>> +              XILINX_VDMA_DMASR_HALTED) &&
>> +             (vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR) &
>> +              XILINX_VDMA_DMACR_RUNSTOP);
>> +}
>> +
>> +/**
>> + * xilinx_vdma_is_idle - Check if VDMA channel is idle
>> + * @chan: Driver specific VDMA channel
>> + *
>> + * Return: '1' if idle, '0' if not.
>> + */
>> +static int xilinx_vdma_is_idle(struct xilinx_vdma_chan *chan)
>> +{
>> +     return vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) &
>> +             XILINX_VDMA_DMASR_IDLE;
>> +}
>> +
>> +/**
>> + * xilinx_vdma_halt - Halt VDMA channel
>> + * @chan: Driver specific VDMA channel
>> + */
>> +static void xilinx_vdma_halt(struct xilinx_vdma_chan *chan)
>> +{
>> +     int loop = XILINX_VDMA_LOOP_COUNT + 1;
>> +
>> +     vdma_ctrl_clr(chan, XILINX_VDMA_REG_DMACR, XILINX_VDMA_DMACR_RUNSTOP);
>> +
>> +     /* Wait for the hardware to halt */
>> +     while (loop--)
>> +             if (vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) &
>> +                 XILINX_VDMA_DMASR_HALTED)
>> +                     break;
>> +
>> +     if (!loop) {
>> +             dev_err(chan->dev, "Cannot stop channel %p: %x\n",
>> +                     chan, vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR));
>> +             chan->err = true;
>> +     }
>> +
>> +     return;
>> +}
>> +
>> +/**
>> + * xilinx_vdma_start - Start VDMA channel
>> + * @chan: Driver specific VDMA channel
>> + */
>> +static void xilinx_vdma_start(struct xilinx_vdma_chan *chan)
>> +{
>> +     int loop = XILINX_VDMA_LOOP_COUNT + 1;
>> +
>> +     vdma_ctrl_set(chan, XILINX_VDMA_REG_DMACR, XILINX_VDMA_DMACR_RUNSTOP);
>> +
>> +     /* Wait for the hardware to start */
>> +     while (loop--)
>> +             if (!(vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) &
>> +                   XILINX_VDMA_DMASR_HALTED))
>> +                     break;
>> +
>> +     if (!loop) {
>> +             dev_err(chan->dev, "Cannot start channel %p: %x\n",
>> +                     chan, vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR));
>> +
>> +             chan->err = true;
>> +     }
>> +
>> +     return;
>> +}
>> +
>> +/**
>> + * xilinx_vdma_start_transfer - Starts VDMA transfer
>> + * @chan: Driver specific channel struct pointer
>> + */
>> +static void xilinx_vdma_start_transfer(struct xilinx_vdma_chan *chan)
>> +{
>> +     struct xilinx_vdma_config *config = &chan->config;
>> +     struct xilinx_vdma_tx_descriptor *desc;
>> +     unsigned long flags;
>> +     u32 reg;
>> +     struct xilinx_vdma_tx_segment *head, *tail = NULL;
>> +
>> +     if (chan->err)
>> +             return;
>> +
>> +     spin_lock_irqsave(&chan->lock, flags);
>> +
>> +     /* There's already an active descriptor, bail out. */
>> +     if (chan->active_desc)
>> +             goto out_unlock;
>> +
>> +     if (list_empty(&chan->pending_list))
>> +             goto out_unlock;
>> +
>> +     desc = list_first_entry(&chan->pending_list,
>> +                             struct xilinx_vdma_tx_descriptor, node);
>> +
>> +     /* If it is SG mode and hardware is busy, cannot submit */
>> +     if (chan->has_sg && xilinx_vdma_is_running(chan) &&
>> +         !xilinx_vdma_is_idle(chan)) {
>> +             dev_dbg(chan->dev, "DMA controller still busy\n");
>> +             goto out_unlock;
>> +     }
>> +
>> +     if (chan->err)
>> +             goto out_unlock;
>> +
>> +     /*
>> +      * If hardware is idle, then all descriptors on the running lists are
>> +      * done, start new transfers
>> +      */
>> +     if (chan->has_sg) {
>> +             head = list_first_entry(&desc->segments,
>> +                                     struct xilinx_vdma_tx_segment, node);
>> +             tail = list_entry(desc->segments.prev,
>> +                               struct xilinx_vdma_tx_segment, node);
>> +
>> +             vdma_ctrl_write(chan, XILINX_VDMA_REG_CURDESC, head->phys);
>> +     }
>> +
>> +     /* Configure the hardware using info in the config structure */
>> +     reg = vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR);
>> +
>> +     if (config->frm_cnt_en)
>> +             reg |= XILINX_VDMA_DMACR_FRAMECNT_EN;
>> +     else
>> +             reg &= ~XILINX_VDMA_DMACR_FRAMECNT_EN;
>> +
>> +     /*
>> +      * With SG, start with circular mode, so that BDs can be fetched.
>> +      * In direct register mode, if not parking, enable circular mode
>> +      */
>> +     if (chan->has_sg || !config->park)
>> +             reg |= XILINX_VDMA_DMACR_CIRC_EN;
>> +
>> +     if (config->park)
>> +             reg &= ~XILINX_VDMA_DMACR_CIRC_EN;
>> +
>> +     vdma_ctrl_write(chan, XILINX_VDMA_REG_DMACR, reg);
>> +
>> +     if (config->park && (config->park_frm >= 0) &&
>> +                     (config->park_frm < chan->num_frms)) {
>> +             if (chan->direction == DMA_MEM_TO_DEV)
>> +                     vdma_write(chan, XILINX_VDMA_REG_PARK_PTR,
>> +                             config->park_frm <<
>> +                                     XILINX_VDMA_PARK_PTR_RD_REF_SHIFT);
>> +             else
>> +                     vdma_write(chan, XILINX_VDMA_REG_PARK_PTR,
>> +                             config->park_frm <<
>> +                                     XILINX_VDMA_PARK_PTR_WR_REF_SHIFT);
>> +     }
>> +
>> +     /* Start the hardware */
>> +     xilinx_vdma_start(chan);
>> +
>> +     if (chan->err)
>> +             goto out_unlock;
>> +
>> +     /* Start the transfer */
>> +     if (chan->has_sg) {
>> +             vdma_ctrl_write(chan, XILINX_VDMA_REG_TAILDESC, tail->phys);
>> +     } else {
>> +             struct xilinx_vdma_tx_segment *segment;
>> +             int i = 0;
>> +
>> +             list_for_each_entry(segment, &desc->segments, node)
>> +                     vdma_desc_write(chan,
>> +                                     XILINX_VDMA_REG_START_ADDRESS(i++),
>> +                                     segment->hw.buf_addr);
>> +
>> +             vdma_desc_write(chan, XILINX_VDMA_REG_HSIZE, config->hsize);
>> +             vdma_desc_write(chan, XILINX_VDMA_REG_FRMDLY_STRIDE,
>> +                             (config->frm_dly <<
>> +                              XILINX_VDMA_FRMDLY_STRIDE_FRMDLY_SHIFT) |
>> +                             (config->stride <<
>> +                              XILINX_VDMA_FRMDLY_STRIDE_STRIDE_SHIFT));
>> +             vdma_desc_write(chan, XILINX_VDMA_REG_VSIZE, config->vsize);
>> +     }
>> +
>> +     list_del(&desc->node);
>> +     chan->active_desc = desc;
>> +
>> +out_unlock:
>> +     spin_unlock_irqrestore(&chan->lock, flags);
>> +}
>> +
>> +/**
>> + * xilinx_vdma_issue_pending - Issue pending transactions
>> + * @dchan: DMA channel
>> + */
>> +static void xilinx_vdma_issue_pending(struct dma_chan *dchan)
>> +{
>> +     struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
>> +
>> +     xilinx_vdma_start_transfer(chan);
>> +}
>> +
>> +/**
>> + * xilinx_vdma_complete_descriptor - Mark the active descriptor as complete
>> + * @chan : xilinx DMA channel
>> + *
>> + * CONTEXT: hardirq
>> + */
>> +static void xilinx_vdma_complete_descriptor(struct xilinx_vdma_chan *chan)
>> +{
>> +     struct xilinx_vdma_tx_descriptor *desc;
>> +     unsigned long flags;
>> +
>> +     spin_lock_irqsave(&chan->lock, flags);
>> +
>> +     desc = chan->active_desc;
>> +     if (!desc) {
>> +             dev_dbg(chan->dev, "no running descriptors\n");
>> +             goto out_unlock;
>> +     }
>> +
>> +     list_add_tail(&desc->node, &chan->done_list);
>> +
>> +     /* Update the completed cookie and reset the active descriptor. */
>> +     chan->completed_cookie = desc->async_tx.cookie;
>> +     chan->active_desc = NULL;
>> +
>> +out_unlock:
>> +     spin_unlock_irqrestore(&chan->lock, flags);
>> +}
>> +
>> +/**
>> + * xilinx_vdma_reset - Reset VDMA channel
>> + * @chan: Driver specific VDMA channel
>> + *
>> + * Return: '0' on success and failure value on error
>> + */
>> +static int xilinx_vdma_reset(struct xilinx_vdma_chan *chan)
>> +{
>> +     int loop = XILINX_VDMA_LOOP_COUNT + 1;
>> +     u32 tmp;
>> +
>> +     vdma_ctrl_set(chan, XILINX_VDMA_REG_DMACR, XILINX_VDMA_DMACR_RESET);
>> +
>> +     tmp = vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR) &
>> +             XILINX_VDMA_DMACR_RESET;
>> +
>> +     /* Wait for the hardware to finish reset */
>> +     while (loop-- && tmp)
>> +             tmp = vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR) &
>> +                     XILINX_VDMA_DMACR_RESET;
>> +
>> +     if (!loop) {
>> +             dev_err(chan->dev, "reset timeout, cr %x, sr %x\n",
>> +                     vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR),
>> +                     vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR));
>> +             return -ETIMEDOUT;
>> +     }
>> +
>> +     chan->err = false;
>> +
>> +     return 0;
>> +}
>> +
>> +/**
>> + * xilinx_vdma_chan_reset - Reset VDMA channel and enable interrupts
>> + * @chan: Driver specific VDMA channel
>> + *
>> + * Return: '0' on success and failure value on error
>> + */
>> +static int xilinx_vdma_chan_reset(struct xilinx_vdma_chan *chan)
>> +{
>> +     int err;
>> +
>> +     /* Reset VDMA */
>> +     err = xilinx_vdma_reset(chan);
>> +     if (err)
>> +             return err;
>> +
>> +     /* Enable interrupts */
>> +     vdma_ctrl_set(chan, XILINX_VDMA_REG_DMACR,
>> +                   XILINX_VDMA_DMAXR_ALL_IRQ_MASK);
>> +
>> +     return 0;
>> +}
>> +
>> +/**
>> + * xilinx_vdma_irq_handler - VDMA Interrupt handler
>> + * @irq: IRQ number
>> + * @data: Pointer to the Xilinx VDMA channel structure
>> + *
>> + * Return: IRQ_HANDLED/IRQ_NONE
>> + */
>> +static irqreturn_t xilinx_vdma_irq_handler(int irq, void *data)
>> +{
>> +     struct xilinx_vdma_chan *chan = data;
>> +     u32 status;
>> +
>> +     /* Read the status and ack the interrupts. */
>> +     status = vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR);
>> +     if (!(status & XILINX_VDMA_DMAXR_ALL_IRQ_MASK))
>> +             return IRQ_NONE;
>> +
>> +     vdma_ctrl_write(chan, XILINX_VDMA_REG_DMASR,
>> +                     status & XILINX_VDMA_DMAXR_ALL_IRQ_MASK);
>> +
>> +     if (status & XILINX_VDMA_DMASR_ERR_IRQ) {
>> +             /*
>> +              * An error occurred. If C_FLUSH_ON_FSYNC is enabled and the
>> +              * error is recoverable, ignore it. Otherwise flag the error.
>> +              *
>> +              * Only recoverable errors can be cleared in the DMASR register,
>> +              * make sure not to write to other error bits to 1.
>> +              */
>> +             u32 errors = status & XILINX_VDMA_DMASR_ALL_ERR_MASK;
>> +             vdma_ctrl_write(chan, XILINX_VDMA_REG_DMASR,
>> +                             errors & XILINX_VDMA_DMASR_ERR_RECOVER_MASK);
>> +
>> +             if (!chan->flush_on_fsync ||
>> +                 (errors & ~XILINX_VDMA_DMASR_ERR_RECOVER_MASK)) {
>> +                     dev_err(chan->dev,
>> +                             "Channel %p has errors %x, cdr %x tdr %x\n",
>> +                             chan, errors,
>> +                             vdma_ctrl_read(chan, XILINX_VDMA_REG_CURDESC),
>> +                             vdma_ctrl_read(chan, XILINX_VDMA_REG_TAILDESC));
>> +                     chan->err = true;
>> +             }
>> +     }
>> +
>> +     if (status & XILINX_VDMA_DMASR_DLY_CNT_IRQ) {
>> +             /*
>> +              * Device takes too long to do the transfer when user requires
>> +              * responsiveness.
>> +              */
>> +             dev_dbg(chan->dev, "Inter-packet latency too long\n");
>> +     }
>> +
>> +     if (status & XILINX_VDMA_DMASR_FRM_CNT_IRQ) {
>> +             xilinx_vdma_complete_descriptor(chan);
>> +             xilinx_vdma_start_transfer(chan);
>> +     }
>> +
>> +     tasklet_schedule(&chan->tasklet);
>> +     return IRQ_HANDLED;
>> +}
>> +
>> +/**
>> + * xilinx_vdma_tx_submit - Submit DMA transaction
>> + * @tx: Async transaction descriptor
>> + *
>> + * Return: cookie value on success and failure value on error
>> + */
>> +static dma_cookie_t xilinx_vdma_tx_submit(struct dma_async_tx_descriptor *tx)
>> +{
>> +     struct xilinx_vdma_tx_descriptor *desc = to_vdma_tx_descriptor(tx);
>> +     struct xilinx_vdma_chan *chan = to_xilinx_chan(tx->chan);
>> +     struct xilinx_vdma_tx_segment *segment;
>> +     dma_cookie_t cookie;
>> +     unsigned long flags;
>> +     int err;
>> +
>> +     if (chan->err) {
>> +             /*
>> +              * If reset fails, need to hard reset the system.
>> +              * Channel is no longer functional
>> +              */
>> +             err = xilinx_vdma_chan_reset(chan);
>> +             if (err < 0)
>> +                     return err;
>> +     }
>> +
>> +     spin_lock_irqsave(&chan->lock, flags);
>> +
>> +     /* Assign cookies to all of the segments that make up this transaction.
>> +      * Use the cookie of the last segment as the transaction cookie.
>> +      */
>
> Keep style of multiline comment the same over the code.

Sure.

>
>> +     cookie = chan->cookie;
>> +
>> +     list_for_each_entry(segment, &desc->segments, node) {
>> +             if (cookie < DMA_MAX_COOKIE)
>> +                     cookie++;
>> +             else
>> +                     cookie = DMA_MIN_COOKIE;
>> +
>> +             segment->cookie = cookie;
>> +     }
>> +
>> +     tx->cookie = cookie;
>> +     chan->cookie = cookie;
>> +
>> +     /* Append the transaction to the pending transactions queue. */
>> +     list_add_tail(&desc->node, &chan->pending_list);
>> +
>> +     spin_unlock_irqrestore(&chan->lock, flags);
>> +
>> +     return cookie;
>> +}
>> +
>> +/**
>> + * xilinx_vdma_prep_slave_sg - prepare a descriptor for a DMA_SLAVE transaction
>> + * @dchan: DMA channel
>> + * @sgl: scatterlist to transfer to/from
>> + * @sg_len: number of entries in @sgl
>> + * @dir: DMA direction
>> + * @flags: transfer ack flags
>> + * @context: unused
>> + *
>> + * Return: Async transaction descriptor on success and NULL on failure
>> + */
>> +static struct dma_async_tx_descriptor *
>> +xilinx_vdma_prep_slave_sg(struct dma_chan *dchan, struct scatterlist *sgl,
>> +                       unsigned int sg_len, enum dma_transfer_direction dir,
>> +                       unsigned long flags, void *context)
>> +{
>> +     struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
>> +     struct xilinx_vdma_tx_descriptor *desc;
>> +     struct xilinx_vdma_tx_segment *segment;
>> +     struct xilinx_vdma_tx_segment *prev = NULL;
>> +     struct scatterlist *sg;
>> +     int i;
>> +
>> +     if (chan->direction != dir || sg_len == 0)
>> +             return NULL;
>> +
>> +     /* Enforce one sg entry for one frame. */
>> +     if (sg_len != chan->num_frms) {
>> +             dev_err(chan->dev,
>> +             "number of entries %d not the same as num stores %d\n",
>> +                     sg_len, chan->num_frms);
>> +             return NULL;
>> +     }
>> +
>> +     /* Allocate a transaction descriptor. */
>> +     desc = xilinx_vdma_alloc_tx_descriptor(chan);
>> +     if (!desc)
>> +             return NULL;
>> +
>> +     dma_async_tx_descriptor_init(&desc->async_tx, &chan->common);
>> +     desc->async_tx.tx_submit = xilinx_vdma_tx_submit;
>> +     desc->async_tx.cookie = 0;
>> +     async_tx_ack(&desc->async_tx);
>> +
>> +     /* Build the list of transaction segments. */
>> +     for_each_sg(sgl, sg, sg_len, i) {
>> +             struct xilinx_vdma_desc_hw *hw;
>> +
>> +             /* Allocate the link descriptor from DMA pool */
>> +             segment = xilinx_vdma_alloc_tx_segment(chan);
>> +             if (!segment)
>> +                     goto error;
>> +
>> +             /* Fill in the hardware descriptor */
>> +             hw = &segment->hw;
>> +             hw->buf_addr = sg_dma_address(sg);
>> +             hw->vsize = chan->config.vsize;
>> +             hw->hsize = chan->config.hsize;
>> +             hw->stride = (chan->config.frm_dly <<
>> +                           XILINX_VDMA_FRMDLY_STRIDE_FRMDLY_SHIFT) |
>> +                          (chan->config.stride <<
>> +                           XILINX_VDMA_FRMDLY_STRIDE_STRIDE_SHIFT);
>> +             if (prev)
>> +                     prev->hw.next_desc = segment->phys;
>> +
>> +             /* Insert the segment into the descriptor segments list. */
>> +             list_add_tail(&segment->node, &desc->segments);
>> +
>> +             prev = segment;
>> +     }
>> +
>> +     /* Link the last hardware descriptor with the first. */
>> +     segment = list_first_entry(&desc->segments,
>> +                                struct xilinx_vdma_tx_segment, node);
>> +     prev->hw.next_desc = segment->phys;
>> +
>> +     return &desc->async_tx;
>> +
>> +error:
>> +     xilinx_vdma_free_tx_descriptor(chan, desc);
>> +     return NULL;
>> +}
>> +
>> +/**
>> + * xilinx_vdma_terminate_all - Halt the channel and free descriptors
>> + * @chan: Driver specific VDMA Channel pointer
>> + */
>> +static void xilinx_vdma_terminate_all(struct xilinx_vdma_chan *chan)
>> +{
>> +     /* Halt the DMA engine */
>> +     xilinx_vdma_halt(chan);
>> +
>> +     /* Remove and free all of the descriptors in the lists */
>> +     xilinx_vdma_free_descriptors(chan);
>> +}
>> +
>> +/**
>> + * xilinx_vdma_slave_config - Configure VDMA channel
>> + * Run-time configuration for Axi VDMA, supports:
>> + * . halt the channel
>> + * . configure interrupt coalescing and inter-packet delay threshold
>> + * . start/stop parking
>> + * . enable genlock
>> + * . set transfer information using config struct
>> + *
>> + * @chan: Driver specific VDMA Channel pointer
>> + * @cfg: Channel configuration pointer
>> + *
>> + * Return: '0' on success and failure value on error
>> + */
>> +static int xilinx_vdma_slave_config(struct xilinx_vdma_chan *chan,
>> +                                 struct xilinx_vdma_config *cfg)
>> +{
>> +     u32 dmacr;
>> +
>> +     if (cfg->reset)
>> +             return xilinx_vdma_chan_reset(chan);
>> +
>> +     dmacr = vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR);
>> +
>> +     /* If vsize is -1, it is park-related operations */
>> +     if (cfg->vsize == -1) {
>> +             if (cfg->park)
>> +                     dmacr &= ~XILINX_VDMA_DMACR_CIRC_EN;
>> +             else
>> +                     dmacr |= XILINX_VDMA_DMACR_CIRC_EN;
>> +
>> +             vdma_ctrl_write(chan, XILINX_VDMA_REG_DMACR, dmacr);
>> +             return 0;
>> +     }
>> +
>> +     /* If hsize is -1, it is interrupt threshold settings */
>> +     if (cfg->hsize == -1) {
>> +             if (cfg->coalesc <= XILINX_VDMA_DMACR_FRAME_COUNT_MAX) {
>> +                     dmacr &= ~XILINX_VDMA_DMACR_FRAME_COUNT_MASK;
>> +                     dmacr |= cfg->coalesc <<
>> +                              XILINX_VDMA_DMACR_FRAME_COUNT_SHIFT;
>> +                     chan->config.coalesc = cfg->coalesc;
>> +             }
>> +
>> +             if (cfg->delay <= XILINX_VDMA_DMACR_DELAY_MAX) {
>> +                     dmacr &= ~XILINX_VDMA_DMACR_DELAY_MASK;
>> +                     dmacr |= cfg->delay << XILINX_VDMA_DMACR_DELAY_SHIFT;
>> +                     chan->config.delay = cfg->delay;
>> +             }
>> +
>> +             vdma_ctrl_write(chan, XILINX_VDMA_REG_DMACR, dmacr);
>> +             return 0;
>> +     }
>> +
>> +     /* Transfer information */
>> +     chan->config.vsize = cfg->vsize;
>> +     chan->config.hsize = cfg->hsize;
>> +     chan->config.stride = cfg->stride;
>> +     chan->config.frm_dly = cfg->frm_dly;
>> +     chan->config.park = cfg->park;
>> +
>> +     /* genlock settings */
>> +     chan->config.gen_lock = cfg->gen_lock;
>> +     chan->config.master = cfg->master;
>> +
>> +     if (cfg->gen_lock && chan->genlock) {
>> +             dmacr |= XILINX_VDMA_DMACR_GENLOCK_EN;
>> +             dmacr |= cfg->master << XILINX_VDMA_DMACR_MASTER_SHIFT;
>> +     }
>> +
>> +     chan->config.frm_cnt_en = cfg->frm_cnt_en;
>> +     if (cfg->park)
>> +             chan->config.park_frm = cfg->park_frm;
>> +     else
>> +             chan->config.park_frm = -1;
>> +
>> +     chan->config.coalesc = cfg->coalesc;
>> +     chan->config.delay = cfg->delay;
>> +     if (cfg->coalesc <= XILINX_VDMA_DMACR_FRAME_COUNT_MAX) {
>> +             dmacr |= cfg->coalesc << XILINX_VDMA_DMACR_FRAME_COUNT_SHIFT;
>> +             chan->config.coalesc = cfg->coalesc;
>> +     }
>> +
>> +     if (cfg->delay <= XILINX_VDMA_DMACR_DELAY_MAX) {
>> +             dmacr |= cfg->delay << XILINX_VDMA_DMACR_DELAY_SHIFT;
>> +             chan->config.delay = cfg->delay;
>> +     }
>> +
>> +     /* FSync Source selection */
>> +     dmacr &= ~XILINX_VDMA_DMACR_FSYNCSRC_MASK;
>> +     dmacr |= cfg->ext_fsync << XILINX_VDMA_DMACR_FSYNCSRC_SHIFT;
>> +
>> +     vdma_ctrl_write(chan, XILINX_VDMA_REG_DMACR, dmacr);
>> +     return 0;
>> +}
>> +
>> +/**
>> + * xilinx_vdma_device_control - Configure DMA channel of the device
>> + * @dchan: DMA Channel pointer
>> + * @cmd: DMA control command
>> + * @arg: Channel configuration
>> + *
>> + * Return: '0' on success and failure value on error
>> + */
>> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
>> +                                   enum dma_ctrl_cmd cmd, unsigned long arg)
>> +{
>> +     struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
>> +
>> +     switch (cmd) {
>> +     case DMA_TERMINATE_ALL:
>> +             xilinx_vdma_terminate_all(chan);
>> +             return 0;
>> +     case DMA_SLAVE_CONFIG:
>> +             return xilinx_vdma_slave_config(chan,
>> +                                     (struct xilinx_vdma_config *)arg);
>> +     default:
>> +             return -ENXIO;
>> +     }
>> +}
>> +
>> +/* -----------------------------------------------------------------------------
>> + * Probe and remove
>> + */
>> +
>> +/**
>> + * xilinx_vdma_chan_remove - Per Channel remove function
>> + * @chan: Driver specific VDMA channel
>> + */
>> +static void xilinx_vdma_chan_remove(struct xilinx_vdma_chan *chan)
>> +{
>> +     /* Disable all interrupts */
>> +     vdma_ctrl_clr(chan, XILINX_VDMA_REG_DMACR,
>> +                   XILINX_VDMA_DMAXR_ALL_IRQ_MASK);
>> +
>> +     list_del(&chan->common.device_node);
>> +}
>> +
>> +/**
>> + * xilinx_vdma_chan_probe - Per Channel Probing
>> + * It get channel features from the device tree entry and
>> + * initialize special channel handling routines
>> + *
>> + * @xdev: Driver specific device structure
>> + * @node: Device node
>> + *
>> + * Return: '0' on success and failure value on error
>> + */
>> +static int xilinx_vdma_chan_probe(struct xilinx_vdma_device *xdev,
>> +                               struct device_node *node)
>> +{
>> +     struct xilinx_vdma_chan *chan;
>> +     bool has_dre = false;
>> +     u32 value;
>> +     int err;
>> +
>> +     /* Allocate and initialize the channel structure */
>> +     chan = devm_kzalloc(xdev->dev, sizeof(*chan), GFP_KERNEL);
>> +     if (!chan)
>> +             return -ENOMEM;
>> +
>> +     chan->dev = xdev->dev;
>> +     chan->xdev = xdev;
>> +     chan->has_sg = xdev->has_sg;
>> +
>> +     spin_lock_init(&chan->lock);
>> +     INIT_LIST_HEAD(&chan->pending_list);
>> +     INIT_LIST_HEAD(&chan->done_list);
>> +
>> +     /* Retrieve the channel properties from the device tree */
>> +     has_dre = of_property_read_bool(node, "xlnx,include-dre");
>> +
>> +     chan->genlock = of_property_read_bool(node, "xlnx,genlock-mode");
>> +
>> +     err = of_property_read_u32(node, "xlnx,datawidth", &value);
>> +     if (!err) {
>> +             u32 width = value >> 3; /* Convert bits to bytes */
>> +
>> +             /* If data width is greater than 8 bytes, DRE is not in hw */
>> +             if (width > 8)
>> +                     has_dre = false;
>> +
>> +             if (!has_dre)
>> +                     xdev->common.copy_align = fls(width - 1);
>> +     } else {
>> +             dev_err(xdev->dev, "missing xlnx,datawidth property\n");
>> +             return err;
>> +     }
>> +
>> +     if (of_device_is_compatible(node, "xlnx,axi-vdma-mm2s-channel")) {
>> +             chan->direction = DMA_MEM_TO_DEV;
>> +             chan->id = 0;
>> +
>> +             chan->ctrl_offset = XILINX_VDMA_MM2S_CTRL_OFFSET;
>> +             chan->desc_offset = XILINX_VDMA_MM2S_DESC_OFFSET;
>> +
>> +             if (xdev->flush_on_fsync == XILINX_VDMA_FLUSH_BOTH ||
>> +                 xdev->flush_on_fsync == XILINX_VDMA_FLUSH_MM2S)
>> +                     chan->flush_on_fsync = true;
>> +     } else if (of_device_is_compatible(node,
>> +                                         "xlnx,axi-vdma-s2mm-channel")) {
>> +             chan->direction = DMA_DEV_TO_MEM;
>> +             chan->id = 1;
>> +
>> +             chan->ctrl_offset = XILINX_VDMA_S2MM_CTRL_OFFSET;
>> +             chan->desc_offset = XILINX_VDMA_S2MM_DESC_OFFSET;
>> +
>> +             if (xdev->flush_on_fsync == XILINX_VDMA_FLUSH_BOTH ||
>> +                 xdev->flush_on_fsync == XILINX_VDMA_FLUSH_S2MM)
>> +                     chan->flush_on_fsync = true;
>> +     } else {
>> +             dev_err(xdev->dev, "Invalid channel compatible node\n");
>> +             return -EINVAL;
>> +     }
>> +
>> +     /* Request the interrupt */
>> +     chan->irq = irq_of_parse_and_map(node, 0);
>> +     err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler,
>> +                            IRQF_SHARED, "xilinx-vdma-controller", chan);
>> +     if (err) {
>> +             dev_err(xdev->dev, "unable to request IRQ\n");
>> +             return err;
>> +     }
>> +
>> +     /* Initialize the DMA channel and add it to the DMA engine channels
>> +      * list.
>> +      */
>> +     chan->common.device = &xdev->common;
>> +
>> +     list_add_tail(&chan->common.device_node, &xdev->common.channels);
>> +     xdev->chan[chan->id] = chan;
>> +
>> +     /* Reset the channel */
>> +     err = xilinx_vdma_chan_reset(chan);
>> +     if (err < 0) {
>> +             dev_err(xdev->dev, "Reset channel failed\n");
>> +             return err;
>> +     }
>> +
>> +     return 0;
>> +}
>> +
>> +/**
>> + * struct of_dma_filter_xilinx_args - Channel filter args
>> + * @dev: DMA device structure
>> + * @chan_id: Channel id
>> + */
>> +struct of_dma_filter_xilinx_args {
>> +     struct dma_device *dev;
>> +     u32 chan_id;
>> +};
>> +
>> +/**
>> + * xilinx_vdma_dt_filter - VDMA channel filter function
>> + * @chan: DMA channel pointer
>> + * @param: Filter match value
>> + *
>> + * Return: true/false based on the result
>> + */
>> +static bool xilinx_vdma_dt_filter(struct dma_chan *chan, void *param)
>> +{
>> +     struct of_dma_filter_xilinx_args *args = param;
>> +
>> +     return chan->device == args->dev && chan->chan_id == args->chan_id;
>> +}
>> +
>> +/**
>> + * of_dma_xilinx_xlate - Translation function
>> + * @dma_spec: Pointer to DMA specifier as found in the device tree
>> + * @ofdma: Pointer to DMA controller data
>> + *
>> + * Return: DMA channel pointer on success and NULL on error
>> + */
>> +static struct dma_chan *of_dma_xilinx_xlate(struct of_phandle_args *dma_spec,
>> +                                             struct of_dma *ofdma)
>> +{
>> +     struct of_dma_filter_xilinx_args args;
>> +     dma_cap_mask_t cap;
>> +
>> +     args.dev = ofdma->of_dma_data;
>> +     if (!args.dev)
>> +             return NULL;
>> +
>> +     if (dma_spec->args_count != 1)
>> +             return NULL;
>> +
>> +     dma_cap_zero(cap);
>> +     dma_cap_set(DMA_SLAVE, cap);
>> +
>> +     args.chan_id = dma_spec->args[0];
>> +
>> +     return dma_request_channel(cap, xilinx_vdma_dt_filter, &args);
>> +}
>> +
>> +/**
>> + * xilinx_vdma_probe - Driver probe function
>> + * @pdev: Pointer to the platform_device structure
>> + *
>> + * Return: '0' on success and failure value on error
>> + */
>> +static int xilinx_vdma_probe(struct platform_device *pdev)
>> +{
>> +     struct device_node *node = pdev->dev.of_node;
>> +     struct xilinx_vdma_device *xdev;
>> +     struct device_node *child;
>> +     struct resource *io;
>> +     u32 num_frames;
>> +     int i, err;
>> +
>> +     dev_info(&pdev->dev, "Probing xilinx axi vdma engine\n");
>> +
>> +     /* Allocate and initialize the DMA engine structure */
>> +     xdev = devm_kzalloc(&pdev->dev, sizeof(*xdev), GFP_KERNEL);
>> +     if (!xdev)
>> +             return -ENOMEM;
>> +
>> +     xdev->dev = &pdev->dev;
>> +
>> +     /* Request and map I/O memory */
>> +     io = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>> +     xdev->regs = devm_ioremap_resource(&pdev->dev, io);
>> +     if (IS_ERR(xdev->regs))
>> +             return PTR_ERR(xdev->regs);
>> +
>> +     /* Retrieve the DMA engine properties from the device tree */
>> +     xdev->has_sg = of_property_read_bool(node, "xlnx,include-sg");
>> +
>> +     err = of_property_read_u32(node, "xlnx,num-fstores", &num_frames);
>> +     if (err < 0) {
>> +             dev_err(xdev->dev, "missing xlnx,num-fstores property\n");
>> +             return err;
>> +     }
>> +
>> +     of_property_read_u32(node, "xlnx,flush-fsync", &xdev->flush_on_fsync);
>> +
>> +     /* Initialize the DMA engine */
>> +     xdev->common.dev = &pdev->dev;
>> +
>> +     INIT_LIST_HEAD(&xdev->common.channels);
>> +     dma_cap_set(DMA_SLAVE, xdev->common.cap_mask);
>> +     dma_cap_set(DMA_PRIVATE, xdev->common.cap_mask);
>> +
>> +     xdev->common.device_alloc_chan_resources =
>> +                             xilinx_vdma_alloc_chan_resources;
>> +     xdev->common.device_free_chan_resources =
>> +                             xilinx_vdma_free_chan_resources;
>> +     xdev->common.device_prep_slave_sg = xilinx_vdma_prep_slave_sg;
>> +     xdev->common.device_control = xilinx_vdma_device_control;
>> +     xdev->common.device_tx_status = xilinx_vdma_tx_status;
>> +     xdev->common.device_issue_pending = xilinx_vdma_issue_pending;
>> +
>> +     platform_set_drvdata(pdev, xdev);
>> +
>> +     /* Initialize the channels */
>> +     for_each_child_of_node(node, child) {
>> +             err = xilinx_vdma_chan_probe(xdev, child);
>> +             if (err < 0)
>> +                     goto error;
>> +     }
>> +
>> +     for (i = 0; i < XILINX_VDMA_MAX_CHANS_PER_DEVICE; i++) {
>> +             if (xdev->chan[i])
>> +                     xdev->chan[i]->num_frms = num_frames;
>> +     }
>> +
>> +     /* Register the DMA engine with the core */
>> +     dma_async_device_register(&xdev->common);
>> +
>> +     err = of_dma_controller_register(node, of_dma_xilinx_xlate,
>> +                                      &xdev->common);
>> +     if (err < 0) {
>> +             dev_err(&pdev->dev, "Unable to register DMA to DT\n");
>> +             dma_async_device_unregister(&xdev->common);
>> +             goto error;
>> +     }
>> +
>> +     return 0;
>> +
>> +error:
>> +     for (i = 0; i < XILINX_VDMA_MAX_CHANS_PER_DEVICE; i++) {
>> +             if (xdev->chan[i])
>> +                     xilinx_vdma_chan_remove(xdev->chan[i]);
>> +     }
>> +
>> +     return err;
>> +}
>> +
>> +/**
>> + * xilinx_vdma_remove - Driver remove function
>> + * @pdev: Pointer to the platform_device structure
>> + *
>> + * Return: Always '0'
>> + */
>> +static int xilinx_vdma_remove(struct platform_device *pdev)
>> +{
>> +     struct xilinx_vdma_device *xdev;
>> +     int i;
>> +
>> +     of_dma_controller_free(pdev->dev.of_node);
>> +
>> +     xdev = platform_get_drvdata(pdev);
>
> You could move this assignment to a variables block, it's normal
> practice.

Ok.  I will send v3 fixing the comments.

Srikanth

>
>> +     dma_async_device_unregister(&xdev->common);
>> +
>> +     for (i = 0; i < XILINX_VDMA_MAX_CHANS_PER_DEVICE; i++) {
>> +             if (xdev->chan[i])
>> +                     xilinx_vdma_chan_remove(xdev->chan[i]);
>> +     }
>> +
>> +     return 0;
>> +}
>> +
>> +static const struct of_device_id xilinx_vdma_of_ids[] = {
>> +     { .compatible = "xlnx,axi-vdma-1.00.a",},
>> +     {}
>> +};
>> +
>> +static struct platform_driver xilinx_vdma_driver = {
>> +     .driver = {
>> +             .name = "xilinx-vdma",
>> +             .owner = THIS_MODULE,
>> +             .of_match_table = xilinx_vdma_of_ids,
>> +     },
>> +     .probe = xilinx_vdma_probe,
>> +     .remove = xilinx_vdma_remove,
>> +};
>> +
>> +module_platform_driver(xilinx_vdma_driver);
>> +
>> +MODULE_AUTHOR("Xilinx, Inc.");
>> +MODULE_DESCRIPTION("Xilinx VDMA driver");
>> +MODULE_LICENSE("GPL v2");
>> diff --git a/include/linux/amba/xilinx_dma.h b/include/linux/amba/xilinx_dma.h
>> new file mode 100644
>> index 0000000..48a8c8b
>> --- /dev/null
>> +++ b/include/linux/amba/xilinx_dma.h
>> @@ -0,0 +1,50 @@
>> +/*
>> + * Xilinx DMA Engine drivers support header file
>> + *
>> + * Copyright (C) 2010-2014 Xilinx, Inc. All rights reserved.
>> + *
>> + * This is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + */
>> +
>> +#ifndef __DMA_XILINX_DMA_H
>> +#define __DMA_XILINX_DMA_H
>> +
>> +#include <linux/dma-mapping.h>
>> +#include <linux/dmaengine.h>
>> +
>> +/**
>> + * struct xilinx_vdma_config - VDMA Configuration structure
>> + * @vsize: Vertical size
>> + * @hsize: Horizontal size
>> + * @stride: Stride
>> + * @frm_dly: Frame delay
>> + * @gen_lock: Whether in gen-lock mode
>> + * @master: Master that it syncs to
>> + * @frm_cnt_en: Enable frame count enable
>> + * @park: Whether wants to park
>> + * @park_frm: Frame to park on
>> + * @coalesc: Interrupt coalescing threshold
>> + * @delay: Delay counter
>> + * @reset: Reset Channel
>> + * @ext_fsync: External Frame Sync source
>> + */
>> +struct xilinx_vdma_config {
>> +     int vsize;
>> +     int hsize;
>> +     int stride;
>> +     int frm_dly;
>> +     int gen_lock;
>> +     int master;
>> +     int frm_cnt_en;
>> +     int park;
>> +     int park_frm;
>> +     int coalesc;
>> +     int delay;
>> +     int reset;
>> +     int ext_fsync;
>> +};
>> +
>> +#endif
>
> --
> Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Intel Finland Oy
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Srikanth Thokala Jan. 24, 2014, 11:16 a.m. UTC | #10
Hi Lars,

On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen <lars@metafoo.de> wrote:
> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
> [...]
>> +/**
>> + * xilinx_vdma_device_control - Configure DMA channel of the device
>> + * @dchan: DMA Channel pointer
>> + * @cmd: DMA control command
>> + * @arg: Channel configuration
>> + *
>> + * Return: '0' on success and failure value on error
>> + */
>> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
>> +                                   enum dma_ctrl_cmd cmd, unsigned long arg)
>> +{
>> +     struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
>> +
>> +     switch (cmd) {
>> +     case DMA_TERMINATE_ALL:
>> +             xilinx_vdma_terminate_all(chan);
>> +             return 0;
>> +     case DMA_SLAVE_CONFIG:
>> +             return xilinx_vdma_slave_config(chan,
>> +                                     (struct xilinx_vdma_config *)arg);
>
> You really shouldn't be overloading the generic API with your own semantics.
> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.

Ok.  The driver needs few additional configuration from the slave
device like Vertical
Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case do you
suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START
defined for Freescale drivers?

>
>> +     default:
>> +             return -ENXIO;
>> +     }
>> +}
>> +
> [...]
>> +
>> +     /* Request the interrupt */
>> +     chan->irq = irq_of_parse_and_map(node, 0);
>> +     err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler,
>> +                            IRQF_SHARED, "xilinx-vdma-controller", chan);
>
> This is a clasic example of where to not use devm_request_irq. 'chan' is
> accessed in the interrupt handler, but if you use devm_request_irq 'chan'
> will be freed before the interrupt handler has been released, which means
> there is now a race condition where the interrupt handler can access already
> freed memory.

Ok, thank you for the clarification on this thread.  I will fix it in v3.

>
>> +     if (err) {
>> +             dev_err(xdev->dev, "unable to request IRQ\n");
>> +             return err;
>> +     }
>> +
>> +     /* Initialize the DMA channel and add it to the DMA engine channels
>> +      * list.
>> +      */
>> +     chan->common.device = &xdev->common;
>> +
>> +     list_add_tail(&chan->common.device_node, &xdev->common.channels);
>> +     xdev->chan[chan->id] = chan;
>> +
>> +     /* Reset the channel */
>> +     err = xilinx_vdma_chan_reset(chan);
>> +     if (err < 0) {
>> +             dev_err(xdev->dev, "Reset channel failed\n");
>> +             return err;
>> +     }
>> +
>> +     return 0;
>> +}
>> +
>> +/**
>> + * struct of_dma_filter_xilinx_args - Channel filter args
>> + * @dev: DMA device structure
>> + * @chan_id: Channel id
>> + */
>> +struct of_dma_filter_xilinx_args {
>> +     struct dma_device *dev;
>> +     u32 chan_id;
>> +};
>> +
>> +/**
>> + * xilinx_vdma_dt_filter - VDMA channel filter function
>> + * @chan: DMA channel pointer
>> + * @param: Filter match value
>> + *
>> + * Return: true/false based on the result
>> + */
>> +static bool xilinx_vdma_dt_filter(struct dma_chan *chan, void *param)
>> +{
>> +     struct of_dma_filter_xilinx_args *args = param;
>> +
>> +     return chan->device == args->dev && chan->chan_id == args->chan_id;
>> +}
>> +
>> +/**
>> + * of_dma_xilinx_xlate - Translation function
>> + * @dma_spec: Pointer to DMA specifier as found in the device tree
>> + * @ofdma: Pointer to DMA controller data
>> + *
>> + * Return: DMA channel pointer on success and NULL on error
>> + */
>> +static struct dma_chan *of_dma_xilinx_xlate(struct of_phandle_args *dma_spec,
>> +                                             struct of_dma *ofdma)
>> +{
>> +     struct of_dma_filter_xilinx_args args;
>> +     dma_cap_mask_t cap;
>> +
>> +     args.dev = ofdma->of_dma_data;
>> +     if (!args.dev)
>> +             return NULL;
>> +
>> +     if (dma_spec->args_count != 1)
>> +             return NULL;
>> +
>> +     dma_cap_zero(cap);
>> +     dma_cap_set(DMA_SLAVE, cap);
>> +
>> +     args.chan_id = dma_spec->args[0];
>> +
>> +     return dma_request_channel(cap, xilinx_vdma_dt_filter, &args);
>
> There is a new helper function called dma_get_slave_channel, which makes
> this much easier. Take a look at the k3dma.c driver for an example.

Ok.  I will check and fix it in v3.

Srikanth

>> +}
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lars-Peter Clausen Jan. 24, 2014, 1:24 p.m. UTC | #11
On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
> Hi Lars,
> 
> On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen <lars@metafoo.de> wrote:
>> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
>> [...]
>>> +/**
>>> + * xilinx_vdma_device_control - Configure DMA channel of the device
>>> + * @dchan: DMA Channel pointer
>>> + * @cmd: DMA control command
>>> + * @arg: Channel configuration
>>> + *
>>> + * Return: '0' on success and failure value on error
>>> + */
>>> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
>>> +                                   enum dma_ctrl_cmd cmd, unsigned long arg)
>>> +{
>>> +     struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
>>> +
>>> +     switch (cmd) {
>>> +     case DMA_TERMINATE_ALL:
>>> +             xilinx_vdma_terminate_all(chan);
>>> +             return 0;
>>> +     case DMA_SLAVE_CONFIG:
>>> +             return xilinx_vdma_slave_config(chan,
>>> +                                     (struct xilinx_vdma_config *)arg);
>>
>> You really shouldn't be overloading the generic API with your own semantics.
>> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
> 
> Ok.  The driver needs few additional configuration from the slave
> device like Vertical
> Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case do you
> suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START
> defined for Freescale drivers?

In my opinion it is not a good idea to have driver implement a generic API,
but at the same time let the driver have custom semantics for those API
calls. It's a bit like having a gpio driver that expects 23 and 42 as the
values passed to gpio_set_value instead of 0 and 1. It completely defeats
the purpose of a generic API, namely that you are able to write generic code
that makes use of the API without having to know about which implementation
API it is talking to. The dmaengine framework provides the
dmaengine_prep_interleaved_dma() function to setup two dimensional
transfers, e.g. take a look at sirf-dma.c or imx-dma.c.

- Lars

--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Vinod Koul Jan. 26, 2014, 1:59 p.m. UTC | #12
On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote:
> On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
> > Hi Lars,
> > 
> > On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen <lars@metafoo.de> wrote:
> >> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
> >> [...]
> >>> +/**
> >>> + * xilinx_vdma_device_control - Configure DMA channel of the device
> >>> + * @dchan: DMA Channel pointer
> >>> + * @cmd: DMA control command
> >>> + * @arg: Channel configuration
> >>> + *
> >>> + * Return: '0' on success and failure value on error
> >>> + */
> >>> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
> >>> +                                   enum dma_ctrl_cmd cmd, unsigned long arg)
> >>> +{
> >>> +     struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
> >>> +
> >>> +     switch (cmd) {
> >>> +     case DMA_TERMINATE_ALL:
> >>> +             xilinx_vdma_terminate_all(chan);
> >>> +             return 0;
> >>> +     case DMA_SLAVE_CONFIG:
> >>> +             return xilinx_vdma_slave_config(chan,
> >>> +                                     (struct xilinx_vdma_config *)arg);
> >>
> >> You really shouldn't be overloading the generic API with your own semantics.
> >> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
> > 
> > Ok.  The driver needs few additional configuration from the slave
> > device like Vertical
> > Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case do you
> > suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START
> > defined for Freescale drivers?
> 
> In my opinion it is not a good idea to have driver implement a generic API,
> but at the same time let the driver have custom semantics for those API
> calls. It's a bit like having a gpio driver that expects 23 and 42 as the
> values passed to gpio_set_value instead of 0 and 1. It completely defeats
> the purpose of a generic API, namely that you are able to write generic code
> that makes use of the API without having to know about which implementation
> API it is talking to. The dmaengine framework provides the
> dmaengine_prep_interleaved_dma() function to setup two dimensional
> transfers, e.g. take a look at sirf-dma.c or imx-dma.c.

The question here i think would be waht this device supports? Is the hardware
capable of doing interleaved transfers, then would make sense.

While we do try to get users use dma_slave_config, but there will always be
someone who have specfic params. If we can generalize then we might want to add
to the dma_slave_config as well

--
~Vinod
--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Vinod Koul Jan. 26, 2014, 2:03 p.m. UTC | #13
On Thu, Jan 23, 2014 at 03:07:32PM +0100, Lars-Peter Clausen wrote:
> On 01/23/2014 03:00 PM, Andy Shevchenko wrote:
> > On Thu, 2014-01-23 at 14:50 +0100, Lars-Peter Clausen wrote:
> >> On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote:
> >>> On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote:
> >>>> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
> >>>
> >>> [...]
> >>>
> >>>>> +	/* Request the interrupt */
> >>>>> +	chan->irq = irq_of_parse_and_map(node, 0);
> >>>>> +	err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler,
> >>>>> +			       IRQF_SHARED, "xilinx-vdma-controller", chan);
> >>>>
> >>>> This is a clasic example of where to not use devm_request_irq. 'chan' is
> >>>> accessed in the interrupt handler, but if you use devm_request_irq 'chan'
> >>>> will be freed before the interrupt handler has been released, which means
> >>>> there is now a race condition where the interrupt handler can access already
> >>>> freed memory.ta
> >>>
> >>> Could you elaborate this case? As far as I understood managed resources
> >>> are a kind of stack pile. In this case you have no such condition. Where
> >>> am I wrong?
> >>
> >> The stacked stuff is only ran after the remove() function. Which means that
> >> you call dma_async_device_unregister() before the interrupt handler is
> >> freed. Another issue with the interrupt handler is a bit hidden. The driver
> >> does not call tasklet_kill in the remove function. Which it should though to
> >> make sure that the tasklet does not race against the freeing of the memory.
> >> And in order to make sure that the tasklet is not rescheduled you need to
> >> free the irq before killing the tasklet, since the interrupt handler
> >> schedules the tasklet.
> > 
> > So, you mean devm_request_irq() will race in any DMA driver?
> 
> Most likely yes. devm_request_irq() is race condition prone for the majority
> of device driver. You have to be really careful if you want to use it.
> 
> > 
> > I think the proper solution is to disable all device work in
> > the .remove() and devm will care about resources.
> 
> As long as the interrupt handler is registered it can be called, the only
> proper solution is to make sure that the order in which resources are torn
> down is correct.
Wouldn't it work if we register the irq last in the probe. That wil ensure on
success the channel is always valid.

Also the tasklet is required to be killed not just in your .remove but also in
drivers .suspend handler, you dont want handler to be invoked after you returned
from your suspend

--
~Vinod
--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Vinod Koul Jan. 26, 2014, 2:24 p.m. UTC | #14
On Wed, Jan 22, 2014 at 10:22:45PM +0530, Srikanth Thokala wrote:
> This is the driver for the AXI Video Direct Memory Access (AXI
> VDMA) core, which is a soft Xilinx IP core that provides high-
> bandwidth direct memory access between memory and AXI4-Stream
> type video target peripherals. The core provides efficient two
> dimensional DMA operations with independent asynchronous read
ok here is tha catch, do you want to support interleaved API rather?

> +* DMA client + +Required properties: +- dmas: a list of <[Video DMA device
> phandle] [Channel ID]> pairs, +	where Channel ID is '0' for write/tx and
> '1' for read/rx +	channel.  +- dma-names: a list of DMA channel names, one
> per "dmas" entry + +Example: +++++++++ + +vdmatest_0: vdmatest@0 { +
> compatible ="xlnx,axi-vdma-test-1.00.a"; +	dmas = <&axi_vdma_0 0 +
> &axi_vdma_0 1>; +	dma-names = "vdma0", "vdma1"; +} ;
Need ack from DT folks. ALso would be better to split the binding to a separate
patch


> +/**
> + * struct xilinx_vdma_chan - Driver specific VDMA channel structure
> + * @xdev: Driver specific device structure
> + * @ctrl_offset: Control registers offset
> + * @desc_offset: TX descriptor registers offset
> + * @completed_cookie: Maximum cookie completed
> + * @cookie: The current cookie
> + * @lock: Descriptor operation lock
> + * @pending_list: Descriptors waiting
> + * @active_desc: Active descriptor
> + * @done_list: Complete descriptors
> + * @common: DMA common channel
> + * @desc_pool: Descriptors pool
> + * @dev: The dma device
> + * @irq: Channel IRQ
> + * @id: Channel ID
> + * @direction: Transfer direction
> + * @num_frms: Number of frames
> + * @has_sg: Support scatter transfers
> + * @genlock: Support genlock mode
> + * @err: Channel has errors
> + * @tasklet: Cleanup work after irq
> + * @config: Device configuration info
> + * @flush_on_fsync: Flush on Frame sync
> + */
> +struct xilinx_vdma_chan {
> +	struct xilinx_vdma_device *xdev;
> +	u32 ctrl_offset;
> +	u32 desc_offset;
> +	dma_cookie_t completed_cookie;
> +	dma_cookie_t cookie;
> +	spinlock_t lock;
> +	struct list_head pending_list;
> +	struct xilinx_vdma_tx_descriptor *active_desc;
> +	struct list_head done_list;
> +	struct dma_chan common;
> +	struct dma_pool *desc_pool;
> +	struct device *dev;
> +	int irq;
> +	int id;
> +	enum dma_transfer_direction direction;
why should channel have a direction... descriptor should have direction and not
the channel!

> +/**
> + * xilinx_vdma_free_tx_descriptor - Free transaction descriptor
> + * @chan: Driver specific VDMA channel
> + * @desc: VDMA transaction descriptor
> + */
> +static void
> +xilinx_vdma_free_tx_descriptor(struct xilinx_vdma_chan *chan,
> +			       struct xilinx_vdma_tx_descriptor *desc)
> +{
> +	struct xilinx_vdma_tx_segment *segment, *next;
> +
> +	if (!desc)
> +		return;
> +
> +	list_for_each_entry_safe(segment, next, &desc->segments, node) {
do you want to use _safe. Isee that this is called for cleanup while lock held,
and in other case within another _safe iterator!

> +		list_del(&segment->node);
> +		xilinx_vdma_free_tx_segment(chan, segment);
> +	}
> +
> +	kfree(desc);
> +}
> +
> +/* Required functions */
> +

> + * xilinx_vdma_do_tasklet - Schedule completion tasklet
> + * @data: Pointer to the Xilinx VDMA channel structure
> + */
> +static void xilinx_vdma_do_tasklet(unsigned long data)
> +{
> +	struct xilinx_vdma_chan *chan = (struct xilinx_vdma_chan *)data;
> +
> +	xilinx_vdma_chan_desc_cleanup(chan);
> +}
> +
> +/**
> + * xilinx_vdma_alloc_chan_resources - Allocate channel resources
> + * @dchan: DMA channel
> + *
> + * Return: '1' on success and failure value on error
naaah, we dont do that, pls use standard notation of 0 on success
Also API wants you to return descriptors allocated here!

> + */
> +static int xilinx_vdma_alloc_chan_resources(struct dma_chan *dchan)
> +{
> +	struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
> +
> +	/* Has this channel already been allocated? */
> +	if (chan->desc_pool)
> +		return 1;
> +
> +	/*
> +	 * We need the descriptor to be aligned to 64bytes
> +	 * for meeting Xilinx VDMA specification requirement.
> +	 */
> +	chan->desc_pool = dma_pool_create("xilinx_vdma_desc_pool",
> +				chan->dev,
> +				sizeof(struct xilinx_vdma_tx_segment),
> +				__alignof__(struct xilinx_vdma_tx_segment), 0);
> +	if (!chan->desc_pool) {
> +		dev_err(chan->dev,
> +			"unable to allocate channel %d descriptor pool\n",
> +			chan->id);
> +		return -ENOMEM;
> +	}
> +
> +	tasklet_init(&chan->tasklet, xilinx_vdma_do_tasklet,
> +			(unsigned long)chan);
> +
> +	chan->completed_cookie = DMA_MIN_COOKIE;
> +	chan->cookie = DMA_MIN_COOKIE;
Can you use virtual dma implementation to simplfy your implemenattion of lists,
cookies (driver/dma/virt-dma.c)

> +	/* There is at least one descriptor free to be allocated */
???

> +	return 1;
> +}
> +

> + * xilinx_vdma_tx_status - Get VDMA transaction status
> + * @dchan: DMA channel
> + * @cookie: Transaction identifier
> + * @txstate: Transaction state
> + *
> + * Return: DMA transaction status
> + */
> +static enum dma_status xilinx_vdma_tx_status(struct dma_chan *dchan,
> +					dma_cookie_t cookie,
> +					struct dma_tx_state *txstate)
> +{
> +	struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
> +	dma_cookie_t last_used;
> +	dma_cookie_t last_complete;
> +
> +	xilinx_vdma_chan_desc_cleanup(chan);
> +
> +	last_used = dchan->cookie;
> +	last_complete = chan->completed_cookie;
> +
> +	dma_set_tx_state(txstate, last_complete, last_used, 0);
> +
> +	return dma_async_is_complete(cookie, last_complete, last_used);
no residue calculation?

> +}
> +
> + * xilinx_vdma_start - Start VDMA channel
> + * @chan: Driver specific VDMA channel
> + */
> +static void xilinx_vdma_start(struct xilinx_vdma_chan *chan)
> +{
> +	int loop = XILINX_VDMA_LOOP_COUNT + 1;
> +
> +	vdma_ctrl_set(chan, XILINX_VDMA_REG_DMACR, XILINX_VDMA_DMACR_RUNSTOP);
> +
> +	/* Wait for the hardware to start */
> +	while (loop--)
> +		if (!(vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) &
> +		      XILINX_VDMA_DMASR_HALTED))
> +			break;
wouldnt do while be better than doing than increamenting loop by 1 above and
using in while!
> +
> +	if (!loop) {
> +		dev_err(chan->dev, "Cannot start channel %p: %x\n",
> +			chan, vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR));
> +
> +		chan->err = true;
> +	}
> +
> +	return;
> +}
> +

> +/**
> + * xilinx_vdma_prep_slave_sg - prepare a descriptor for a DMA_SLAVE transaction
> + * @dchan: DMA channel
> + * @sgl: scatterlist to transfer to/from
> + * @sg_len: number of entries in @sgl
> + * @dir: DMA direction
> + * @flags: transfer ack flags
> + * @context: unused
> + *
> + * Return: Async transaction descriptor on success and NULL on failure
> + */
> +static struct dma_async_tx_descriptor *
> +xilinx_vdma_prep_slave_sg(struct dma_chan *dchan, struct scatterlist *sgl,
> +			  unsigned int sg_len, enum dma_transfer_direction dir,
> +			  unsigned long flags, void *context)
okay now am worried, this is supposed to memcpy DMA so why slave-sg??

Looking at the driver overall, IMHO we need to do:
- use the virt-dma to simplfy the cookie handling and perhpasn the descrptors
  too!
- Perhpas use interleaved API..?
- I dont think we should use the slave API as this seems memcpy case!
Lars-Peter Clausen Jan. 26, 2014, 5:39 p.m. UTC | #15
On 01/26/2014 02:59 PM, Vinod Koul wrote:
> On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote:
>> On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
>>> Hi Lars,
>>>
>>> On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen <lars@metafoo.de> wrote:
>>>> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
>>>> [...]
>>>>> +/**
>>>>> + * xilinx_vdma_device_control - Configure DMA channel of the device
>>>>> + * @dchan: DMA Channel pointer
>>>>> + * @cmd: DMA control command
>>>>> + * @arg: Channel configuration
>>>>> + *
>>>>> + * Return: '0' on success and failure value on error
>>>>> + */
>>>>> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
>>>>> +                                   enum dma_ctrl_cmd cmd, unsigned long arg)
>>>>> +{
>>>>> +     struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
>>>>> +
>>>>> +     switch (cmd) {
>>>>> +     case DMA_TERMINATE_ALL:
>>>>> +             xilinx_vdma_terminate_all(chan);
>>>>> +             return 0;
>>>>> +     case DMA_SLAVE_CONFIG:
>>>>> +             return xilinx_vdma_slave_config(chan,
>>>>> +                                     (struct xilinx_vdma_config *)arg);
>>>>
>>>> You really shouldn't be overloading the generic API with your own semantics.
>>>> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
>>>
>>> Ok.  The driver needs few additional configuration from the slave
>>> device like Vertical
>>> Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case do you
>>> suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START
>>> defined for Freescale drivers?
>>
>> In my opinion it is not a good idea to have driver implement a generic API,
>> but at the same time let the driver have custom semantics for those API
>> calls. It's a bit like having a gpio driver that expects 23 and 42 as the
>> values passed to gpio_set_value instead of 0 and 1. It completely defeats
>> the purpose of a generic API, namely that you are able to write generic code
>> that makes use of the API without having to know about which implementation
>> API it is talking to. The dmaengine framework provides the
>> dmaengine_prep_interleaved_dma() function to setup two dimensional
>> transfers, e.g. take a look at sirf-dma.c or imx-dma.c.
> 
> The question here i think would be waht this device supports? Is the hardware
> capable of doing interleaved transfers, then would make sense.

The hardware does 2D transfers. The parameters for a transfer are height,
width and stride. That's only a subset of what interleaved transfers can be
(xt->num_frames must be one for 2d transfers). But if I remember correctly
there has been some discussion on this in the past and the result of that
discussion was that using interleaved transfers for 2D transfers is
preferred over adding a custom API for 2D transfers.

- Lars

--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lars-Peter Clausen Jan. 26, 2014, 5:41 p.m. UTC | #16
On 01/26/2014 03:03 PM, Vinod Koul wrote:
> On Thu, Jan 23, 2014 at 03:07:32PM +0100, Lars-Peter Clausen wrote:
>> On 01/23/2014 03:00 PM, Andy Shevchenko wrote:
>>> On Thu, 2014-01-23 at 14:50 +0100, Lars-Peter Clausen wrote:
>>>> On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote:
>>>>> On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote:
>>>>>> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
>>>>>
>>>>> [...]
>>>>>
>>>>>>> +	/* Request the interrupt */
>>>>>>> +	chan->irq = irq_of_parse_and_map(node, 0);
>>>>>>> +	err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler,
>>>>>>> +			       IRQF_SHARED, "xilinx-vdma-controller", chan);
>>>>>>
>>>>>> This is a clasic example of where to not use devm_request_irq. 'chan' is
>>>>>> accessed in the interrupt handler, but if you use devm_request_irq 'chan'
>>>>>> will be freed before the interrupt handler has been released, which means
>>>>>> there is now a race condition where the interrupt handler can access already
>>>>>> freed memory.ta
>>>>>
>>>>> Could you elaborate this case? As far as I understood managed resources
>>>>> are a kind of stack pile. In this case you have no such condition. Where
>>>>> am I wrong?
>>>>
>>>> The stacked stuff is only ran after the remove() function. Which means that
>>>> you call dma_async_device_unregister() before the interrupt handler is
>>>> freed. Another issue with the interrupt handler is a bit hidden. The driver
>>>> does not call tasklet_kill in the remove function. Which it should though to
>>>> make sure that the tasklet does not race against the freeing of the memory.
>>>> And in order to make sure that the tasklet is not rescheduled you need to
>>>> free the irq before killing the tasklet, since the interrupt handler
>>>> schedules the tasklet.
>>>
>>> So, you mean devm_request_irq() will race in any DMA driver?
>>
>> Most likely yes. devm_request_irq() is race condition prone for the majority
>> of device driver. You have to be really careful if you want to use it.
>>
>>>
>>> I think the proper solution is to disable all device work in
>>> the .remove() and devm will care about resources.
>>
>> As long as the interrupt handler is registered it can be called, the only
>> proper solution is to make sure that the order in which resources are torn
>> down is correct.
> Wouldn't it work if we register the irq last in the probe. That wil ensure on
> success the channel is always valid.

Yes, but only if the irq is not device managed. All device managed resources
will be freed after the remove function has been called. Which is to late in
our case since we make sure that the tasklet is not running in the remove
function.

- Lars

--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lars-Peter Clausen Jan. 26, 2014, 5:46 p.m. UTC | #17
On 01/26/2014 03:24 PM, Vinod Koul wrote:
> On Wed, Jan 22, 2014 at 10:22:45PM +0530, Srikanth Thokala wrote:
>> This is the driver for the AXI Video Direct Memory Access (AXI
>> VDMA) core, which is a soft Xilinx IP core that provides high-
>> bandwidth direct memory access between memory and AXI4-Stream
>> type video target peripherals. The core provides efficient two
>> dimensional DMA operations with independent asynchronous read
> ok here is tha catch, do you want to support interleaved API rather?
> 
>> +* DMA client + +Required properties: +- dmas: a list of <[Video DMA device
>> phandle] [Channel ID]> pairs, +	where Channel ID is '0' for write/tx and
>> '1' for read/rx +	channel.  +- dma-names: a list of DMA channel names, one
>> per "dmas" entry + +Example: +++++++++ + +vdmatest_0: vdmatest@0 { +
>> compatible ="xlnx,axi-vdma-test-1.00.a"; +	dmas = <&axi_vdma_0 0 +
>> &axi_vdma_0 1>; +	dma-names = "vdma0", "vdma1"; +} ;
> Need ack from DT folks. ALso would be better to split the binding to a separate
> patch
> 
> 
>> +/**
>> + * struct xilinx_vdma_chan - Driver specific VDMA channel structure
>> + * @xdev: Driver specific device structure
>> + * @ctrl_offset: Control registers offset
>> + * @desc_offset: TX descriptor registers offset
>> + * @completed_cookie: Maximum cookie completed
>> + * @cookie: The current cookie
>> + * @lock: Descriptor operation lock
>> + * @pending_list: Descriptors waiting
>> + * @active_desc: Active descriptor
>> + * @done_list: Complete descriptors
>> + * @common: DMA common channel
>> + * @desc_pool: Descriptors pool
>> + * @dev: The dma device
>> + * @irq: Channel IRQ
>> + * @id: Channel ID
>> + * @direction: Transfer direction
>> + * @num_frms: Number of frames
>> + * @has_sg: Support scatter transfers
>> + * @genlock: Support genlock mode
>> + * @err: Channel has errors
>> + * @tasklet: Cleanup work after irq
>> + * @config: Device configuration info
>> + * @flush_on_fsync: Flush on Frame sync
>> + */
>> +struct xilinx_vdma_chan {
>> +	struct xilinx_vdma_device *xdev;
>> +	u32 ctrl_offset;
>> +	u32 desc_offset;
>> +	dma_cookie_t completed_cookie;
>> +	dma_cookie_t cookie;
>> +	spinlock_t lock;
>> +	struct list_head pending_list;
>> +	struct xilinx_vdma_tx_descriptor *active_desc;
>> +	struct list_head done_list;
>> +	struct dma_chan common;
>> +	struct dma_pool *desc_pool;
>> +	struct device *dev;
>> +	int irq;
>> +	int id;
>> +	enum dma_transfer_direction direction;
> why should channel have a direction... descriptor should have direction and not
> the channel!

The channel only supports transfers in one direction. Either from memory to
peripheral or from peripheral to memory, that's fixed and can't be changed
at runtime. The driver needs to know which direction the channel supports so
it can reject transfers with the wrong direction.

[...]
>> +
> 
>> + * xilinx_vdma_tx_status - Get VDMA transaction status
>> + * @dchan: DMA channel
>> + * @cookie: Transaction identifier
>> + * @txstate: Transaction state
>> + *
>> + * Return: DMA transaction status
>> + */
>> +static enum dma_status xilinx_vdma_tx_status(struct dma_chan *dchan,
>> +					dma_cookie_t cookie,
>> +					struct dma_tx_state *txstate)
>> +{
>> +	struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
>> +	dma_cookie_t last_used;
>> +	dma_cookie_t last_complete;
>> +
>> +	xilinx_vdma_chan_desc_cleanup(chan);
>> +
>> +	last_used = dchan->cookie;
>> +	last_complete = chan->completed_cookie;
>> +
>> +	dma_set_tx_state(txstate, last_complete, last_used, 0);
>> +
>> +	return dma_async_is_complete(cookie, last_complete, last_used);
> no residue calculation?
> 

The hardware doesn't support that.

>> +/**
>> + * xilinx_vdma_prep_slave_sg - prepare a descriptor for a DMA_SLAVE transaction
>> + * @dchan: DMA channel
>> + * @sgl: scatterlist to transfer to/from
>> + * @sg_len: number of entries in @sgl
>> + * @dir: DMA direction
>> + * @flags: transfer ack flags
>> + * @context: unused
>> + *
>> + * Return: Async transaction descriptor on success and NULL on failure
>> + */
>> +static struct dma_async_tx_descriptor *
>> +xilinx_vdma_prep_slave_sg(struct dma_chan *dchan, struct scatterlist *sgl,
>> +			  unsigned int sg_len, enum dma_transfer_direction dir,
>> +			  unsigned long flags, void *context)
> okay now am worried, this is supposed to memcpy DMA so why slave-sg??

The DMA is either from memory to peripheral or from peripheral to memory
depending on the direction. So slave sg should be fine.

> 
> Looking at the driver overall, IMHO we need to do:
> - use the virt-dma to simplfy the cookie handling and perhpasn the descrptors
>   too!
> - Perhpas use interleaved API..?
> - I dont think we should use the slave API as this seems memcpy case!
> 

--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Srikanth Thokala Jan. 27, 2014, 11:06 a.m. UTC | #18
Hi Vinod,

On Sun, Jan 26, 2014 at 7:29 PM, Vinod Koul <vinod.koul@intel.com> wrote:
> On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote:
>> On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
>> > Hi Lars,
>> >
>> > On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen <lars@metafoo.de> wrote:
>> >> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
>> >> [...]
>> >>> +/**
>> >>> + * xilinx_vdma_device_control - Configure DMA channel of the device
>> >>> + * @dchan: DMA Channel pointer
>> >>> + * @cmd: DMA control command
>> >>> + * @arg: Channel configuration
>> >>> + *
>> >>> + * Return: '0' on success and failure value on error
>> >>> + */
>> >>> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
>> >>> +                                   enum dma_ctrl_cmd cmd, unsigned long arg)
>> >>> +{
>> >>> +     struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
>> >>> +
>> >>> +     switch (cmd) {
>> >>> +     case DMA_TERMINATE_ALL:
>> >>> +             xilinx_vdma_terminate_all(chan);
>> >>> +             return 0;
>> >>> +     case DMA_SLAVE_CONFIG:
>> >>> +             return xilinx_vdma_slave_config(chan,
>> >>> +                                     (struct xilinx_vdma_config *)arg);
>> >>
>> >> You really shouldn't be overloading the generic API with your own semantics.
>> >> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
>> >
>> > Ok.  The driver needs few additional configuration from the slave
>> > device like Vertical
>> > Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case do you
>> > suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START
>> > defined for Freescale drivers?
>>
>> In my opinion it is not a good idea to have driver implement a generic API,
>> but at the same time let the driver have custom semantics for those API
>> calls. It's a bit like having a gpio driver that expects 23 and 42 as the
>> values passed to gpio_set_value instead of 0 and 1. It completely defeats
>> the purpose of a generic API, namely that you are able to write generic code
>> that makes use of the API without having to know about which implementation
>> API it is talking to. The dmaengine framework provides the
>> dmaengine_prep_interleaved_dma() function to setup two dimensional
>> transfers, e.g. take a look at sirf-dma.c or imx-dma.c.
>
> The question here i think would be waht this device supports? Is the hardware
> capable of doing interleaved transfers, then would make sense.
>
> While we do try to get users use dma_slave_config, but there will always be
> someone who have specfic params. If we can generalize then we might want to add
> to the dma_slave_config as well

There are many configuration parameters which are specific to IP and I
would like to
give an overview of some of parameteres here:

1) Park Mode ('cfg->park'): In Park mode, engine will park on frame
referenced by
    'cfg->park_frm', so user will have control on each frame in this mode.

2) Interrupt Coalesce ('cfg->coalesce'):  Used for setting interrupt
threshold. This value
   determines the number of frame buffers to process. To use this feature,
   'cfg->frm_cnt_en' should be set.

3) Frame Synchronization Source ('cfg->ext_fsync'):  Can be an
external/internal frame
    synchronization source. Used to synchronize one channel (MM2S/S2MM) with
    another (S2MM/MM2S) channel.

4) Genlock Synchronization ('cfg->genlock'): Used to avoid mismatch rate between
    master and slave.  In master mode (cfg->master), frames are not dropped and
    slave can drop frames to adjust to master frame rate.

And in future, this Engine being a soft IP, we could expect some more additional
parameters.  Isn't a good idea to have a private member in dma_slave_config for
sharing additional configuration between slave device and dma engine? Or a new
dma_ctrl_cmd like FSLDMA_EXTERNAL_START?

Srikanth

>
> --
> ~Vinod
> --/EX
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Srikanth Thokala Jan. 27, 2014, 1:12 p.m. UTC | #19
Hi Lars/Vinod,

On Sun, Jan 26, 2014 at 11:09 PM, Lars-Peter Clausen <lars@metafoo.de> wrote:
> On 01/26/2014 02:59 PM, Vinod Koul wrote:
>> On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote:
>>> On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
>>>> Hi Lars,
>>>>
>>>> On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen <lars@metafoo.de> wrote:
>>>>> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
>>>>> [...]
>>>>>> +/**
>>>>>> + * xilinx_vdma_device_control - Configure DMA channel of the device
>>>>>> + * @dchan: DMA Channel pointer
>>>>>> + * @cmd: DMA control command
>>>>>> + * @arg: Channel configuration
>>>>>> + *
>>>>>> + * Return: '0' on success and failure value on error
>>>>>> + */
>>>>>> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
>>>>>> +                                   enum dma_ctrl_cmd cmd, unsigned long arg)
>>>>>> +{
>>>>>> +     struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
>>>>>> +
>>>>>> +     switch (cmd) {
>>>>>> +     case DMA_TERMINATE_ALL:
>>>>>> +             xilinx_vdma_terminate_all(chan);
>>>>>> +             return 0;
>>>>>> +     case DMA_SLAVE_CONFIG:
>>>>>> +             return xilinx_vdma_slave_config(chan,
>>>>>> +                                     (struct xilinx_vdma_config *)arg);
>>>>>
>>>>> You really shouldn't be overloading the generic API with your own semantics.
>>>>> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
>>>>
>>>> Ok.  The driver needs few additional configuration from the slave
>>>> device like Vertical
>>>> Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case do you
>>>> suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START
>>>> defined for Freescale drivers?
>>>
>>> In my opinion it is not a good idea to have driver implement a generic API,
>>> but at the same time let the driver have custom semantics for those API
>>> calls. It's a bit like having a gpio driver that expects 23 and 42 as the
>>> values passed to gpio_set_value instead of 0 and 1. It completely defeats
>>> the purpose of a generic API, namely that you are able to write generic code
>>> that makes use of the API without having to know about which implementation
>>> API it is talking to. The dmaengine framework provides the
>>> dmaengine_prep_interleaved_dma() function to setup two dimensional
>>> transfers, e.g. take a look at sirf-dma.c or imx-dma.c.
>>
>> The question here i think would be waht this device supports? Is the hardware
>> capable of doing interleaved transfers, then would make sense.
>
> The hardware does 2D transfers. The parameters for a transfer are height,
> width and stride. That's only a subset of what interleaved transfers can be
> (xt->num_frames must be one for 2d transfers). But if I remember correctly
> there has been some discussion on this in the past and the result of that
> discussion was that using interleaved transfers for 2D transfers is
> preferred over adding a custom API for 2D transfers.

I went through the prep_interleaved_dma API and I see only one descriptor
is prepared per API call (i.e. per frame).  As our IP supports upto 16 frame
buffers (can be more in future), isn't it less efficient compared to the
prep_slave_sg where we get a single sg list and can prepare all the descriptors
(of non-contiguous buffers) in one go?  Correct me, if am wrong and let me
know your opinions.

Srikanth

>
> - Lars
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Vinod Koul Jan. 28, 2014, 3:09 a.m. UTC | #20
On Sun, Jan 26, 2014 at 06:39:21PM +0100, Lars-Peter Clausen wrote:
> On 01/26/2014 02:59 PM, Vinod Koul wrote:
> > On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote:
> >> On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
> >>> Hi Lars,
> >>>
> >>> On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen <lars@metafoo.de> wrote:
> >>>> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
> >>>> [...]
> >>>>> +/**
> >>>>> + * xilinx_vdma_device_control - Configure DMA channel of the device
> >>>>> + * @dchan: DMA Channel pointer
> >>>>> + * @cmd: DMA control command
> >>>>> + * @arg: Channel configuration
> >>>>> + *
> >>>>> + * Return: '0' on success and failure value on error
> >>>>> + */
> >>>>> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
> >>>>> +                                   enum dma_ctrl_cmd cmd, unsigned long arg)
> >>>>> +{
> >>>>> +     struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
> >>>>> +
> >>>>> +     switch (cmd) {
> >>>>> +     case DMA_TERMINATE_ALL:
> >>>>> +             xilinx_vdma_terminate_all(chan);
> >>>>> +             return 0;
> >>>>> +     case DMA_SLAVE_CONFIG:
> >>>>> +             return xilinx_vdma_slave_config(chan,
> >>>>> +                                     (struct xilinx_vdma_config *)arg);
> >>>>
> >>>> You really shouldn't be overloading the generic API with your own semantics.
> >>>> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
> >>>
> >>> Ok.  The driver needs few additional configuration from the slave
> >>> device like Vertical
> >>> Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case do you
> >>> suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START
> >>> defined for Freescale drivers?
> >>
> >> In my opinion it is not a good idea to have driver implement a generic API,
> >> but at the same time let the driver have custom semantics for those API
> >> calls. It's a bit like having a gpio driver that expects 23 and 42 as the
> >> values passed to gpio_set_value instead of 0 and 1. It completely defeats
> >> the purpose of a generic API, namely that you are able to write generic code
> >> that makes use of the API without having to know about which implementation
> >> API it is talking to. The dmaengine framework provides the
> >> dmaengine_prep_interleaved_dma() function to setup two dimensional
> >> transfers, e.g. take a look at sirf-dma.c or imx-dma.c.
> > 
> > The question here i think would be waht this device supports? Is the hardware
> > capable of doing interleaved transfers, then would make sense.
> 
> The hardware does 2D transfers. The parameters for a transfer are height,
> width and stride. That's only a subset of what interleaved transfers can be
> (xt->num_frames must be one for 2d transfers). But if I remember correctly
> there has been some discussion on this in the past and the result of that
> discussion was that using interleaved transfers for 2D transfers is
> preferred over adding a custom API for 2D transfers.
Yup that would be my recomendation. Moving this driver to interleaved API seems
right to me
Vinod Koul Jan. 28, 2014, 3:13 a.m. UTC | #21
On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote:
> Hi Lars/Vinod,
> >> The question here i think would be waht this device supports? Is the hardware
> >> capable of doing interleaved transfers, then would make sense.
> >
> > The hardware does 2D transfers. The parameters for a transfer are height,
> > width and stride. That's only a subset of what interleaved transfers can be
> > (xt->num_frames must be one for 2d transfers). But if I remember correctly
> > there has been some discussion on this in the past and the result of that
> > discussion was that using interleaved transfers for 2D transfers is
> > preferred over adding a custom API for 2D transfers.
> 
> I went through the prep_interleaved_dma API and I see only one descriptor
> is prepared per API call (i.e. per frame).  As our IP supports upto 16 frame
> buffers (can be more in future), isn't it less efficient compared to the
> prep_slave_sg where we get a single sg list and can prepare all the descriptors
> (of non-contiguous buffers) in one go?  Correct me, if am wrong and let me
> know your opinions.
Well the descriptor maybe one, but that can represent multiple frames, for
example 16 as in your case. Can you read up the documentation of how multiple
frames are passed. Pls see include/linux/dmaengine.h 

/**
 * Interleaved Transfer Request
 * ----------------------------
 * A chunk is collection of contiguous bytes to be transfered.
 * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
 * ICGs may or maynot change between chunks.
 * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
 *  that when repeated an integral number of times, specifies the transfer.
 * A transfer template is specification of a Frame, the number of times
 *  it is to be repeated and other per-transfer attributes.
 *
 * Practically, a client driver would have ready a template for each
 *  type of transfer it is going to need during its lifetime and
 *  set only 'src_start' and 'dst_start' before submitting the requests.
 *
 *
 *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
 *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
 *
 *    ==  Chunk size
 *    ... ICG
 */
Srikanth Thokala Jan. 31, 2014, 6:51 a.m. UTC | #22
Hi Vinod,

On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul <vinod.koul@intel.com> wrote:
> On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote:
>> Hi Lars/Vinod,
>> >> The question here i think would be waht this device supports? Is the hardware
>> >> capable of doing interleaved transfers, then would make sense.
>> >
>> > The hardware does 2D transfers. The parameters for a transfer are height,
>> > width and stride. That's only a subset of what interleaved transfers can be
>> > (xt->num_frames must be one for 2d transfers). But if I remember correctly
>> > there has been some discussion on this in the past and the result of that
>> > discussion was that using interleaved transfers for 2D transfers is
>> > preferred over adding a custom API for 2D transfers.
>>
>> I went through the prep_interleaved_dma API and I see only one descriptor
>> is prepared per API call (i.e. per frame).  As our IP supports upto 16 frame
>> buffers (can be more in future), isn't it less efficient compared to the
>> prep_slave_sg where we get a single sg list and can prepare all the descriptors
>> (of non-contiguous buffers) in one go?  Correct me, if am wrong and let me
>> know your opinions.
> Well the descriptor maybe one, but that can represent multiple frames, for
> example 16 as in your case. Can you read up the documentation of how multiple
> frames are passed. Pls see include/linux/dmaengine.h
>
> /**
>  * Interleaved Transfer Request
>  * ----------------------------
>  * A chunk is collection of contiguous bytes to be transfered.
>  * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
>  * ICGs may or maynot change between chunks.
>  * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
>  *  that when repeated an integral number of times, specifies the transfer.
>  * A transfer template is specification of a Frame, the number of times
>  *  it is to be repeated and other per-transfer attributes.
>  *
>  * Practically, a client driver would have ready a template for each
>  *  type of transfer it is going to need during its lifetime and
>  *  set only 'src_start' and 'dst_start' before submitting the requests.
>  *
>  *
>  *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
>  *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
>  *
>  *    ==  Chunk size
>  *    ... ICG
>  */

Yes, it can handle multiple frames specified by 'numf' each of size
'frame_size * sgl[0].size'.
But, I see it only works if all the frames' memory is contiguous and
in this case we
can just increment 'src_start' by the total frame size 'numf' number
of times to fill in
for each HW descriptor (each frame is one HW descriptor).  So, there
is no issue when the
memory is contiguous.  If the frames are non contiguous, we have to
call this API for each
frame (hence for each descriptor), as the src_start for each frame is
different.  Is it correct?

FYI: This hardware has an inbuilt Scatter-Gather engine.

Srikanth

>
> --
> ~Vinod
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Srikanth Thokala Jan. 31, 2014, 6:52 a.m. UTC | #23
Hi Vinod,

On Mon, Jan 27, 2014 at 4:36 PM, Srikanth Thokala <sthokal@xilinx.com> wrote:
> Hi Vinod,
>
> On Sun, Jan 26, 2014 at 7:29 PM, Vinod Koul <vinod.koul@intel.com> wrote:
>> On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote:
>>> On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
>>> > Hi Lars,
>>> >
>>> > On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen <lars@metafoo.de> wrote:
>>> >> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
>>> >> [...]
>>> >>> +/**
>>> >>> + * xilinx_vdma_device_control - Configure DMA channel of the device
>>> >>> + * @dchan: DMA Channel pointer
>>> >>> + * @cmd: DMA control command
>>> >>> + * @arg: Channel configuration
>>> >>> + *
>>> >>> + * Return: '0' on success and failure value on error
>>> >>> + */
>>> >>> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
>>> >>> +                                   enum dma_ctrl_cmd cmd, unsigned long arg)
>>> >>> +{
>>> >>> +     struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
>>> >>> +
>>> >>> +     switch (cmd) {
>>> >>> +     case DMA_TERMINATE_ALL:
>>> >>> +             xilinx_vdma_terminate_all(chan);
>>> >>> +             return 0;
>>> >>> +     case DMA_SLAVE_CONFIG:
>>> >>> +             return xilinx_vdma_slave_config(chan,
>>> >>> +                                     (struct xilinx_vdma_config *)arg);
>>> >>
>>> >> You really shouldn't be overloading the generic API with your own semantics.
>>> >> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
>>> >
>>> > Ok.  The driver needs few additional configuration from the slave
>>> > device like Vertical
>>> > Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case do you
>>> > suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START
>>> > defined for Freescale drivers?
>>>
>>> In my opinion it is not a good idea to have driver implement a generic API,
>>> but at the same time let the driver have custom semantics for those API
>>> calls. It's a bit like having a gpio driver that expects 23 and 42 as the
>>> values passed to gpio_set_value instead of 0 and 1. It completely defeats
>>> the purpose of a generic API, namely that you are able to write generic code
>>> that makes use of the API without having to know about which implementation
>>> API it is talking to. The dmaengine framework provides the
>>> dmaengine_prep_interleaved_dma() function to setup two dimensional
>>> transfers, e.g. take a look at sirf-dma.c or imx-dma.c.
>>
>> The question here i think would be waht this device supports? Is the hardware
>> capable of doing interleaved transfers, then would make sense.
>>
>> While we do try to get users use dma_slave_config, but there will always be
>> someone who have specfic params. If we can generalize then we might want to add
>> to the dma_slave_config as well
>
> There are many configuration parameters which are specific to IP and I
> would like to
> give an overview of some of parameteres here:
>
> 1) Park Mode ('cfg->park'): In Park mode, engine will park on frame
> referenced by
>     'cfg->park_frm', so user will have control on each frame in this mode.
>
> 2) Interrupt Coalesce ('cfg->coalesce'):  Used for setting interrupt
> threshold. This value
>    determines the number of frame buffers to process. To use this feature,
>    'cfg->frm_cnt_en' should be set.
>
> 3) Frame Synchronization Source ('cfg->ext_fsync'):  Can be an
> external/internal frame
>     synchronization source. Used to synchronize one channel (MM2S/S2MM) with
>     another (S2MM/MM2S) channel.
>
> 4) Genlock Synchronization ('cfg->genlock'): Used to avoid mismatch rate between
>     master and slave.  In master mode (cfg->master), frames are not dropped and
>     slave can drop frames to adjust to master frame rate.
>
> And in future, this Engine being a soft IP, we could expect some more additional
> parameters.  Isn't a good idea to have a private member in dma_slave_config for
> sharing additional configuration between slave device and dma engine? Or a new
> dma_ctrl_cmd like FSLDMA_EXTERNAL_START?


Ping?

>
> Srikanth
>
>>
>> --
>> ~Vinod
>> --/EX
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andy Gross Jan. 31, 2014, 5:44 p.m. UTC | #24
On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote:
> On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
> > Hi Lars,
> > 
> > On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen <lars@metafoo.de> wrote:
> >> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
> >> [...]
> >>> +/**
> >>> + * xilinx_vdma_device_control - Configure DMA channel of the device
> >>> + * @dchan: DMA Channel pointer
> >>> + * @cmd: DMA control command
> >>> + * @arg: Channel configuration
> >>> + *
> >>> + * Return: '0' on success and failure value on error
> >>> + */
> >>> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
> >>> +                                   enum dma_ctrl_cmd cmd, unsigned long arg)
> >>> +{
> >>> +     struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
> >>> +
> >>> +     switch (cmd) {
> >>> +     case DMA_TERMINATE_ALL:
> >>> +             xilinx_vdma_terminate_all(chan);
> >>> +             return 0;
> >>> +     case DMA_SLAVE_CONFIG:
> >>> +             return xilinx_vdma_slave_config(chan,
> >>> +                                     (struct xilinx_vdma_config *)arg);
> >>
> >> You really shouldn't be overloading the generic API with your own semantics.
> >> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
> > 
> > Ok.  The driver needs few additional configuration from the slave
> > device like Vertical
> > Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case do you
> > suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START
> > defined for Freescale drivers?
> 
> In my opinion it is not a good idea to have driver implement a generic API,
> but at the same time let the driver have custom semantics for those API
> calls. It's a bit like having a gpio driver that expects 23 and 42 as the
> values passed to gpio_set_value instead of 0 and 1. It completely defeats
> the purpose of a generic API, namely that you are able to write generic code
> that makes use of the API without having to know about which implementation
> API it is talking to. The dmaengine framework provides the
> dmaengine_prep_interleaved_dma() function to setup two dimensional
> transfers, e.g. take a look at sirf-dma.c or imx-dma.c.
> 

The comments in the include/linux/dmaengine.h state that if you have
non-generic, non-fixed configuration then you can just create your own
structure and embed the dma_slave_config.  Using the container_of you can get
back your structure.

I agree that we should always use the generic structure if possible, but
sometimes there are some non-standard things that you have to do for your
hardware.  I am currently in a bind for adding some quirky features that are
required by peripherals who want to use the QCOM DMA devices.

If the context field in prep_slave_sg and prep_dma_cyclic was exposed to
everyone, that would allow an easy way to pass in hardware specific
configuration without bastardizing the slave_config.  I noticed that rapidio is
the only consumer of that field and that they have their own prep function.

If we are not going to allow people to do their own slave_config when they need
to, then we need to remove the comments from the include file and expose the
context to the dmaengine_prep_slave_sg and dmaengine_prep_dma_cyclic.
Lars-Peter Clausen Feb. 1, 2014, 6:23 p.m. UTC | #25
On 01/31/2014 06:44 PM, Andy Gross wrote:
> On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote:
>> On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
>>> Hi Lars,
>>>
>>> On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen <lars@metafoo.de> wrote:
>>>> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
>>>> [...]
>>>>> +/**
>>>>> + * xilinx_vdma_device_control - Configure DMA channel of the device
>>>>> + * @dchan: DMA Channel pointer
>>>>> + * @cmd: DMA control command
>>>>> + * @arg: Channel configuration
>>>>> + *
>>>>> + * Return: '0' on success and failure value on error
>>>>> + */
>>>>> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
>>>>> +                                   enum dma_ctrl_cmd cmd, unsigned long arg)
>>>>> +{
>>>>> +     struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
>>>>> +
>>>>> +     switch (cmd) {
>>>>> +     case DMA_TERMINATE_ALL:
>>>>> +             xilinx_vdma_terminate_all(chan);
>>>>> +             return 0;
>>>>> +     case DMA_SLAVE_CONFIG:
>>>>> +             return xilinx_vdma_slave_config(chan,
>>>>> +                                     (struct xilinx_vdma_config *)arg);
>>>>
>>>> You really shouldn't be overloading the generic API with your own semantics.
>>>> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
>>>
>>> Ok.  The driver needs few additional configuration from the slave
>>> device like Vertical
>>> Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case do you
>>> suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START
>>> defined for Freescale drivers?
>>
>> In my opinion it is not a good idea to have driver implement a generic API,
>> but at the same time let the driver have custom semantics for those API
>> calls. It's a bit like having a gpio driver that expects 23 and 42 as the
>> values passed to gpio_set_value instead of 0 and 1. It completely defeats
>> the purpose of a generic API, namely that you are able to write generic code
>> that makes use of the API without having to know about which implementation
>> API it is talking to. The dmaengine framework provides the
>> dmaengine_prep_interleaved_dma() function to setup two dimensional
>> transfers, e.g. take a look at sirf-dma.c or imx-dma.c.
>>
>
> The comments in the include/linux/dmaengine.h state that if you have
> non-generic, non-fixed configuration then you can just create your own
> structure and embed the dma_slave_config.  Using the container_of you can get
> back your structure.

We should probably revise that, since it is not going to work that well.

>
> I agree that we should always use the generic structure if possible, but
> sometimes there are some non-standard things that you have to do for your
> hardware.  I am currently in a bind for adding some quirky features that are
> required by peripherals who want to use the QCOM DMA devices.

Well there are two types of extensions to the API. The first type changes the 
semantics of the API so it is no longer possible to use the API without knowing 
about the extension. This is in my opinion a complete no-go since goes against 
the very idea of a common API. If you implement the common API with custom 
semantics you have a custom API. It's just better hidden since you use the same 
function names. My opinion on this is if you want/need a custom API make it a 
custom API with custom function names. This on one hand avoids confusion about 
the behavior and on the other hand reduces the maintenance burden for the 
common API (e.g. if somebody makes changes to the common API they don't have to 
bother to update your driver and don't have to try to understand the custom 
semantics). The other kind of extensions are those that add additional 
functionality on top of the common API, while keeping the normal semantics for 
the common API. Which means a user that does not know about the extensions is 
still able to function. A user that knows about the extension can make use of 
the additional features.

That said, everybody always thinks their hardware is special and requires 
special extensions. Usually this is not the case, there will always sooner or 
later somebody else who needs the same extensions. The dmaengine API is not set 
in stone, so if you think something is missing to properly to support your 
hardware it is worth investigating if it makes sense to add the missing parts 
to the common API. As I said before the whole point of the exercise of having a 
common API is that we want to abstract away (hardware) implementation specific 
details. This allows the upper layers to have platform independent common code 
to take care of setting up the DMA transfers. E.g. in ALSA subsystem we went 
from 10+ custom implementations of a PCM driver build on top of dmaengine to 1 
generic implementation that is shared between platforms. All those custom PCM 
drivers had hardcoded assumptions about the behavior and features of the 
underlying dmaengine driver. To be able to have one generic PCM driver it was 
necessarily to extend the dmaengine API to be able to expose these differences 
in features and behavior. So as I said the API is not set in stone if it is 
necessary to extend or modify it to support something properly do it. Other 
subsystems also want to go the direction of having more shared code that makes 
use of the dmaengine API at the subsytem level rather than having every driver 
basically implement the same stuff (with slight variations) over and over 
again. Having custom extensions in your dmaengine will not make it possible to 
write a generic user.

>
> If the context field in prep_slave_sg and prep_dma_cyclic was exposed to
> everyone, that would allow an easy way to pass in hardware specific
> configuration without bastardizing the slave_config.  I noticed that rapidio is
> the only consumer of that field and that they have their own prep function.
>
> If we are not going to allow people to do their own slave_config when they need
> to, then we need to remove the comments from the include file and expose the
> context to the dmaengine_prep_slave_sg and dmaengine_prep_dma_cyclic.

I don't think the way the rapidio stuff is handled is good for the reasons 
stated above. It uses the same names, but has different semantics. A user of 
the dmaengine interface that does not know that the underlying dmaengine driver 
expects rapidio semantics does not work.

- Lars

--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Vinod Koul Feb. 4, 2014, 5:28 a.m. UTC | #26
On Fri, Jan 31, 2014 at 12:22:52PM +0530, Srikanth Thokala wrote:
> >>> >> [...]
> >>> >>> +/**
> >>> >>> + * xilinx_vdma_device_control - Configure DMA channel of the device
> >>> >>> + * @dchan: DMA Channel pointer
> >>> >>> + * @cmd: DMA control command
> >>> >>> + * @arg: Channel configuration
> >>> >>> + *
> >>> >>> + * Return: '0' on success and failure value on error
> >>> >>> + */
> >>> >>> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
> >>> >>> +                                   enum dma_ctrl_cmd cmd, unsigned long arg)
> >>> >>> +{
> >>> >>> +     struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
> >>> >>> +
> >>> >>> +     switch (cmd) {
> >>> >>> +     case DMA_TERMINATE_ALL:
> >>> >>> +             xilinx_vdma_terminate_all(chan);
> >>> >>> +             return 0;
> >>> >>> +     case DMA_SLAVE_CONFIG:
> >>> >>> +             return xilinx_vdma_slave_config(chan,
> >>> >>> +                                     (struct xilinx_vdma_config *)arg);
> >>> >>
> >>> >> You really shouldn't be overloading the generic API with your own semantics.
> >>> >> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
> >>> >
> >>> > Ok.  The driver needs few additional configuration from the slave
> >>> > device like Vertical
> >>> > Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case do you
> >>> > suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START
> >>> > defined for Freescale drivers?
> >>>
> >>> In my opinion it is not a good idea to have driver implement a generic API,
> >>> but at the same time let the driver have custom semantics for those API
> >>> calls. It's a bit like having a gpio driver that expects 23 and 42 as the
> >>> values passed to gpio_set_value instead of 0 and 1. It completely defeats
> >>> the purpose of a generic API, namely that you are able to write generic code
> >>> that makes use of the API without having to know about which implementation
> >>> API it is talking to. The dmaengine framework provides the
> >>> dmaengine_prep_interleaved_dma() function to setup two dimensional
> >>> transfers, e.g. take a look at sirf-dma.c or imx-dma.c.
> >>
> >> The question here i think would be waht this device supports? Is the hardware
> >> capable of doing interleaved transfers, then would make sense.
> >>
> >> While we do try to get users use dma_slave_config, but there will always be
> >> someone who have specfic params. If we can generalize then we might want to add
> >> to the dma_slave_config as well
> >
> > There are many configuration parameters which are specific to IP and I
> > would like to
> > give an overview of some of parameteres here:
> >
> > 1) Park Mode ('cfg->park'): In Park mode, engine will park on frame
> > referenced by
> >     'cfg->park_frm', so user will have control on each frame in this mode.
> >
> > 2) Interrupt Coalesce ('cfg->coalesce'):  Used for setting interrupt
> > threshold. This value
> >    determines the number of frame buffers to process. To use this feature,
> >    'cfg->frm_cnt_en' should be set.
> >
> > 3) Frame Synchronization Source ('cfg->ext_fsync'):  Can be an
> > external/internal frame
> >     synchronization source. Used to synchronize one channel (MM2S/S2MM) with
> >     another (S2MM/MM2S) channel.
> >
> > 4) Genlock Synchronization ('cfg->genlock'): Used to avoid mismatch rate between
> >     master and slave.  In master mode (cfg->master), frames are not dropped and
> >     slave can drop frames to adjust to master frame rate.
> >
> > And in future, this Engine being a soft IP, we could expect some more additional
> > parameters.  Isn't a good idea to have a private member in dma_slave_config for
> > sharing additional configuration between slave device and dma engine? Or a new
> > dma_ctrl_cmd like FSLDMA_EXTERNAL_START?

The idea of a generic API is that we can use it for most of the controllers. Even
if you are planning to support a family of controllers

ATM, lets not discuss the possiblity of private member and try to exhanust all
possible options. Worst case you can embed the dma_slave_config in
xilinx_dma_slave_config and retrieve it in dmac driver
Srikanth Thokala Feb. 4, 2014, 10:35 a.m. UTC | #27
On Tue, Feb 4, 2014 at 10:58 AM, Vinod Koul <vinod.koul@intel.com> wrote:
> On Fri, Jan 31, 2014 at 12:22:52PM +0530, Srikanth Thokala wrote:
>> >>> >> [...]
>> >>> >>> +/**
>> >>> >>> + * xilinx_vdma_device_control - Configure DMA channel of the device
>> >>> >>> + * @dchan: DMA Channel pointer
>> >>> >>> + * @cmd: DMA control command
>> >>> >>> + * @arg: Channel configuration
>> >>> >>> + *
>> >>> >>> + * Return: '0' on success and failure value on error
>> >>> >>> + */
>> >>> >>> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
>> >>> >>> +                                   enum dma_ctrl_cmd cmd, unsigned long arg)
>> >>> >>> +{
>> >>> >>> +     struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
>> >>> >>> +
>> >>> >>> +     switch (cmd) {
>> >>> >>> +     case DMA_TERMINATE_ALL:
>> >>> >>> +             xilinx_vdma_terminate_all(chan);
>> >>> >>> +             return 0;
>> >>> >>> +     case DMA_SLAVE_CONFIG:
>> >>> >>> +             return xilinx_vdma_slave_config(chan,
>> >>> >>> +                                     (struct xilinx_vdma_config *)arg);
>> >>> >>
>> >>> >> You really shouldn't be overloading the generic API with your own semantics.
>> >>> >> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
>> >>> >
>> >>> > Ok.  The driver needs few additional configuration from the slave
>> >>> > device like Vertical
>> >>> > Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case do you
>> >>> > suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START
>> >>> > defined for Freescale drivers?
>> >>>
>> >>> In my opinion it is not a good idea to have driver implement a generic API,
>> >>> but at the same time let the driver have custom semantics for those API
>> >>> calls. It's a bit like having a gpio driver that expects 23 and 42 as the
>> >>> values passed to gpio_set_value instead of 0 and 1. It completely defeats
>> >>> the purpose of a generic API, namely that you are able to write generic code
>> >>> that makes use of the API without having to know about which implementation
>> >>> API it is talking to. The dmaengine framework provides the
>> >>> dmaengine_prep_interleaved_dma() function to setup two dimensional
>> >>> transfers, e.g. take a look at sirf-dma.c or imx-dma.c.
>> >>
>> >> The question here i think would be waht this device supports? Is the hardware
>> >> capable of doing interleaved transfers, then would make sense.
>> >>
>> >> While we do try to get users use dma_slave_config, but there will always be
>> >> someone who have specfic params. If we can generalize then we might want to add
>> >> to the dma_slave_config as well
>> >
>> > There are many configuration parameters which are specific to IP and I
>> > would like to
>> > give an overview of some of parameteres here:
>> >
>> > 1) Park Mode ('cfg->park'): In Park mode, engine will park on frame
>> > referenced by
>> >     'cfg->park_frm', so user will have control on each frame in this mode.
>> >
>> > 2) Interrupt Coalesce ('cfg->coalesce'):  Used for setting interrupt
>> > threshold. This value
>> >    determines the number of frame buffers to process. To use this feature,
>> >    'cfg->frm_cnt_en' should be set.
>> >
>> > 3) Frame Synchronization Source ('cfg->ext_fsync'):  Can be an
>> > external/internal frame
>> >     synchronization source. Used to synchronize one channel (MM2S/S2MM) with
>> >     another (S2MM/MM2S) channel.
>> >
>> > 4) Genlock Synchronization ('cfg->genlock'): Used to avoid mismatch rate between
>> >     master and slave.  In master mode (cfg->master), frames are not dropped and
>> >     slave can drop frames to adjust to master frame rate.
>> >
>> > And in future, this Engine being a soft IP, we could expect some more additional
>> > parameters.  Isn't a good idea to have a private member in dma_slave_config for
>> > sharing additional configuration between slave device and dma engine? Or a new
>> > dma_ctrl_cmd like FSLDMA_EXTERNAL_START?
>
> The idea of a generic API is that we can use it for most of the controllers. Even
> if you are planning to support a family of controllers
>
> ATM, lets not discuss the possiblity of private member and try to exhanust all
> possible options. Worst case you can embed the dma_slave_config in
> xilinx_dma_slave_config and retrieve it in dmac driver

Ok.

Srikanth

>
> --
> ~Vinod
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Srikanth Thokala Feb. 5, 2014, 4:25 p.m. UTC | #28
On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala <sthokal@xilinx.com> wrote:
> Hi Vinod,
>
> On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul <vinod.koul@intel.com> wrote:
>> On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote:
>>> Hi Lars/Vinod,
>>> >> The question here i think would be waht this device supports? Is the hardware
>>> >> capable of doing interleaved transfers, then would make sense.
>>> >
>>> > The hardware does 2D transfers. The parameters for a transfer are height,
>>> > width and stride. That's only a subset of what interleaved transfers can be
>>> > (xt->num_frames must be one for 2d transfers). But if I remember correctly
>>> > there has been some discussion on this in the past and the result of that
>>> > discussion was that using interleaved transfers for 2D transfers is
>>> > preferred over adding a custom API for 2D transfers.
>>>
>>> I went through the prep_interleaved_dma API and I see only one descriptor
>>> is prepared per API call (i.e. per frame).  As our IP supports upto 16 frame
>>> buffers (can be more in future), isn't it less efficient compared to the
>>> prep_slave_sg where we get a single sg list and can prepare all the descriptors
>>> (of non-contiguous buffers) in one go?  Correct me, if am wrong and let me
>>> know your opinions.
>> Well the descriptor maybe one, but that can represent multiple frames, for
>> example 16 as in your case. Can you read up the documentation of how multiple
>> frames are passed. Pls see include/linux/dmaengine.h
>>
>> /**
>>  * Interleaved Transfer Request
>>  * ----------------------------
>>  * A chunk is collection of contiguous bytes to be transfered.
>>  * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
>>  * ICGs may or maynot change between chunks.
>>  * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
>>  *  that when repeated an integral number of times, specifies the transfer.
>>  * A transfer template is specification of a Frame, the number of times
>>  *  it is to be repeated and other per-transfer attributes.
>>  *
>>  * Practically, a client driver would have ready a template for each
>>  *  type of transfer it is going to need during its lifetime and
>>  *  set only 'src_start' and 'dst_start' before submitting the requests.
>>  *
>>  *
>>  *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
>>  *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
>>  *
>>  *    ==  Chunk size
>>  *    ... ICG
>>  */
>
> Yes, it can handle multiple frames specified by 'numf' each of size
> 'frame_size * sgl[0].size'.
> But, I see it only works if all the frames' memory is contiguous and
> in this case we
> can just increment 'src_start' by the total frame size 'numf' number
> of times to fill in
> for each HW descriptor (each frame is one HW descriptor).  So, there
> is no issue when the
> memory is contiguous.  If the frames are non contiguous, we have to
> call this API for each
> frame (hence for each descriptor), as the src_start for each frame is
> different.  Is it correct?
>
> FYI: This hardware has an inbuilt Scatter-Gather engine.
>

Ping?


> Srikanth
>
>>
>> --
>> ~Vinod
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lars-Peter Clausen Feb. 5, 2014, 4:30 p.m. UTC | #29
On 02/05/2014 05:25 PM, Srikanth Thokala wrote:
> On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala <sthokal@xilinx.com> wrote:
>> Hi Vinod,
>>
>> On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul <vinod.koul@intel.com> wrote:
>>> On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote:
>>>> Hi Lars/Vinod,
>>>>>> The question here i think would be waht this device supports? Is the hardware
>>>>>> capable of doing interleaved transfers, then would make sense.
>>>>>
>>>>> The hardware does 2D transfers. The parameters for a transfer are height,
>>>>> width and stride. That's only a subset of what interleaved transfers can be
>>>>> (xt->num_frames must be one for 2d transfers). But if I remember correctly
>>>>> there has been some discussion on this in the past and the result of that
>>>>> discussion was that using interleaved transfers for 2D transfers is
>>>>> preferred over adding a custom API for 2D transfers.
>>>>
>>>> I went through the prep_interleaved_dma API and I see only one descriptor
>>>> is prepared per API call (i.e. per frame).  As our IP supports upto 16 frame
>>>> buffers (can be more in future), isn't it less efficient compared to the
>>>> prep_slave_sg where we get a single sg list and can prepare all the descriptors
>>>> (of non-contiguous buffers) in one go?  Correct me, if am wrong and let me
>>>> know your opinions.
>>> Well the descriptor maybe one, but that can represent multiple frames, for
>>> example 16 as in your case. Can you read up the documentation of how multiple
>>> frames are passed. Pls see include/linux/dmaengine.h
>>>
>>> /**
>>>   * Interleaved Transfer Request
>>>   * ----------------------------
>>>   * A chunk is collection of contiguous bytes to be transfered.
>>>   * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
>>>   * ICGs may or maynot change between chunks.
>>>   * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
>>>   *  that when repeated an integral number of times, specifies the transfer.
>>>   * A transfer template is specification of a Frame, the number of times
>>>   *  it is to be repeated and other per-transfer attributes.
>>>   *
>>>   * Practically, a client driver would have ready a template for each
>>>   *  type of transfer it is going to need during its lifetime and
>>>   *  set only 'src_start' and 'dst_start' before submitting the requests.
>>>   *
>>>   *
>>>   *  |      Frame-1        |       Frame-2       | ~ |       Frame-'numf'  |
>>>   *  |====....==.===...=...|====....==.===...=...| ~ |====....==.===...=...|
>>>   *
>>>   *    ==  Chunk size
>>>   *    ... ICG
>>>   */
>>
>> Yes, it can handle multiple frames specified by 'numf' each of size
>> 'frame_size * sgl[0].size'.
>> But, I see it only works if all the frames' memory is contiguous and
>> in this case we
>> can just increment 'src_start' by the total frame size 'numf' number
>> of times to fill in
>> for each HW descriptor (each frame is one HW descriptor).  So, there
>> is no issue when the
>> memory is contiguous.  If the frames are non contiguous, we have to
>> call this API for each
>> frame (hence for each descriptor), as the src_start for each frame is
>> different.  Is it correct?
>>
>> FYI: This hardware has an inbuilt Scatter-Gather engine.
>>
>
> Ping?

If you want to submit multiple frames at once I think you should look at how 
the current dmaengine API can be extended to allow that. And also provide an 
explanation on how this is superior over submitting them one by one.

- Lars

--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Srikanth Thokala Feb. 6, 2014, 1:34 p.m. UTC | #30
On Wed, Feb 5, 2014 at 10:00 PM, Lars-Peter Clausen <lars@metafoo.de> wrote:
> On 02/05/2014 05:25 PM, Srikanth Thokala wrote:
>>
>> On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala <sthokal@xilinx.com>
>> wrote:
>>>
>>> Hi Vinod,
>>>
>>> On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul <vinod.koul@intel.com> wrote:
>>>>
>>>> On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote:
>>>>>
>>>>> Hi Lars/Vinod,
>>>>>>>
>>>>>>> The question here i think would be waht this device supports? Is the
>>>>>>> hardware
>>>>>>> capable of doing interleaved transfers, then would make sense.
>>>>>>
>>>>>>
>>>>>> The hardware does 2D transfers. The parameters for a transfer are
>>>>>> height,
>>>>>> width and stride. That's only a subset of what interleaved transfers
>>>>>> can be
>>>>>> (xt->num_frames must be one for 2d transfers). But if I remember
>>>>>> correctly
>>>>>> there has been some discussion on this in the past and the result of
>>>>>> that
>>>>>> discussion was that using interleaved transfers for 2D transfers is
>>>>>> preferred over adding a custom API for 2D transfers.
>>>>>
>>>>>
>>>>> I went through the prep_interleaved_dma API and I see only one
>>>>> descriptor
>>>>> is prepared per API call (i.e. per frame).  As our IP supports upto 16
>>>>> frame
>>>>> buffers (can be more in future), isn't it less efficient compared to
>>>>> the
>>>>> prep_slave_sg where we get a single sg list and can prepare all the
>>>>> descriptors
>>>>> (of non-contiguous buffers) in one go?  Correct me, if am wrong and let
>>>>> me
>>>>> know your opinions.
>>>>
>>>> Well the descriptor maybe one, but that can represent multiple frames,
>>>> for
>>>> example 16 as in your case. Can you read up the documentation of how
>>>> multiple
>>>> frames are passed. Pls see include/linux/dmaengine.h
>>>>
>>>> /**
>>>>   * Interleaved Transfer Request
>>>>   * ----------------------------
>>>>   * A chunk is collection of contiguous bytes to be transfered.
>>>>   * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
>>>>   * ICGs may or maynot change between chunks.
>>>>   * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
>>>>   *  that when repeated an integral number of times, specifies the
>>>> transfer.
>>>>   * A transfer template is specification of a Frame, the number of times
>>>>   *  it is to be repeated and other per-transfer attributes.
>>>>   *
>>>>   * Practically, a client driver would have ready a template for each
>>>>   *  type of transfer it is going to need during its lifetime and
>>>>   *  set only 'src_start' and 'dst_start' before submitting the
>>>> requests.
>>>>   *
>>>>   *
>>>>   *  |      Frame-1        |       Frame-2       | ~ |
>>>> Frame-'numf'  |
>>>>   *  |====....==.===...=...|====....==.===...=...| ~
>>>> |====....==.===...=...|
>>>>   *
>>>>   *    ==  Chunk size
>>>>   *    ... ICG
>>>>   */
>>>
>>>
>>> Yes, it can handle multiple frames specified by 'numf' each of size
>>> 'frame_size * sgl[0].size'.
>>> But, I see it only works if all the frames' memory is contiguous and
>>> in this case we
>>> can just increment 'src_start' by the total frame size 'numf' number
>>> of times to fill in
>>> for each HW descriptor (each frame is one HW descriptor).  So, there
>>> is no issue when the
>>> memory is contiguous.  If the frames are non contiguous, we have to
>>> call this API for each
>>> frame (hence for each descriptor), as the src_start for each frame is
>>> different.  Is it correct?
>>>
>>> FYI: This hardware has an inbuilt Scatter-Gather engine.
>>>
>>
>> Ping?
>
>
> If you want to submit multiple frames at once I think you should look at how
> the current dmaengine API can be extended to allow that. And also provide an
> explanation on how this is superior over submitting them one by one.


Sure.  I would start with explaning the current implementation of this driver.

Using prep_slave_sg(), we can define multiple segments in a
async_tx_descriptor where each frame is defined by a segment (a sg
list entry).  So, the slave device could DMA the data (of multiple
frames) with a descriptor by calling tx_submit in a transaction i.e.,

prep_slave_sg(16)  -> tx_submit(1) -> interrupt  (16 frames)

Using interleaved_dma(), we could not divide into segments when we
have scattered memory (for the reasons mentioned in above thread).
This implies we are restricting the slave device to process frame by
frame i.e.,

interleaved_dma(1) -> tx_submit(1) -> interrupt -> interleaved_dma(2)
-> tx_submit (2) -> interrupt -> ........ tx_submit(16) -> interrupt

This implementation makes the hardware to wait until the next frame is
submitted.

To overcome this, I feel it would be a good option if we could extend
interleaved_dma template to modify src_start/dest_start to be a
pointer to an array of addresses.  Here, number of addresses will be
defined by numf. The other option would be to include scatterlist in
the interleaved template. This way we can handle scattered memory
using this API.

Srikanth

>
> - Lars
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lars-Peter Clausen Feb. 6, 2014, 3:53 p.m. UTC | #31
On 02/06/2014 02:34 PM, Srikanth Thokala wrote:
> On Wed, Feb 5, 2014 at 10:00 PM, Lars-Peter Clausen <lars@metafoo.de> wrote:
>> On 02/05/2014 05:25 PM, Srikanth Thokala wrote:
>>>
>>> On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala <sthokal@xilinx.com>
>>> wrote:
>>>>
>>>> Hi Vinod,
>>>>
>>>> On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul <vinod.koul@intel.com> wrote:
>>>>>
>>>>> On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote:
>>>>>>
>>>>>> Hi Lars/Vinod,
>>>>>>>>
>>>>>>>> The question here i think would be waht this device supports? Is the
>>>>>>>> hardware
>>>>>>>> capable of doing interleaved transfers, then would make sense.
>>>>>>>
>>>>>>>
>>>>>>> The hardware does 2D transfers. The parameters for a transfer are
>>>>>>> height,
>>>>>>> width and stride. That's only a subset of what interleaved transfers
>>>>>>> can be
>>>>>>> (xt->num_frames must be one for 2d transfers). But if I remember
>>>>>>> correctly
>>>>>>> there has been some discussion on this in the past and the result of
>>>>>>> that
>>>>>>> discussion was that using interleaved transfers for 2D transfers is
>>>>>>> preferred over adding a custom API for 2D transfers.
>>>>>>
>>>>>>
>>>>>> I went through the prep_interleaved_dma API and I see only one
>>>>>> descriptor
>>>>>> is prepared per API call (i.e. per frame).  As our IP supports upto 16
>>>>>> frame
>>>>>> buffers (can be more in future), isn't it less efficient compared to
>>>>>> the
>>>>>> prep_slave_sg where we get a single sg list and can prepare all the
>>>>>> descriptors
>>>>>> (of non-contiguous buffers) in one go?  Correct me, if am wrong and let
>>>>>> me
>>>>>> know your opinions.
>>>>>
>>>>> Well the descriptor maybe one, but that can represent multiple frames,
>>>>> for
>>>>> example 16 as in your case. Can you read up the documentation of how
>>>>> multiple
>>>>> frames are passed. Pls see include/linux/dmaengine.h
>>>>>
>>>>> /**
>>>>>    * Interleaved Transfer Request
>>>>>    * ----------------------------
>>>>>    * A chunk is collection of contiguous bytes to be transfered.
>>>>>    * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
>>>>>    * ICGs may or maynot change between chunks.
>>>>>    * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
>>>>>    *  that when repeated an integral number of times, specifies the
>>>>> transfer.
>>>>>    * A transfer template is specification of a Frame, the number of times
>>>>>    *  it is to be repeated and other per-transfer attributes.
>>>>>    *
>>>>>    * Practically, a client driver would have ready a template for each
>>>>>    *  type of transfer it is going to need during its lifetime and
>>>>>    *  set only 'src_start' and 'dst_start' before submitting the
>>>>> requests.
>>>>>    *
>>>>>    *
>>>>>    *  |      Frame-1        |       Frame-2       | ~ |
>>>>> Frame-'numf'  |
>>>>>    *  |====....==.===...=...|====....==.===...=...| ~
>>>>> |====....==.===...=...|
>>>>>    *
>>>>>    *    ==  Chunk size
>>>>>    *    ... ICG
>>>>>    */
>>>>
>>>>
>>>> Yes, it can handle multiple frames specified by 'numf' each of size
>>>> 'frame_size * sgl[0].size'.
>>>> But, I see it only works if all the frames' memory is contiguous and
>>>> in this case we
>>>> can just increment 'src_start' by the total frame size 'numf' number
>>>> of times to fill in
>>>> for each HW descriptor (each frame is one HW descriptor).  So, there
>>>> is no issue when the
>>>> memory is contiguous.  If the frames are non contiguous, we have to
>>>> call this API for each
>>>> frame (hence for each descriptor), as the src_start for each frame is
>>>> different.  Is it correct?
>>>>
>>>> FYI: This hardware has an inbuilt Scatter-Gather engine.
>>>>
>>>
>>> Ping?
>>
>>
>> If you want to submit multiple frames at once I think you should look at how
>> the current dmaengine API can be extended to allow that. And also provide an
>> explanation on how this is superior over submitting them one by one.
>
>
> Sure.  I would start with explaning the current implementation of this driver.
>
> Using prep_slave_sg(), we can define multiple segments in a
> async_tx_descriptor where each frame is defined by a segment (a sg
> list entry).  So, the slave device could DMA the data (of multiple
> frames) with a descriptor by calling tx_submit in a transaction i.e.,
>
> prep_slave_sg(16)  -> tx_submit(1) -> interrupt  (16 frames)
>
> Using interleaved_dma(), we could not divide into segments when we
> have scattered memory (for the reasons mentioned in above thread).
> This implies we are restricting the slave device to process frame by
> frame i.e.,
>
> interleaved_dma(1) -> tx_submit(1) -> interrupt -> interleaved_dma(2)
> -> tx_submit (2) -> interrupt -> ........ tx_submit(16) -> interrupt
>

The API allows you to create and submit multiple interleaved descriptors 
before you have to issue them.

interleaved_dma(1) -> tx_submit(1) -> interleaved_dma(2) -> tx_submit(2) -> 
... -> issue_pending() -> interrupt

> This implementation makes the hardware to wait until the next frame is
> submitted.
>
> To overcome this, I feel it would be a good option if we could extend
> interleaved_dma template to modify src_start/dest_start to be a
> pointer to an array of addresses.  Here, number of addresses will be
> defined by numf. The other option would be to include scatterlist in
> the interleaved template. This way we can handle scattered memory
> using this API.

Each "frame" in a interleaved transfer describes a single line in your video 
frame (size = width, icg = stride). numf is the number of lines per video 
frame. So the suggested change does not make that much sense. If you want to 
submit multiple video frames in one batch the best option is in my opinion 
to allow to pass an array of dma_interleaved_template structs instead of a 
single one.

- Lars

>
> Srikanth
>
>>
>> - Lars
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
> --
> To unsubscribe from this list: send the line "unsubscribe dmaengine" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Srikanth Thokala Feb. 10, 2014, 12:51 p.m. UTC | #32
On Thu, Feb 6, 2014 at 9:23 PM, Lars-Peter Clausen <lars@metafoo.de> wrote:
> On 02/06/2014 02:34 PM, Srikanth Thokala wrote:
>>
>> On Wed, Feb 5, 2014 at 10:00 PM, Lars-Peter Clausen <lars@metafoo.de>
>> wrote:
>>>
>>> On 02/05/2014 05:25 PM, Srikanth Thokala wrote:
>>>>
>>>>
>>>> On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala <sthokal@xilinx.com>
>>>> wrote:
>>>>>
>>>>>
>>>>> Hi Vinod,
>>>>>
>>>>> On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul <vinod.koul@intel.com>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote:
>>>>>>>
>>>>>>>
>>>>>>> Hi Lars/Vinod,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The question here i think would be waht this device supports? Is
>>>>>>>>> the
>>>>>>>>> hardware
>>>>>>>>> capable of doing interleaved transfers, then would make sense.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> The hardware does 2D transfers. The parameters for a transfer are
>>>>>>>> height,
>>>>>>>> width and stride. That's only a subset of what interleaved transfers
>>>>>>>> can be
>>>>>>>> (xt->num_frames must be one for 2d transfers). But if I remember
>>>>>>>> correctly
>>>>>>>> there has been some discussion on this in the past and the result of
>>>>>>>> that
>>>>>>>> discussion was that using interleaved transfers for 2D transfers is
>>>>>>>> preferred over adding a custom API for 2D transfers.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I went through the prep_interleaved_dma API and I see only one
>>>>>>> descriptor
>>>>>>> is prepared per API call (i.e. per frame).  As our IP supports upto
>>>>>>> 16
>>>>>>> frame
>>>>>>> buffers (can be more in future), isn't it less efficient compared to
>>>>>>> the
>>>>>>> prep_slave_sg where we get a single sg list and can prepare all the
>>>>>>> descriptors
>>>>>>> (of non-contiguous buffers) in one go?  Correct me, if am wrong and
>>>>>>> let
>>>>>>> me
>>>>>>> know your opinions.
>>>>>>
>>>>>>
>>>>>> Well the descriptor maybe one, but that can represent multiple frames,
>>>>>> for
>>>>>> example 16 as in your case. Can you read up the documentation of how
>>>>>> multiple
>>>>>> frames are passed. Pls see include/linux/dmaengine.h
>>>>>>
>>>>>> /**
>>>>>>    * Interleaved Transfer Request
>>>>>>    * ----------------------------
>>>>>>    * A chunk is collection of contiguous bytes to be transfered.
>>>>>>    * The gap(in bytes) between two chunks is called
>>>>>> inter-chunk-gap(ICG).
>>>>>>    * ICGs may or maynot change between chunks.
>>>>>>    * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
>>>>>>    *  that when repeated an integral number of times, specifies the
>>>>>> transfer.
>>>>>>    * A transfer template is specification of a Frame, the number of
>>>>>> times
>>>>>>    *  it is to be repeated and other per-transfer attributes.
>>>>>>    *
>>>>>>    * Practically, a client driver would have ready a template for each
>>>>>>    *  type of transfer it is going to need during its lifetime and
>>>>>>    *  set only 'src_start' and 'dst_start' before submitting the
>>>>>> requests.
>>>>>>    *
>>>>>>    *
>>>>>>    *  |      Frame-1        |       Frame-2       | ~ |
>>>>>> Frame-'numf'  |
>>>>>>    *  |====....==.===...=...|====....==.===...=...| ~
>>>>>> |====....==.===...=...|
>>>>>>    *
>>>>>>    *    ==  Chunk size
>>>>>>    *    ... ICG
>>>>>>    */
>>>>>
>>>>>
>>>>>
>>>>> Yes, it can handle multiple frames specified by 'numf' each of size
>>>>> 'frame_size * sgl[0].size'.
>>>>> But, I see it only works if all the frames' memory is contiguous and
>>>>> in this case we
>>>>> can just increment 'src_start' by the total frame size 'numf' number
>>>>> of times to fill in
>>>>> for each HW descriptor (each frame is one HW descriptor).  So, there
>>>>> is no issue when the
>>>>> memory is contiguous.  If the frames are non contiguous, we have to
>>>>> call this API for each
>>>>> frame (hence for each descriptor), as the src_start for each frame is
>>>>> different.  Is it correct?
>>>>>
>>>>> FYI: This hardware has an inbuilt Scatter-Gather engine.
>>>>>
>>>>
>>>> Ping?
>>>
>>>
>>>
>>> If you want to submit multiple frames at once I think you should look at
>>> how
>>> the current dmaengine API can be extended to allow that. And also provide
>>> an
>>> explanation on how this is superior over submitting them one by one.
>>
>>
>>
>> Sure.  I would start with explaning the current implementation of this
>> driver.
>>
>> Using prep_slave_sg(), we can define multiple segments in a
>> async_tx_descriptor where each frame is defined by a segment (a sg
>> list entry).  So, the slave device could DMA the data (of multiple
>> frames) with a descriptor by calling tx_submit in a transaction i.e.,
>>
>> prep_slave_sg(16)  -> tx_submit(1) -> interrupt  (16 frames)
>>
>> Using interleaved_dma(), we could not divide into segments when we
>> have scattered memory (for the reasons mentioned in above thread).
>> This implies we are restricting the slave device to process frame by
>> frame i.e.,
>>
>> interleaved_dma(1) -> tx_submit(1) -> interrupt -> interleaved_dma(2)
>> -> tx_submit (2) -> interrupt -> ........ tx_submit(16) -> interrupt
>>
>
> The API allows you to create and submit multiple interleaved descriptors
> before you have to issue them.
>
> interleaved_dma(1) -> tx_submit(1) -> interleaved_dma(2) -> tx_submit(2) ->
> ... -> issue_pending() -> interrupt
>
>
>> This implementation makes the hardware to wait until the next frame is
>> submitted.
>>
>> To overcome this, I feel it would be a good option if we could extend
>> interleaved_dma template to modify src_start/dest_start to be a
>> pointer to an array of addresses.  Here, number of addresses will be
>> defined by numf. The other option would be to include scatterlist in
>> the interleaved template. This way we can handle scattered memory
>> using this API.
>
>
> Each "frame" in a interleaved transfer describes a single line in your video
> frame (size = width, icg = stride). numf is the number of lines per video
> frame. So the suggested change does not make that much sense. If you want to
> submit multiple video frames in one batch the best option is in my opinion
> to allow to pass an array of dma_interleaved_template structs instead of a
> single one.

Ok.  I agree with you.  Will fix it in v3.

Srikanth

>
> - Lars
>
>>
>> Srikanth
>>
>>>
>>> - Lars
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel"
>>> in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> Please read the FAQ at  http://www.tux.org/lkml/
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe dmaengine" in
>>
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
new file mode 100644
index 0000000..ab8be1a
--- /dev/null
+++ b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
@@ -0,0 +1,75 @@ 
+Xilinx AXI VDMA engine, it does transfers between memory and video devices.
+It can be configured to have one channel or two channels. If configured
+as two channels, one is to transmit to the video device and another is
+to receive from the video device.
+
+Required properties:
+- compatible: Should be "xlnx,axi-vdma-1.00.a"
+- #dma-cells: Should be <1>, see "dmas" property below
+- reg: Should contain VDMA registers location and length.
+- xlnx,num-fstores: Should be the number of framebuffers as configured in h/w.
+- dma-channel child node: Should have atleast one channel and can have upto
+	two channels per device. This node specifies the properties of each
+	DMA channel (see child node properties below).
+
+Optional properties:
+- xlnx,include-sg: Tells whether configured for Scatter-mode in
+	the hardware.
+- xlnx,flush-fsync: Tells whether which channel to Flush on Frame sync.
+	It takes following values:
+	{1}, flush both channels
+	{2}, flush mm2s channel
+	{3}, flush s2mm channel
+
+Required child node properties:
+- compatible: It should be either "xlnx,axi-vdma-mm2s-channel" or
+	"xlnx,axi-vdma-s2mm-channel".
+- interrupts: Should contain per channel VDMA interrupts.
+- xlnx,data-width: Should contain the stream data width, take values
+	{32,64...1024}.
+
+Option child node properties:
+- xlnx,include-dre: Tells whether hardware is configured for Data
+	Realignment Engine.
+- xlnx,genlock-mode: Tells whether Genlock synchronization is
+	enabled/disabled in hardware.
+
+Example:
+++++++++
+
+axi_vdma_0: axivdma@40030000 {
+	compatible = "xlnx,axi-vdma-1.00.a";
+	#dma_cells = <1>;
+	reg = < 0x40030000 0x10000 >;
+	xlnx,num-fstores = <0x8>;
+	xlnx,flush-fsync = <0x1>;
+	dma-channel@40030000 {
+		compatible = "xlnx,axi-vdma-mm2s-channel";
+		interrupts = < 0 54 4 >;
+		xlnx,datawidth = <0x40>;
+	} ;
+	dma-channel@40030030 {
+		compatible = "xlnx,axi-vdma-s2mm-channel";
+		interrupts = < 0 53 4 >;
+		xlnx,datawidth = <0x40>;
+	} ;
+} ;
+
+
+* DMA client
+
+Required properties:
+- dmas: a list of <[Video DMA device phandle] [Channel ID]> pairs,
+	where Channel ID is '0' for write/tx and '1' for read/rx
+	channel.
+- dma-names: a list of DMA channel names, one per "dmas" entry
+
+Example:
+++++++++
+
+vdmatest_0: vdmatest@0 {
+	compatible ="xlnx,axi-vdma-test-1.00.a";
+	dmas = <&axi_vdma_0 0
+		&axi_vdma_0 1>;
+	dma-names = "vdma0", "vdma1";
+} ;
diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index c823daa..2a74651 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -334,6 +334,20 @@  config K3_DMA
 	  Support the DMA engine for Hisilicon K3 platform
 	  devices.
 
+config XILINX_VDMA
+	tristate "Xilinx AXI VDMA Engine"
+	depends on (ARCH_ZYNQ || MICROBLAZE)
+	select DMA_ENGINE
+	help
+	  Enable support for Xilinx AXI VDMA Soft IP.
+
+	  This engine provides high-bandwidth direct memory access
+	  between memory and AXI4-Stream video type target
+	  peripherals including peripherals which support AXI4-
+	  Stream Video Protocol.  It has two stream interfaces/
+	  channels, Memory Mapped to Stream (MM2S) and Stream to
+	  Memory Mapped (S2MM) for the data transfers.
+
 config DMA_ENGINE
 	bool
 
diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile
index 0ce2da9..d84130b 100644
--- a/drivers/dma/Makefile
+++ b/drivers/dma/Makefile
@@ -42,3 +42,4 @@  obj-$(CONFIG_MMP_PDMA) += mmp_pdma.o
 obj-$(CONFIG_DMA_JZ4740) += dma-jz4740.o
 obj-$(CONFIG_TI_CPPI41) += cppi41.o
 obj-$(CONFIG_K3_DMA) += k3dma.o
+obj-y += xilinx/
diff --git a/drivers/dma/xilinx/Makefile b/drivers/dma/xilinx/Makefile
new file mode 100644
index 0000000..3c4e9f2
--- /dev/null
+++ b/drivers/dma/xilinx/Makefile
@@ -0,0 +1 @@ 
+obj-$(CONFIG_XILINX_VDMA) += xilinx_vdma.o
diff --git a/drivers/dma/xilinx/xilinx_vdma.c b/drivers/dma/xilinx/xilinx_vdma.c
new file mode 100644
index 0000000..4c0d04c
--- /dev/null
+++ b/drivers/dma/xilinx/xilinx_vdma.c
@@ -0,0 +1,1486 @@ 
+/*
+ * DMA driver for Xilinx Video DMA Engine
+ *
+ * Copyright (C) 2010-2014 Xilinx, Inc. All rights reserved.
+ *
+ * Based on the Freescale DMA driver.
+ *
+ * Description:
+ * The AXI Video Direct Memory Access (AXI VDMA) core is a soft Xilinx IP
+ * core that provides high-bandwidth direct memory access between memory
+ * and AXI4-Stream type video target peripherals. The core provides efficient
+ * two dimensional DMA operations with independent asynchronous read (S2MM)
+ * and write (MM2S) channel operation. It can be configured to have either
+ * one channel or two channels. If configured as two channels, one is to
+ * transmit to the video device (MM2S) and another is to receive from the
+ * video device (S2MM). Initialization, status, interrupt and management
+ * registers are accessed through an AXI4-Lite slave interface.
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/amba/xilinx_dma.h>
+#include <linux/bitops.h>
+#include <linux/dmapool.h>
+#include <linux/init.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/module.h>
+#include <linux/of_address.h>
+#include <linux/of_dma.h>
+#include <linux/of_platform.h>
+#include <linux/of_irq.h>
+#include <linux/slab.h>
+
+/* Register/Descriptor Offsets */
+#define XILINX_VDMA_MM2S_CTRL_OFFSET		0x0000
+#define XILINX_VDMA_S2MM_CTRL_OFFSET		0x0030
+#define XILINX_VDMA_MM2S_DESC_OFFSET		0x0050
+#define XILINX_VDMA_S2MM_DESC_OFFSET		0x00a0
+
+/* Control Registers */
+#define XILINX_VDMA_REG_DMACR			0x0000
+#define XILINX_VDMA_DMACR_DELAY_MAX		0xff
+#define XILINX_VDMA_DMACR_DELAY_SHIFT		24
+#define XILINX_VDMA_DMACR_FRAME_COUNT_MAX	0xff
+#define XILINX_VDMA_DMACR_FRAME_COUNT_SHIFT	16
+#define XILINX_VDMA_DMACR_ERR_IRQ		BIT(14)
+#define XILINX_VDMA_DMACR_DLY_CNT_IRQ		BIT(13)
+#define XILINX_VDMA_DMACR_FRM_CNT_IRQ		BIT(12)
+#define XILINX_VDMA_DMACR_MASTER_SHIFT		8
+#define XILINX_VDMA_DMACR_FSYNCSRC_SHIFT	5
+#define XILINX_VDMA_DMACR_FRAMECNT_EN		BIT(4)
+#define XILINX_VDMA_DMACR_GENLOCK_EN		BIT(3)
+#define XILINX_VDMA_DMACR_RESET			BIT(2)
+#define XILINX_VDMA_DMACR_CIRC_EN		BIT(1)
+#define XILINX_VDMA_DMACR_RUNSTOP		BIT(0)
+#define XILINX_VDMA_DMACR_DELAY_MASK		\
+				(XILINX_VDMA_DMACR_DELAY_MAX << \
+				XILINX_VDMA_DMACR_DELAY_SHIFT)
+#define XILINX_VDMA_DMACR_FRAME_COUNT_MASK	\
+				(XILINX_VDMA_DMACR_FRAME_COUNT_MAX << \
+				XILINX_VDMA_DMACR_FRAME_COUNT_SHIFT)
+#define XILINX_VDMA_DMACR_MASTER_MASK		\
+				(0xf << XILINX_VDMA_DMACR_MASTER_SHIFT)
+#define XILINX_VDMA_DMACR_FSYNCSRC_MASK		\
+				(3 << XILINX_VDMA_DMACR_FSYNCSRC_SHIFT)
+
+#define XILINX_VDMA_REG_DMASR			0x0004
+#define XILINX_VDMA_DMASR_DELAY_SHIFT		24
+#define XILINX_VDMA_DMASR_FRAME_COUNT_SHIFT	16
+#define XILINX_VDMA_DMASR_EOL_LATE_ERR		BIT(15)
+#define XILINX_VDMA_DMASR_ERR_IRQ		BIT(14)
+#define XILINX_VDMA_DMASR_DLY_CNT_IRQ		BIT(13)
+#define XILINX_VDMA_DMASR_FRM_CNT_IRQ		BIT(12)
+#define XILINX_VDMA_DMASR_SOF_LATE_ERR		BIT(11)
+#define XILINX_VDMA_DMASR_SG_DEC_ERR		BIT(10)
+#define XILINX_VDMA_DMASR_SG_SLV_ERR		BIT(9)
+#define XILINX_VDMA_DMASR_EOF_EARLY_ERR		BIT(8)
+#define XILINX_VDMA_DMASR_SOF_EARLY_ERR		BIT(7)
+#define XILINX_VDMA_DMASR_DMA_DEC_ERR		BIT(6)
+#define XILINX_VDMA_DMASR_DMA_SLAVE_ERR		BIT(5)
+#define XILINX_VDMA_DMASR_DMA_INT_ERR		BIT(4)
+#define XILINX_VDMA_DMASR_IDLE			BIT(1)
+#define XILINX_VDMA_DMASR_HALTED		BIT(0)
+
+#define XILINX_VDMA_DMASR_DELAY_MASK		\
+				(0xff << XILINX_VDMA_DMASR_DELAY_SHIFT)
+#define XILINX_VDMA_DMASR_FRAME_COUNT_MASK	\
+				(0xff << XILINX_VDMA_DMASR_FRAME_COUNT_SHIFT)
+
+#define XILINX_VDMA_REG_CURDESC			0x0008
+#define XILINX_VDMA_REG_TAILDESC		0x0010
+#define XILINX_VDMA_REG_REG_INDEX		0x0014
+#define XILINX_VDMA_REG_FRMSTORE		0x0018
+#define XILINX_VDMA_REG_THRESHOLD		0x001c
+#define XILINX_VDMA_REG_FRMPTR_STS		0x0024
+#define XILINX_VDMA_REG_PARK_PTR		0x0028
+#define XILINX_VDMA_PARK_PTR_WR_REF_SHIFT	8
+#define XILINX_VDMA_PARK_PTR_RD_REF_SHIFT	0
+#define XILINX_VDMA_REG_VDMA_VERSION		0x002c
+
+/* Register Direct Mode Registers */
+#define XILINX_VDMA_REG_VSIZE			0x0000
+#define XILINX_VDMA_REG_HSIZE			0x0004
+
+#define XILINX_VDMA_REG_FRMDLY_STRIDE		0x0008
+#define XILINX_VDMA_FRMDLY_STRIDE_FRMDLY_SHIFT	24
+#define XILINX_VDMA_FRMDLY_STRIDE_STRIDE_SHIFT	0
+#define XILINX_VDMA_FRMDLY_STRIDE_FRMDLY_MASK	\
+				(0x1f <<	\
+				XILINX_VDMA_FRMDLY_STRIDE_FRMDLY_SHIFT)
+#define XILINX_VDMA_FRMDLY_STRIDE_STRIDE_MASK	\
+				(0xffff <<	\
+				XILINX_VDMA_FRMDLY_STRIDE_STRIDE_MASK)
+
+#define XILINX_VDMA_REG_START_ADDRESS(n)	(0x000c + 4 * (n))
+
+/* Hw specific definitions */
+#define XILINX_VDMA_MAX_CHANS_PER_DEVICE	0x2
+
+#define XILINX_VDMA_DMAXR_ALL_IRQ_MASK	(XILINX_VDMA_DMASR_FRM_CNT_IRQ | \
+					 XILINX_VDMA_DMASR_DLY_CNT_IRQ | \
+					 XILINX_VDMA_DMASR_ERR_IRQ)
+
+#define XILINX_VDMA_DMASR_ALL_ERR_MASK	(XILINX_VDMA_DMASR_EOL_LATE_ERR | \
+					 XILINX_VDMA_DMASR_SOF_LATE_ERR | \
+					 XILINX_VDMA_DMASR_SG_DEC_ERR | \
+					 XILINX_VDMA_DMASR_SG_SLV_ERR | \
+					 XILINX_VDMA_DMASR_EOF_EARLY_ERR | \
+					 XILINX_VDMA_DMASR_SOF_EARLY_ERR | \
+					 XILINX_VDMA_DMASR_DMA_DEC_ERR | \
+					 XILINX_VDMA_DMASR_DMA_SLAVE_ERR | \
+					 XILINX_VDMA_DMASR_DMA_INT_ERR)
+
+/*
+ * Recoverable errors are DMA Internal error, SOF Early, EOF Early and SOF Late.
+ * They are only recoverable when C_FLUSH_ON_FSYNC is enabled in the h/w system.
+ */
+#define XILINX_VDMA_DMASR_ERR_RECOVER_MASK	\
+					(XILINX_VDMA_DMASR_SOF_LATE_ERR | \
+					 XILINX_VDMA_DMASR_EOF_EARLY_ERR | \
+					 XILINX_VDMA_DMASR_SOF_EARLY_ERR | \
+					 XILINX_VDMA_DMASR_DMA_INT_ERR)
+
+/* Axi VDMA Flush on Fsync bits */
+#define XILINX_VDMA_FLUSH_S2MM			3
+#define XILINX_VDMA_FLUSH_MM2S			2
+#define XILINX_VDMA_FLUSH_BOTH			1
+
+/* Delay loop counter to prevent hardware failure */
+#define XILINX_VDMA_LOOP_COUNT			1000000
+
+/**
+ * struct xilinx_vdma_desc_hw - Hardware Descriptor
+ * @next_desc: Next Descriptor Pointer @0x00
+ * @pad1: Reserved @0x04
+ * @buf_addr: Buffer address @0x08
+ * @pad2: Reserved @0x0C
+ * @vsize: Vertical Size @0x10
+ * @hsize: Horizontal Size @0x14
+ * @stride: Number of bytes between the first
+ *	    pixels of each horizontal line @0x18
+ */
+struct xilinx_vdma_desc_hw {
+	u32 next_desc;
+	u32 pad1;
+	u32 buf_addr;
+	u32 pad2;
+	u32 vsize;
+	u32 hsize;
+	u32 stride;
+} __aligned(64);
+
+/**
+ * struct xilinx_vdma_tx_segment - Descriptor segment
+ * @hw: Hardware descriptor
+ * @node: Node in the descriptor segments list
+ * @cookie: Segment cookie
+ * @phys: Physical address of segment
+ */
+struct xilinx_vdma_tx_segment {
+	struct xilinx_vdma_desc_hw hw;
+	struct list_head node;
+	dma_cookie_t cookie;
+	dma_addr_t phys;
+} __aligned(64);
+
+/**
+ * struct xilinx_vdma_tx_descriptor - Per Transaction structure
+ * @async_tx: Async transaction descriptor
+ * @segments: TX segments list
+ * @node: Node in the channel descriptors list
+ */
+struct xilinx_vdma_tx_descriptor {
+	struct dma_async_tx_descriptor async_tx;
+	struct list_head segments;
+	struct list_head node;
+};
+
+#define to_vdma_tx_descriptor(tx) \
+	container_of(tx, struct xilinx_vdma_tx_descriptor, async_tx)
+
+/**
+ * struct xilinx_vdma_chan - Driver specific VDMA channel structure
+ * @xdev: Driver specific device structure
+ * @ctrl_offset: Control registers offset
+ * @desc_offset: TX descriptor registers offset
+ * @completed_cookie: Maximum cookie completed
+ * @cookie: The current cookie
+ * @lock: Descriptor operation lock
+ * @pending_list: Descriptors waiting
+ * @active_desc: Active descriptor
+ * @done_list: Complete descriptors
+ * @common: DMA common channel
+ * @desc_pool: Descriptors pool
+ * @dev: The dma device
+ * @irq: Channel IRQ
+ * @id: Channel ID
+ * @direction: Transfer direction
+ * @num_frms: Number of frames
+ * @has_sg: Support scatter transfers
+ * @genlock: Support genlock mode
+ * @err: Channel has errors
+ * @tasklet: Cleanup work after irq
+ * @config: Device configuration info
+ * @flush_on_fsync: Flush on Frame sync
+ */
+struct xilinx_vdma_chan {
+	struct xilinx_vdma_device *xdev;
+	u32 ctrl_offset;
+	u32 desc_offset;
+	dma_cookie_t completed_cookie;
+	dma_cookie_t cookie;
+	spinlock_t lock;
+	struct list_head pending_list;
+	struct xilinx_vdma_tx_descriptor *active_desc;
+	struct list_head done_list;
+	struct dma_chan common;
+	struct dma_pool *desc_pool;
+	struct device *dev;
+	int irq;
+	int id;
+	enum dma_transfer_direction direction;
+	int num_frms;
+	bool has_sg;
+	bool genlock;
+	bool err;
+	struct tasklet_struct tasklet;
+	struct xilinx_vdma_config config;
+	bool flush_on_fsync;
+};
+
+/**
+ * struct xilinx_vdma_device - VDMA device structure
+ * @regs: I/O mapped base address
+ * @dev: Device Structure
+ * @common: DMA device structure
+ * @chan: Driver specific VDMA channel
+ * @has_sg: Specifies whether Scatter-Gather is present or not
+ * @flush_on_fsync: Flush on frame sync
+ */
+struct xilinx_vdma_device {
+	void __iomem *regs;
+	struct device *dev;
+	struct dma_device common;
+	struct xilinx_vdma_chan *chan[XILINX_VDMA_MAX_CHANS_PER_DEVICE];
+	bool has_sg;
+	u32 flush_on_fsync;
+};
+
+#define to_xilinx_chan(chan) \
+			container_of(chan, struct xilinx_vdma_chan, common)
+
+/* IO accessors */
+static inline u32 vdma_read(struct xilinx_vdma_chan *chan, u32 reg)
+{
+	return ioread32(chan->xdev->regs + reg);
+}
+
+static inline void vdma_write(struct xilinx_vdma_chan *chan, u32 reg, u32 value)
+{
+	iowrite32(value, chan->xdev->regs + reg);
+}
+
+static inline void vdma_desc_write(struct xilinx_vdma_chan *chan, u32 reg,
+				   u32 value)
+{
+	vdma_write(chan, chan->desc_offset + reg, value);
+}
+
+static inline u32 vdma_ctrl_read(struct xilinx_vdma_chan *chan, u32 reg)
+{
+	return vdma_read(chan, chan->ctrl_offset + reg);
+}
+
+static inline void vdma_ctrl_write(struct xilinx_vdma_chan *chan, u32 reg,
+				   u32 value)
+{
+	vdma_write(chan, chan->ctrl_offset + reg, value);
+}
+
+static inline void vdma_ctrl_clr(struct xilinx_vdma_chan *chan, u32 reg,
+				 u32 clr)
+{
+	vdma_ctrl_write(chan, reg, vdma_ctrl_read(chan, reg) & ~clr);
+}
+
+static inline void vdma_ctrl_set(struct xilinx_vdma_chan *chan, u32 reg,
+				 u32 set)
+{
+	vdma_ctrl_write(chan, reg, vdma_ctrl_read(chan, reg) | set);
+}
+
+/* -----------------------------------------------------------------------------
+ * Descriptors and segments alloc and free
+ */
+
+/**
+ * xilinx_vdma_alloc_tx_segment - Allocate transaction segment
+ * @chan: Driver specific VDMA channel
+ *
+ * Return: The allocated segment on success and NULL on failure.
+ */
+static struct xilinx_vdma_tx_segment *
+xilinx_vdma_alloc_tx_segment(struct xilinx_vdma_chan *chan)
+{
+	struct xilinx_vdma_tx_segment *segment;
+	dma_addr_t phys;
+
+	segment = dma_pool_alloc(chan->desc_pool, GFP_ATOMIC, &phys);
+	if (!segment)
+		return NULL;
+
+	memset(segment, 0, sizeof(*segment));
+	segment->phys = phys;
+
+	return segment;
+}
+
+/**
+ * xilinx_vdma_free_tx_segment - Free transaction segment
+ * @chan: Driver specific VDMA channel
+ * @segment: VDMA transaction segment
+ */
+static void xilinx_vdma_free_tx_segment(struct xilinx_vdma_chan *chan,
+					struct xilinx_vdma_tx_segment *segment)
+{
+	dma_pool_free(chan->desc_pool, segment, segment->phys);
+}
+
+/**
+ * xilinx_vdma_tx_descriptor - Allocate transaction descriptor
+ * @chan: Driver specific VDMA channel
+ *
+ * Return: The allocated descriptor on success and NULL on failure.
+ */
+static struct xilinx_vdma_tx_descriptor *
+xilinx_vdma_alloc_tx_descriptor(struct xilinx_vdma_chan *chan)
+{
+	struct xilinx_vdma_tx_descriptor *desc;
+
+	desc = kzalloc(sizeof(*desc), GFP_KERNEL);
+	if (!desc)
+		return NULL;
+
+	INIT_LIST_HEAD(&desc->segments);
+
+	return desc;
+}
+
+/**
+ * xilinx_vdma_free_tx_descriptor - Free transaction descriptor
+ * @chan: Driver specific VDMA channel
+ * @desc: VDMA transaction descriptor
+ */
+static void
+xilinx_vdma_free_tx_descriptor(struct xilinx_vdma_chan *chan,
+			       struct xilinx_vdma_tx_descriptor *desc)
+{
+	struct xilinx_vdma_tx_segment *segment, *next;
+
+	if (!desc)
+		return;
+
+	list_for_each_entry_safe(segment, next, &desc->segments, node) {
+		list_del(&segment->node);
+		xilinx_vdma_free_tx_segment(chan, segment);
+	}
+
+	kfree(desc);
+}
+
+/* Required functions */
+
+/**
+ * xilinx_vdma_free_descriptors - Free descriptors list
+ * @chan: Driver specific VDMA channel
+ * @list: List to parse and delete the descriptor
+ */
+static void xilinx_vdma_free_desc_list(struct xilinx_vdma_chan *chan,
+					struct list_head *list)
+{
+	struct xilinx_vdma_tx_descriptor *desc, *next;
+
+	list_for_each_entry_safe(desc, next, list, node) {
+		list_del(&desc->node);
+		xilinx_vdma_free_tx_descriptor(chan, desc);
+	}
+}
+
+/**
+ * xilinx_vdma_free_descriptors - Free channel descriptors
+ * @chan: Driver specific VDMA channel
+ */
+static void xilinx_vdma_free_descriptors(struct xilinx_vdma_chan *chan)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&chan->lock, flags);
+
+	xilinx_vdma_free_desc_list(chan, &chan->pending_list);
+	xilinx_vdma_free_desc_list(chan, &chan->done_list);
+
+	xilinx_vdma_free_tx_descriptor(chan, chan->active_desc);
+	chan->active_desc = NULL;
+
+	spin_unlock_irqrestore(&chan->lock, flags);
+}
+
+/**
+ * xilinx_vdma_free_chan_resources - Free channel resources
+ * @dchan: DMA channel
+ */
+static void xilinx_vdma_free_chan_resources(struct dma_chan *dchan)
+{
+	struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
+
+	dev_dbg(chan->dev, "Free all channel resources.\n");
+
+	tasklet_kill(&chan->tasklet);
+	xilinx_vdma_free_descriptors(chan);
+	dma_pool_destroy(chan->desc_pool);
+	chan->desc_pool = NULL;
+}
+
+/**
+ * xilinx_vdma_chan_desc_cleanup - Clean channel descriptors
+ * @chan: Driver specific VDMA channel
+ */
+static void xilinx_vdma_chan_desc_cleanup(struct xilinx_vdma_chan *chan)
+{
+	struct xilinx_vdma_tx_descriptor *desc, *next;
+	unsigned long flags;
+
+	spin_lock_irqsave(&chan->lock, flags);
+
+	list_for_each_entry_safe(desc, next, &chan->done_list, node) {
+		dma_async_tx_callback callback;
+		void *callback_param;
+
+		/* Remove from the list of running transactions */
+		list_del(&desc->node);
+
+		/* Run the link descriptor callback function */
+		callback = desc->async_tx.callback;
+		callback_param = desc->async_tx.callback_param;
+		if (callback) {
+			spin_unlock_irqrestore(&chan->lock, flags);
+			callback(callback_param);
+			spin_lock_irqsave(&chan->lock, flags);
+		}
+
+		/* Run any dependencies, then free the descriptor */
+		dma_run_dependencies(&desc->async_tx);
+		xilinx_vdma_free_tx_descriptor(chan, desc);
+	}
+
+	spin_unlock_irqrestore(&chan->lock, flags);
+}
+
+/**
+ * xilinx_vdma_do_tasklet - Schedule completion tasklet
+ * @data: Pointer to the Xilinx VDMA channel structure
+ */
+static void xilinx_vdma_do_tasklet(unsigned long data)
+{
+	struct xilinx_vdma_chan *chan = (struct xilinx_vdma_chan *)data;
+
+	xilinx_vdma_chan_desc_cleanup(chan);
+}
+
+/**
+ * xilinx_vdma_alloc_chan_resources - Allocate channel resources
+ * @dchan: DMA channel
+ *
+ * Return: '1' on success and failure value on error
+ */
+static int xilinx_vdma_alloc_chan_resources(struct dma_chan *dchan)
+{
+	struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
+
+	/* Has this channel already been allocated? */
+	if (chan->desc_pool)
+		return 1;
+
+	/*
+	 * We need the descriptor to be aligned to 64bytes
+	 * for meeting Xilinx VDMA specification requirement.
+	 */
+	chan->desc_pool = dma_pool_create("xilinx_vdma_desc_pool",
+				chan->dev,
+				sizeof(struct xilinx_vdma_tx_segment),
+				__alignof__(struct xilinx_vdma_tx_segment), 0);
+	if (!chan->desc_pool) {
+		dev_err(chan->dev,
+			"unable to allocate channel %d descriptor pool\n",
+			chan->id);
+		return -ENOMEM;
+	}
+
+	tasklet_init(&chan->tasklet, xilinx_vdma_do_tasklet,
+			(unsigned long)chan);
+
+	chan->completed_cookie = DMA_MIN_COOKIE;
+	chan->cookie = DMA_MIN_COOKIE;
+
+	/* There is at least one descriptor free to be allocated */
+	return 1;
+}
+
+/**
+ * xilinx_vdma_tx_status - Get VDMA transaction status
+ * @dchan: DMA channel
+ * @cookie: Transaction identifier
+ * @txstate: Transaction state
+ *
+ * Return: DMA transaction status
+ */
+static enum dma_status xilinx_vdma_tx_status(struct dma_chan *dchan,
+					dma_cookie_t cookie,
+					struct dma_tx_state *txstate)
+{
+	struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
+	dma_cookie_t last_used;
+	dma_cookie_t last_complete;
+
+	xilinx_vdma_chan_desc_cleanup(chan);
+
+	last_used = dchan->cookie;
+	last_complete = chan->completed_cookie;
+
+	dma_set_tx_state(txstate, last_complete, last_used, 0);
+
+	return dma_async_is_complete(cookie, last_complete, last_used);
+}
+
+/**
+ * xilinx_vdma_is_running - Check if VDMA channel is running
+ * @chan: Driver specific VDMA channel
+ *
+ * Return: '1' if running, '0' if not.
+ */
+static int xilinx_vdma_is_running(struct xilinx_vdma_chan *chan)
+{
+	return !(vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) &
+		 XILINX_VDMA_DMASR_HALTED) &&
+		(vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR) &
+		 XILINX_VDMA_DMACR_RUNSTOP);
+}
+
+/**
+ * xilinx_vdma_is_idle - Check if VDMA channel is idle
+ * @chan: Driver specific VDMA channel
+ *
+ * Return: '1' if idle, '0' if not.
+ */
+static int xilinx_vdma_is_idle(struct xilinx_vdma_chan *chan)
+{
+	return vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) &
+		XILINX_VDMA_DMASR_IDLE;
+}
+
+/**
+ * xilinx_vdma_halt - Halt VDMA channel
+ * @chan: Driver specific VDMA channel
+ */
+static void xilinx_vdma_halt(struct xilinx_vdma_chan *chan)
+{
+	int loop = XILINX_VDMA_LOOP_COUNT + 1;
+
+	vdma_ctrl_clr(chan, XILINX_VDMA_REG_DMACR, XILINX_VDMA_DMACR_RUNSTOP);
+
+	/* Wait for the hardware to halt */
+	while (loop--)
+		if (vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) &
+		    XILINX_VDMA_DMASR_HALTED)
+			break;
+
+	if (!loop) {
+		dev_err(chan->dev, "Cannot stop channel %p: %x\n",
+			chan, vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR));
+		chan->err = true;
+	}
+
+	return;
+}
+
+/**
+ * xilinx_vdma_start - Start VDMA channel
+ * @chan: Driver specific VDMA channel
+ */
+static void xilinx_vdma_start(struct xilinx_vdma_chan *chan)
+{
+	int loop = XILINX_VDMA_LOOP_COUNT + 1;
+
+	vdma_ctrl_set(chan, XILINX_VDMA_REG_DMACR, XILINX_VDMA_DMACR_RUNSTOP);
+
+	/* Wait for the hardware to start */
+	while (loop--)
+		if (!(vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) &
+		      XILINX_VDMA_DMASR_HALTED))
+			break;
+
+	if (!loop) {
+		dev_err(chan->dev, "Cannot start channel %p: %x\n",
+			chan, vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR));
+
+		chan->err = true;
+	}
+
+	return;
+}
+
+/**
+ * xilinx_vdma_start_transfer - Starts VDMA transfer
+ * @chan: Driver specific channel struct pointer
+ */
+static void xilinx_vdma_start_transfer(struct xilinx_vdma_chan *chan)
+{
+	struct xilinx_vdma_config *config = &chan->config;
+	struct xilinx_vdma_tx_descriptor *desc;
+	unsigned long flags;
+	u32 reg;
+	struct xilinx_vdma_tx_segment *head, *tail = NULL;
+
+	if (chan->err)
+		return;
+
+	spin_lock_irqsave(&chan->lock, flags);
+
+	/* There's already an active descriptor, bail out. */
+	if (chan->active_desc)
+		goto out_unlock;
+
+	if (list_empty(&chan->pending_list))
+		goto out_unlock;
+
+	desc = list_first_entry(&chan->pending_list,
+				struct xilinx_vdma_tx_descriptor, node);
+
+	/* If it is SG mode and hardware is busy, cannot submit */
+	if (chan->has_sg && xilinx_vdma_is_running(chan) &&
+	    !xilinx_vdma_is_idle(chan)) {
+		dev_dbg(chan->dev, "DMA controller still busy\n");
+		goto out_unlock;
+	}
+
+	if (chan->err)
+		goto out_unlock;
+
+	/*
+	 * If hardware is idle, then all descriptors on the running lists are
+	 * done, start new transfers
+	 */
+	if (chan->has_sg) {
+		head = list_first_entry(&desc->segments,
+					struct xilinx_vdma_tx_segment, node);
+		tail = list_entry(desc->segments.prev,
+				  struct xilinx_vdma_tx_segment, node);
+
+		vdma_ctrl_write(chan, XILINX_VDMA_REG_CURDESC, head->phys);
+	}
+
+	/* Configure the hardware using info in the config structure */
+	reg = vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR);
+
+	if (config->frm_cnt_en)
+		reg |= XILINX_VDMA_DMACR_FRAMECNT_EN;
+	else
+		reg &= ~XILINX_VDMA_DMACR_FRAMECNT_EN;
+
+	/*
+	 * With SG, start with circular mode, so that BDs can be fetched.
+	 * In direct register mode, if not parking, enable circular mode
+	 */
+	if (chan->has_sg || !config->park)
+		reg |= XILINX_VDMA_DMACR_CIRC_EN;
+
+	if (config->park)
+		reg &= ~XILINX_VDMA_DMACR_CIRC_EN;
+
+	vdma_ctrl_write(chan, XILINX_VDMA_REG_DMACR, reg);
+
+	if (config->park && (config->park_frm >= 0) &&
+			(config->park_frm < chan->num_frms)) {
+		if (chan->direction == DMA_MEM_TO_DEV)
+			vdma_write(chan, XILINX_VDMA_REG_PARK_PTR,
+				config->park_frm <<
+					XILINX_VDMA_PARK_PTR_RD_REF_SHIFT);
+		else
+			vdma_write(chan, XILINX_VDMA_REG_PARK_PTR,
+				config->park_frm <<
+					XILINX_VDMA_PARK_PTR_WR_REF_SHIFT);
+	}
+
+	/* Start the hardware */
+	xilinx_vdma_start(chan);
+
+	if (chan->err)
+		goto out_unlock;
+
+	/* Start the transfer */
+	if (chan->has_sg) {
+		vdma_ctrl_write(chan, XILINX_VDMA_REG_TAILDESC, tail->phys);
+	} else {
+		struct xilinx_vdma_tx_segment *segment;
+		int i = 0;
+
+		list_for_each_entry(segment, &desc->segments, node)
+			vdma_desc_write(chan,
+					XILINX_VDMA_REG_START_ADDRESS(i++),
+					segment->hw.buf_addr);
+
+		vdma_desc_write(chan, XILINX_VDMA_REG_HSIZE, config->hsize);
+		vdma_desc_write(chan, XILINX_VDMA_REG_FRMDLY_STRIDE,
+				(config->frm_dly <<
+				 XILINX_VDMA_FRMDLY_STRIDE_FRMDLY_SHIFT) |
+				(config->stride <<
+				 XILINX_VDMA_FRMDLY_STRIDE_STRIDE_SHIFT));
+		vdma_desc_write(chan, XILINX_VDMA_REG_VSIZE, config->vsize);
+	}
+
+	list_del(&desc->node);
+	chan->active_desc = desc;
+
+out_unlock:
+	spin_unlock_irqrestore(&chan->lock, flags);
+}
+
+/**
+ * xilinx_vdma_issue_pending - Issue pending transactions
+ * @dchan: DMA channel
+ */
+static void xilinx_vdma_issue_pending(struct dma_chan *dchan)
+{
+	struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
+
+	xilinx_vdma_start_transfer(chan);
+}
+
+/**
+ * xilinx_vdma_complete_descriptor - Mark the active descriptor as complete
+ * @chan : xilinx DMA channel
+ *
+ * CONTEXT: hardirq
+ */
+static void xilinx_vdma_complete_descriptor(struct xilinx_vdma_chan *chan)
+{
+	struct xilinx_vdma_tx_descriptor *desc;
+	unsigned long flags;
+
+	spin_lock_irqsave(&chan->lock, flags);
+
+	desc = chan->active_desc;
+	if (!desc) {
+		dev_dbg(chan->dev, "no running descriptors\n");
+		goto out_unlock;
+	}
+
+	list_add_tail(&desc->node, &chan->done_list);
+
+	/* Update the completed cookie and reset the active descriptor. */
+	chan->completed_cookie = desc->async_tx.cookie;
+	chan->active_desc = NULL;
+
+out_unlock:
+	spin_unlock_irqrestore(&chan->lock, flags);
+}
+
+/**
+ * xilinx_vdma_reset - Reset VDMA channel
+ * @chan: Driver specific VDMA channel
+ *
+ * Return: '0' on success and failure value on error
+ */
+static int xilinx_vdma_reset(struct xilinx_vdma_chan *chan)
+{
+	int loop = XILINX_VDMA_LOOP_COUNT + 1;
+	u32 tmp;
+
+	vdma_ctrl_set(chan, XILINX_VDMA_REG_DMACR, XILINX_VDMA_DMACR_RESET);
+
+	tmp = vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR) &
+		XILINX_VDMA_DMACR_RESET;
+
+	/* Wait for the hardware to finish reset */
+	while (loop-- && tmp)
+		tmp = vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR) &
+			XILINX_VDMA_DMACR_RESET;
+
+	if (!loop) {
+		dev_err(chan->dev, "reset timeout, cr %x, sr %x\n",
+			vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR),
+			vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR));
+		return -ETIMEDOUT;
+	}
+
+	chan->err = false;
+
+	return 0;
+}
+
+/**
+ * xilinx_vdma_chan_reset - Reset VDMA channel and enable interrupts
+ * @chan: Driver specific VDMA channel
+ *
+ * Return: '0' on success and failure value on error
+ */
+static int xilinx_vdma_chan_reset(struct xilinx_vdma_chan *chan)
+{
+	int err;
+
+	/* Reset VDMA */
+	err = xilinx_vdma_reset(chan);
+	if (err)
+		return err;
+
+	/* Enable interrupts */
+	vdma_ctrl_set(chan, XILINX_VDMA_REG_DMACR,
+		      XILINX_VDMA_DMAXR_ALL_IRQ_MASK);
+
+	return 0;
+}
+
+/**
+ * xilinx_vdma_irq_handler - VDMA Interrupt handler
+ * @irq: IRQ number
+ * @data: Pointer to the Xilinx VDMA channel structure
+ *
+ * Return: IRQ_HANDLED/IRQ_NONE
+ */
+static irqreturn_t xilinx_vdma_irq_handler(int irq, void *data)
+{
+	struct xilinx_vdma_chan *chan = data;
+	u32 status;
+
+	/* Read the status and ack the interrupts. */
+	status = vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR);
+	if (!(status & XILINX_VDMA_DMAXR_ALL_IRQ_MASK))
+		return IRQ_NONE;
+
+	vdma_ctrl_write(chan, XILINX_VDMA_REG_DMASR,
+			status & XILINX_VDMA_DMAXR_ALL_IRQ_MASK);
+
+	if (status & XILINX_VDMA_DMASR_ERR_IRQ) {
+		/*
+		 * An error occurred. If C_FLUSH_ON_FSYNC is enabled and the
+		 * error is recoverable, ignore it. Otherwise flag the error.
+		 *
+		 * Only recoverable errors can be cleared in the DMASR register,
+		 * make sure not to write to other error bits to 1.
+		 */
+		u32 errors = status & XILINX_VDMA_DMASR_ALL_ERR_MASK;
+		vdma_ctrl_write(chan, XILINX_VDMA_REG_DMASR,
+				errors & XILINX_VDMA_DMASR_ERR_RECOVER_MASK);
+
+		if (!chan->flush_on_fsync ||
+		    (errors & ~XILINX_VDMA_DMASR_ERR_RECOVER_MASK)) {
+			dev_err(chan->dev,
+				"Channel %p has errors %x, cdr %x tdr %x\n",
+				chan, errors,
+				vdma_ctrl_read(chan, XILINX_VDMA_REG_CURDESC),
+				vdma_ctrl_read(chan, XILINX_VDMA_REG_TAILDESC));
+			chan->err = true;
+		}
+	}
+
+	if (status & XILINX_VDMA_DMASR_DLY_CNT_IRQ) {
+		/*
+		 * Device takes too long to do the transfer when user requires
+		 * responsiveness.
+		 */
+		dev_dbg(chan->dev, "Inter-packet latency too long\n");
+	}
+
+	if (status & XILINX_VDMA_DMASR_FRM_CNT_IRQ) {
+		xilinx_vdma_complete_descriptor(chan);
+		xilinx_vdma_start_transfer(chan);
+	}
+
+	tasklet_schedule(&chan->tasklet);
+	return IRQ_HANDLED;
+}
+
+/**
+ * xilinx_vdma_tx_submit - Submit DMA transaction
+ * @tx: Async transaction descriptor
+ *
+ * Return: cookie value on success and failure value on error
+ */
+static dma_cookie_t xilinx_vdma_tx_submit(struct dma_async_tx_descriptor *tx)
+{
+	struct xilinx_vdma_tx_descriptor *desc = to_vdma_tx_descriptor(tx);
+	struct xilinx_vdma_chan *chan = to_xilinx_chan(tx->chan);
+	struct xilinx_vdma_tx_segment *segment;
+	dma_cookie_t cookie;
+	unsigned long flags;
+	int err;
+
+	if (chan->err) {
+		/*
+		 * If reset fails, need to hard reset the system.
+		 * Channel is no longer functional
+		 */
+		err = xilinx_vdma_chan_reset(chan);
+		if (err < 0)
+			return err;
+	}
+
+	spin_lock_irqsave(&chan->lock, flags);
+
+	/* Assign cookies to all of the segments that make up this transaction.
+	 * Use the cookie of the last segment as the transaction cookie.
+	 */
+	cookie = chan->cookie;
+
+	list_for_each_entry(segment, &desc->segments, node) {
+		if (cookie < DMA_MAX_COOKIE)
+			cookie++;
+		else
+			cookie = DMA_MIN_COOKIE;
+
+		segment->cookie = cookie;
+	}
+
+	tx->cookie = cookie;
+	chan->cookie = cookie;
+
+	/* Append the transaction to the pending transactions queue. */
+	list_add_tail(&desc->node, &chan->pending_list);
+
+	spin_unlock_irqrestore(&chan->lock, flags);
+
+	return cookie;
+}
+
+/**
+ * xilinx_vdma_prep_slave_sg - prepare a descriptor for a DMA_SLAVE transaction
+ * @dchan: DMA channel
+ * @sgl: scatterlist to transfer to/from
+ * @sg_len: number of entries in @sgl
+ * @dir: DMA direction
+ * @flags: transfer ack flags
+ * @context: unused
+ *
+ * Return: Async transaction descriptor on success and NULL on failure
+ */
+static struct dma_async_tx_descriptor *
+xilinx_vdma_prep_slave_sg(struct dma_chan *dchan, struct scatterlist *sgl,
+			  unsigned int sg_len, enum dma_transfer_direction dir,
+			  unsigned long flags, void *context)
+{
+	struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
+	struct xilinx_vdma_tx_descriptor *desc;
+	struct xilinx_vdma_tx_segment *segment;
+	struct xilinx_vdma_tx_segment *prev = NULL;
+	struct scatterlist *sg;
+	int i;
+
+	if (chan->direction != dir || sg_len == 0)
+		return NULL;
+
+	/* Enforce one sg entry for one frame. */
+	if (sg_len != chan->num_frms) {
+		dev_err(chan->dev,
+		"number of entries %d not the same as num stores %d\n",
+			sg_len, chan->num_frms);
+		return NULL;
+	}
+
+	/* Allocate a transaction descriptor. */
+	desc = xilinx_vdma_alloc_tx_descriptor(chan);
+	if (!desc)
+		return NULL;
+
+	dma_async_tx_descriptor_init(&desc->async_tx, &chan->common);
+	desc->async_tx.tx_submit = xilinx_vdma_tx_submit;
+	desc->async_tx.cookie = 0;
+	async_tx_ack(&desc->async_tx);
+
+	/* Build the list of transaction segments. */
+	for_each_sg(sgl, sg, sg_len, i) {
+		struct xilinx_vdma_desc_hw *hw;
+
+		/* Allocate the link descriptor from DMA pool */
+		segment = xilinx_vdma_alloc_tx_segment(chan);
+		if (!segment)
+			goto error;
+
+		/* Fill in the hardware descriptor */
+		hw = &segment->hw;
+		hw->buf_addr = sg_dma_address(sg);
+		hw->vsize = chan->config.vsize;
+		hw->hsize = chan->config.hsize;
+		hw->stride = (chan->config.frm_dly <<
+			      XILINX_VDMA_FRMDLY_STRIDE_FRMDLY_SHIFT) |
+			     (chan->config.stride <<
+			      XILINX_VDMA_FRMDLY_STRIDE_STRIDE_SHIFT);
+		if (prev)
+			prev->hw.next_desc = segment->phys;
+
+		/* Insert the segment into the descriptor segments list. */
+		list_add_tail(&segment->node, &desc->segments);
+
+		prev = segment;
+	}
+
+	/* Link the last hardware descriptor with the first. */
+	segment = list_first_entry(&desc->segments,
+				   struct xilinx_vdma_tx_segment, node);
+	prev->hw.next_desc = segment->phys;
+
+	return &desc->async_tx;
+
+error:
+	xilinx_vdma_free_tx_descriptor(chan, desc);
+	return NULL;
+}
+
+/**
+ * xilinx_vdma_terminate_all - Halt the channel and free descriptors
+ * @chan: Driver specific VDMA Channel pointer
+ */
+static void xilinx_vdma_terminate_all(struct xilinx_vdma_chan *chan)
+{
+	/* Halt the DMA engine */
+	xilinx_vdma_halt(chan);
+
+	/* Remove and free all of the descriptors in the lists */
+	xilinx_vdma_free_descriptors(chan);
+}
+
+/**
+ * xilinx_vdma_slave_config - Configure VDMA channel
+ * Run-time configuration for Axi VDMA, supports:
+ * . halt the channel
+ * . configure interrupt coalescing and inter-packet delay threshold
+ * . start/stop parking
+ * . enable genlock
+ * . set transfer information using config struct
+ *
+ * @chan: Driver specific VDMA Channel pointer
+ * @cfg: Channel configuration pointer
+ *
+ * Return: '0' on success and failure value on error
+ */
+static int xilinx_vdma_slave_config(struct xilinx_vdma_chan *chan,
+				    struct xilinx_vdma_config *cfg)
+{
+	u32 dmacr;
+
+	if (cfg->reset)
+		return xilinx_vdma_chan_reset(chan);
+
+	dmacr = vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR);
+
+	/* If vsize is -1, it is park-related operations */
+	if (cfg->vsize == -1) {
+		if (cfg->park)
+			dmacr &= ~XILINX_VDMA_DMACR_CIRC_EN;
+		else
+			dmacr |= XILINX_VDMA_DMACR_CIRC_EN;
+
+		vdma_ctrl_write(chan, XILINX_VDMA_REG_DMACR, dmacr);
+		return 0;
+	}
+
+	/* If hsize is -1, it is interrupt threshold settings */
+	if (cfg->hsize == -1) {
+		if (cfg->coalesc <= XILINX_VDMA_DMACR_FRAME_COUNT_MAX) {
+			dmacr &= ~XILINX_VDMA_DMACR_FRAME_COUNT_MASK;
+			dmacr |= cfg->coalesc <<
+				 XILINX_VDMA_DMACR_FRAME_COUNT_SHIFT;
+			chan->config.coalesc = cfg->coalesc;
+		}
+
+		if (cfg->delay <= XILINX_VDMA_DMACR_DELAY_MAX) {
+			dmacr &= ~XILINX_VDMA_DMACR_DELAY_MASK;
+			dmacr |= cfg->delay << XILINX_VDMA_DMACR_DELAY_SHIFT;
+			chan->config.delay = cfg->delay;
+		}
+
+		vdma_ctrl_write(chan, XILINX_VDMA_REG_DMACR, dmacr);
+		return 0;
+	}
+
+	/* Transfer information */
+	chan->config.vsize = cfg->vsize;
+	chan->config.hsize = cfg->hsize;
+	chan->config.stride = cfg->stride;
+	chan->config.frm_dly = cfg->frm_dly;
+	chan->config.park = cfg->park;
+
+	/* genlock settings */
+	chan->config.gen_lock = cfg->gen_lock;
+	chan->config.master = cfg->master;
+
+	if (cfg->gen_lock && chan->genlock) {
+		dmacr |= XILINX_VDMA_DMACR_GENLOCK_EN;
+		dmacr |= cfg->master << XILINX_VDMA_DMACR_MASTER_SHIFT;
+	}
+
+	chan->config.frm_cnt_en = cfg->frm_cnt_en;
+	if (cfg->park)
+		chan->config.park_frm = cfg->park_frm;
+	else
+		chan->config.park_frm = -1;
+
+	chan->config.coalesc = cfg->coalesc;
+	chan->config.delay = cfg->delay;
+	if (cfg->coalesc <= XILINX_VDMA_DMACR_FRAME_COUNT_MAX) {
+		dmacr |= cfg->coalesc << XILINX_VDMA_DMACR_FRAME_COUNT_SHIFT;
+		chan->config.coalesc = cfg->coalesc;
+	}
+
+	if (cfg->delay <= XILINX_VDMA_DMACR_DELAY_MAX) {
+		dmacr |= cfg->delay << XILINX_VDMA_DMACR_DELAY_SHIFT;
+		chan->config.delay = cfg->delay;
+	}
+
+	/* FSync Source selection */
+	dmacr &= ~XILINX_VDMA_DMACR_FSYNCSRC_MASK;
+	dmacr |= cfg->ext_fsync << XILINX_VDMA_DMACR_FSYNCSRC_SHIFT;
+
+	vdma_ctrl_write(chan, XILINX_VDMA_REG_DMACR, dmacr);
+	return 0;
+}
+
+/**
+ * xilinx_vdma_device_control - Configure DMA channel of the device
+ * @dchan: DMA Channel pointer
+ * @cmd: DMA control command
+ * @arg: Channel configuration
+ *
+ * Return: '0' on success and failure value on error
+ */
+static int xilinx_vdma_device_control(struct dma_chan *dchan,
+				      enum dma_ctrl_cmd cmd, unsigned long arg)
+{
+	struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
+
+	switch (cmd) {
+	case DMA_TERMINATE_ALL:
+		xilinx_vdma_terminate_all(chan);
+		return 0;
+	case DMA_SLAVE_CONFIG:
+		return xilinx_vdma_slave_config(chan,
+					(struct xilinx_vdma_config *)arg);
+	default:
+		return -ENXIO;
+	}
+}
+
+/* -----------------------------------------------------------------------------
+ * Probe and remove
+ */
+
+/**
+ * xilinx_vdma_chan_remove - Per Channel remove function
+ * @chan: Driver specific VDMA channel
+ */
+static void xilinx_vdma_chan_remove(struct xilinx_vdma_chan *chan)
+{
+	/* Disable all interrupts */
+	vdma_ctrl_clr(chan, XILINX_VDMA_REG_DMACR,
+		      XILINX_VDMA_DMAXR_ALL_IRQ_MASK);
+
+	list_del(&chan->common.device_node);
+}
+
+/**
+ * xilinx_vdma_chan_probe - Per Channel Probing
+ * It get channel features from the device tree entry and
+ * initialize special channel handling routines
+ *
+ * @xdev: Driver specific device structure
+ * @node: Device node
+ *
+ * Return: '0' on success and failure value on error
+ */
+static int xilinx_vdma_chan_probe(struct xilinx_vdma_device *xdev,
+				  struct device_node *node)
+{
+	struct xilinx_vdma_chan *chan;
+	bool has_dre = false;
+	u32 value;
+	int err;
+
+	/* Allocate and initialize the channel structure */
+	chan = devm_kzalloc(xdev->dev, sizeof(*chan), GFP_KERNEL);
+	if (!chan)
+		return -ENOMEM;
+
+	chan->dev = xdev->dev;
+	chan->xdev = xdev;
+	chan->has_sg = xdev->has_sg;
+
+	spin_lock_init(&chan->lock);
+	INIT_LIST_HEAD(&chan->pending_list);
+	INIT_LIST_HEAD(&chan->done_list);
+
+	/* Retrieve the channel properties from the device tree */
+	has_dre = of_property_read_bool(node, "xlnx,include-dre");
+
+	chan->genlock = of_property_read_bool(node, "xlnx,genlock-mode");
+
+	err = of_property_read_u32(node, "xlnx,datawidth", &value);
+	if (!err) {
+		u32 width = value >> 3; /* Convert bits to bytes */
+
+		/* If data width is greater than 8 bytes, DRE is not in hw */
+		if (width > 8)
+			has_dre = false;
+
+		if (!has_dre)
+			xdev->common.copy_align = fls(width - 1);
+	} else {
+		dev_err(xdev->dev, "missing xlnx,datawidth property\n");
+		return err;
+	}
+
+	if (of_device_is_compatible(node, "xlnx,axi-vdma-mm2s-channel")) {
+		chan->direction = DMA_MEM_TO_DEV;
+		chan->id = 0;
+
+		chan->ctrl_offset = XILINX_VDMA_MM2S_CTRL_OFFSET;
+		chan->desc_offset = XILINX_VDMA_MM2S_DESC_OFFSET;
+
+		if (xdev->flush_on_fsync == XILINX_VDMA_FLUSH_BOTH ||
+		    xdev->flush_on_fsync == XILINX_VDMA_FLUSH_MM2S)
+			chan->flush_on_fsync = true;
+	} else if (of_device_is_compatible(node,
+					    "xlnx,axi-vdma-s2mm-channel")) {
+		chan->direction = DMA_DEV_TO_MEM;
+		chan->id = 1;
+
+		chan->ctrl_offset = XILINX_VDMA_S2MM_CTRL_OFFSET;
+		chan->desc_offset = XILINX_VDMA_S2MM_DESC_OFFSET;
+
+		if (xdev->flush_on_fsync == XILINX_VDMA_FLUSH_BOTH ||
+		    xdev->flush_on_fsync == XILINX_VDMA_FLUSH_S2MM)
+			chan->flush_on_fsync = true;
+	} else {
+		dev_err(xdev->dev, "Invalid channel compatible node\n");
+		return -EINVAL;
+	}
+
+	/* Request the interrupt */
+	chan->irq = irq_of_parse_and_map(node, 0);
+	err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler,
+			       IRQF_SHARED, "xilinx-vdma-controller", chan);
+	if (err) {
+		dev_err(xdev->dev, "unable to request IRQ\n");
+		return err;
+	}
+
+	/* Initialize the DMA channel and add it to the DMA engine channels
+	 * list.
+	 */
+	chan->common.device = &xdev->common;
+
+	list_add_tail(&chan->common.device_node, &xdev->common.channels);
+	xdev->chan[chan->id] = chan;
+
+	/* Reset the channel */
+	err = xilinx_vdma_chan_reset(chan);
+	if (err < 0) {
+		dev_err(xdev->dev, "Reset channel failed\n");
+		return err;
+	}
+
+	return 0;
+}
+
+/**
+ * struct of_dma_filter_xilinx_args - Channel filter args
+ * @dev: DMA device structure
+ * @chan_id: Channel id
+ */
+struct of_dma_filter_xilinx_args {
+	struct dma_device *dev;
+	u32 chan_id;
+};
+
+/**
+ * xilinx_vdma_dt_filter - VDMA channel filter function
+ * @chan: DMA channel pointer
+ * @param: Filter match value
+ *
+ * Return: true/false based on the result
+ */
+static bool xilinx_vdma_dt_filter(struct dma_chan *chan, void *param)
+{
+	struct of_dma_filter_xilinx_args *args = param;
+
+	return chan->device == args->dev && chan->chan_id == args->chan_id;
+}
+
+/**
+ * of_dma_xilinx_xlate - Translation function
+ * @dma_spec: Pointer to DMA specifier as found in the device tree
+ * @ofdma: Pointer to DMA controller data
+ *
+ * Return: DMA channel pointer on success and NULL on error
+ */
+static struct dma_chan *of_dma_xilinx_xlate(struct of_phandle_args *dma_spec,
+						struct of_dma *ofdma)
+{
+	struct of_dma_filter_xilinx_args args;
+	dma_cap_mask_t cap;
+
+	args.dev = ofdma->of_dma_data;
+	if (!args.dev)
+		return NULL;
+
+	if (dma_spec->args_count != 1)
+		return NULL;
+
+	dma_cap_zero(cap);
+	dma_cap_set(DMA_SLAVE, cap);
+
+	args.chan_id = dma_spec->args[0];
+
+	return dma_request_channel(cap, xilinx_vdma_dt_filter, &args);
+}
+
+/**
+ * xilinx_vdma_probe - Driver probe function
+ * @pdev: Pointer to the platform_device structure
+ *
+ * Return: '0' on success and failure value on error
+ */
+static int xilinx_vdma_probe(struct platform_device *pdev)
+{
+	struct device_node *node = pdev->dev.of_node;
+	struct xilinx_vdma_device *xdev;
+	struct device_node *child;
+	struct resource *io;
+	u32 num_frames;
+	int i, err;
+
+	dev_info(&pdev->dev, "Probing xilinx axi vdma engine\n");
+
+	/* Allocate and initialize the DMA engine structure */
+	xdev = devm_kzalloc(&pdev->dev, sizeof(*xdev), GFP_KERNEL);
+	if (!xdev)
+		return -ENOMEM;
+
+	xdev->dev = &pdev->dev;
+
+	/* Request and map I/O memory */
+	io = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	xdev->regs = devm_ioremap_resource(&pdev->dev, io);
+	if (IS_ERR(xdev->regs))
+		return PTR_ERR(xdev->regs);
+
+	/* Retrieve the DMA engine properties from the device tree */
+	xdev->has_sg = of_property_read_bool(node, "xlnx,include-sg");
+
+	err = of_property_read_u32(node, "xlnx,num-fstores", &num_frames);
+	if (err < 0) {
+		dev_err(xdev->dev, "missing xlnx,num-fstores property\n");
+		return err;
+	}
+
+	of_property_read_u32(node, "xlnx,flush-fsync", &xdev->flush_on_fsync);
+
+	/* Initialize the DMA engine */
+	xdev->common.dev = &pdev->dev;
+
+	INIT_LIST_HEAD(&xdev->common.channels);
+	dma_cap_set(DMA_SLAVE, xdev->common.cap_mask);
+	dma_cap_set(DMA_PRIVATE, xdev->common.cap_mask);
+
+	xdev->common.device_alloc_chan_resources =
+				xilinx_vdma_alloc_chan_resources;
+	xdev->common.device_free_chan_resources =
+				xilinx_vdma_free_chan_resources;
+	xdev->common.device_prep_slave_sg = xilinx_vdma_prep_slave_sg;
+	xdev->common.device_control = xilinx_vdma_device_control;
+	xdev->common.device_tx_status = xilinx_vdma_tx_status;
+	xdev->common.device_issue_pending = xilinx_vdma_issue_pending;
+
+	platform_set_drvdata(pdev, xdev);
+
+	/* Initialize the channels */
+	for_each_child_of_node(node, child) {
+		err = xilinx_vdma_chan_probe(xdev, child);
+		if (err < 0)
+			goto error;
+	}
+
+	for (i = 0; i < XILINX_VDMA_MAX_CHANS_PER_DEVICE; i++) {
+		if (xdev->chan[i])
+			xdev->chan[i]->num_frms = num_frames;
+	}
+
+	/* Register the DMA engine with the core */
+	dma_async_device_register(&xdev->common);
+
+	err = of_dma_controller_register(node, of_dma_xilinx_xlate,
+					 &xdev->common);
+	if (err < 0) {
+		dev_err(&pdev->dev, "Unable to register DMA to DT\n");
+		dma_async_device_unregister(&xdev->common);
+		goto error;
+	}
+
+	return 0;
+
+error:
+	for (i = 0; i < XILINX_VDMA_MAX_CHANS_PER_DEVICE; i++) {
+		if (xdev->chan[i])
+			xilinx_vdma_chan_remove(xdev->chan[i]);
+	}
+
+	return err;
+}
+
+/**
+ * xilinx_vdma_remove - Driver remove function
+ * @pdev: Pointer to the platform_device structure
+ *
+ * Return: Always '0'
+ */
+static int xilinx_vdma_remove(struct platform_device *pdev)
+{
+	struct xilinx_vdma_device *xdev;
+	int i;
+
+	of_dma_controller_free(pdev->dev.of_node);
+
+	xdev = platform_get_drvdata(pdev);
+	dma_async_device_unregister(&xdev->common);
+
+	for (i = 0; i < XILINX_VDMA_MAX_CHANS_PER_DEVICE; i++) {
+		if (xdev->chan[i])
+			xilinx_vdma_chan_remove(xdev->chan[i]);
+	}
+
+	return 0;
+}
+
+static const struct of_device_id xilinx_vdma_of_ids[] = {
+	{ .compatible = "xlnx,axi-vdma-1.00.a",},
+	{}
+};
+
+static struct platform_driver xilinx_vdma_driver = {
+	.driver = {
+		.name = "xilinx-vdma",
+		.owner = THIS_MODULE,
+		.of_match_table = xilinx_vdma_of_ids,
+	},
+	.probe = xilinx_vdma_probe,
+	.remove = xilinx_vdma_remove,
+};
+
+module_platform_driver(xilinx_vdma_driver);
+
+MODULE_AUTHOR("Xilinx, Inc.");
+MODULE_DESCRIPTION("Xilinx VDMA driver");
+MODULE_LICENSE("GPL v2");
diff --git a/include/linux/amba/xilinx_dma.h b/include/linux/amba/xilinx_dma.h
new file mode 100644
index 0000000..48a8c8b
--- /dev/null
+++ b/include/linux/amba/xilinx_dma.h
@@ -0,0 +1,50 @@ 
+/*
+ * Xilinx DMA Engine drivers support header file
+ *
+ * Copyright (C) 2010-2014 Xilinx, Inc. All rights reserved.
+ *
+ * This is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#ifndef __DMA_XILINX_DMA_H
+#define __DMA_XILINX_DMA_H
+
+#include <linux/dma-mapping.h>
+#include <linux/dmaengine.h>
+
+/**
+ * struct xilinx_vdma_config - VDMA Configuration structure
+ * @vsize: Vertical size
+ * @hsize: Horizontal size
+ * @stride: Stride
+ * @frm_dly: Frame delay
+ * @gen_lock: Whether in gen-lock mode
+ * @master: Master that it syncs to
+ * @frm_cnt_en: Enable frame count enable
+ * @park: Whether wants to park
+ * @park_frm: Frame to park on
+ * @coalesc: Interrupt coalescing threshold
+ * @delay: Delay counter
+ * @reset: Reset Channel
+ * @ext_fsync: External Frame Sync source
+ */
+struct xilinx_vdma_config {
+	int vsize;
+	int hsize;
+	int stride;
+	int frm_dly;
+	int gen_lock;
+	int master;
+	int frm_cnt_en;
+	int park;
+	int park_frm;
+	int coalesc;
+	int delay;
+	int reset;
+	int ext_fsync;
+};
+
+#endif