From patchwork Wed Oct 16 05:36:48 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: archit taneja X-Patchwork-Id: 3050481 Return-Path: X-Original-To: patchwork-linux-media@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id B346BBF924 for ; Wed, 16 Oct 2013 05:38:12 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 0C87120347 for ; Wed, 16 Oct 2013 05:38:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 28DCD20453 for ; Wed, 16 Oct 2013 05:38:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753228Ab3JPFiE (ORCPT ); Wed, 16 Oct 2013 01:38:04 -0400 Received: from bear.ext.ti.com ([192.94.94.41]:36196 "EHLO bear.ext.ti.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753116Ab3JPFiC (ORCPT ); Wed, 16 Oct 2013 01:38:02 -0400 Received: from dflxv15.itg.ti.com ([128.247.5.124]) by bear.ext.ti.com (8.13.7/8.13.7) with ESMTP id r9G5bvsx015330; Wed, 16 Oct 2013 00:37:57 -0500 Received: from DLEE70.ent.ti.com (dlee70.ent.ti.com [157.170.170.113]) by dflxv15.itg.ti.com (8.14.3/8.13.8) with ESMTP id r9G5bvFD008957; Wed, 16 Oct 2013 00:37:57 -0500 Received: from dlep33.itg.ti.com (157.170.170.75) by DLEE70.ent.ti.com (157.170.170.113) with Microsoft SMTP Server id 14.2.342.3; Wed, 16 Oct 2013 00:37:57 -0500 Received: from legion.dal.design.ti.com (legion.dal.design.ti.com [128.247.22.53]) by dlep33.itg.ti.com (8.14.3/8.13.8) with ESMTP id r9G5bvVW019962; Wed, 16 Oct 2013 00:37:57 -0500 Received: from localhost (a0393947pc.apr.dhcp.ti.com [172.24.145.166]) by legion.dal.design.ti.com (8.11.7p1+Sun/8.11.7) with ESMTP id r9G5btt27938; Wed, 16 Oct 2013 00:37:56 -0500 (CDT) From: Archit Taneja To: CC: , , , Archit Taneja Subject: [PATCH v5 4/4] v4l: ti-vpe: Add de-interlacer support in VPE Date: Wed, 16 Oct 2013 11:06:48 +0530 Message-ID: <1381901808-25119-5-git-send-email-archit@ti.com> X-Mailer: git-send-email 1.8.1.2 In-Reply-To: <1381901808-25119-1-git-send-email-archit@ti.com> References: <1378462346-10880-1-git-send-email-archit@ti.com> <1381901808-25119-1-git-send-email-archit@ti.com> MIME-Version: 1.0 Sender: linux-media-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org X-Spam-Status: No, score=-7.4 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add support for the de-interlacer block in VPE. For de-interlacer to work, we need to enable 2 more sets of VPE input ports which fetch data from the 'last' and 'last to last' fields of the interlaced video. Apart from that, we need to enable the Motion vector output and input ports, and also allocate DMA buffers for them. We need to make sure that two most recent fields in the source queue are available and in the 'READY' state. Once a mem2mem context gets access to the VPE HW(in device_run), it extracts the addresses of the 3 buffers, and provides it to the data descriptors for the 3 sets of input ports((LUMA1, CHROMA1), (LUMA2, CHROMA2), and (LUMA3, CHROMA3)) respectively for the 3 consecutive fields. The motion vector and output port descriptors are configured and the list is submitted to VPDMA. Once the transaction is done, the v4l2 buffer corresponding to the oldest field(the 3rd one) is changed to the state 'DONE', and the buffers corresponding to 1st and 2nd fields become the 2nd and 3rd field for the next de-interlace operation. This way, for each deinterlace operation, we have the 3 most recent fields. After each transaction, we also swap the motion vector buffers, the new input motion vector buffer contains the resultant motion information of all the previous frames, and the new output motion vector buffer will be used to hold the updated motion vector to capture the motion changes in the next field. The motion vector buffers are allocated using the DMA allocation API. The de-interlacer is removed from bypass mode, it requires some extra default configurations which are now added. The chrominance upsampler coefficients are added for interlaced frames. Some VPDMA parameters like frame start event and line mode are configured for the 2 extra sets of input ports. Acked-by: Hans Verkuil Signed-off-by: Archit Taneja --- drivers/media/platform/ti-vpe/vpe.c | 392 ++++++++++++++++++++++++++++++++---- 1 file changed, 358 insertions(+), 34 deletions(-) diff --git a/drivers/media/platform/ti-vpe/vpe.c b/drivers/media/platform/ti-vpe/vpe.c index 3bd9ca6..4e58069 100644 --- a/drivers/media/platform/ti-vpe/vpe.c +++ b/drivers/media/platform/ti-vpe/vpe.c @@ -69,6 +69,8 @@ #define VPE_CHROMA 1 /* per m2m context info */ +#define VPE_MAX_SRC_BUFS 3 /* need 3 src fields to de-interlace */ + #define VPE_DEF_BUFS_PER_JOB 1 /* default one buffer per batch job */ /* @@ -111,6 +113,38 @@ static const struct vpe_us_coeffs us_coeffs[] = { 0x00C8, 0x0348, 0x0018, 0x3FD8, 0x3FB8, 0x0378, 0x00E8, 0x3FE8, 0x00C8, 0x0348, 0x0018, 0x3FD8, 0x3FB8, 0x0378, 0x00E8, 0x3FE8, }, + { + /* Coefficients for Top Field Interlaced input */ + 0x0051, 0x03D5, 0x3FE3, 0x3FF7, 0x3FB5, 0x02E9, 0x018F, 0x3FD3, + /* Coefficients for Bottom Field Interlaced input */ + 0x016B, 0x0247, 0x00B1, 0x3F9D, 0x3FCF, 0x03DB, 0x005D, 0x3FF9, + }, +}; + +/* + * the following registers are for configuring some of the parameters of the + * motion and edge detection blocks inside DEI, these generally remain the same, + * these could be passed later via userspace if some one needs to tweak these. + */ +struct vpe_dei_regs { + unsigned long mdt_spacial_freq_thr_reg; /* VPE_DEI_REG2 */ + unsigned long edi_config_reg; /* VPE_DEI_REG3 */ + unsigned long edi_lut_reg0; /* VPE_DEI_REG4 */ + unsigned long edi_lut_reg1; /* VPE_DEI_REG5 */ + unsigned long edi_lut_reg2; /* VPE_DEI_REG6 */ + unsigned long edi_lut_reg3; /* VPE_DEI_REG7 */ +}; + +/* + * default expert DEI register values, unlikely to be modified. + */ +static const struct vpe_dei_regs dei_regs = { + 0x020C0804u, + 0x0118100Fu, + 0x08040200u, + 0x1010100Cu, + 0x10101010u, + 0x10101010u, }; /* @@ -118,6 +152,7 @@ static const struct vpe_us_coeffs us_coeffs[] = { */ struct vpe_port_data { enum vpdma_channel channel; /* VPDMA channel */ + u8 vb_index; /* input frame f, f-1, f-2 index */ u8 vb_part; /* plane index for co-panar formats */ }; @@ -126,6 +161,12 @@ struct vpe_port_data { */ #define VPE_PORT_LUMA1_IN 0 #define VPE_PORT_CHROMA1_IN 1 +#define VPE_PORT_LUMA2_IN 2 +#define VPE_PORT_CHROMA2_IN 3 +#define VPE_PORT_LUMA3_IN 4 +#define VPE_PORT_CHROMA3_IN 5 +#define VPE_PORT_MV_IN 6 +#define VPE_PORT_MV_OUT 7 #define VPE_PORT_LUMA_OUT 8 #define VPE_PORT_CHROMA_OUT 9 #define VPE_PORT_RGB_OUT 10 @@ -133,12 +174,40 @@ struct vpe_port_data { static const struct vpe_port_data port_data[11] = { [VPE_PORT_LUMA1_IN] = { .channel = VPE_CHAN_LUMA1_IN, + .vb_index = 0, .vb_part = VPE_LUMA, }, [VPE_PORT_CHROMA1_IN] = { .channel = VPE_CHAN_CHROMA1_IN, + .vb_index = 0, + .vb_part = VPE_CHROMA, + }, + [VPE_PORT_LUMA2_IN] = { + .channel = VPE_CHAN_LUMA2_IN, + .vb_index = 1, + .vb_part = VPE_LUMA, + }, + [VPE_PORT_CHROMA2_IN] = { + .channel = VPE_CHAN_CHROMA2_IN, + .vb_index = 1, + .vb_part = VPE_CHROMA, + }, + [VPE_PORT_LUMA3_IN] = { + .channel = VPE_CHAN_LUMA3_IN, + .vb_index = 2, + .vb_part = VPE_LUMA, + }, + [VPE_PORT_CHROMA3_IN] = { + .channel = VPE_CHAN_CHROMA3_IN, + .vb_index = 2, .vb_part = VPE_CHROMA, }, + [VPE_PORT_MV_IN] = { + .channel = VPE_CHAN_MV_IN, + }, + [VPE_PORT_MV_OUT] = { + .channel = VPE_CHAN_MV_OUT, + }, [VPE_PORT_LUMA_OUT] = { .channel = VPE_CHAN_LUMA_OUT, .vb_part = VPE_LUMA, @@ -210,6 +279,7 @@ struct vpe_q_data { unsigned int height; /* frame height */ unsigned int bytesperline[VPE_MAX_PLANES]; /* bytes per line in memory */ enum v4l2_colorspace colorspace; + enum v4l2_field field; /* supported field value */ unsigned int flags; unsigned int sizeimage[VPE_MAX_PLANES]; /* image size in memory */ struct v4l2_rect c_rect; /* crop/compose rectangle */ @@ -219,6 +289,7 @@ struct vpe_q_data { /* vpe_q_data flag bits */ #define Q_DATA_FRAME_1D (1 << 0) #define Q_DATA_MODE_TILED (1 << 1) +#define Q_DATA_INTERLACED (1 << 2) enum { Q_DATA_SRC = 0, @@ -270,6 +341,7 @@ struct vpe_ctx { struct v4l2_m2m_ctx *m2m_ctx; struct v4l2_ctrl_handler hdl; + unsigned int field; /* current field */ unsigned int sequence; /* current frame/field seq */ unsigned int aborting; /* abort after next irq */ @@ -277,13 +349,19 @@ struct vpe_ctx { unsigned int bufs_completed; /* bufs done in this batch */ struct vpe_q_data q_data[2]; /* src & dst queue data */ - struct vb2_buffer *src_vb; + struct vb2_buffer *src_vbs[VPE_MAX_SRC_BUFS]; struct vb2_buffer *dst_vb; + dma_addr_t mv_buf_dma[2]; /* dma addrs of motion vector in/out bufs */ + void *mv_buf[2]; /* virtual addrs of motion vector bufs */ + size_t mv_buf_size; /* current motion vector buffer size */ struct vpdma_buf mmr_adb; /* shadow reg addr/data block */ struct vpdma_desc_list desc_list; /* DMA descriptor list */ + bool deinterlacing; /* using de-interlacer */ bool load_mmrs; /* have new shadow reg values */ + + unsigned int src_mv_buf_selector; }; @@ -359,8 +437,7 @@ struct vpe_mmr_adb { struct vpdma_adb_hdr us3_hdr; u32 us3_regs[8]; struct vpdma_adb_hdr dei_hdr; - u32 dei_regs[1]; - u32 dei_pad[3]; + u32 dei_regs[8]; struct vpdma_adb_hdr sc_hdr; u32 sc_regs[1]; u32 sc_pad[3]; @@ -386,6 +463,80 @@ static void init_adb_hdrs(struct vpe_ctx *ctx) }; /* + * Allocate or re-allocate the motion vector DMA buffers + * There are two buffers, one for input and one for output. + * However, the roles are reversed after each field is processed. + * In other words, after each field is processed, the previous + * output (dst) MV buffer becomes the new input (src) MV buffer. + */ +static int realloc_mv_buffers(struct vpe_ctx *ctx, size_t size) +{ + struct device *dev = ctx->dev->v4l2_dev.dev; + + if (ctx->mv_buf_size == size) + return 0; + + if (ctx->mv_buf[0]) + dma_free_coherent(dev, ctx->mv_buf_size, ctx->mv_buf[0], + ctx->mv_buf_dma[0]); + + if (ctx->mv_buf[1]) + dma_free_coherent(dev, ctx->mv_buf_size, ctx->mv_buf[1], + ctx->mv_buf_dma[1]); + + if (size == 0) + return 0; + + ctx->mv_buf[0] = dma_alloc_coherent(dev, size, &ctx->mv_buf_dma[0], + GFP_KERNEL); + if (!ctx->mv_buf[0]) { + vpe_err(ctx->dev, "failed to allocate motion vector buffer\n"); + return -ENOMEM; + } + + ctx->mv_buf[1] = dma_alloc_coherent(dev, size, &ctx->mv_buf_dma[1], + GFP_KERNEL); + if (!ctx->mv_buf[1]) { + vpe_err(ctx->dev, "failed to allocate motion vector buffer\n"); + dma_free_coherent(dev, size, ctx->mv_buf[0], + ctx->mv_buf_dma[0]); + + return -ENOMEM; + } + + ctx->mv_buf_size = size; + ctx->src_mv_buf_selector = 0; + + return 0; +} + +static void free_mv_buffers(struct vpe_ctx *ctx) +{ + realloc_mv_buffers(ctx, 0); +} + +/* + * While de-interlacing, we keep the two most recent input buffers + * around. This function frees those two buffers when we have + * finished processing the current stream. + */ +static void free_vbs(struct vpe_ctx *ctx) +{ + struct vpe_dev *dev = ctx->dev; + unsigned long flags; + + if (ctx->src_vbs[2] == NULL) + return; + + spin_lock_irqsave(&dev->lock, flags); + if (ctx->src_vbs[2]) { + v4l2_m2m_buf_done(ctx->src_vbs[2], VB2_BUF_STATE_DONE); + v4l2_m2m_buf_done(ctx->src_vbs[1], VB2_BUF_STATE_DONE); + } + spin_unlock_irqrestore(&dev->lock, flags); +} + +/* * Enable or disable the VPE clocks */ static void vpe_set_clock_enable(struct vpe_dev *dev, bool on) @@ -426,6 +577,7 @@ static void vpe_top_vpdma_reset(struct vpe_dev *dev) static void set_us_coefficients(struct vpe_ctx *ctx) { struct vpe_mmr_adb *mmr_adb = ctx->mmr_adb.addr; + struct vpe_q_data *s_q_data = &ctx->q_data[Q_DATA_SRC]; u32 *us1_reg = &mmr_adb->us1_regs[0]; u32 *us2_reg = &mmr_adb->us2_regs[0]; u32 *us3_reg = &mmr_adb->us3_regs[0]; @@ -433,6 +585,9 @@ static void set_us_coefficients(struct vpe_ctx *ctx) cp = &us_coeffs[0].anchor_fid0_c0; + if (s_q_data->flags & Q_DATA_INTERLACED) /* interlaced */ + cp += sizeof(us_coeffs[0]) / sizeof(*cp); + end_cp = cp + sizeof(us_coeffs[0]) / sizeof(*cp); while (cp < end_cp) { @@ -473,14 +628,28 @@ static void set_cfg_and_line_modes(struct vpe_ctx *ctx) /* regs for now */ vpdma_set_line_mode(ctx->dev->vpdma, line_mode, VPE_CHAN_CHROMA1_IN); + vpdma_set_line_mode(ctx->dev->vpdma, line_mode, VPE_CHAN_CHROMA2_IN); + vpdma_set_line_mode(ctx->dev->vpdma, line_mode, VPE_CHAN_CHROMA3_IN); /* frame start for input luma */ vpdma_set_frame_start_event(ctx->dev->vpdma, VPDMA_FSEVENT_CHANNEL_ACTIVE, VPE_CHAN_LUMA1_IN); + vpdma_set_frame_start_event(ctx->dev->vpdma, VPDMA_FSEVENT_CHANNEL_ACTIVE, + VPE_CHAN_LUMA2_IN); + vpdma_set_frame_start_event(ctx->dev->vpdma, VPDMA_FSEVENT_CHANNEL_ACTIVE, + VPE_CHAN_LUMA3_IN); /* frame start for input chroma */ vpdma_set_frame_start_event(ctx->dev->vpdma, VPDMA_FSEVENT_CHANNEL_ACTIVE, VPE_CHAN_CHROMA1_IN); + vpdma_set_frame_start_event(ctx->dev->vpdma, VPDMA_FSEVENT_CHANNEL_ACTIVE, + VPE_CHAN_CHROMA2_IN); + vpdma_set_frame_start_event(ctx->dev->vpdma, VPDMA_FSEVENT_CHANNEL_ACTIVE, + VPE_CHAN_CHROMA3_IN); + + /* frame start for MV in client */ + vpdma_set_frame_start_event(ctx->dev->vpdma, VPDMA_FSEVENT_CHANNEL_ACTIVE, + VPE_CHAN_MV_IN); ctx->load_mmrs = true; } @@ -524,13 +693,14 @@ static void set_dst_registers(struct vpe_ctx *ctx) /* * Set the de-interlacer shadow register values */ -static void set_dei_regs_bypass(struct vpe_ctx *ctx) +static void set_dei_regs(struct vpe_ctx *ctx) { struct vpe_mmr_adb *mmr_adb = ctx->mmr_adb.addr; struct vpe_q_data *s_q_data = &ctx->q_data[Q_DATA_SRC]; unsigned int src_h = s_q_data->c_rect.height; unsigned int src_w = s_q_data->c_rect.width; u32 *dei_mmr0 = &mmr_adb->dei_regs[0]; + bool deinterlace = true; u32 val = 0; /* @@ -539,7 +709,13 @@ static void set_dei_regs_bypass(struct vpe_ctx *ctx) * for both progressive and interlace content in interlace bypass mode. * It has been recommended not to use progressive bypass mode. */ - val = VPE_DEI_INTERLACE_BYPASS; + if ((!ctx->deinterlacing && (s_q_data->flags & Q_DATA_INTERLACED)) || + !(s_q_data->flags & Q_DATA_INTERLACED)) { + deinterlace = false; + val = VPE_DEI_INTERLACE_BYPASS; + } + + src_h = deinterlace ? src_h * 2 : src_h; val |= (src_h << VPE_DEI_HEIGHT_SHIFT) | (src_w << VPE_DEI_WIDTH_SHIFT) | @@ -550,6 +726,22 @@ static void set_dei_regs_bypass(struct vpe_ctx *ctx) ctx->load_mmrs = true; } +static void set_dei_shadow_registers(struct vpe_ctx *ctx) +{ + struct vpe_mmr_adb *mmr_adb = ctx->mmr_adb.addr; + u32 *dei_mmr = &mmr_adb->dei_regs[0]; + const struct vpe_dei_regs *cur = &dei_regs; + + dei_mmr[2] = cur->mdt_spacial_freq_thr_reg; + dei_mmr[3] = cur->edi_config_reg; + dei_mmr[4] = cur->edi_lut_reg0; + dei_mmr[5] = cur->edi_lut_reg1; + dei_mmr[6] = cur->edi_lut_reg2; + dei_mmr[7] = cur->edi_lut_reg3; + + ctx->load_mmrs = true; +} + static void set_csc_coeff_bypass(struct vpe_ctx *ctx) { struct vpe_mmr_adb *mmr_adb = ctx->mmr_adb.addr; @@ -578,10 +770,35 @@ static void set_sc_regs_bypass(struct vpe_ctx *ctx) */ static int set_srcdst_params(struct vpe_ctx *ctx) { + struct vpe_q_data *s_q_data = &ctx->q_data[Q_DATA_SRC]; + struct vpe_q_data *d_q_data = &ctx->q_data[Q_DATA_DST]; + size_t mv_buf_size; + int ret; + ctx->sequence = 0; + ctx->field = V4L2_FIELD_TOP; + + if ((s_q_data->flags & Q_DATA_INTERLACED) && + !(d_q_data->flags & Q_DATA_INTERLACED)) { + const struct vpdma_data_format *mv = + &vpdma_misc_fmts[VPDMA_DATA_FMT_MV]; + + ctx->deinterlacing = 1; + mv_buf_size = + (s_q_data->width * s_q_data->height * mv->depth) >> 3; + } else { + ctx->deinterlacing = 0; + mv_buf_size = 0; + } + + free_vbs(ctx); + + ret = realloc_mv_buffers(ctx, mv_buf_size); + if (ret) + return ret; set_cfg_and_line_modes(ctx); - set_dei_regs_bypass(ctx); + set_dei_regs(ctx); set_csc_coeff_bypass(ctx); set_sc_regs_bypass(ctx); @@ -608,6 +825,9 @@ static int job_ready(void *priv) struct vpe_ctx *ctx = priv; int needed = ctx->bufs_per_job; + if (ctx->deinterlacing && ctx->src_vbs[2] == NULL) + needed += 2; /* need additional two most recent fields */ + if (v4l2_m2m_num_src_bufs_ready(ctx->m2m_ctx) < needed) return 0; @@ -735,17 +955,25 @@ static void add_out_dtd(struct vpe_ctx *ctx, int port) struct v4l2_rect *c_rect = &q_data->c_rect; struct vpe_fmt *fmt = q_data->fmt; const struct vpdma_data_format *vpdma_fmt; - int plane = fmt->coplanar ? p_data->vb_part : 0; + int mv_buf_selector = !ctx->src_mv_buf_selector; dma_addr_t dma_addr; u32 flags = 0; - vpdma_fmt = fmt->vpdma_fmt[plane]; - dma_addr = vb2_dma_contig_plane_dma_addr(vb, plane); - if (!dma_addr) { - vpe_err(ctx->dev, - "acquiring output buffer(%d) dma_addr failed\n", - port); - return; + if (port == VPE_PORT_MV_OUT) { + vpdma_fmt = &vpdma_misc_fmts[VPDMA_DATA_FMT_MV]; + dma_addr = ctx->mv_buf_dma[mv_buf_selector]; + } else { + /* to incorporate interleaved formats */ + int plane = fmt->coplanar ? p_data->vb_part : 0; + + vpdma_fmt = fmt->vpdma_fmt[plane]; + dma_addr = vb2_dma_contig_plane_dma_addr(vb, plane); + if (!dma_addr) { + vpe_err(ctx->dev, + "acquiring output buffer(%d) dma_addr failed\n", + port); + return; + } } if (q_data->flags & Q_DATA_FRAME_1D) @@ -761,23 +989,31 @@ static void add_in_dtd(struct vpe_ctx *ctx, int port) { struct vpe_q_data *q_data = &ctx->q_data[Q_DATA_SRC]; const struct vpe_port_data *p_data = &port_data[port]; - struct vb2_buffer *vb = ctx->src_vb; + struct vb2_buffer *vb = ctx->src_vbs[p_data->vb_index]; struct v4l2_rect *c_rect = &q_data->c_rect; struct vpe_fmt *fmt = q_data->fmt; const struct vpdma_data_format *vpdma_fmt; - int plane = fmt->coplanar ? p_data->vb_part : 0; - int field = 0; + int mv_buf_selector = ctx->src_mv_buf_selector; + int field = vb->v4l2_buf.field == V4L2_FIELD_BOTTOM; dma_addr_t dma_addr; u32 flags = 0; - vpdma_fmt = fmt->vpdma_fmt[plane]; + if (port == VPE_PORT_MV_IN) { + vpdma_fmt = &vpdma_misc_fmts[VPDMA_DATA_FMT_MV]; + dma_addr = ctx->mv_buf_dma[mv_buf_selector]; + } else { + /* to incorporate interleaved formats */ + int plane = fmt->coplanar ? p_data->vb_part : 0; - dma_addr = vb2_dma_contig_plane_dma_addr(vb, plane); - if (!dma_addr) { - vpe_err(ctx->dev, - "acquiring input buffer(%d) dma_addr failed\n", - port); - return; + vpdma_fmt = fmt->vpdma_fmt[plane]; + + dma_addr = vb2_dma_contig_plane_dma_addr(vb, plane); + if (!dma_addr) { + vpe_err(ctx->dev, + "acquiring input buffer(%d) dma_addr failed\n", + port); + return; + } } if (q_data->flags & Q_DATA_FRAME_1D) @@ -795,7 +1031,8 @@ static void add_in_dtd(struct vpe_ctx *ctx, int port) static void enable_irqs(struct vpe_ctx *ctx) { write_reg(ctx->dev, VPE_INT0_ENABLE0_SET, VPE_INT0_LIST0_COMPLETE); - write_reg(ctx->dev, VPE_INT0_ENABLE1_SET, VPE_DS1_UV_ERROR_INT); + write_reg(ctx->dev, VPE_INT0_ENABLE1_SET, VPE_DEI_ERROR_INT | + VPE_DS1_UV_ERROR_INT); vpdma_enable_list_complete_irq(ctx->dev->vpdma, 0, true); } @@ -818,8 +1055,15 @@ static void device_run(void *priv) struct vpe_ctx *ctx = priv; struct vpe_q_data *d_q_data = &ctx->q_data[Q_DATA_DST]; - ctx->src_vb = v4l2_m2m_src_buf_remove(ctx->m2m_ctx); - WARN_ON(ctx->src_vb == NULL); + if (ctx->deinterlacing && ctx->src_vbs[2] == NULL) { + ctx->src_vbs[2] = v4l2_m2m_src_buf_remove(ctx->m2m_ctx); + WARN_ON(ctx->src_vbs[2] == NULL); + ctx->src_vbs[1] = v4l2_m2m_src_buf_remove(ctx->m2m_ctx); + WARN_ON(ctx->src_vbs[1] == NULL); + } + + ctx->src_vbs[0] = v4l2_m2m_src_buf_remove(ctx->m2m_ctx); + WARN_ON(ctx->src_vbs[0] == NULL); ctx->dst_vb = v4l2_m2m_dst_buf_remove(ctx->m2m_ctx); WARN_ON(ctx->dst_vb == NULL); @@ -831,28 +1075,67 @@ static void device_run(void *priv) ctx->load_mmrs = false; } + /* output data descriptors */ + if (ctx->deinterlacing) + add_out_dtd(ctx, VPE_PORT_MV_OUT); + add_out_dtd(ctx, VPE_PORT_LUMA_OUT); if (d_q_data->fmt->coplanar) add_out_dtd(ctx, VPE_PORT_CHROMA_OUT); + /* input data descriptors */ + if (ctx->deinterlacing) { + add_in_dtd(ctx, VPE_PORT_LUMA3_IN); + add_in_dtd(ctx, VPE_PORT_CHROMA3_IN); + + add_in_dtd(ctx, VPE_PORT_LUMA2_IN); + add_in_dtd(ctx, VPE_PORT_CHROMA2_IN); + } + add_in_dtd(ctx, VPE_PORT_LUMA1_IN); add_in_dtd(ctx, VPE_PORT_CHROMA1_IN); + if (ctx->deinterlacing) + add_in_dtd(ctx, VPE_PORT_MV_IN); + /* sync on channel control descriptors for input ports */ vpdma_add_sync_on_channel_ctd(&ctx->desc_list, VPE_CHAN_LUMA1_IN); vpdma_add_sync_on_channel_ctd(&ctx->desc_list, VPE_CHAN_CHROMA1_IN); + if (ctx->deinterlacing) { + vpdma_add_sync_on_channel_ctd(&ctx->desc_list, + VPE_CHAN_LUMA2_IN); + vpdma_add_sync_on_channel_ctd(&ctx->desc_list, + VPE_CHAN_CHROMA2_IN); + + vpdma_add_sync_on_channel_ctd(&ctx->desc_list, + VPE_CHAN_LUMA3_IN); + vpdma_add_sync_on_channel_ctd(&ctx->desc_list, + VPE_CHAN_CHROMA3_IN); + + vpdma_add_sync_on_channel_ctd(&ctx->desc_list, VPE_CHAN_MV_IN); + } + /* sync on channel control descriptors for output ports */ vpdma_add_sync_on_channel_ctd(&ctx->desc_list, VPE_CHAN_LUMA_OUT); if (d_q_data->fmt->coplanar) vpdma_add_sync_on_channel_ctd(&ctx->desc_list, VPE_CHAN_CHROMA_OUT); + if (ctx->deinterlacing) + vpdma_add_sync_on_channel_ctd(&ctx->desc_list, VPE_CHAN_MV_OUT); + enable_irqs(ctx); vpdma_map_desc_buf(ctx->dev->vpdma, &ctx->desc_list.buf); vpdma_submit_descs(ctx->dev->vpdma, &ctx->desc_list); } +static void dei_error(struct vpe_ctx *ctx) +{ + dev_warn(ctx->dev->v4l2_dev.dev, + "received DEI error interrupt\n"); +} + static void ds1_uv_error(struct vpe_ctx *ctx) { dev_warn(ctx->dev->v4l2_dev.dev, @@ -863,6 +1146,7 @@ static irqreturn_t vpe_irq(int irq_vpe, void *data) { struct vpe_dev *dev = (struct vpe_dev *)data; struct vpe_ctx *ctx; + struct vpe_q_data *d_q_data; struct vb2_buffer *s_vb, *d_vb; struct v4l2_buffer *s_buf, *d_buf; unsigned long flags; @@ -886,9 +1170,15 @@ static irqreturn_t vpe_irq(int irq_vpe, void *data) goto handled; } - if (irqst1 & VPE_DS1_UV_ERROR_INT) { - irqst1 &= ~VPE_DS1_UV_ERROR_INT; - ds1_uv_error(ctx); + if (irqst1) { + if (irqst1 & VPE_DEI_ERROR_INT) { + irqst1 &= ~VPE_DEI_ERROR_INT; + dei_error(ctx); + } + if (irqst1 & VPE_DS1_UV_ERROR_INT) { + irqst1 &= ~VPE_DS1_UV_ERROR_INT; + ds1_uv_error(ctx); + } } if (irqst0) { @@ -911,10 +1201,13 @@ static irqreturn_t vpe_irq(int irq_vpe, void *data) vpdma_reset_desc_list(&ctx->desc_list); + /* the previous dst mv buffer becomes the next src mv buffer */ + ctx->src_mv_buf_selector = !ctx->src_mv_buf_selector; + if (ctx->aborting) goto finished; - s_vb = ctx->src_vb; + s_vb = ctx->src_vbs[0]; d_vb = ctx->dst_vb; s_buf = &s_vb->v4l2_buf; d_buf = &d_vb->v4l2_buf; @@ -924,16 +1217,35 @@ static irqreturn_t vpe_irq(int irq_vpe, void *data) d_buf->flags |= V4L2_BUF_FLAG_TIMECODE; d_buf->timecode = s_buf->timecode; } - d_buf->sequence = ctx->sequence; + d_buf->field = ctx->field; + + d_q_data = &ctx->q_data[Q_DATA_DST]; + if (d_q_data->flags & Q_DATA_INTERLACED) { + if (ctx->field == V4L2_FIELD_BOTTOM) { + ctx->sequence++; + ctx->field = V4L2_FIELD_TOP; + } else { + WARN_ON(ctx->field != V4L2_FIELD_TOP); + ctx->field = V4L2_FIELD_BOTTOM; + } + } else { + ctx->sequence++; + } - ctx->sequence++; + if (ctx->deinterlacing) + s_vb = ctx->src_vbs[2]; spin_lock_irqsave(&dev->lock, flags); v4l2_m2m_buf_done(s_vb, VB2_BUF_STATE_DONE); v4l2_m2m_buf_done(d_vb, VB2_BUF_STATE_DONE); spin_unlock_irqrestore(&dev->lock, flags); + if (ctx->deinterlacing) { + ctx->src_vbs[2] = ctx->src_vbs[1]; + ctx->src_vbs[1] = ctx->src_vbs[0]; + } + ctx->bufs_completed++; if (ctx->bufs_completed < ctx->bufs_per_job) { device_run(ctx); @@ -1012,6 +1324,7 @@ static int vpe_g_fmt(struct file *file, void *priv, struct v4l2_format *f) pix->width = q_data->width; pix->height = q_data->height; pix->pixelformat = q_data->fmt->fourcc; + pix->field = q_data->field; if (V4L2_TYPE_IS_OUTPUT(f->type)) { pix->colorspace = q_data->colorspace; @@ -1047,7 +1360,8 @@ static int __vpe_try_fmt(struct vpe_ctx *ctx, struct v4l2_format *f, return -EINVAL; } - pix->field = V4L2_FIELD_NONE; + if (pix->field != V4L2_FIELD_NONE && pix->field != V4L2_FIELD_ALTERNATE) + pix->field = V4L2_FIELD_NONE; v4l_bound_align_image(&pix->width, MIN_W, MAX_W, W_ALIGN, &pix->height, MIN_H, MAX_H, H_ALIGN, @@ -1124,6 +1438,7 @@ static int __vpe_s_fmt(struct vpe_ctx *ctx, struct v4l2_format *f) q_data->width = pix->width; q_data->height = pix->height; q_data->colorspace = pix->colorspace; + q_data->field = pix->field; for (i = 0; i < pix->num_planes; i++) { plane_fmt = &pix->plane_fmt[i]; @@ -1137,6 +1452,11 @@ static int __vpe_s_fmt(struct vpe_ctx *ctx, struct v4l2_format *f) q_data->c_rect.width = q_data->width; q_data->c_rect.height = q_data->height; + if (q_data->field == V4L2_FIELD_ALTERNATE) + q_data->flags |= Q_DATA_INTERLACED; + else + q_data->flags &= ~Q_DATA_INTERLACED; + vpe_dbg(ctx->dev, "Setting format for type %d, wxh: %dx%d, fmt: %d bpl_y %d", f->type, q_data->width, q_data->height, q_data->fmt->fourcc, q_data->bytesperline[VPE_LUMA]); @@ -1451,6 +1771,7 @@ static int vpe_open(struct file *file) s_q_data->sizeimage[VPE_LUMA] = (s_q_data->width * s_q_data->height * s_q_data->fmt->vpdma_fmt[VPE_LUMA]->depth) >> 3; s_q_data->colorspace = V4L2_COLORSPACE_SMPTE240M; + s_q_data->field = V4L2_FIELD_NONE; s_q_data->c_rect.left = 0; s_q_data->c_rect.top = 0; s_q_data->c_rect.width = s_q_data->width; @@ -1459,6 +1780,7 @@ static int vpe_open(struct file *file) ctx->q_data[Q_DATA_DST] = *s_q_data; + set_dei_shadow_registers(ctx); set_src_registers(ctx); set_dst_registers(ctx); ret = set_srcdst_params(ctx); @@ -1513,6 +1835,8 @@ static int vpe_release(struct file *file) vpe_dbg(dev, "releasing instance %p\n", ctx); mutex_lock(&dev->dev_mutex); + free_vbs(ctx); + free_mv_buffers(ctx); vpdma_free_desc_list(&ctx->desc_list); vpdma_free_desc_buf(&ctx->mmr_adb);