From patchwork Mon Sep 27 15:19:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrzej Pietrasiewicz X-Patchwork-Id: 12520169 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DDFF7C433FE for ; Mon, 27 Sep 2021 15:20:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BD4C061058 for ; Mon, 27 Sep 2021 15:20:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235145AbhI0PVs (ORCPT ); Mon, 27 Sep 2021 11:21:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41136 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234972AbhI0PVo (ORCPT ); Mon, 27 Sep 2021 11:21:44 -0400 Received: from bhuna.collabora.co.uk (bhuna.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e3e3]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E544DC061714; Mon, 27 Sep 2021 08:20:06 -0700 (PDT) Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: andrzej.p) with ESMTPSA id A03EC1F42DB8 From: Andrzej Pietrasiewicz To: linux-media@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-rockchip@lists.infradead.org, linux-staging@lists.linux.dev Cc: Andrzej Pietrasiewicz , Benjamin Gaignard , Boris Brezillon , Ezequiel Garcia , Fabio Estevam , Greg Kroah-Hartman , Hans Verkuil , Heiko Stuebner , Jernej Skrabec , Mauro Carvalho Chehab , Nicolas Dufresne , NXP Linux Team , Pengutronix Kernel Team , Philipp Zabel , Sascha Hauer , Shawn Guo , kernel@collabora.com, Ezequiel Garcia Subject: [PATCH v6 01/10] hantro: postproc: Fix motion vector space size Date: Mon, 27 Sep 2021 17:19:49 +0200 Message-Id: <20210927151958.24426-2-andrzej.p@collabora.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210927151958.24426-1-andrzej.p@collabora.com> References: <20210927151958.24426-1-andrzej.p@collabora.com> Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org From: Ezequiel Garcia When the post-processor hardware block is enabled, the driver allocates an internal queue of buffers for the decoder enginer, and uses the vb2 queue for the post-processor engine. For instance, on a G1 core, the decoder engine produces NV12 buffers and the post-processor engine can produce YUY2 buffers. The decoder engine expects motion vectors to be appended to the NV12 buffers, but this is only required for CODECs that need motion vectors, such as H.264. Fix the post-processor logic accordingly. Signed-off-by: Ezequiel Garcia Signed-off-by: Andrzej Pietrasiewicz --- drivers/staging/media/hantro/hantro_postproc.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/staging/media/hantro/hantro_postproc.c b/drivers/staging/media/hantro/hantro_postproc.c index ed8916c950a4..07842152003f 100644 --- a/drivers/staging/media/hantro/hantro_postproc.c +++ b/drivers/staging/media/hantro/hantro_postproc.c @@ -132,9 +132,10 @@ int hantro_postproc_alloc(struct hantro_ctx *ctx) unsigned int num_buffers = cap_queue->num_buffers; unsigned int i, buf_size; - buf_size = ctx->dst_fmt.plane_fmt[0].sizeimage + - hantro_h264_mv_size(ctx->dst_fmt.width, - ctx->dst_fmt.height); + buf_size = ctx->dst_fmt.plane_fmt[0].sizeimage; + if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_H264_SLICE) + buf_size += hantro_h264_mv_size(ctx->dst_fmt.width, + ctx->dst_fmt.height); for (i = 0; i < num_buffers; ++i) { struct hantro_aux_buf *priv = &ctx->postproc.dec_q[i]; From patchwork Mon Sep 27 15:19:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrzej Pietrasiewicz X-Patchwork-Id: 12520171 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E833DC4332F for ; Mon, 27 Sep 2021 15:20:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D345C61157 for ; Mon, 27 Sep 2021 15:20:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235139AbhI0PVt (ORCPT ); Mon, 27 Sep 2021 11:21:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41142 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235128AbhI0PVp (ORCPT ); Mon, 27 Sep 2021 11:21:45 -0400 Received: from bhuna.collabora.co.uk (bhuna.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e3e3]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8578CC061740; Mon, 27 Sep 2021 08:20:07 -0700 (PDT) Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: andrzej.p) with ESMTPSA id 912601F42DBC From: Andrzej Pietrasiewicz To: linux-media@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-rockchip@lists.infradead.org, linux-staging@lists.linux.dev Cc: Andrzej Pietrasiewicz , Benjamin Gaignard , Boris Brezillon , Ezequiel Garcia , Fabio Estevam , Greg Kroah-Hartman , Hans Verkuil , Heiko Stuebner , Jernej Skrabec , Mauro Carvalho Chehab , Nicolas Dufresne , NXP Linux Team , Pengutronix Kernel Team , Philipp Zabel , Sascha Hauer , Shawn Guo , kernel@collabora.com, Ezequiel Garcia Subject: [PATCH v6 02/10] hantro: postproc: Introduce struct hantro_postproc_ops Date: Mon, 27 Sep 2021 17:19:50 +0200 Message-Id: <20210927151958.24426-3-andrzej.p@collabora.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210927151958.24426-1-andrzej.p@collabora.com> References: <20210927151958.24426-1-andrzej.p@collabora.com> Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org From: Ezequiel Garcia Turns out the post-processor block on the G2 core is substantially different from the one on the G1 core. Introduce hantro_postproc_ops with .enable and .disable methods, which will allow to support the G2 post-processor cleanly. Signed-off-by: Ezequiel Garcia Signed-off-by: Andrzej Pietrasiewicz Reviewed-by: Benjamin Gaignard --- drivers/staging/media/hantro/hantro.h | 5 +-- drivers/staging/media/hantro/hantro_hw.h | 13 +++++++- .../staging/media/hantro/hantro_postproc.c | 33 ++++++++++++++----- drivers/staging/media/hantro/imx8m_vpu_hw.c | 2 +- .../staging/media/hantro/rockchip_vpu_hw.c | 6 ++-- .../staging/media/hantro/sama5d4_vdec_hw.c | 2 +- 6 files changed, 44 insertions(+), 17 deletions(-) diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h index c2e2dca38628..c2e01959dc00 100644 --- a/drivers/staging/media/hantro/hantro.h +++ b/drivers/staging/media/hantro/hantro.h @@ -28,6 +28,7 @@ struct hantro_ctx; struct hantro_codec_ops; +struct hantro_postproc_ops; #define HANTRO_JPEG_ENCODER BIT(0) #define HANTRO_ENCODERS 0x0000ffff @@ -59,6 +60,7 @@ struct hantro_irq { * @num_dec_fmts: Number of decoder formats. * @postproc_fmts: Post-processor formats. * @num_postproc_fmts: Number of post-processor formats. + * @postproc_ops: Post-processor ops. * @codec: Supported codecs * @codec_ops: Codec ops. * @init: Initialize hardware, optional. @@ -69,7 +71,6 @@ struct hantro_irq { * @num_clocks: number of clocks in the array * @reg_names: array of register range names * @num_regs: number of register range names in the array - * @postproc_regs: &struct hantro_postproc_regs pointer */ struct hantro_variant { unsigned int enc_offset; @@ -80,6 +81,7 @@ struct hantro_variant { unsigned int num_dec_fmts; const struct hantro_fmt *postproc_fmts; unsigned int num_postproc_fmts; + const struct hantro_postproc_ops *postproc_ops; unsigned int codec; const struct hantro_codec_ops *codec_ops; int (*init)(struct hantro_dev *vpu); @@ -90,7 +92,6 @@ struct hantro_variant { int num_clocks; const char * const *reg_names; int num_regs; - const struct hantro_postproc_regs *postproc_regs; }; /** diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h index df7b5e3a57b9..4323e63dfbfc 100644 --- a/drivers/staging/media/hantro/hantro_hw.h +++ b/drivers/staging/media/hantro/hantro_hw.h @@ -170,6 +170,17 @@ struct hantro_postproc_ctx { struct hantro_aux_buf dec_q[VB2_MAX_FRAME]; }; +/** + * struct hantro_postproc_ops - post-processor operations + * + * @enable: Enable the post-processor block. Optional. + * @disable: Disable the post-processor block. Optional. + */ +struct hantro_postproc_ops { + void (*enable)(struct hantro_ctx *ctx); + void (*disable)(struct hantro_ctx *ctx); +}; + /** * struct hantro_codec_ops - codec mode specific operations * @@ -217,7 +228,7 @@ extern const struct hantro_variant rk3328_vpu_variant; extern const struct hantro_variant rk3399_vpu_variant; extern const struct hantro_variant sama5d4_vdec_variant; -extern const struct hantro_postproc_regs hantro_g1_postproc_regs; +extern const struct hantro_postproc_ops hantro_g1_postproc_ops; extern const u32 hantro_vp8_dec_mc_filter[8][6]; diff --git a/drivers/staging/media/hantro/hantro_postproc.c b/drivers/staging/media/hantro/hantro_postproc.c index 07842152003f..882fb8bc5ddd 100644 --- a/drivers/staging/media/hantro/hantro_postproc.c +++ b/drivers/staging/media/hantro/hantro_postproc.c @@ -15,14 +15,14 @@ #define HANTRO_PP_REG_WRITE(vpu, reg_name, val) \ { \ hantro_reg_write(vpu, \ - &(vpu)->variant->postproc_regs->reg_name, \ + &hantro_g1_postproc_regs.reg_name, \ val); \ } #define HANTRO_PP_REG_WRITE_S(vpu, reg_name, val) \ { \ hantro_reg_write_s(vpu, \ - &(vpu)->variant->postproc_regs->reg_name, \ + &hantro_g1_postproc_regs.reg_name, \ val); \ } @@ -64,16 +64,13 @@ bool hantro_needs_postproc(const struct hantro_ctx *ctx, return fmt->fourcc != V4L2_PIX_FMT_NV12; } -void hantro_postproc_enable(struct hantro_ctx *ctx) +static void hantro_postproc_g1_enable(struct hantro_ctx *ctx) { struct hantro_dev *vpu = ctx->dev; struct vb2_v4l2_buffer *dst_buf; u32 src_pp_fmt, dst_pp_fmt; dma_addr_t dst_dma; - if (!vpu->variant->postproc_regs) - return; - /* Turn on pipeline mode. Must be done first. */ HANTRO_PP_REG_WRITE_S(vpu, pipeline_en, 0x1); @@ -154,12 +151,30 @@ int hantro_postproc_alloc(struct hantro_ctx *ctx) return 0; } +static void hantro_postproc_g1_disable(struct hantro_ctx *ctx) +{ + struct hantro_dev *vpu = ctx->dev; + + HANTRO_PP_REG_WRITE_S(vpu, pipeline_en, 0x0); +} + void hantro_postproc_disable(struct hantro_ctx *ctx) { struct hantro_dev *vpu = ctx->dev; - if (!vpu->variant->postproc_regs) - return; + if (vpu->variant->postproc_ops && vpu->variant->postproc_ops->disable) + vpu->variant->postproc_ops->disable(ctx); +} - HANTRO_PP_REG_WRITE_S(vpu, pipeline_en, 0x0); +void hantro_postproc_enable(struct hantro_ctx *ctx) +{ + struct hantro_dev *vpu = ctx->dev; + + if (vpu->variant->postproc_ops && vpu->variant->postproc_ops->enable) + vpu->variant->postproc_ops->enable(ctx); } + +const struct hantro_postproc_ops hantro_g1_postproc_ops = { + .enable = hantro_postproc_g1_enable, + .disable = hantro_postproc_g1_disable, +}; diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/staging/media/hantro/imx8m_vpu_hw.c index ea919bfb9891..22fa7d2f3b64 100644 --- a/drivers/staging/media/hantro/imx8m_vpu_hw.c +++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c @@ -262,7 +262,7 @@ const struct hantro_variant imx8mq_vpu_variant = { .num_dec_fmts = ARRAY_SIZE(imx8m_vpu_dec_fmts), .postproc_fmts = imx8m_vpu_postproc_fmts, .num_postproc_fmts = ARRAY_SIZE(imx8m_vpu_postproc_fmts), - .postproc_regs = &hantro_g1_postproc_regs, + .postproc_ops = &hantro_g1_postproc_ops, .codec = HANTRO_MPEG2_DECODER | HANTRO_VP8_DECODER | HANTRO_H264_DECODER, .codec_ops = imx8mq_vpu_codec_ops, diff --git a/drivers/staging/media/hantro/rockchip_vpu_hw.c b/drivers/staging/media/hantro/rockchip_vpu_hw.c index d4f52957cc53..6c1ad5534ce5 100644 --- a/drivers/staging/media/hantro/rockchip_vpu_hw.c +++ b/drivers/staging/media/hantro/rockchip_vpu_hw.c @@ -460,7 +460,7 @@ const struct hantro_variant rk3036_vpu_variant = { .num_dec_fmts = ARRAY_SIZE(rk3066_vpu_dec_fmts), .postproc_fmts = rockchip_vpu1_postproc_fmts, .num_postproc_fmts = ARRAY_SIZE(rockchip_vpu1_postproc_fmts), - .postproc_regs = &hantro_g1_postproc_regs, + .postproc_ops = &hantro_g1_postproc_ops, .codec = HANTRO_MPEG2_DECODER | HANTRO_VP8_DECODER | HANTRO_H264_DECODER, .codec_ops = rk3036_vpu_codec_ops, @@ -485,7 +485,7 @@ const struct hantro_variant rk3066_vpu_variant = { .num_dec_fmts = ARRAY_SIZE(rk3066_vpu_dec_fmts), .postproc_fmts = rockchip_vpu1_postproc_fmts, .num_postproc_fmts = ARRAY_SIZE(rockchip_vpu1_postproc_fmts), - .postproc_regs = &hantro_g1_postproc_regs, + .postproc_ops = &hantro_g1_postproc_ops, .codec = HANTRO_JPEG_ENCODER | HANTRO_MPEG2_DECODER | HANTRO_VP8_DECODER | HANTRO_H264_DECODER, .codec_ops = rk3066_vpu_codec_ops, @@ -505,7 +505,7 @@ const struct hantro_variant rk3288_vpu_variant = { .num_dec_fmts = ARRAY_SIZE(rk3288_vpu_dec_fmts), .postproc_fmts = rockchip_vpu1_postproc_fmts, .num_postproc_fmts = ARRAY_SIZE(rockchip_vpu1_postproc_fmts), - .postproc_regs = &hantro_g1_postproc_regs, + .postproc_ops = &hantro_g1_postproc_ops, .codec = HANTRO_JPEG_ENCODER | HANTRO_MPEG2_DECODER | HANTRO_VP8_DECODER | HANTRO_H264_DECODER, .codec_ops = rk3288_vpu_codec_ops, diff --git a/drivers/staging/media/hantro/sama5d4_vdec_hw.c b/drivers/staging/media/hantro/sama5d4_vdec_hw.c index 9c3b8cd0b239..f3fecc7248c4 100644 --- a/drivers/staging/media/hantro/sama5d4_vdec_hw.c +++ b/drivers/staging/media/hantro/sama5d4_vdec_hw.c @@ -100,7 +100,7 @@ const struct hantro_variant sama5d4_vdec_variant = { .num_dec_fmts = ARRAY_SIZE(sama5d4_vdec_fmts), .postproc_fmts = sama5d4_vdec_postproc_fmts, .num_postproc_fmts = ARRAY_SIZE(sama5d4_vdec_postproc_fmts), - .postproc_regs = &hantro_g1_postproc_regs, + .postproc_ops = &hantro_g1_postproc_ops, .codec = HANTRO_MPEG2_DECODER | HANTRO_VP8_DECODER | HANTRO_H264_DECODER, .codec_ops = sama5d4_vdec_codec_ops, From patchwork Mon Sep 27 15:19:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrzej Pietrasiewicz X-Patchwork-Id: 12520173 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 099EBC433F5 for ; Mon, 27 Sep 2021 15:20:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EA0376101A for ; Mon, 27 Sep 2021 15:20:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235162AbhI0PVu (ORCPT ); Mon, 27 Sep 2021 11:21:50 -0400 Received: from bhuna.collabora.co.uk ([46.235.227.227]:54122 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235129AbhI0PVq (ORCPT ); Mon, 27 Sep 2021 11:21:46 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: andrzej.p) with ESMTPSA id 818E01F42E29 From: Andrzej Pietrasiewicz To: linux-media@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-rockchip@lists.infradead.org, linux-staging@lists.linux.dev Cc: Andrzej Pietrasiewicz , Benjamin Gaignard , Boris Brezillon , Ezequiel Garcia , Fabio Estevam , Greg Kroah-Hartman , Hans Verkuil , Heiko Stuebner , Jernej Skrabec , Mauro Carvalho Chehab , Nicolas Dufresne , NXP Linux Team , Pengutronix Kernel Team , Philipp Zabel , Sascha Hauer , Shawn Guo , kernel@collabora.com, Ezequiel Garcia Subject: [PATCH v6 03/10] hantro: Simplify postprocessor Date: Mon, 27 Sep 2021 17:19:51 +0200 Message-Id: <20210927151958.24426-4-andrzej.p@collabora.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210927151958.24426-1-andrzej.p@collabora.com> References: <20210927151958.24426-1-andrzej.p@collabora.com> Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org From: Ezequiel Garcia Add a 'postprocessed' boolean property to struct hantro_fmt to signal that a format is produced by the post-processor. This will allow to introduce the G2 post-processor in a simple way. Signed-off-by: Ezequiel Garcia Signed-off-by: Andrzej Pietrasiewicz --- drivers/staging/media/hantro/hantro.h | 2 ++ drivers/staging/media/hantro/hantro_postproc.c | 8 +------- drivers/staging/media/hantro/imx8m_vpu_hw.c | 1 + drivers/staging/media/hantro/rockchip_vpu_hw.c | 1 + drivers/staging/media/hantro/sama5d4_vdec_hw.c | 1 + 5 files changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h index c2e01959dc00..dd5e56765d4e 100644 --- a/drivers/staging/media/hantro/hantro.h +++ b/drivers/staging/media/hantro/hantro.h @@ -263,6 +263,7 @@ struct hantro_ctx { * @max_depth: Maximum depth, for bitstream formats * @enc_fmt: Format identifier for encoder registers. * @frmsize: Supported range of frame sizes (only for bitstream formats). + * @postprocessed: Indicates if this format needs the post-processor. */ struct hantro_fmt { char *name; @@ -272,6 +273,7 @@ struct hantro_fmt { int max_depth; enum hantro_enc_fmt enc_fmt; struct v4l2_frmsize_stepwise frmsize; + bool postprocessed; }; struct hantro_reg { diff --git a/drivers/staging/media/hantro/hantro_postproc.c b/drivers/staging/media/hantro/hantro_postproc.c index 882fb8bc5ddd..4549aec08feb 100644 --- a/drivers/staging/media/hantro/hantro_postproc.c +++ b/drivers/staging/media/hantro/hantro_postproc.c @@ -53,15 +53,9 @@ const struct hantro_postproc_regs hantro_g1_postproc_regs = { bool hantro_needs_postproc(const struct hantro_ctx *ctx, const struct hantro_fmt *fmt) { - struct hantro_dev *vpu = ctx->dev; - if (ctx->is_encoder) return false; - - if (!vpu->variant->postproc_fmts) - return false; - - return fmt->fourcc != V4L2_PIX_FMT_NV12; + return fmt->postprocessed; } static void hantro_postproc_g1_enable(struct hantro_ctx *ctx) diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/staging/media/hantro/imx8m_vpu_hw.c index 22fa7d2f3b64..02e61438220a 100644 --- a/drivers/staging/media/hantro/imx8m_vpu_hw.c +++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c @@ -82,6 +82,7 @@ static const struct hantro_fmt imx8m_vpu_postproc_fmts[] = { { .fourcc = V4L2_PIX_FMT_YUYV, .codec_mode = HANTRO_MODE_NONE, + .postprocessed = true, }, }; diff --git a/drivers/staging/media/hantro/rockchip_vpu_hw.c b/drivers/staging/media/hantro/rockchip_vpu_hw.c index 6c1ad5534ce5..f372f767d4ff 100644 --- a/drivers/staging/media/hantro/rockchip_vpu_hw.c +++ b/drivers/staging/media/hantro/rockchip_vpu_hw.c @@ -62,6 +62,7 @@ static const struct hantro_fmt rockchip_vpu1_postproc_fmts[] = { { .fourcc = V4L2_PIX_FMT_YUYV, .codec_mode = HANTRO_MODE_NONE, + .postprocessed = true, }, }; diff --git a/drivers/staging/media/hantro/sama5d4_vdec_hw.c b/drivers/staging/media/hantro/sama5d4_vdec_hw.c index f3fecc7248c4..b2fc1c5613e1 100644 --- a/drivers/staging/media/hantro/sama5d4_vdec_hw.c +++ b/drivers/staging/media/hantro/sama5d4_vdec_hw.c @@ -15,6 +15,7 @@ static const struct hantro_fmt sama5d4_vdec_postproc_fmts[] = { { .fourcc = V4L2_PIX_FMT_YUYV, .codec_mode = HANTRO_MODE_NONE, + .postprocessed = true, }, }; From patchwork Mon Sep 27 15:19:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrzej Pietrasiewicz X-Patchwork-Id: 12520177 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE590C433EF for ; Mon, 27 Sep 2021 15:20:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 981D36101A for ; Mon, 27 Sep 2021 15:20:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234972AbhI0PVw (ORCPT ); Mon, 27 Sep 2021 11:21:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41156 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235146AbhI0PVs (ORCPT ); Mon, 27 Sep 2021 11:21:48 -0400 Received: from bhuna.collabora.co.uk (bhuna.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e3e3]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 86EFCC061575; Mon, 27 Sep 2021 08:20:10 -0700 (PDT) Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: andrzej.p) with ESMTPSA id 70B641F42E30 From: Andrzej Pietrasiewicz To: linux-media@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-rockchip@lists.infradead.org, linux-staging@lists.linux.dev Cc: Andrzej Pietrasiewicz , Benjamin Gaignard , Boris Brezillon , Ezequiel Garcia , Fabio Estevam , Greg Kroah-Hartman , Hans Verkuil , Heiko Stuebner , Jernej Skrabec , Mauro Carvalho Chehab , Nicolas Dufresne , NXP Linux Team , Pengutronix Kernel Team , Philipp Zabel , Sascha Hauer , Shawn Guo , kernel@collabora.com, Ezequiel Garcia Subject: [PATCH v6 04/10] hantro: Add quirk for NV12/NV12_4L4 capture format Date: Mon, 27 Sep 2021 17:19:52 +0200 Message-Id: <20210927151958.24426-5-andrzej.p@collabora.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210927151958.24426-1-andrzej.p@collabora.com> References: <20210927151958.24426-1-andrzej.p@collabora.com> Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org From: Ezequiel Garcia The G2 core decoder engine produces NV12_4L4 format, which is a simple NV12 4x4 tiled format. The driver currently hides this format by always enabling the post-processor engine, and therefore offering NV12 directly. This is done without using the logic in hantro_postproc.c and therefore makes it difficult to add VP9 cleanly. Since fixing this is not easy, add a small quirk to force NV12 if HEVC was configured, but otherwise declare NV12_4L4 as the pixel format in imx8mq_vpu_g2_variant.dec_fmts. This will be used by the VP9 decoder which will be added soon. Signed-off-by: Ezequiel Garcia Signed-off-by: Andrzej Pietrasiewicz --- drivers/staging/media/hantro/hantro_v4l2.c | 14 ++++++++++++++ drivers/staging/media/hantro/imx8m_vpu_hw.c | 2 +- 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/drivers/staging/media/hantro/hantro_v4l2.c b/drivers/staging/media/hantro/hantro_v4l2.c index bcb0bdff4a9a..d1f060c55fed 100644 --- a/drivers/staging/media/hantro/hantro_v4l2.c +++ b/drivers/staging/media/hantro/hantro_v4l2.c @@ -150,6 +150,20 @@ static int vidioc_enum_fmt(struct file *file, void *priv, unsigned int num_fmts, i, j = 0; bool skip_mode_none; + /* + * The HEVC decoder on the G2 core needs a little quirk to offer NV12 + * only on the capture side. Once the post-processor logic is used, + * we will be able to expose NV12_4L4 and NV12 as the other cases, + * and therefore remove this quirk. + */ + if (capture && ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_HEVC_SLICE) { + if (f->index == 0) { + f->pixelformat = V4L2_PIX_FMT_NV12; + return 0; + } + return -EINVAL; + } + /* * When dealing with an encoder: * - on the capture side we want to filter out all MODE_NONE formats. diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/staging/media/hantro/imx8m_vpu_hw.c index 02e61438220a..a40b161e5956 100644 --- a/drivers/staging/media/hantro/imx8m_vpu_hw.c +++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c @@ -134,7 +134,7 @@ static const struct hantro_fmt imx8m_vpu_dec_fmts[] = { static const struct hantro_fmt imx8m_vpu_g2_dec_fmts[] = { { - .fourcc = V4L2_PIX_FMT_NV12, + .fourcc = V4L2_PIX_FMT_NV12_4L4, .codec_mode = HANTRO_MODE_NONE, }, { From patchwork Mon Sep 27 15:19:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrzej Pietrasiewicz X-Patchwork-Id: 12520175 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C397C433FE for ; Mon, 27 Sep 2021 15:20:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E66BB610A2 for ; Mon, 27 Sep 2021 15:20:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235143AbhI0PVv (ORCPT ); Mon, 27 Sep 2021 11:21:51 -0400 Received: from bhuna.collabora.co.uk ([46.235.227.227]:54122 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235142AbhI0PVt (ORCPT ); Mon, 27 Sep 2021 11:21:49 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: andrzej.p) with ESMTPSA id 8F4AA1F42DB7 From: Andrzej Pietrasiewicz To: linux-media@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-rockchip@lists.infradead.org, linux-staging@lists.linux.dev Cc: Andrzej Pietrasiewicz , Benjamin Gaignard , Boris Brezillon , Ezequiel Garcia , Fabio Estevam , Greg Kroah-Hartman , Hans Verkuil , Heiko Stuebner , Jernej Skrabec , Mauro Carvalho Chehab , Nicolas Dufresne , NXP Linux Team , Pengutronix Kernel Team , Philipp Zabel , Sascha Hauer , Shawn Guo , kernel@collabora.com, Ezequiel Garcia , Adrian Ratiu , Daniel Almeida Subject: [PATCH v6 05/10] media: uapi: Add VP9 stateless decoder controls Date: Mon, 27 Sep 2021 17:19:53 +0200 Message-Id: <20210927151958.24426-6-andrzej.p@collabora.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210927151958.24426-1-andrzej.p@collabora.com> References: <20210927151958.24426-1-andrzej.p@collabora.com> Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org Add the VP9 stateless decoder controls plus the documentation that goes with it. Signed-off-by: Boris Brezillon Co-developed-by: Ezequiel Garcia Signed-off-by: Ezequiel Garcia Signed-off-by: Adrian Ratiu Signed-off-by: Andrzej Pietrasiewicz Co-developed-by: Daniel Almeida Signed-off-by: Daniel Almeida --- .../userspace-api/media/v4l/biblio.rst | 10 + .../media/v4l/ext-ctrls-codec-stateless.rst | 573 ++++++++++++++++++ .../media/v4l/pixfmt-compressed.rst | 15 + .../media/v4l/vidioc-g-ext-ctrls.rst | 8 + .../media/v4l/vidioc-queryctrl.rst | 12 + .../media/videodev2.h.rst.exceptions | 2 + drivers/media/v4l2-core/v4l2-ctrls-core.c | 180 ++++++ drivers/media/v4l2-core/v4l2-ctrls-defs.c | 8 + drivers/media/v4l2-core/v4l2-ioctl.c | 1 + include/media/v4l2-ctrls.h | 4 + include/uapi/linux/v4l2-controls.h | 284 +++++++++ include/uapi/linux/videodev2.h | 6 + 12 files changed, 1103 insertions(+) diff --git a/Documentation/userspace-api/media/v4l/biblio.rst b/Documentation/userspace-api/media/v4l/biblio.rst index 7b8e6738ff9e..9cd18c153d19 100644 --- a/Documentation/userspace-api/media/v4l/biblio.rst +++ b/Documentation/userspace-api/media/v4l/biblio.rst @@ -417,3 +417,13 @@ VP8 :title: RFC 6386: "VP8 Data Format and Decoding Guide" :author: J. Bankoski et al. + +.. _vp9: + +VP9 +=== + + +:title: VP9 Bitstream & Decoding Process Specification + +:author: Adrian Grange (Google), Peter de Rivaz (Argon Design), Jonathan Hunt (Argon Design) diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-codec-stateless.rst b/Documentation/userspace-api/media/v4l/ext-ctrls-codec-stateless.rst index 72f5e85b4f34..cc080c4257d0 100644 --- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec-stateless.rst +++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec-stateless.rst @@ -1458,3 +1458,576 @@ FWHT Flags .. raw:: latex \normalsize + +.. _v4l2-codec-stateless-vp9: + +``V4L2_CID_STATELESS_VP9_COMPRESSED_HDR (struct)`` + Stores VP9 probabilities updates as parsed from the current compressed frame + header. A value of zero in an array element means no update of the relevant + probability. Motion vector-related updates contain a new value or zero. All + other updates contain values translated with inv_map_table[] (see 6.3.5 in + :ref:`vp9`). + +.. c:type:: v4l2_ctrl_vp9_compressed_hdr + +.. tabularcolumns:: |p{1cm}|p{4.8cm}|p{11.4cm}| + +.. cssclass:: longtable + +.. flat-table:: struct v4l2_ctrl_vp9_compressed_hdr + :header-rows: 0 + :stub-columns: 0 + :widths: 1 1 2 + + * - __u8 + - ``tx_mode`` + - Specifies the TX mode. See :ref:`TX Mode ` for more details. + * - __u8 + - ``tx8[2][1]`` + - TX 8x8 probabilities delta. + * - __u8 + - ``tx16[2][2]`` + - TX 16x16 probabilities delta. + * - __u8 + - ``tx32[2][3]`` + - TX 32x32 probabilities delta. + * - __u8 + - ``coef[4][2][2][6][6][3]`` + - Coefficient probabilities delta. + * - __u8 + - ``skip[3]`` + - Skip probabilities delta. + * - __u8 + - ``inter_mode[7][3]`` + - Inter prediction mode probabilities delta. + * - __u8 + - ``interp_filter[4][2]`` + - Interpolation filter probabilities delta. + * - __u8 + - ``is_inter[4]`` + - Is inter-block probabilities delta. + * - __u8 + - ``comp_mode[5]`` + - Compound prediction mode probabilities delta. + * - __u8 + - ``single_ref[5][2]`` + - Single reference probabilities delta. + * - __u8 + - ``comp_ref[5]`` + - Compound reference probabilities delta. + * - __u8 + - ``y_mode[4][9]`` + - Y prediction mode probabilities delta. + * - __u8 + - ``uv_mode[10][9]`` + - UV prediction mode probabilities delta. + * - __u8 + - ``partition[16][3]`` + - Partition probabilities delta. + * - __u8 + - ``mv.joint[3]`` + - Motion vector joint probabilities delta. + * - __u8 + - ``mv.sign[2]`` + - Motion vector sign probabilities delta. + * - __u8 + - ``mv.classes[2][10]`` + - Motion vector class probabilities delta. + * - __u8 + - ``mv.class0_bit[2]`` + - Motion vector class0 bit probabilities delta. + * - __u8 + - ``mv.bits[2][10]`` + - Motion vector bits probabilities delta. + * - __u8 + - ``mv.class0_fr[2][2][3]`` + - Motion vector class0 fractional bit probabilities delta. + * - __u8 + - ``mv.fr[2][3]`` + - Motion vector fractional bit probabilities delta. + * - __u8 + - ``mv.class0_hp[2]`` + - Motion vector class0 high precision fractional bit probabilities delta. + * - __u8 + - ``mv.hp[2]`` + - Motion vector high precision fractional bit probabilities delta. + +.. _vp9_tx_mode: + +``TX Mode`` + +.. tabularcolumns:: |p{6.5cm}|p{0.5cm}|p{10.3cm}| + +.. flat-table:: + :header-rows: 0 + :stub-columns: 0 + :widths: 1 1 2 + + * - ``V4L2_VP9_TX_MODE_ONLY_4X4`` + - 0 + - Transform size is 4x4. + * - ``V4L2_VP9_TX_MODE_ALLOW_8X8`` + - 1 + - Transform size can be up to 8x8. + * - ``V4L2_VP9_TX_MODE_ALLOW_16X16`` + - 2 + - Transform size can be up to 16x16. + * - ``V4L2_VP9_TX_MODE_ALLOW_32X32`` + - 3 + - transform size can be up to 32x32. + * - ``V4L2_VP9_TX_MODE_SELECT`` + - 4 + - Bitstream contains the transform size for each block. + +See section '7.3.1 Tx mode semantics' of the :ref:`vp9` specification for more details. + +``V4L2_CID_STATELESS_VP9_FRAME (struct)`` + Specifies the frame parameters for the associated VP9 frame decode request. + This includes the necessary parameters for configuring a stateless hardware + decoding pipeline for VP9. The bitstream parameters are defined according + to :ref:`vp9`. + +.. c:type:: v4l2_ctrl_vp9_frame + +.. raw:: latex + + \small + +.. tabularcolumns:: |p{4.7cm}|p{5.5cm}|p{7.1cm}| + +.. cssclass:: longtable + +.. flat-table:: struct v4l2_ctrl_vp9_frame + :header-rows: 0 + :stub-columns: 0 + :widths: 1 1 2 + + * - struct :c:type:`v4l2_vp9_loop_filter` + - ``lf`` + - Loop filter parameters. See struct :c:type:`v4l2_vp9_loop_filter` for more details. + * - struct :c:type:`v4l2_vp9_quantization` + - ``quant`` + - Quantization parameters. See :c:type:`v4l2_vp9_quantization` for more details. + * - struct :c:type:`v4l2_vp9_segmentation` + - ``seg`` + - Segmentation parameters. See :c:type:`v4l2_vp9_segmentation` for more details. + * - __u32 + - ``flags`` + - Combination of V4L2_VP9_FRAME_FLAG_* flags. See :ref:`Frame Flags`. + * - __u16 + - ``compressed_header_size`` + - Compressed header size in bytes. + * - __u16 + - ``uncompressed_header_size`` + - Uncompressed header size in bytes. + * - __u16 + - ``frame_width_minus_1`` + - Add 1 to get the frame width expressed in pixels. See section 7.2.3 in :ref:`vp9`. + * - __u16 + - ``frame_height_minus_1`` + - Add 1 to get the frame height expressed in pixels. See section 7.2.3 in :ref:`vp9`. + * - __u16 + - ``render_width_minus_1`` + - Add 1 to get the expected render width expressed in pixels. This is + not used during the decoding process but might be used by HW scalers to + prepare a frame that's ready for scanout. See section 7.2.4 in :ref:`vp9`. + * - __u16 + - render_height_minus_1 + - Add 1 to get the expected render height expressed in pixels. This is + not used during the decoding process but might be used by HW scalers to + prepare a frame that's ready for scanout. See section 7.2.4 in :ref:`vp9`. + * - __u64 + - ``last_frame_ts`` + - "last" reference buffer timestamp. + The timestamp refers to the ``timestamp`` field in + struct :c:type:`v4l2_buffer`. Use the :c:func:`v4l2_timeval_to_ns()` + function to convert the struct :c:type:`timeval` in struct + :c:type:`v4l2_buffer` to a __u64. + * - __u64 + - ``golden_frame_ts`` + - "golden" reference buffer timestamp. + The timestamp refers to the ``timestamp`` field in + struct :c:type:`v4l2_buffer`. Use the :c:func:`v4l2_timeval_to_ns()` + function to convert the struct :c:type:`timeval` in struct + :c:type:`v4l2_buffer` to a __u64. + * - __u64 + - ``alt_frame_ts`` + - "alt" reference buffer timestamp. + The timestamp refers to the ``timestamp`` field in + struct :c:type:`v4l2_buffer`. Use the :c:func:`v4l2_timeval_to_ns()` + function to convert the struct :c:type:`timeval` in struct + :c:type:`v4l2_buffer` to a __u64. + * - __u8 + - ``ref_frame_sign_bias`` + - a bitfield specifying whether the sign bias is set for a given + reference frame. See :ref:`Reference Frame Sign Bias` + for more details. + * - __u8 + - ``reset_frame_context`` + - specifies whether the frame context should be reset to default values. See + :ref:`Reset Frame Context` for more details. + * - __u8 + - ``frame_context_idx`` + - Frame context that should be used/updated. + * - __u8 + - ``profile`` + - VP9 profile. Can be 0, 1, 2 or 3. + * - __u8 + - ``bit_depth`` + - Component depth in bits. Can be 8, 10 or 12. Note that not all profiles + support 10 and/or 12 bits depths. + * - __u8 + - ``interpolation_filter`` + - Specifies the filter selection used for performing inter prediction. See + :ref:`Interpolation Filter` for more details. + * - __u8 + - ``tile_cols_log2`` + - Specifies the base 2 logarithm of the width of each tile (where the + width is measured in units of 8x8 blocks). Shall be less than or equal + to 6. + * - __u8 + - ``tile_rows_log2`` + - Specifies the base 2 logarithm of the height of each tile (where the + height is measured in units of 8x8 blocks). + * - __u8 + - ``reference_mode`` + - Specifies the type of inter prediction to be used. See + :ref:`Reference Mode` for more details. + * - __u8 + - ``reserved[7]`` + - Applications and drivers must set this to zero. + +.. raw:: latex + + \normalsize + +.. _vp9_frame_flags: + +``Frame Flags`` + +.. tabularcolumns:: |p{10.0cm}|p{1.2cm}|p{6.1cm}| + +.. flat-table:: + :header-rows: 0 + :stub-columns: 0 + :widths: 1 1 2 + + * - ``V4L2_VP9_FRAME_FLAG_KEY_FRAME`` + - 0x001 + - The frame is a key frame. + * - ``V4L2_VP9_FRAME_FLAG_SHOW_FRAME`` + - 0x002 + - The frame should be displayed. + * - ``V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT`` + - 0x004 + - The decoding should be error resilient. + * - ``V4L2_VP9_FRAME_FLAG_INTRA_ONLY`` + - 0x008 + - The frame does not reference other frames. + * - ``V4L2_VP9_FRAME_FLAG_ALLOW_HIGH_PREC_MV`` + - 0x010 + - The frame can use high precision motion vectors. + * - ``V4L2_VP9_FRAME_FLAG_REFRESH_FRAME_CTX`` + - 0x020 + - Frame context should be updated after decoding. + * - ``V4L2_VP9_FRAME_FLAG_PARALLEL_DEC_MODE`` + - 0x040 + - Parallel decoding is used. + * - ``V4L2_VP9_FRAME_FLAG_X_SUBSAMPLING`` + - 0x080 + - Vertical subsampling is enabled. + * - ``V4L2_VP9_FRAME_FLAG_Y_SUBSAMPLING`` + - 0x100 + - Horizontal subsampling is enabled. + * - ``V4L2_VP9_FRAME_FLAG_COLOR_RANGE_FULL_SWING`` + - 0x200 + - The full UV range is used. + +.. _vp9_ref_frame_sign_bias: + +``Reference Frame Sign Bias`` + +.. tabularcolumns:: |p{7.0cm}|p{1.2cm}|p{9.1cm}| + +.. flat-table:: + :header-rows: 0 + :stub-columns: 0 + :widths: 1 1 2 + + * - ``V4L2_VP9_SIGN_BIAS_LAST`` + - 0x1 + - Sign bias is set for the last reference frame. + * - ``V4L2_VP9_SIGN_BIAS_GOLDEN`` + - 0x2 + - Sign bias is set for the golden reference frame. + * - ``V4L2_VP9_SIGN_BIAS_ALT`` + - 0x2 + - Sign bias is set for the alt reference frame. + +.. _vp9_reset_frame_context: + +``Reset Frame Context`` + +.. tabularcolumns:: |p{7.0cm}|p{1.2cm}|p{9.1cm}| + +.. flat-table:: + :header-rows: 0 + :stub-columns: 0 + :widths: 1 1 2 + + * - ``V4L2_VP9_RESET_FRAME_CTX_NONE`` + - 0 + - Do not reset any frame context. + * - ``V4L2_VP9_RESET_FRAME_CTX_SPEC`` + - 1 + - Reset the frame context pointed to by + :c:type:`v4l2_ctrl_vp9_frame`.frame_context_idx. + * - ``V4L2_VP9_RESET_FRAME_CTX_ALL`` + - 2 + - Reset all frame contexts. + +See section '7.2 Uncompressed header semantics' of the :ref:`vp9` specification +for more details. + +.. _vp9_interpolation_filter: + +``Interpolation Filter`` + +.. tabularcolumns:: |p{9.0cm}|p{1.2cm}|p{7.1cm}| + +.. flat-table:: + :header-rows: 0 + :stub-columns: 0 + :widths: 1 1 2 + + * - ``V4L2_VP9_INTERP_FILTER_EIGHTTAP`` + - 0 + - Eight tap filter. + * - ``V4L2_VP9_INTERP_FILTER_EIGHTTAP_SMOOTH`` + - 1 + - Eight tap smooth filter. + * - ``V4L2_VP9_INTERP_FILTER_EIGHTTAP_SHARP`` + - 2 + - Eeight tap sharp filter. + * - ``V4L2_VP9_INTERP_FILTER_BILINEAR`` + - 3 + - Bilinear filter. + * - ``V4L2_VP9_INTERP_FILTER_SWITCHABLE`` + - 4 + - Filter selection is signaled at the block level. + +See section '7.2.7 Interpolation filter semantics' of the :ref:`vp9` specification +for more details. + +.. _vp9_reference_mode: + +``Reference Mode`` + +.. tabularcolumns:: |p{9.6cm}|p{0.5cm}|p{7.2cm}| + +.. flat-table:: + :header-rows: 0 + :stub-columns: 0 + :widths: 1 1 2 + + * - ``V4L2_VP9_REFERENCE_MODE_SINGLE_REFERENCE`` + - 0 + - Indicates that all the inter blocks use only a single reference frame + to generate motion compensated prediction. + * - ``V4L2_VP9_REFERENCE_MODE_COMPOUND_REFERENCE`` + - 1 + - Requires all the inter blocks to use compound mode. Single reference + frame prediction is not allowed. + * - ``V4L2_VP9_REFERENCE_MODE_SELECT`` + - 2 + - Allows each individual inter block to select between single and + compound prediction modes. + +See section '7.3.6 Frame reference mode semantics' of the :ref:`vp9` specification for more details. + +.. c:type:: v4l2_vp9_segmentation + +Encodes the quantization parameters. See section '7.2.10 Segmentation +params syntax' of the :ref:`vp9` specification for more details. + +.. tabularcolumns:: |p{0.8cm}|p{5cm}|p{11.4cm}| + +.. cssclass:: longtable + +.. flat-table:: struct v4l2_vp9_segmentation + :header-rows: 0 + :stub-columns: 0 + :widths: 1 1 2 + + * - __u8 + - ``feature_data[8][4]`` + - Data attached to each feature. Data entry is only valid if the feature + is enabled. The array shall be indexed with segment number as the first dimension + (0..7) and one of V4L2_VP9_SEG_* as the second dimension. + See :ref:`Segment Feature IDs`. + * - __u8 + - ``feature_enabled[8]`` + - Bitmask defining which features are enabled in each segment. The value for each + segment is a combination of V4L2_VP9_SEGMENT_FEATURE_ENABLED(id) values where id is + one of V4L2_VP9_SEG_*. See :ref:`Segment Feature IDs`. + * - __u8 + - ``tree_probs[7]`` + - Specifies the probability values to be used when decoding a Segment-ID. + See '5.15. Segmentation map' section of :ref:`vp9` for more details. + * - __u8 + - ``pred_probs[3]`` + - Specifies the probability values to be used when decoding a + Predicted-Segment-ID. See '6.4.14. Get segment id syntax' + section of :ref:`vp9` for more details. + * - __u8 + - ``flags`` + - Combination of V4L2_VP9_SEGMENTATION_FLAG_* flags. See + :ref:`Segmentation Flags`. + * - __u8 + - ``reserved[5]`` + - Applications and drivers must set this to zero. + +.. _vp9_segment_feature: + +``Segment feature IDs`` + +.. tabularcolumns:: |p{6.0cm}|p{1cm}|p{10.3cm}| + +.. flat-table:: + :header-rows: 0 + :stub-columns: 0 + :widths: 1 1 2 + + * - ``V4L2_VP9_SEG_LVL_ALT_Q`` + - 0 + - Quantizer segment feature. + * - ``V4L2_VP9_SEG_LVL_ALT_L`` + - 1 + - Loop filter segment feature. + * - ``V4L2_VP9_SEG_LVL_REF_FRAME`` + - 2 + - Reference frame segment feature. + * - ``V4L2_VP9_SEG_LVL_SKIP`` + - 3 + - Skip segment feature. + * - ``V4L2_VP9_SEG_LVL_MAX`` + - 4 + - Number of segment features. + +.. _vp9_segmentation_flags: + +``Segmentation Flags`` + +.. tabularcolumns:: |p{10.6cm}|p{0.8cm}|p{5.9cm}| + +.. flat-table:: + :header-rows: 0 + :stub-columns: 0 + :widths: 1 1 2 + + * - ``V4L2_VP9_SEGMENTATION_FLAG_ENABLED`` + - 0x01 + - Indicates that this frame makes use of the segmentation tool. + * - ``V4L2_VP9_SEGMENTATION_FLAG_UPDATE_MAP`` + - 0x02 + - Indicates that the segmentation map should be updated during the + decoding of this frame. + * - ``V4L2_VP9_SEGMENTATION_FLAG_TEMPORAL_UPDATE`` + - 0x04 + - Indicates that the updates to the segmentation map are coded + relative to the existing segmentation map. + * - ``V4L2_VP9_SEGMENTATION_FLAG_UPDATE_DATA`` + - 0x08 + - Indicates that new parameters are about to be specified for each + segment. + * - ``V4L2_VP9_SEGMENTATION_FLAG_ABS_OR_DELTA_UPDATE`` + - 0x10 + - Indicates that the segmentation parameters represent the actual values + to be used. + +.. c:type:: v4l2_vp9_quantization + +Encodes the quantization parameters. See section '7.2.9 Quantization params +syntax' of the VP9 specification for more details. + +.. tabularcolumns:: |p{0.8cm}|p{4cm}|p{12.4cm}| + +.. cssclass:: longtable + +.. flat-table:: struct v4l2_vp9_quantization + :header-rows: 0 + :stub-columns: 0 + :widths: 1 1 2 + + * - __u8 + - ``base_q_idx`` + - Indicates the base frame qindex. + * - __s8 + - ``delta_q_y_dc`` + - Indicates the Y DC quantizer relative to base_q_idx. + * - __s8 + - ``delta_q_uv_dc`` + - Indicates the UV DC quantizer relative to base_q_idx. + * - __s8 + - ``delta_q_uv_ac`` + - Indicates the UV AC quantizer relative to base_q_idx. + * - __u8 + - ``reserved[4]`` + - Applications and drivers must set this to zero. + +.. c:type:: v4l2_vp9_loop_filter + +This structure contains all loop filter related parameters. See sections +'7.2.8 Loop filter semantics' of the :ref:`vp9` specification for more details. + +.. tabularcolumns:: |p{0.8cm}|p{4cm}|p{12.4cm}| + +.. cssclass:: longtable + +.. flat-table:: struct v4l2_vp9_loop_filter + :header-rows: 0 + :stub-columns: 0 + :widths: 1 1 2 + + * - __s8 + - ``ref_deltas[4]`` + - Contains the adjustment needed for the filter level based on the chosen + reference frame. + * - __s8 + - ``mode_deltas[2]`` + - Contains the adjustment needed for the filter level based on the chosen + mode. + * - __u8 + - ``level`` + - Indicates the loop filter strength. + * - __u8 + - ``sharpness`` + - Indicates the sharpness level. + * - __u8 + - ``flags`` + - Combination of V4L2_VP9_LOOP_FILTER_FLAG_* flags. + See :ref:`Loop Filter Flags `. + * - __u8 + - ``reserved[7]`` + - Applications and drivers must set this to zero. + + +.. _vp9_loop_filter_flags: + +``Loop Filter Flags`` + +.. tabularcolumns:: |p{9.6cm}|p{0.5cm}|p{7.2cm}| + +.. flat-table:: + :header-rows: 0 + :stub-columns: 0 + :widths: 1 1 2 + + * - ``V4L2_VP9_LOOP_FILTER_FLAG_DELTA_ENABLED`` + - 0x1 + - When set, the filter level depends on the mode and reference frame used + to predict a block. + * - ``V4L2_VP9_LOOP_FILTER_FLAG_DELTA_UPDATE`` + - 0x2 + - When set, the bitstream contains additional syntax elements that + specify which mode and reference frame deltas are to be updated. diff --git a/Documentation/userspace-api/media/v4l/pixfmt-compressed.rst b/Documentation/userspace-api/media/v4l/pixfmt-compressed.rst index 0ede39907ee2..967fc803ef94 100644 --- a/Documentation/userspace-api/media/v4l/pixfmt-compressed.rst +++ b/Documentation/userspace-api/media/v4l/pixfmt-compressed.rst @@ -172,6 +172,21 @@ Compressed Formats - VP9 compressed video frame. The encoder generates one compressed frame per buffer, and the decoder requires one compressed frame per buffer. + * .. _V4L2-PIX-FMT-VP9-FRAME: + + - ``V4L2_PIX_FMT_VP9_FRAME`` + - 'VP9F' + - VP9 parsed frame, including the frame header, as extracted from the container. + This format is adapted for stateless video decoders that implement a + VP9 pipeline with the :ref:`stateless_decoder`. + Metadata associated with the frame to decode is required to be passed + through the ``V4L2_CID_STATELESS_VP9_FRAME`` and + the ``V4L2_CID_STATELESS_VP9_COMPRESSED_HDR`` controls. + See the :ref:`associated Codec Control IDs `. + Exactly one output and one capture buffer must be provided for use with + this pixel format. The output buffer must contain the appropriate number + of macroblocks to decode a full corresponding frame to the matching + capture buffer. * .. _V4L2-PIX-FMT-HEVC: - ``V4L2_PIX_FMT_HEVC`` diff --git a/Documentation/userspace-api/media/v4l/vidioc-g-ext-ctrls.rst b/Documentation/userspace-api/media/v4l/vidioc-g-ext-ctrls.rst index 2d6bc8d94380..d2bdd3db076f 100644 --- a/Documentation/userspace-api/media/v4l/vidioc-g-ext-ctrls.rst +++ b/Documentation/userspace-api/media/v4l/vidioc-g-ext-ctrls.rst @@ -233,6 +233,14 @@ still cause this situation. - ``p_mpeg2_quantisation`` - A pointer to a struct :c:type:`v4l2_ctrl_mpeg2_quantisation`. Valid if this control is of type ``V4L2_CTRL_TYPE_MPEG2_QUANTISATION``. + * - struct :c:type:`v4l2_ctrl_vp9_compressed_hdr` * + - ``p_vp9_compressed_hdr_probs`` + - A pointer to a struct :c:type:`v4l2_ctrl_vp9_compressed_hdr`. Valid if this + control is of type ``V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR``. + * - struct :c:type:`v4l2_ctrl_vp9_frame` * + - ``p_vp9_frame`` + - A pointer to a struct :c:type:`v4l2_ctrl_vp9_frame`. Valid if this + control is of type ``V4L2_CTRL_TYPE_VP9_FRAME``. * - struct :c:type:`v4l2_ctrl_hdr10_cll_info` * - ``p_hdr10_cll`` - A pointer to a struct :c:type:`v4l2_ctrl_hdr10_cll_info`. Valid if this control is diff --git a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst index f9ecf6276129..9ad930823960 100644 --- a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst +++ b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst @@ -507,6 +507,18 @@ See also the examples in :ref:`control`. - n/a - A struct :c:type:`v4l2_ctrl_hevc_decode_params`, containing HEVC decoding parameters for stateless video decoders. + * - ``V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR`` + - n/a + - n/a + - n/a + - A struct :c:type:`v4l2_ctrl_vp9_compressed_hdr`, containing VP9 + probabilities updates for stateless video decoders. + * - ``V4L2_CTRL_TYPE_VP9_FRAME`` + - n/a + - n/a + - n/a + - A struct :c:type:`v4l2_ctrl_vp9_frame`, containing VP9 + frame decode parameters for stateless video decoders. .. raw:: latex diff --git a/Documentation/userspace-api/media/videodev2.h.rst.exceptions b/Documentation/userspace-api/media/videodev2.h.rst.exceptions index eb0b1cd37abd..9cbb7a0c354a 100644 --- a/Documentation/userspace-api/media/videodev2.h.rst.exceptions +++ b/Documentation/userspace-api/media/videodev2.h.rst.exceptions @@ -149,6 +149,8 @@ replace symbol V4L2_CTRL_TYPE_HEVC_SLICE_PARAMS :c:type:`v4l2_ctrl_type` replace symbol V4L2_CTRL_TYPE_AREA :c:type:`v4l2_ctrl_type` replace symbol V4L2_CTRL_TYPE_FWHT_PARAMS :c:type:`v4l2_ctrl_type` replace symbol V4L2_CTRL_TYPE_VP8_FRAME :c:type:`v4l2_ctrl_type` +replace symbol V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR :c:type:`v4l2_ctrl_type` +replace symbol V4L2_CTRL_TYPE_VP9_FRAME :c:type:`v4l2_ctrl_type` replace symbol V4L2_CTRL_TYPE_HDR10_CLL_INFO :c:type:`v4l2_ctrl_type` replace symbol V4L2_CTRL_TYPE_HDR10_MASTERING_DISPLAY :c:type:`v4l2_ctrl_type` diff --git a/drivers/media/v4l2-core/v4l2-ctrls-core.c b/drivers/media/v4l2-core/v4l2-ctrls-core.c index c4b5082849b6..52b9ff46ab26 100644 --- a/drivers/media/v4l2-core/v4l2-ctrls-core.c +++ b/drivers/media/v4l2-core/v4l2-ctrls-core.c @@ -283,6 +283,12 @@ static void std_log(const struct v4l2_ctrl *ctrl) case V4L2_CTRL_TYPE_MPEG2_PICTURE: pr_cont("MPEG2_PICTURE"); break; + case V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR: + pr_cont("VP9_COMPRESSED_HDR"); + break; + case V4L2_CTRL_TYPE_VP9_FRAME: + pr_cont("VP9_FRAME"); + break; default: pr_cont("unknown type %d", ctrl->type); break; @@ -317,6 +323,168 @@ static void std_log(const struct v4l2_ctrl *ctrl) #define zero_reserved(s) \ memset(&(s).reserved, 0, sizeof((s).reserved)) +static int +validate_vp9_lf_params(struct v4l2_vp9_loop_filter *lf) +{ + unsigned int i; + + if (lf->flags & ~(V4L2_VP9_LOOP_FILTER_FLAG_DELTA_ENABLED | + V4L2_VP9_LOOP_FILTER_FLAG_DELTA_UPDATE)) + return -EINVAL; + + /* That all values are in the accepted range. */ + if (lf->level > GENMASK(5, 0)) + return -EINVAL; + + if (lf->sharpness > GENMASK(2, 0)) + return -EINVAL; + + for (i = 0; i < ARRAY_SIZE(lf->ref_deltas); i++) + if (lf->ref_deltas[i] < -63 || lf->ref_deltas[i] > 63) + return -EINVAL; + + for (i = 0; i < ARRAY_SIZE(lf->mode_deltas); i++) + if (lf->mode_deltas[i] < -63 || lf->mode_deltas[i] > 63) + return -EINVAL; + + zero_reserved(*lf); + return 0; +} + +static int +validate_vp9_quant_params(struct v4l2_vp9_quantization *quant) +{ + if (quant->delta_q_y_dc < -15 || quant->delta_q_y_dc > 15 || + quant->delta_q_uv_dc < -15 || quant->delta_q_uv_dc > 15 || + quant->delta_q_uv_ac < -15 || quant->delta_q_uv_ac > 15) + return -EINVAL; + + zero_reserved(*quant); + return 0; +} + +static int +validate_vp9_seg_params(struct v4l2_vp9_segmentation *seg) +{ + unsigned int i, j; + + if (seg->flags & ~(V4L2_VP9_SEGMENTATION_FLAG_ENABLED | + V4L2_VP9_SEGMENTATION_FLAG_UPDATE_MAP | + V4L2_VP9_SEGMENTATION_FLAG_TEMPORAL_UPDATE | + V4L2_VP9_SEGMENTATION_FLAG_UPDATE_DATA | + V4L2_VP9_SEGMENTATION_FLAG_ABS_OR_DELTA_UPDATE)) + return -EINVAL; + + for (i = 0; i < ARRAY_SIZE(seg->feature_enabled); i++) { + if (seg->feature_enabled[i] & + ~V4L2_VP9_SEGMENT_FEATURE_ENABLED_MASK) + return -EINVAL; + } + + for (i = 0; i < ARRAY_SIZE(seg->feature_data); i++) { + const int range[] = { 255, 63, 3, 0 }; + + for (j = 0; j < ARRAY_SIZE(seg->feature_data[j]); j++) { + if (seg->feature_data[i][j] < -range[j] || + seg->feature_data[i][j] > range[j]) + return -EINVAL; + } + } + + zero_reserved(*seg); + return 0; +} + +static int +validate_vp9_compressed_hdr(struct v4l2_ctrl_vp9_compressed_hdr *hdr) +{ + if (hdr->tx_mode > V4L2_VP9_TX_MODE_SELECT) + return -EINVAL; + + return 0; +} + +static int +validate_vp9_frame(struct v4l2_ctrl_vp9_frame *frame) +{ + int ret; + + /* Make sure we're not passed invalid flags. */ + if (frame->flags & ~(V4L2_VP9_FRAME_FLAG_KEY_FRAME | + V4L2_VP9_FRAME_FLAG_SHOW_FRAME | + V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT | + V4L2_VP9_FRAME_FLAG_INTRA_ONLY | + V4L2_VP9_FRAME_FLAG_ALLOW_HIGH_PREC_MV | + V4L2_VP9_FRAME_FLAG_REFRESH_FRAME_CTX | + V4L2_VP9_FRAME_FLAG_PARALLEL_DEC_MODE | + V4L2_VP9_FRAME_FLAG_X_SUBSAMPLING | + V4L2_VP9_FRAME_FLAG_Y_SUBSAMPLING | + V4L2_VP9_FRAME_FLAG_COLOR_RANGE_FULL_SWING)) + return -EINVAL; + + if (frame->flags & V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT && + frame->flags & V4L2_VP9_FRAME_FLAG_REFRESH_FRAME_CTX) + return -EINVAL; + + if (frame->profile > V4L2_VP9_PROFILE_MAX) + return -EINVAL; + + if (frame->reset_frame_context > V4L2_VP9_RESET_FRAME_CTX_ALL) + return -EINVAL; + + if (frame->frame_context_idx >= V4L2_VP9_NUM_FRAME_CTX) + return -EINVAL; + + /* + * Profiles 0 and 1 only support 8-bit depth, profiles 2 and 3 only 10 + * and 12 bit depths. + */ + if ((frame->profile < 2 && frame->bit_depth != 8) || + (frame->profile >= 2 && + (frame->bit_depth != 10 && frame->bit_depth != 12))) + return -EINVAL; + + /* Profile 0 and 2 only accept YUV 4:2:0. */ + if ((frame->profile == 0 || frame->profile == 2) && + (!(frame->flags & V4L2_VP9_FRAME_FLAG_X_SUBSAMPLING) || + !(frame->flags & V4L2_VP9_FRAME_FLAG_Y_SUBSAMPLING))) + return -EINVAL; + + /* Profile 1 and 3 only accept YUV 4:2:2, 4:4:0 and 4:4:4. */ + if ((frame->profile == 1 || frame->profile == 3) && + ((frame->flags & V4L2_VP9_FRAME_FLAG_X_SUBSAMPLING) && + (frame->flags & V4L2_VP9_FRAME_FLAG_Y_SUBSAMPLING))) + return -EINVAL; + + if (frame->interpolation_filter > V4L2_VP9_INTERP_FILTER_SWITCHABLE) + return -EINVAL; + + /* + * According to the spec, tile_cols_log2 shall be less than or equal + * to 6. + */ + if (frame->tile_cols_log2 > 6) + return -EINVAL; + + if (frame->reference_mode > V4L2_VP9_REFERENCE_MODE_SELECT) + return -EINVAL; + + ret = validate_vp9_lf_params(&frame->lf); + if (ret) + return ret; + + ret = validate_vp9_quant_params(&frame->quant); + if (ret) + return ret; + + ret = validate_vp9_seg_params(&frame->seg); + if (ret) + return ret; + + zero_reserved(*frame); + return 0; +} + /* * Compound controls validation requires setting unused fields/flags to zero * in order to properly detect unchanged controls with std_equal's memcmp. @@ -687,6 +855,12 @@ static int std_validate_compound(const struct v4l2_ctrl *ctrl, u32 idx, break; + case V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR: + return validate_vp9_compressed_hdr(p); + + case V4L2_CTRL_TYPE_VP9_FRAME: + return validate_vp9_frame(p); + case V4L2_CTRL_TYPE_AREA: area = p; if (!area->width || !area->height) @@ -1249,6 +1423,12 @@ static struct v4l2_ctrl *v4l2_ctrl_new(struct v4l2_ctrl_handler *hdl, case V4L2_CTRL_TYPE_HDR10_MASTERING_DISPLAY: elem_size = sizeof(struct v4l2_ctrl_hdr10_mastering_display); break; + case V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR: + elem_size = sizeof(struct v4l2_ctrl_vp9_compressed_hdr); + break; + case V4L2_CTRL_TYPE_VP9_FRAME: + elem_size = sizeof(struct v4l2_ctrl_vp9_frame); + break; case V4L2_CTRL_TYPE_AREA: elem_size = sizeof(struct v4l2_area); break; diff --git a/drivers/media/v4l2-core/v4l2-ctrls-defs.c b/drivers/media/v4l2-core/v4l2-ctrls-defs.c index 421300e13a41..5845c1b6bb2a 100644 --- a/drivers/media/v4l2-core/v4l2-ctrls-defs.c +++ b/drivers/media/v4l2-core/v4l2-ctrls-defs.c @@ -1175,6 +1175,8 @@ const char *v4l2_ctrl_get_name(u32 id) case V4L2_CID_STATELESS_MPEG2_SEQUENCE: return "MPEG-2 Sequence Header"; case V4L2_CID_STATELESS_MPEG2_PICTURE: return "MPEG-2 Picture Header"; case V4L2_CID_STATELESS_MPEG2_QUANTISATION: return "MPEG-2 Quantisation Matrices"; + case V4L2_CID_STATELESS_VP9_COMPRESSED_HDR: return "VP9 Probabilities Updates"; + case V4L2_CID_STATELESS_VP9_FRAME: return "VP9 Frame Decode Parameters"; /* Colorimetry controls */ /* Keep the order of the 'case's the same as in v4l2-controls.h! */ @@ -1493,6 +1495,12 @@ void v4l2_ctrl_fill(u32 id, const char **name, enum v4l2_ctrl_type *type, case V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS: *type = V4L2_CTRL_TYPE_HEVC_DECODE_PARAMS; break; + case V4L2_CID_STATELESS_VP9_COMPRESSED_HDR: + *type = V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR; + break; + case V4L2_CID_STATELESS_VP9_FRAME: + *type = V4L2_CTRL_TYPE_VP9_FRAME; + break; case V4L2_CID_UNIT_CELL_SIZE: *type = V4L2_CTRL_TYPE_AREA; *flags |= V4L2_CTRL_FLAG_READ_ONLY; diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c index ec6fc1ef291e..7a5e8120d733 100644 --- a/drivers/media/v4l2-core/v4l2-ioctl.c +++ b/drivers/media/v4l2-core/v4l2-ioctl.c @@ -1394,6 +1394,7 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt) case V4L2_PIX_FMT_VP8: descr = "VP8"; break; case V4L2_PIX_FMT_VP8_FRAME: descr = "VP8 Frame"; break; case V4L2_PIX_FMT_VP9: descr = "VP9"; break; + case V4L2_PIX_FMT_VP9_FRAME: descr = "VP9 Frame"; break; case V4L2_PIX_FMT_HEVC: descr = "HEVC"; break; /* aka H.265 */ case V4L2_PIX_FMT_HEVC_SLICE: descr = "HEVC Parsed Slice Data"; break; case V4L2_PIX_FMT_FWHT: descr = "FWHT"; break; /* used in vicodec */ diff --git a/include/media/v4l2-ctrls.h b/include/media/v4l2-ctrls.h index 575b59fbac77..b3ce438f1329 100644 --- a/include/media/v4l2-ctrls.h +++ b/include/media/v4l2-ctrls.h @@ -50,6 +50,8 @@ struct video_device; * @p_h264_decode_params: Pointer to a struct v4l2_ctrl_h264_decode_params. * @p_h264_pred_weights: Pointer to a struct v4l2_ctrl_h264_pred_weights. * @p_vp8_frame: Pointer to a VP8 frame params structure. + * @p_vp9_compressed_hdr_probs: Pointer to a VP9 frame compressed header probs structure. + * @p_vp9_frame: Pointer to a VP9 frame params structure. * @p_hevc_sps: Pointer to an HEVC sequence parameter set structure. * @p_hevc_pps: Pointer to an HEVC picture parameter set structure. * @p_hevc_slice_params: Pointer to an HEVC slice parameters structure. @@ -80,6 +82,8 @@ union v4l2_ctrl_ptr { struct v4l2_ctrl_hevc_sps *p_hevc_sps; struct v4l2_ctrl_hevc_pps *p_hevc_pps; struct v4l2_ctrl_hevc_slice_params *p_hevc_slice_params; + struct v4l2_ctrl_vp9_compressed_hdr *p_vp9_compressed_hdr_probs; + struct v4l2_ctrl_vp9_frame *p_vp9_frame; struct v4l2_ctrl_hdr10_cll_info *p_hdr10_cll; struct v4l2_ctrl_hdr10_mastering_display *p_hdr10_mastering; struct v4l2_area *p_area; diff --git a/include/uapi/linux/v4l2-controls.h b/include/uapi/linux/v4l2-controls.h index 5532b5f68493..36c82ad98030 100644 --- a/include/uapi/linux/v4l2-controls.h +++ b/include/uapi/linux/v4l2-controls.h @@ -2010,6 +2010,290 @@ struct v4l2_ctrl_hdr10_mastering_display { __u32 min_display_mastering_luminance; }; +/* Stateless VP9 controls */ + +#define V4L2_VP9_LOOP_FILTER_FLAG_DELTA_ENABLED 0x1 +#define V4L2_VP9_LOOP_FILTER_FLAG_DELTA_UPDATE 0x2 + +/** + * struct v4l2_vp9_loop_filter - VP9 loop filter parameters + * + * @ref_deltas: contains the adjustment needed for the filter level based on the + * chosen reference frame. If this syntax element is not present in the bitstream, + * users should pass its last value. + * @mode_deltas: contains the adjustment needed for the filter level based on the + * chosen mode. If this syntax element is not present in the bitstream, users should + * pass its last value. + * @level: indicates the loop filter strength. + * @sharpness: indicates the sharpness level. + * @flags: combination of V4L2_VP9_LOOP_FILTER_FLAG_{} flags. + * @reserved: padding field. Should be zeroed by applications. + * + * This structure contains all loop filter related parameters. See sections + * '7.2.8 Loop filter semantics' of the VP9 specification for more details. + */ +struct v4l2_vp9_loop_filter { + __s8 ref_deltas[4]; + __s8 mode_deltas[2]; + __u8 level; + __u8 sharpness; + __u8 flags; + __u8 reserved[7]; +}; + +/** + * struct v4l2_vp9_quantization - VP9 quantization parameters + * + * @base_q_idx: indicates the base frame qindex. + * @delta_q_y_dc: indicates the Y DC quantizer relative to base_q_idx. + * @delta_q_uv_dc: indicates the UV DC quantizer relative to base_q_idx. + * @delta_q_uv_ac: indicates the UV AC quantizer relative to base_q_idx. + * @reserved: padding field. Should be zeroed by applications. + * + * Encodes the quantization parameters. See section '7.2.9 Quantization params + * syntax' of the VP9 specification for more details. + */ +struct v4l2_vp9_quantization { + __u8 base_q_idx; + __s8 delta_q_y_dc; + __s8 delta_q_uv_dc; + __s8 delta_q_uv_ac; + __u8 reserved[4]; +}; + +#define V4L2_VP9_SEGMENTATION_FLAG_ENABLED 0x01 +#define V4L2_VP9_SEGMENTATION_FLAG_UPDATE_MAP 0x02 +#define V4L2_VP9_SEGMENTATION_FLAG_TEMPORAL_UPDATE 0x04 +#define V4L2_VP9_SEGMENTATION_FLAG_UPDATE_DATA 0x08 +#define V4L2_VP9_SEGMENTATION_FLAG_ABS_OR_DELTA_UPDATE 0x10 + +#define V4L2_VP9_SEG_LVL_ALT_Q 0 +#define V4L2_VP9_SEG_LVL_ALT_L 1 +#define V4L2_VP9_SEG_LVL_REF_FRAME 2 +#define V4L2_VP9_SEG_LVL_SKIP 3 +#define V4L2_VP9_SEG_LVL_MAX 4 + +#define V4L2_VP9_SEGMENT_FEATURE_ENABLED(id) (1 << (id)) +#define V4L2_VP9_SEGMENT_FEATURE_ENABLED_MASK 0xf + +/** + * struct v4l2_vp9_segmentation - VP9 segmentation parameters + * + * @feature_data: data attached to each feature. Data entry is only valid if + * the feature is enabled. The array shall be indexed with segment number as + * the first dimension (0..7) and one of V4L2_VP9_SEG_{} as the second dimension. + * @feature_enabled: bitmask defining which features are enabled in each segment. + * The value for each segment is a combination of V4L2_VP9_SEGMENT_FEATURE_ENABLED(id) + * values where id is one of V4L2_VP9_SEG_LVL_{}. + * @tree_probs: specifies the probability values to be used when decoding a + * Segment-ID. See '5.15. Segmentation map' section of the VP9 specification + * for more details. + * @pred_probs: specifies the probability values to be used when decoding a + * Predicted-Segment-ID. See '6.4.14. Get segment id syntax' section of :ref:`vp9` + * for more details. + * @flags: combination of V4L2_VP9_SEGMENTATION_FLAG_{} flags. + * @reserved: padding field. Should be zeroed by applications. + * + * Encodes the quantization parameters. See section '7.2.10 Segmentation params syntax' of + * the VP9 specification for more details. + */ +struct v4l2_vp9_segmentation { + __s16 feature_data[8][4]; + __u8 feature_enabled[8]; + __u8 tree_probs[7]; + __u8 pred_probs[3]; + __u8 flags; + __u8 reserved[5]; +}; + +#define V4L2_VP9_FRAME_FLAG_KEY_FRAME 0x001 +#define V4L2_VP9_FRAME_FLAG_SHOW_FRAME 0x002 +#define V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT 0x004 +#define V4L2_VP9_FRAME_FLAG_INTRA_ONLY 0x008 +#define V4L2_VP9_FRAME_FLAG_ALLOW_HIGH_PREC_MV 0x010 +#define V4L2_VP9_FRAME_FLAG_REFRESH_FRAME_CTX 0x020 +#define V4L2_VP9_FRAME_FLAG_PARALLEL_DEC_MODE 0x040 +#define V4L2_VP9_FRAME_FLAG_X_SUBSAMPLING 0x080 +#define V4L2_VP9_FRAME_FLAG_Y_SUBSAMPLING 0x100 +#define V4L2_VP9_FRAME_FLAG_COLOR_RANGE_FULL_SWING 0x200 + +#define V4L2_VP9_SIGN_BIAS_LAST 0x1 +#define V4L2_VP9_SIGN_BIAS_GOLDEN 0x2 +#define V4L2_VP9_SIGN_BIAS_ALT 0x4 + +#define V4L2_VP9_RESET_FRAME_CTX_NONE 0 +#define V4L2_VP9_RESET_FRAME_CTX_SPEC 1 +#define V4L2_VP9_RESET_FRAME_CTX_ALL 2 + +#define V4L2_VP9_INTERP_FILTER_EIGHTTAP 0 +#define V4L2_VP9_INTERP_FILTER_EIGHTTAP_SMOOTH 1 +#define V4L2_VP9_INTERP_FILTER_EIGHTTAP_SHARP 2 +#define V4L2_VP9_INTERP_FILTER_BILINEAR 3 +#define V4L2_VP9_INTERP_FILTER_SWITCHABLE 4 + +#define V4L2_VP9_REFERENCE_MODE_SINGLE_REFERENCE 0 +#define V4L2_VP9_REFERENCE_MODE_COMPOUND_REFERENCE 1 +#define V4L2_VP9_REFERENCE_MODE_SELECT 2 + +#define V4L2_VP9_PROFILE_MAX 3 + +#define V4L2_CID_STATELESS_VP9_FRAME (V4L2_CID_CODEC_STATELESS_BASE + 300) +/** + * struct v4l2_ctrl_vp9_frame - VP9 frame decoding control + * + * @lf: loop filter parameters. See &v4l2_vp9_loop_filter for more details. + * @quant: quantization parameters. See &v4l2_vp9_quantization for more details. + * @seg: segmentation parameters. See &v4l2_vp9_segmentation for more details. + * @flags: combination of V4L2_VP9_FRAME_FLAG_{} flags. + * @compressed_header_size: compressed header size in bytes. + * @uncompressed_header_size: uncompressed header size in bytes. + * @frame_width_minus_1: add 1 to it and you'll get the frame width expressed in pixels. + * @frame_height_minus_1: add 1 to it and you'll get the frame height expressed in pixels. + * @render_width_minus_1: add 1 to it and you'll get the expected render width expressed in + * pixels. This is not used during the decoding process but might be used by HW scalers + * to prepare a frame that's ready for scanout. + * @render_height_minus_1: add 1 to it and you'll get the expected render height expressed in + * pixels. This is not used during the decoding process but might be used by HW scalers + * to prepare a frame that's ready for scanout. + * @last_frame_ts: "last" reference buffer timestamp. + * The timestamp refers to the timestamp field in struct v4l2_buffer. + * Use v4l2_timeval_to_ns() to convert the struct timeval to a __u64. + * @golden_frame_ts: "golden" reference buffer timestamp. + * The timestamp refers to the timestamp field in struct v4l2_buffer. + * Use v4l2_timeval_to_ns() to convert the struct timeval to a __u64. + * @alt_frame_ts: "alt" reference buffer timestamp. + * The timestamp refers to the timestamp field in struct v4l2_buffer. + * Use v4l2_timeval_to_ns() to convert the struct timeval to a __u64. + * @ref_frame_sign_bias: a bitfield specifying whether the sign bias is set for a given + * reference frame. Either of V4L2_VP9_SIGN_BIAS_{}. + * @reset_frame_context: specifies whether the frame context should be reset to default values. + * Either of V4L2_VP9_RESET_FRAME_CTX_{}. + * @frame_context_idx: frame context that should be used/updated. + * @profile: VP9 profile. Can be 0, 1, 2 or 3. + * @bit_depth: bits per components. Can be 8, 10 or 12. Note that not all profiles support + * 10 and/or 12 bits depths. + * @interpolation_filter: specifies the filter selection used for performing inter prediction. + * Set to one of V4L2_VP9_INTERP_FILTER_{}. + * @tile_cols_log2: specifies the base 2 logarithm of the width of each tile (where the width + * is measured in units of 8x8 blocks). Shall be less than or equal to 6. + * @tile_rows_log2: specifies the base 2 logarithm of the height of each tile (where the height + * is measured in units of 8x8 blocks). + * @reference_mode: specifies the type of inter prediction to be used. + * Set to one of V4L2_VP9_REFERENCE_MODE_{}. + * @reserved: padding field. Should be zeroed by applications. + */ +struct v4l2_ctrl_vp9_frame { + struct v4l2_vp9_loop_filter lf; + struct v4l2_vp9_quantization quant; + struct v4l2_vp9_segmentation seg; + __u32 flags; + __u16 compressed_header_size; + __u16 uncompressed_header_size; + __u16 frame_width_minus_1; + __u16 frame_height_minus_1; + __u16 render_width_minus_1; + __u16 render_height_minus_1; + __u64 last_frame_ts; + __u64 golden_frame_ts; + __u64 alt_frame_ts; + __u8 ref_frame_sign_bias; + __u8 reset_frame_context; + __u8 frame_context_idx; + __u8 profile; + __u8 bit_depth; + __u8 interpolation_filter; + __u8 tile_cols_log2; + __u8 tile_rows_log2; + __u8 reference_mode; + __u8 reserved[7]; +}; + +#define V4L2_VP9_NUM_FRAME_CTX 4 + +/** + * struct v4l2_vp9_mv_probs - VP9 Motion vector probability updates + * @joint: motion vector joint probability updates. + * @sign: motion vector sign probability updates. + * @classes: motion vector class probability updates. + * @class0_bit: motion vector class0 bit probability updates. + * @bits: motion vector bits probability updates. + * @class0_fr: motion vector class0 fractional bit probability updates. + * @fr: motion vector fractional bit probability updates. + * @class0_hp: motion vector class0 high precision fractional bit probability updates. + * @hp: motion vector high precision fractional bit probability updates. + * + * This structure contains new values of motion vector probabilities. + * A value of zero in an array element means there is no update of the relevant probability. + * See `struct v4l2_vp9_prob_updates` for details. + */ +struct v4l2_vp9_mv_probs { + __u8 joint[3]; + __u8 sign[2]; + __u8 classes[2][10]; + __u8 class0_bit[2]; + __u8 bits[2][10]; + __u8 class0_fr[2][2][3]; + __u8 fr[2][3]; + __u8 class0_hp[2]; + __u8 hp[2]; +}; + +#define V4L2_CID_STATELESS_VP9_COMPRESSED_HDR (V4L2_CID_CODEC_STATELESS_BASE + 301) + +#define V4L2_VP9_TX_MODE_ONLY_4X4 0 +#define V4L2_VP9_TX_MODE_ALLOW_8X8 1 +#define V4L2_VP9_TX_MODE_ALLOW_16X16 2 +#define V4L2_VP9_TX_MODE_ALLOW_32X32 3 +#define V4L2_VP9_TX_MODE_SELECT 4 + +/** + * struct v4l2_ctrl_vp9_compressed_hdr - VP9 probability updates control + * @tx_mode: specifies the TX mode. Set to one of V4L2_VP9_TX_MODE_{}. + * @tx8: TX 8x8 probability updates. + * @tx16: TX 16x16 probability updates. + * @tx32: TX 32x32 probability updates. + * @coef: coefficient probability updates. + * @skip: skip probability updates. + * @inter_mode: inter mode probability updates. + * @interp_filter: interpolation filter probability updates. + * @is_inter: is inter-block probability updates. + * @comp_mode: compound prediction mode probability updates. + * @single_ref: single ref probability updates. + * @comp_ref: compound ref probability updates. + * @y_mode: Y prediction mode probability updates. + * @uv_mode: UV prediction mode probability updates. + * @partition: partition probability updates. + * @mv: motion vector probability updates. + * + * This structure holds the probabilities update as parsed in the compressed + * header (Spec 6.3). These values represent the value of probability update after + * being translated with inv_map_table[] (see 6.3.5). A value of zero in an array element + * means that there is no update of the relevant probability. + * + * This control is optional and needs to be used when dealing with the hardware which is + * not capable of parsing the compressed header itself. Only drivers which need it will + * implement it. + */ +struct v4l2_ctrl_vp9_compressed_hdr { + __u8 tx_mode; + __u8 tx8[2][1]; + __u8 tx16[2][2]; + __u8 tx32[2][3]; + __u8 coef[4][2][2][6][6][3]; + __u8 skip[3]; + __u8 inter_mode[7][3]; + __u8 interp_filter[4][2]; + __u8 is_inter[4]; + __u8 comp_mode[5]; + __u8 single_ref[5][2]; + __u8 comp_ref[5]; + __u8 y_mode[4][9]; + __u8 uv_mode[10][9]; + __u8 partition[16][3]; + + struct v4l2_vp9_mv_probs mv; +}; + /* MPEG-compression definitions kept for backwards compatibility */ #ifndef __KERNEL__ #define V4L2_CTRL_CLASS_MPEG V4L2_CTRL_CLASS_CODEC diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h index 58392dcd3bf5..2cd8f7e432c5 100644 --- a/include/uapi/linux/videodev2.h +++ b/include/uapi/linux/videodev2.h @@ -703,6 +703,7 @@ struct v4l2_pix_format { #define V4L2_PIX_FMT_VP8 v4l2_fourcc('V', 'P', '8', '0') /* VP8 */ #define V4L2_PIX_FMT_VP8_FRAME v4l2_fourcc('V', 'P', '8', 'F') /* VP8 parsed frame */ #define V4L2_PIX_FMT_VP9 v4l2_fourcc('V', 'P', '9', '0') /* VP9 */ +#define V4L2_PIX_FMT_VP9_FRAME v4l2_fourcc('V', 'P', '9', 'F') /* VP9 parsed frame */ #define V4L2_PIX_FMT_HEVC v4l2_fourcc('H', 'E', 'V', 'C') /* HEVC aka H.265 */ #define V4L2_PIX_FMT_FWHT v4l2_fourcc('F', 'W', 'H', 'T') /* Fast Walsh Hadamard Transform (vicodec) */ #define V4L2_PIX_FMT_FWHT_STATELESS v4l2_fourcc('S', 'F', 'W', 'H') /* Stateless FWHT (vicodec) */ @@ -1755,6 +1756,8 @@ struct v4l2_ext_control { struct v4l2_ctrl_mpeg2_sequence __user *p_mpeg2_sequence; struct v4l2_ctrl_mpeg2_picture __user *p_mpeg2_picture; struct v4l2_ctrl_mpeg2_quantisation __user *p_mpeg2_quantisation; + struct v4l2_ctrl_vp9_compressed_hdr __user *p_vp9_compressed_hdr_probs; + struct v4l2_ctrl_vp9_frame __user *p_vp9_frame; void __user *ptr; }; } __attribute__ ((packed)); @@ -1819,6 +1822,9 @@ enum v4l2_ctrl_type { V4L2_CTRL_TYPE_MPEG2_QUANTISATION = 0x0250, V4L2_CTRL_TYPE_MPEG2_SEQUENCE = 0x0251, V4L2_CTRL_TYPE_MPEG2_PICTURE = 0x0252, + + V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR = 0x0260, + V4L2_CTRL_TYPE_VP9_FRAME = 0x0261, }; /* Used in the VIDIOC_QUERYCTRL ioctl for querying controls */ From patchwork Mon Sep 27 15:19:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrzej Pietrasiewicz X-Patchwork-Id: 12520181 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 48A88C433EF for ; Mon, 27 Sep 2021 15:20:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 22A716101A for ; Mon, 27 Sep 2021 15:20:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235091AbhI0PVz (ORCPT ); Mon, 27 Sep 2021 11:21:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41176 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235140AbhI0PVu (ORCPT ); Mon, 27 Sep 2021 11:21:50 -0400 Received: from bhuna.collabora.co.uk (bhuna.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e3e3]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9C7ABC061575; Mon, 27 Sep 2021 08:20:12 -0700 (PDT) Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: andrzej.p) with ESMTPSA id A00B51F42E3E From: Andrzej Pietrasiewicz To: linux-media@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-rockchip@lists.infradead.org, linux-staging@lists.linux.dev Cc: Andrzej Pietrasiewicz , Benjamin Gaignard , Boris Brezillon , Ezequiel Garcia , Fabio Estevam , Greg Kroah-Hartman , Hans Verkuil , Heiko Stuebner , Jernej Skrabec , Mauro Carvalho Chehab , Nicolas Dufresne , NXP Linux Team , Pengutronix Kernel Team , Philipp Zabel , Sascha Hauer , Shawn Guo , kernel@collabora.com, Ezequiel Garcia Subject: [PATCH v6 06/10] media: Add VP9 v4l2 library Date: Mon, 27 Sep 2021 17:19:54 +0200 Message-Id: <20210927151958.24426-7-andrzej.p@collabora.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210927151958.24426-1-andrzej.p@collabora.com> References: <20210927151958.24426-1-andrzej.p@collabora.com> Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org Provide code common to vp9 drivers in one central location. Signed-off-by: Andrzej Pietrasiewicz Signed-off-by: Ezequiel Garcia --- drivers/media/v4l2-core/Kconfig | 4 + drivers/media/v4l2-core/Makefile | 1 + drivers/media/v4l2-core/v4l2-vp9.c | 1850 ++++++++++++++++++++++++++++ include/media/v4l2-vp9.h | 182 +++ 4 files changed, 2037 insertions(+) create mode 100644 drivers/media/v4l2-core/v4l2-vp9.c create mode 100644 include/media/v4l2-vp9.h diff --git a/drivers/media/v4l2-core/Kconfig b/drivers/media/v4l2-core/Kconfig index 02dc1787e953..6ee75c6c820e 100644 --- a/drivers/media/v4l2-core/Kconfig +++ b/drivers/media/v4l2-core/Kconfig @@ -52,6 +52,10 @@ config V4L2_JPEG_HELPER config V4L2_H264 tristate +# Used by drivers that need v4l2-vp9.ko +config V4L2_VP9 + tristate + # Used by drivers that need v4l2-mem2mem.ko config V4L2_MEM2MEM_DEV tristate diff --git a/drivers/media/v4l2-core/Makefile b/drivers/media/v4l2-core/Makefile index 66a78c556c98..83fac5c746f5 100644 --- a/drivers/media/v4l2-core/Makefile +++ b/drivers/media/v4l2-core/Makefile @@ -24,6 +24,7 @@ obj-$(CONFIG_VIDEO_TUNER) += tuner.o obj-$(CONFIG_V4L2_MEM2MEM_DEV) += v4l2-mem2mem.o obj-$(CONFIG_V4L2_H264) += v4l2-h264.o +obj-$(CONFIG_V4L2_VP9) += v4l2-vp9.o obj-$(CONFIG_V4L2_FLASH_LED_CLASS) += v4l2-flash-led-class.o diff --git a/drivers/media/v4l2-core/v4l2-vp9.c b/drivers/media/v4l2-core/v4l2-vp9.c new file mode 100644 index 000000000000..859589f1fd35 --- /dev/null +++ b/drivers/media/v4l2-core/v4l2-vp9.c @@ -0,0 +1,1850 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * V4L2 VP9 helpers. + * + * Copyright (C) 2021 Collabora, Ltd. + * + * Author: Andrzej Pietrasiewicz + */ + +#include + +#include + +const u8 v4l2_vp9_kf_y_mode_prob[10][10][9] = { + { + /* above = dc */ + { 137, 30, 42, 148, 151, 207, 70, 52, 91 }, /*left = dc */ + { 92, 45, 102, 136, 116, 180, 74, 90, 100 }, /*left = v */ + { 73, 32, 19, 187, 222, 215, 46, 34, 100 }, /*left = h */ + { 91, 30, 32, 116, 121, 186, 93, 86, 94 }, /*left = d45 */ + { 72, 35, 36, 149, 68, 206, 68, 63, 105 }, /*left = d135*/ + { 73, 31, 28, 138, 57, 124, 55, 122, 151 }, /*left = d117*/ + { 67, 23, 21, 140, 126, 197, 40, 37, 171 }, /*left = d153*/ + { 86, 27, 28, 128, 154, 212, 45, 43, 53 }, /*left = d207*/ + { 74, 32, 27, 107, 86, 160, 63, 134, 102 }, /*left = d63 */ + { 59, 67, 44, 140, 161, 202, 78, 67, 119 }, /*left = tm */ + }, { /* above = v */ + { 63, 36, 126, 146, 123, 158, 60, 90, 96 }, /*left = dc */ + { 43, 46, 168, 134, 107, 128, 69, 142, 92 }, /*left = v */ + { 44, 29, 68, 159, 201, 177, 50, 57, 77 }, /*left = h */ + { 58, 38, 76, 114, 97, 172, 78, 133, 92 }, /*left = d45 */ + { 46, 41, 76, 140, 63, 184, 69, 112, 57 }, /*left = d135*/ + { 38, 32, 85, 140, 46, 112, 54, 151, 133 }, /*left = d117*/ + { 39, 27, 61, 131, 110, 175, 44, 75, 136 }, /*left = d153*/ + { 52, 30, 74, 113, 130, 175, 51, 64, 58 }, /*left = d207*/ + { 47, 35, 80, 100, 74, 143, 64, 163, 74 }, /*left = d63 */ + { 36, 61, 116, 114, 128, 162, 80, 125, 82 }, /*left = tm */ + }, { /* above = h */ + { 82, 26, 26, 171, 208, 204, 44, 32, 105 }, /*left = dc */ + { 55, 44, 68, 166, 179, 192, 57, 57, 108 }, /*left = v */ + { 42, 26, 11, 199, 241, 228, 23, 15, 85 }, /*left = h */ + { 68, 42, 19, 131, 160, 199, 55, 52, 83 }, /*left = d45 */ + { 58, 50, 25, 139, 115, 232, 39, 52, 118 }, /*left = d135*/ + { 50, 35, 33, 153, 104, 162, 64, 59, 131 }, /*left = d117*/ + { 44, 24, 16, 150, 177, 202, 33, 19, 156 }, /*left = d153*/ + { 55, 27, 12, 153, 203, 218, 26, 27, 49 }, /*left = d207*/ + { 53, 49, 21, 110, 116, 168, 59, 80, 76 }, /*left = d63 */ + { 38, 72, 19, 168, 203, 212, 50, 50, 107 }, /*left = tm */ + }, { /* above = d45 */ + { 103, 26, 36, 129, 132, 201, 83, 80, 93 }, /*left = dc */ + { 59, 38, 83, 112, 103, 162, 98, 136, 90 }, /*left = v */ + { 62, 30, 23, 158, 200, 207, 59, 57, 50 }, /*left = h */ + { 67, 30, 29, 84, 86, 191, 102, 91, 59 }, /*left = d45 */ + { 60, 32, 33, 112, 71, 220, 64, 89, 104 }, /*left = d135*/ + { 53, 26, 34, 130, 56, 149, 84, 120, 103 }, /*left = d117*/ + { 53, 21, 23, 133, 109, 210, 56, 77, 172 }, /*left = d153*/ + { 77, 19, 29, 112, 142, 228, 55, 66, 36 }, /*left = d207*/ + { 61, 29, 29, 93, 97, 165, 83, 175, 162 }, /*left = d63 */ + { 47, 47, 43, 114, 137, 181, 100, 99, 95 }, /*left = tm */ + }, { /* above = d135 */ + { 69, 23, 29, 128, 83, 199, 46, 44, 101 }, /*left = dc */ + { 53, 40, 55, 139, 69, 183, 61, 80, 110 }, /*left = v */ + { 40, 29, 19, 161, 180, 207, 43, 24, 91 }, /*left = h */ + { 60, 34, 19, 105, 61, 198, 53, 64, 89 }, /*left = d45 */ + { 52, 31, 22, 158, 40, 209, 58, 62, 89 }, /*left = d135*/ + { 44, 31, 29, 147, 46, 158, 56, 102, 198 }, /*left = d117*/ + { 35, 19, 12, 135, 87, 209, 41, 45, 167 }, /*left = d153*/ + { 55, 25, 21, 118, 95, 215, 38, 39, 66 }, /*left = d207*/ + { 51, 38, 25, 113, 58, 164, 70, 93, 97 }, /*left = d63 */ + { 47, 54, 34, 146, 108, 203, 72, 103, 151 }, /*left = tm */ + }, { /* above = d117 */ + { 64, 19, 37, 156, 66, 138, 49, 95, 133 }, /*left = dc */ + { 46, 27, 80, 150, 55, 124, 55, 121, 135 }, /*left = v */ + { 36, 23, 27, 165, 149, 166, 54, 64, 118 }, /*left = h */ + { 53, 21, 36, 131, 63, 163, 60, 109, 81 }, /*left = d45 */ + { 40, 26, 35, 154, 40, 185, 51, 97, 123 }, /*left = d135*/ + { 35, 19, 34, 179, 19, 97, 48, 129, 124 }, /*left = d117*/ + { 36, 20, 26, 136, 62, 164, 33, 77, 154 }, /*left = d153*/ + { 45, 18, 32, 130, 90, 157, 40, 79, 91 }, /*left = d207*/ + { 45, 26, 28, 129, 45, 129, 49, 147, 123 }, /*left = d63 */ + { 38, 44, 51, 136, 74, 162, 57, 97, 121 }, /*left = tm */ + }, { /* above = d153 */ + { 75, 17, 22, 136, 138, 185, 32, 34, 166 }, /*left = dc */ + { 56, 39, 58, 133, 117, 173, 48, 53, 187 }, /*left = v */ + { 35, 21, 12, 161, 212, 207, 20, 23, 145 }, /*left = h */ + { 56, 29, 19, 117, 109, 181, 55, 68, 112 }, /*left = d45 */ + { 47, 29, 17, 153, 64, 220, 59, 51, 114 }, /*left = d135*/ + { 46, 16, 24, 136, 76, 147, 41, 64, 172 }, /*left = d117*/ + { 34, 17, 11, 108, 152, 187, 13, 15, 209 }, /*left = d153*/ + { 51, 24, 14, 115, 133, 209, 32, 26, 104 }, /*left = d207*/ + { 55, 30, 18, 122, 79, 179, 44, 88, 116 }, /*left = d63 */ + { 37, 49, 25, 129, 168, 164, 41, 54, 148 }, /*left = tm */ + }, { /* above = d207 */ + { 82, 22, 32, 127, 143, 213, 39, 41, 70 }, /*left = dc */ + { 62, 44, 61, 123, 105, 189, 48, 57, 64 }, /*left = v */ + { 47, 25, 17, 175, 222, 220, 24, 30, 86 }, /*left = h */ + { 68, 36, 17, 106, 102, 206, 59, 74, 74 }, /*left = d45 */ + { 57, 39, 23, 151, 68, 216, 55, 63, 58 }, /*left = d135*/ + { 49, 30, 35, 141, 70, 168, 82, 40, 115 }, /*left = d117*/ + { 51, 25, 15, 136, 129, 202, 38, 35, 139 }, /*left = d153*/ + { 68, 26, 16, 111, 141, 215, 29, 28, 28 }, /*left = d207*/ + { 59, 39, 19, 114, 75, 180, 77, 104, 42 }, /*left = d63 */ + { 40, 61, 26, 126, 152, 206, 61, 59, 93 }, /*left = tm */ + }, { /* above = d63 */ + { 78, 23, 39, 111, 117, 170, 74, 124, 94 }, /*left = dc */ + { 48, 34, 86, 101, 92, 146, 78, 179, 134 }, /*left = v */ + { 47, 22, 24, 138, 187, 178, 68, 69, 59 }, /*left = h */ + { 56, 25, 33, 105, 112, 187, 95, 177, 129 }, /*left = d45 */ + { 48, 31, 27, 114, 63, 183, 82, 116, 56 }, /*left = d135*/ + { 43, 28, 37, 121, 63, 123, 61, 192, 169 }, /*left = d117*/ + { 42, 17, 24, 109, 97, 177, 56, 76, 122 }, /*left = d153*/ + { 58, 18, 28, 105, 139, 182, 70, 92, 63 }, /*left = d207*/ + { 46, 23, 32, 74, 86, 150, 67, 183, 88 }, /*left = d63 */ + { 36, 38, 48, 92, 122, 165, 88, 137, 91 }, /*left = tm */ + }, { /* above = tm */ + { 65, 70, 60, 155, 159, 199, 61, 60, 81 }, /*left = dc */ + { 44, 78, 115, 132, 119, 173, 71, 112, 93 }, /*left = v */ + { 39, 38, 21, 184, 227, 206, 42, 32, 64 }, /*left = h */ + { 58, 47, 36, 124, 137, 193, 80, 82, 78 }, /*left = d45 */ + { 49, 50, 35, 144, 95, 205, 63, 78, 59 }, /*left = d135*/ + { 41, 53, 52, 148, 71, 142, 65, 128, 51 }, /*left = d117*/ + { 40, 36, 28, 143, 143, 202, 40, 55, 137 }, /*left = d153*/ + { 52, 34, 29, 129, 183, 227, 42, 35, 43 }, /*left = d207*/ + { 42, 44, 44, 104, 105, 164, 64, 130, 80 }, /*left = d63 */ + { 43, 81, 53, 140, 169, 204, 68, 84, 72 }, /*left = tm */ + } +}; +EXPORT_SYMBOL_GPL(v4l2_vp9_kf_y_mode_prob); + +const u8 v4l2_vp9_kf_partition_probs[16][3] = { + /* 8x8 -> 4x4 */ + { 158, 97, 94 }, /* a/l both not split */ + { 93, 24, 99 }, /* a split, l not split */ + { 85, 119, 44 }, /* l split, a not split */ + { 62, 59, 67 }, /* a/l both split */ + /* 16x16 -> 8x8 */ + { 149, 53, 53 }, /* a/l both not split */ + { 94, 20, 48 }, /* a split, l not split */ + { 83, 53, 24 }, /* l split, a not split */ + { 52, 18, 18 }, /* a/l both split */ + /* 32x32 -> 16x16 */ + { 150, 40, 39 }, /* a/l both not split */ + { 78, 12, 26 }, /* a split, l not split */ + { 67, 33, 11 }, /* l split, a not split */ + { 24, 7, 5 }, /* a/l both split */ + /* 64x64 -> 32x32 */ + { 174, 35, 49 }, /* a/l both not split */ + { 68, 11, 27 }, /* a split, l not split */ + { 57, 15, 9 }, /* l split, a not split */ + { 12, 3, 3 }, /* a/l both split */ +}; +EXPORT_SYMBOL_GPL(v4l2_vp9_kf_partition_probs); + +const u8 v4l2_vp9_kf_uv_mode_prob[10][9] = { + { 144, 11, 54, 157, 195, 130, 46, 58, 108 }, /* y = dc */ + { 118, 15, 123, 148, 131, 101, 44, 93, 131 }, /* y = v */ + { 113, 12, 23, 188, 226, 142, 26, 32, 125 }, /* y = h */ + { 120, 11, 50, 123, 163, 135, 64, 77, 103 }, /* y = d45 */ + { 113, 9, 36, 155, 111, 157, 32, 44, 161 }, /* y = d135 */ + { 116, 9, 55, 176, 76, 96, 37, 61, 149 }, /* y = d117 */ + { 115, 9, 28, 141, 161, 167, 21, 25, 193 }, /* y = d153 */ + { 120, 12, 32, 145, 195, 142, 32, 38, 86 }, /* y = d207 */ + { 116, 12, 64, 120, 140, 125, 49, 115, 121 }, /* y = d63 */ + { 102, 19, 66, 162, 182, 122, 35, 59, 128 } /* y = tm */ +}; +EXPORT_SYMBOL_GPL(v4l2_vp9_kf_uv_mode_prob); + +const struct v4l2_vp9_frame_context v4l2_vp9_default_probs = { + .tx8 = { + { 100 }, + { 66 }, + }, + .tx16 = { + { 20, 152 }, + { 15, 101 }, + }, + .tx32 = { + { 3, 136, 37 }, + { 5, 52, 13 }, + }, + .coef = { + { /* tx = 4x4 */ + { /* block Type 0 */ + { /* Intra */ + { /* Coeff Band 0 */ + { 195, 29, 183 }, + { 84, 49, 136 }, + { 8, 42, 71 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + }, + { /* Coeff Band 1 */ + { 31, 107, 169 }, + { 35, 99, 159 }, + { 17, 82, 140 }, + { 8, 66, 114 }, + { 2, 44, 76 }, + { 1, 19, 32 }, + }, + { /* Coeff Band 2 */ + { 40, 132, 201 }, + { 29, 114, 187 }, + { 13, 91, 157 }, + { 7, 75, 127 }, + { 3, 58, 95 }, + { 1, 28, 47 }, + }, + { /* Coeff Band 3 */ + { 69, 142, 221 }, + { 42, 122, 201 }, + { 15, 91, 159 }, + { 6, 67, 121 }, + { 1, 42, 77 }, + { 1, 17, 31 }, + }, + { /* Coeff Band 4 */ + { 102, 148, 228 }, + { 67, 117, 204 }, + { 17, 82, 154 }, + { 6, 59, 114 }, + { 2, 39, 75 }, + { 1, 15, 29 }, + }, + { /* Coeff Band 5 */ + { 156, 57, 233 }, + { 119, 57, 212 }, + { 58, 48, 163 }, + { 29, 40, 124 }, + { 12, 30, 81 }, + { 3, 12, 31 } + }, + }, + { /* Inter */ + { /* Coeff Band 0 */ + { 191, 107, 226 }, + { 124, 117, 204 }, + { 25, 99, 155 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + }, + { /* Coeff Band 1 */ + { 29, 148, 210 }, + { 37, 126, 194 }, + { 8, 93, 157 }, + { 2, 68, 118 }, + { 1, 39, 69 }, + { 1, 17, 33 }, + }, + { /* Coeff Band 2 */ + { 41, 151, 213 }, + { 27, 123, 193 }, + { 3, 82, 144 }, + { 1, 58, 105 }, + { 1, 32, 60 }, + { 1, 13, 26 }, + }, + { /* Coeff Band 3 */ + { 59, 159, 220 }, + { 23, 126, 198 }, + { 4, 88, 151 }, + { 1, 66, 114 }, + { 1, 38, 71 }, + { 1, 18, 34 }, + }, + { /* Coeff Band 4 */ + { 114, 136, 232 }, + { 51, 114, 207 }, + { 11, 83, 155 }, + { 3, 56, 105 }, + { 1, 33, 65 }, + { 1, 17, 34 }, + }, + { /* Coeff Band 5 */ + { 149, 65, 234 }, + { 121, 57, 215 }, + { 61, 49, 166 }, + { 28, 36, 114 }, + { 12, 25, 76 }, + { 3, 16, 42 }, + }, + }, + }, + { /* block Type 1 */ + { /* Intra */ + { /* Coeff Band 0 */ + { 214, 49, 220 }, + { 132, 63, 188 }, + { 42, 65, 137 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + }, + { /* Coeff Band 1 */ + { 85, 137, 221 }, + { 104, 131, 216 }, + { 49, 111, 192 }, + { 21, 87, 155 }, + { 2, 49, 87 }, + { 1, 16, 28 }, + }, + { /* Coeff Band 2 */ + { 89, 163, 230 }, + { 90, 137, 220 }, + { 29, 100, 183 }, + { 10, 70, 135 }, + { 2, 42, 81 }, + { 1, 17, 33 }, + }, + { /* Coeff Band 3 */ + { 108, 167, 237 }, + { 55, 133, 222 }, + { 15, 97, 179 }, + { 4, 72, 135 }, + { 1, 45, 85 }, + { 1, 19, 38 }, + }, + { /* Coeff Band 4 */ + { 124, 146, 240 }, + { 66, 124, 224 }, + { 17, 88, 175 }, + { 4, 58, 122 }, + { 1, 36, 75 }, + { 1, 18, 37 }, + }, + { /* Coeff Band 5 */ + { 141, 79, 241 }, + { 126, 70, 227 }, + { 66, 58, 182 }, + { 30, 44, 136 }, + { 12, 34, 96 }, + { 2, 20, 47 }, + }, + }, + { /* Inter */ + { /* Coeff Band 0 */ + { 229, 99, 249 }, + { 143, 111, 235 }, + { 46, 109, 192 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + }, + { /* Coeff Band 1 */ + { 82, 158, 236 }, + { 94, 146, 224 }, + { 25, 117, 191 }, + { 9, 87, 149 }, + { 3, 56, 99 }, + { 1, 33, 57 }, + }, + { /* Coeff Band 2 */ + { 83, 167, 237 }, + { 68, 145, 222 }, + { 10, 103, 177 }, + { 2, 72, 131 }, + { 1, 41, 79 }, + { 1, 20, 39 }, + }, + { /* Coeff Band 3 */ + { 99, 167, 239 }, + { 47, 141, 224 }, + { 10, 104, 178 }, + { 2, 73, 133 }, + { 1, 44, 85 }, + { 1, 22, 47 }, + }, + { /* Coeff Band 4 */ + { 127, 145, 243 }, + { 71, 129, 228 }, + { 17, 93, 177 }, + { 3, 61, 124 }, + { 1, 41, 84 }, + { 1, 21, 52 }, + }, + { /* Coeff Band 5 */ + { 157, 78, 244 }, + { 140, 72, 231 }, + { 69, 58, 184 }, + { 31, 44, 137 }, + { 14, 38, 105 }, + { 8, 23, 61 }, + }, + }, + }, + }, + { /* tx = 8x8 */ + { /* block Type 0 */ + { /* Intra */ + { /* Coeff Band 0 */ + { 125, 34, 187 }, + { 52, 41, 133 }, + { 6, 31, 56 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + }, + { /* Coeff Band 1 */ + { 37, 109, 153 }, + { 51, 102, 147 }, + { 23, 87, 128 }, + { 8, 67, 101 }, + { 1, 41, 63 }, + { 1, 19, 29 }, + }, + { /* Coeff Band 2 */ + { 31, 154, 185 }, + { 17, 127, 175 }, + { 6, 96, 145 }, + { 2, 73, 114 }, + { 1, 51, 82 }, + { 1, 28, 45 }, + }, + { /* Coeff Band 3 */ + { 23, 163, 200 }, + { 10, 131, 185 }, + { 2, 93, 148 }, + { 1, 67, 111 }, + { 1, 41, 69 }, + { 1, 14, 24 }, + }, + { /* Coeff Band 4 */ + { 29, 176, 217 }, + { 12, 145, 201 }, + { 3, 101, 156 }, + { 1, 69, 111 }, + { 1, 39, 63 }, + { 1, 14, 23 }, + }, + { /* Coeff Band 5 */ + { 57, 192, 233 }, + { 25, 154, 215 }, + { 6, 109, 167 }, + { 3, 78, 118 }, + { 1, 48, 69 }, + { 1, 21, 29 }, + }, + }, + { /* Inter */ + { /* Coeff Band 0 */ + { 202, 105, 245 }, + { 108, 106, 216 }, + { 18, 90, 144 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + }, + { /* Coeff Band 1 */ + { 33, 172, 219 }, + { 64, 149, 206 }, + { 14, 117, 177 }, + { 5, 90, 141 }, + { 2, 61, 95 }, + { 1, 37, 57 }, + }, + { /* Coeff Band 2 */ + { 33, 179, 220 }, + { 11, 140, 198 }, + { 1, 89, 148 }, + { 1, 60, 104 }, + { 1, 33, 57 }, + { 1, 12, 21 }, + }, + { /* Coeff Band 3 */ + { 30, 181, 221 }, + { 8, 141, 198 }, + { 1, 87, 145 }, + { 1, 58, 100 }, + { 1, 31, 55 }, + { 1, 12, 20 }, + }, + { /* Coeff Band 4 */ + { 32, 186, 224 }, + { 7, 142, 198 }, + { 1, 86, 143 }, + { 1, 58, 100 }, + { 1, 31, 55 }, + { 1, 12, 22 }, + }, + { /* Coeff Band 5 */ + { 57, 192, 227 }, + { 20, 143, 204 }, + { 3, 96, 154 }, + { 1, 68, 112 }, + { 1, 42, 69 }, + { 1, 19, 32 }, + }, + }, + }, + { /* block Type 1 */ + { /* Intra */ + { /* Coeff Band 0 */ + { 212, 35, 215 }, + { 113, 47, 169 }, + { 29, 48, 105 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + }, + { /* Coeff Band 1 */ + { 74, 129, 203 }, + { 106, 120, 203 }, + { 49, 107, 178 }, + { 19, 84, 144 }, + { 4, 50, 84 }, + { 1, 15, 25 }, + }, + { /* Coeff Band 2 */ + { 71, 172, 217 }, + { 44, 141, 209 }, + { 15, 102, 173 }, + { 6, 76, 133 }, + { 2, 51, 89 }, + { 1, 24, 42 }, + }, + { /* Coeff Band 3 */ + { 64, 185, 231 }, + { 31, 148, 216 }, + { 8, 103, 175 }, + { 3, 74, 131 }, + { 1, 46, 81 }, + { 1, 18, 30 }, + }, + { /* Coeff Band 4 */ + { 65, 196, 235 }, + { 25, 157, 221 }, + { 5, 105, 174 }, + { 1, 67, 120 }, + { 1, 38, 69 }, + { 1, 15, 30 }, + }, + { /* Coeff Band 5 */ + { 65, 204, 238 }, + { 30, 156, 224 }, + { 7, 107, 177 }, + { 2, 70, 124 }, + { 1, 42, 73 }, + { 1, 18, 34 }, + }, + }, + { /* Inter */ + { /* Coeff Band 0 */ + { 225, 86, 251 }, + { 144, 104, 235 }, + { 42, 99, 181 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + }, + { /* Coeff Band 1 */ + { 85, 175, 239 }, + { 112, 165, 229 }, + { 29, 136, 200 }, + { 12, 103, 162 }, + { 6, 77, 123 }, + { 2, 53, 84 }, + }, + { /* Coeff Band 2 */ + { 75, 183, 239 }, + { 30, 155, 221 }, + { 3, 106, 171 }, + { 1, 74, 128 }, + { 1, 44, 76 }, + { 1, 17, 28 }, + }, + { /* Coeff Band 3 */ + { 73, 185, 240 }, + { 27, 159, 222 }, + { 2, 107, 172 }, + { 1, 75, 127 }, + { 1, 42, 73 }, + { 1, 17, 29 }, + }, + { /* Coeff Band 4 */ + { 62, 190, 238 }, + { 21, 159, 222 }, + { 2, 107, 172 }, + { 1, 72, 122 }, + { 1, 40, 71 }, + { 1, 18, 32 }, + }, + { /* Coeff Band 5 */ + { 61, 199, 240 }, + { 27, 161, 226 }, + { 4, 113, 180 }, + { 1, 76, 129 }, + { 1, 46, 80 }, + { 1, 23, 41 }, + }, + }, + }, + }, + { /* tx = 16x16 */ + { /* block Type 0 */ + { /* Intra */ + { /* Coeff Band 0 */ + { 7, 27, 153 }, + { 5, 30, 95 }, + { 1, 16, 30 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + }, + { /* Coeff Band 1 */ + { 50, 75, 127 }, + { 57, 75, 124 }, + { 27, 67, 108 }, + { 10, 54, 86 }, + { 1, 33, 52 }, + { 1, 12, 18 }, + }, + { /* Coeff Band 2 */ + { 43, 125, 151 }, + { 26, 108, 148 }, + { 7, 83, 122 }, + { 2, 59, 89 }, + { 1, 38, 60 }, + { 1, 17, 27 }, + }, + { /* Coeff Band 3 */ + { 23, 144, 163 }, + { 13, 112, 154 }, + { 2, 75, 117 }, + { 1, 50, 81 }, + { 1, 31, 51 }, + { 1, 14, 23 }, + }, + { /* Coeff Band 4 */ + { 18, 162, 185 }, + { 6, 123, 171 }, + { 1, 78, 125 }, + { 1, 51, 86 }, + { 1, 31, 54 }, + { 1, 14, 23 }, + }, + { /* Coeff Band 5 */ + { 15, 199, 227 }, + { 3, 150, 204 }, + { 1, 91, 146 }, + { 1, 55, 95 }, + { 1, 30, 53 }, + { 1, 11, 20 }, + } + }, + { /* Inter */ + { /* Coeff Band 0 */ + { 19, 55, 240 }, + { 19, 59, 196 }, + { 3, 52, 105 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + }, + { /* Coeff Band 1 */ + { 41, 166, 207 }, + { 104, 153, 199 }, + { 31, 123, 181 }, + { 14, 101, 152 }, + { 5, 72, 106 }, + { 1, 36, 52 }, + }, + { /* Coeff Band 2 */ + { 35, 176, 211 }, + { 12, 131, 190 }, + { 2, 88, 144 }, + { 1, 60, 101 }, + { 1, 36, 60 }, + { 1, 16, 28 }, + }, + { /* Coeff Band 3 */ + { 28, 183, 213 }, + { 8, 134, 191 }, + { 1, 86, 142 }, + { 1, 56, 96 }, + { 1, 30, 53 }, + { 1, 12, 20 }, + }, + { /* Coeff Band 4 */ + { 20, 190, 215 }, + { 4, 135, 192 }, + { 1, 84, 139 }, + { 1, 53, 91 }, + { 1, 28, 49 }, + { 1, 11, 20 }, + }, + { /* Coeff Band 5 */ + { 13, 196, 216 }, + { 2, 137, 192 }, + { 1, 86, 143 }, + { 1, 57, 99 }, + { 1, 32, 56 }, + { 1, 13, 24 }, + }, + }, + }, + { /* block Type 1 */ + { /* Intra */ + { /* Coeff Band 0 */ + { 211, 29, 217 }, + { 96, 47, 156 }, + { 22, 43, 87 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + }, + { /* Coeff Band 1 */ + { 78, 120, 193 }, + { 111, 116, 186 }, + { 46, 102, 164 }, + { 15, 80, 128 }, + { 2, 49, 76 }, + { 1, 18, 28 }, + }, + { /* Coeff Band 2 */ + { 71, 161, 203 }, + { 42, 132, 192 }, + { 10, 98, 150 }, + { 3, 69, 109 }, + { 1, 44, 70 }, + { 1, 18, 29 }, + }, + { /* Coeff Band 3 */ + { 57, 186, 211 }, + { 30, 140, 196 }, + { 4, 93, 146 }, + { 1, 62, 102 }, + { 1, 38, 65 }, + { 1, 16, 27 }, + }, + { /* Coeff Band 4 */ + { 47, 199, 217 }, + { 14, 145, 196 }, + { 1, 88, 142 }, + { 1, 57, 98 }, + { 1, 36, 62 }, + { 1, 15, 26 }, + }, + { /* Coeff Band 5 */ + { 26, 219, 229 }, + { 5, 155, 207 }, + { 1, 94, 151 }, + { 1, 60, 104 }, + { 1, 36, 62 }, + { 1, 16, 28 }, + } + }, + { /* Inter */ + { /* Coeff Band 0 */ + { 233, 29, 248 }, + { 146, 47, 220 }, + { 43, 52, 140 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + }, + { /* Coeff Band 1 */ + { 100, 163, 232 }, + { 179, 161, 222 }, + { 63, 142, 204 }, + { 37, 113, 174 }, + { 26, 89, 137 }, + { 18, 68, 97 }, + }, + { /* Coeff Band 2 */ + { 85, 181, 230 }, + { 32, 146, 209 }, + { 7, 100, 164 }, + { 3, 71, 121 }, + { 1, 45, 77 }, + { 1, 18, 30 }, + }, + { /* Coeff Band 3 */ + { 65, 187, 230 }, + { 20, 148, 207 }, + { 2, 97, 159 }, + { 1, 68, 116 }, + { 1, 40, 70 }, + { 1, 14, 29 }, + }, + { /* Coeff Band 4 */ + { 40, 194, 227 }, + { 8, 147, 204 }, + { 1, 94, 155 }, + { 1, 65, 112 }, + { 1, 39, 66 }, + { 1, 14, 26 }, + }, + { /* Coeff Band 5 */ + { 16, 208, 228 }, + { 3, 151, 207 }, + { 1, 98, 160 }, + { 1, 67, 117 }, + { 1, 41, 74 }, + { 1, 17, 31 }, + }, + }, + }, + }, + { /* tx = 32x32 */ + { /* block Type 0 */ + { /* Intra */ + { /* Coeff Band 0 */ + { 17, 38, 140 }, + { 7, 34, 80 }, + { 1, 17, 29 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + }, + { /* Coeff Band 1 */ + { 37, 75, 128 }, + { 41, 76, 128 }, + { 26, 66, 116 }, + { 12, 52, 94 }, + { 2, 32, 55 }, + { 1, 10, 16 }, + }, + { /* Coeff Band 2 */ + { 50, 127, 154 }, + { 37, 109, 152 }, + { 16, 82, 121 }, + { 5, 59, 85 }, + { 1, 35, 54 }, + { 1, 13, 20 }, + }, + { /* Coeff Band 3 */ + { 40, 142, 167 }, + { 17, 110, 157 }, + { 2, 71, 112 }, + { 1, 44, 72 }, + { 1, 27, 45 }, + { 1, 11, 17 }, + }, + { /* Coeff Band 4 */ + { 30, 175, 188 }, + { 9, 124, 169 }, + { 1, 74, 116 }, + { 1, 48, 78 }, + { 1, 30, 49 }, + { 1, 11, 18 }, + }, + { /* Coeff Band 5 */ + { 10, 222, 223 }, + { 2, 150, 194 }, + { 1, 83, 128 }, + { 1, 48, 79 }, + { 1, 27, 45 }, + { 1, 11, 17 }, + }, + }, + { /* Inter */ + { /* Coeff Band 0 */ + { 36, 41, 235 }, + { 29, 36, 193 }, + { 10, 27, 111 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + }, + { /* Coeff Band 1 */ + { 85, 165, 222 }, + { 177, 162, 215 }, + { 110, 135, 195 }, + { 57, 113, 168 }, + { 23, 83, 120 }, + { 10, 49, 61 }, + }, + { /* Coeff Band 2 */ + { 85, 190, 223 }, + { 36, 139, 200 }, + { 5, 90, 146 }, + { 1, 60, 103 }, + { 1, 38, 65 }, + { 1, 18, 30 }, + }, + { /* Coeff Band 3 */ + { 72, 202, 223 }, + { 23, 141, 199 }, + { 2, 86, 140 }, + { 1, 56, 97 }, + { 1, 36, 61 }, + { 1, 16, 27 }, + }, + { /* Coeff Band 4 */ + { 55, 218, 225 }, + { 13, 145, 200 }, + { 1, 86, 141 }, + { 1, 57, 99 }, + { 1, 35, 61 }, + { 1, 13, 22 }, + }, + { /* Coeff Band 5 */ + { 15, 235, 212 }, + { 1, 132, 184 }, + { 1, 84, 139 }, + { 1, 57, 97 }, + { 1, 34, 56 }, + { 1, 14, 23 }, + }, + }, + }, + { /* block Type 1 */ + { /* Intra */ + { /* Coeff Band 0 */ + { 181, 21, 201 }, + { 61, 37, 123 }, + { 10, 38, 71 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + }, + { /* Coeff Band 1 */ + { 47, 106, 172 }, + { 95, 104, 173 }, + { 42, 93, 159 }, + { 18, 77, 131 }, + { 4, 50, 81 }, + { 1, 17, 23 }, + }, + { /* Coeff Band 2 */ + { 62, 147, 199 }, + { 44, 130, 189 }, + { 28, 102, 154 }, + { 18, 75, 115 }, + { 2, 44, 65 }, + { 1, 12, 19 }, + }, + { /* Coeff Band 3 */ + { 55, 153, 210 }, + { 24, 130, 194 }, + { 3, 93, 146 }, + { 1, 61, 97 }, + { 1, 31, 50 }, + { 1, 10, 16 }, + }, + { /* Coeff Band 4 */ + { 49, 186, 223 }, + { 17, 148, 204 }, + { 1, 96, 142 }, + { 1, 53, 83 }, + { 1, 26, 44 }, + { 1, 11, 17 }, + }, + { /* Coeff Band 5 */ + { 13, 217, 212 }, + { 2, 136, 180 }, + { 1, 78, 124 }, + { 1, 50, 83 }, + { 1, 29, 49 }, + { 1, 14, 23 }, + }, + }, + { /* Inter */ + { /* Coeff Band 0 */ + { 197, 13, 247 }, + { 82, 17, 222 }, + { 25, 17, 162 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + { 0, 0, 0 }, + }, + { /* Coeff Band 1 */ + { 126, 186, 247 }, + { 234, 191, 243 }, + { 176, 177, 234 }, + { 104, 158, 220 }, + { 66, 128, 186 }, + { 55, 90, 137 }, + }, + { /* Coeff Band 2 */ + { 111, 197, 242 }, + { 46, 158, 219 }, + { 9, 104, 171 }, + { 2, 65, 125 }, + { 1, 44, 80 }, + { 1, 17, 91 }, + }, + { /* Coeff Band 3 */ + { 104, 208, 245 }, + { 39, 168, 224 }, + { 3, 109, 162 }, + { 1, 79, 124 }, + { 1, 50, 102 }, + { 1, 43, 102 }, + }, + { /* Coeff Band 4 */ + { 84, 220, 246 }, + { 31, 177, 231 }, + { 2, 115, 180 }, + { 1, 79, 134 }, + { 1, 55, 77 }, + { 1, 60, 79 }, + }, + { /* Coeff Band 5 */ + { 43, 243, 240 }, + { 8, 180, 217 }, + { 1, 115, 166 }, + { 1, 84, 121 }, + { 1, 51, 67 }, + { 1, 16, 6 }, + }, + }, + }, + }, + }, + + .skip = { 192, 128, 64 }, + .inter_mode = { + { 2, 173, 34 }, + { 7, 145, 85 }, + { 7, 166, 63 }, + { 7, 94, 66 }, + { 8, 64, 46 }, + { 17, 81, 31 }, + { 25, 29, 30 }, + }, + .interp_filter = { + { 235, 162 }, + { 36, 255 }, + { 34, 3 }, + { 149, 144 }, + }, + .is_inter = { 9, 102, 187, 225 }, + .comp_mode = { 239, 183, 119, 96, 41 }, + .single_ref = { + { 33, 16 }, + { 77, 74 }, + { 142, 142 }, + { 172, 170 }, + { 238, 247 }, + }, + .comp_ref = { 50, 126, 123, 221, 226 }, + .y_mode = { + { 65, 32, 18, 144, 162, 194, 41, 51, 98 }, + { 132, 68, 18, 165, 217, 196, 45, 40, 78 }, + { 173, 80, 19, 176, 240, 193, 64, 35, 46 }, + { 221, 135, 38, 194, 248, 121, 96, 85, 29 }, + }, + .uv_mode = { + { 120, 7, 76, 176, 208, 126, 28, 54, 103 } /* y = dc */, + { 48, 12, 154, 155, 139, 90, 34, 117, 119 } /* y = v */, + { 67, 6, 25, 204, 243, 158, 13, 21, 96 } /* y = h */, + { 97, 5, 44, 131, 176, 139, 48, 68, 97 } /* y = d45 */, + { 83, 5, 42, 156, 111, 152, 26, 49, 152 } /* y = d135 */, + { 80, 5, 58, 178, 74, 83, 33, 62, 145 } /* y = d117 */, + { 86, 5, 32, 154, 192, 168, 14, 22, 163 } /* y = d153 */, + { 85, 5, 32, 156, 216, 148, 19, 29, 73 } /* y = d207 */, + { 77, 7, 64, 116, 132, 122, 37, 126, 120 } /* y = d63 */, + { 101, 21, 107, 181, 192, 103, 19, 67, 125 } /* y = tm */ + }, + .partition = { + /* 8x8 -> 4x4 */ + { 199, 122, 141 } /* a/l both not split */, + { 147, 63, 159 } /* a split, l not split */, + { 148, 133, 118 } /* l split, a not split */, + { 121, 104, 114 } /* a/l both split */, + /* 16x16 -> 8x8 */ + { 174, 73, 87 } /* a/l both not split */, + { 92, 41, 83 } /* a split, l not split */, + { 82, 99, 50 } /* l split, a not split */, + { 53, 39, 39 } /* a/l both split */, + /* 32x32 -> 16x16 */ + { 177, 58, 59 } /* a/l both not split */, + { 68, 26, 63 } /* a split, l not split */, + { 52, 79, 25 } /* l split, a not split */, + { 17, 14, 12 } /* a/l both split */, + /* 64x64 -> 32x32 */ + { 222, 34, 30 } /* a/l both not split */, + { 72, 16, 44 } /* a split, l not split */, + { 58, 32, 12 } /* l split, a not split */, + { 10, 7, 6 } /* a/l both split */, + }, + + .mv = { + .joint = { 32, 64, 96 }, + .sign = { 128, 128 }, + .classes = { + { 224, 144, 192, 168, 192, 176, 192, 198, 198, 245 }, + { 216, 128, 176, 160, 176, 176, 192, 198, 198, 208 }, + }, + .class0_bit = { 216, 208 }, + .bits = { + { 136, 140, 148, 160, 176, 192, 224, 234, 234, 240}, + { 136, 140, 148, 160, 176, 192, 224, 234, 234, 240}, + }, + .class0_fr = { + { + { 128, 128, 64 }, + { 96, 112, 64 }, + }, + { + { 128, 128, 64 }, + { 96, 112, 64 }, + }, + }, + .fr = { + { 64, 96, 64 }, + { 64, 96, 64 }, + }, + .class0_hp = { 160, 160 }, + .hp = { 128, 128 }, + }, +}; +EXPORT_SYMBOL_GPL(v4l2_vp9_default_probs); + +static u32 fastdiv(u32 dividend, u16 divisor) +{ +#define DIV_INV(d) ((u32)(((1ULL << 32) + ((d) - 1)) / (d))) +#define DIVS_INV(d0, d1, d2, d3, d4, d5, d6, d7, d8, d9) \ + DIV_INV(d0), DIV_INV(d1), DIV_INV(d2), DIV_INV(d3), \ + DIV_INV(d4), DIV_INV(d5), DIV_INV(d6), DIV_INV(d7), \ + DIV_INV(d8), DIV_INV(d9) + + static const u32 inv[] = { + DIV_INV(2), DIV_INV(3), DIV_INV(4), DIV_INV(5), + DIV_INV(6), DIV_INV(7), DIV_INV(8), DIV_INV(9), + DIVS_INV(10, 11, 12, 13, 14, 15, 16, 17, 18, 19), + DIVS_INV(20, 21, 22, 23, 24, 25, 26, 27, 28, 29), + DIVS_INV(30, 31, 32, 33, 34, 35, 36, 37, 38, 39), + DIVS_INV(40, 41, 42, 43, 44, 45, 46, 47, 48, 49), + DIVS_INV(50, 51, 52, 53, 54, 55, 56, 57, 58, 59), + DIVS_INV(60, 61, 62, 63, 64, 65, 66, 67, 68, 69), + DIVS_INV(70, 71, 72, 73, 74, 75, 76, 77, 78, 79), + DIVS_INV(80, 81, 82, 83, 84, 85, 86, 87, 88, 89), + DIVS_INV(90, 91, 92, 93, 94, 95, 96, 97, 98, 99), + DIVS_INV(100, 101, 102, 103, 104, 105, 106, 107, 108, 109), + DIVS_INV(110, 111, 112, 113, 114, 115, 116, 117, 118, 119), + DIVS_INV(120, 121, 122, 123, 124, 125, 126, 127, 128, 129), + DIVS_INV(130, 131, 132, 133, 134, 135, 136, 137, 138, 139), + DIVS_INV(140, 141, 142, 143, 144, 145, 146, 147, 148, 149), + DIVS_INV(150, 151, 152, 153, 154, 155, 156, 157, 158, 159), + DIVS_INV(160, 161, 162, 163, 164, 165, 166, 167, 168, 169), + DIVS_INV(170, 171, 172, 173, 174, 175, 176, 177, 178, 179), + DIVS_INV(180, 181, 182, 183, 184, 185, 186, 187, 188, 189), + DIVS_INV(190, 191, 192, 193, 194, 195, 196, 197, 198, 199), + DIVS_INV(200, 201, 202, 203, 204, 205, 206, 207, 208, 209), + DIVS_INV(210, 211, 212, 213, 214, 215, 216, 217, 218, 219), + DIVS_INV(220, 221, 222, 223, 224, 225, 226, 227, 228, 229), + DIVS_INV(230, 231, 232, 233, 234, 235, 236, 237, 238, 239), + DIVS_INV(240, 241, 242, 243, 244, 245, 246, 247, 248, 249), + DIV_INV(250), DIV_INV(251), DIV_INV(252), DIV_INV(253), + DIV_INV(254), DIV_INV(255), DIV_INV(256), + }; + + if (divisor == 0) + return 0; + else if (divisor == 1) + return dividend; + + if (WARN_ON(divisor - 2 >= ARRAY_SIZE(inv))) + return dividend; + + return ((u64)dividend * inv[divisor - 2]) >> 32; +} + +/* 6.3.6 inv_recenter_nonneg(v, m) */ +static int inv_recenter_nonneg(int v, int m) +{ + if (v > 2 * m) + return v; + + if (v & 1) + return m - ((v + 1) >> 1); + + return m + (v >> 1); +} + +/* + * part of 6.3.5 inv_remap_prob(deltaProb, prob) + * delta = inv_map_table[deltaProb] done by userspace + */ +static int update_prob(int delta, int prob) +{ + if (!delta) + return prob; + + return prob <= 128 ? + 1 + inv_recenter_nonneg(delta, prob - 1) : + 255 - inv_recenter_nonneg(delta, 255 - prob); +} + +/* Counterpart to 6.3.2 tx_mode_probs() */ +static void update_tx_probs(struct v4l2_vp9_frame_context *probs, + const struct v4l2_ctrl_vp9_compressed_hdr *deltas) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(probs->tx8); i++) { + u8 *p8x8 = probs->tx8[i]; + u8 *p16x16 = probs->tx16[i]; + u8 *p32x32 = probs->tx32[i]; + const u8 *d8x8 = deltas->tx8[i]; + const u8 *d16x16 = deltas->tx16[i]; + const u8 *d32x32 = deltas->tx32[i]; + + p8x8[0] = update_prob(d8x8[0], p8x8[0]); + p16x16[0] = update_prob(d16x16[0], p16x16[0]); + p16x16[1] = update_prob(d16x16[1], p16x16[1]); + p32x32[0] = update_prob(d32x32[0], p32x32[0]); + p32x32[1] = update_prob(d32x32[1], p32x32[1]); + p32x32[2] = update_prob(d32x32[2], p32x32[2]); + } +} + +#define BAND_6(band) ((band) == 0 ? 3 : 6) + +static void update_coeff(const u8 deltas[6][6][3], u8 probs[6][6][3]) +{ + int l, m, n; + + for (l = 0; l < 6; l++) + for (m = 0; m < BAND_6(l); m++) { + u8 *p = probs[l][m]; + const u8 *d = deltas[l][m]; + + for (n = 0; n < 3; n++) + p[n] = update_prob(d[n], p[n]); + } +} + +/* Counterpart to 6.3.7 read_coef_probs() */ +static void update_coef_probs(struct v4l2_vp9_frame_context *probs, + const struct v4l2_ctrl_vp9_compressed_hdr *deltas, + const struct v4l2_ctrl_vp9_frame *dec_params) +{ + int i, j, k; + + for (i = 0; i < ARRAY_SIZE(probs->coef); i++) { + for (j = 0; j < ARRAY_SIZE(probs->coef[0]); j++) + for (k = 0; k < ARRAY_SIZE(probs->coef[0][0]); k++) + update_coeff(deltas->coef[i][j][k], probs->coef[i][j][k]); + + if (deltas->tx_mode == i) + break; + } +} + +/* Counterpart to 6.3.8 read_skip_prob() */ +static void update_skip_probs(struct v4l2_vp9_frame_context *probs, + const struct v4l2_ctrl_vp9_compressed_hdr *deltas) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(probs->skip); i++) + probs->skip[i] = update_prob(deltas->skip[i], probs->skip[i]); +} + +/* Counterpart to 6.3.9 read_inter_mode_probs() */ +static void update_inter_mode_probs(struct v4l2_vp9_frame_context *probs, + const struct v4l2_ctrl_vp9_compressed_hdr *deltas) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(probs->inter_mode); i++) { + u8 *p = probs->inter_mode[i]; + const u8 *d = deltas->inter_mode[i]; + + p[0] = update_prob(d[0], p[0]); + p[1] = update_prob(d[1], p[1]); + p[2] = update_prob(d[2], p[2]); + } +} + +/* Counterpart to 6.3.10 read_interp_filter_probs() */ +static void update_interp_filter_probs(struct v4l2_vp9_frame_context *probs, + const struct v4l2_ctrl_vp9_compressed_hdr *deltas) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(probs->interp_filter); i++) { + u8 *p = probs->interp_filter[i]; + const u8 *d = deltas->interp_filter[i]; + + p[0] = update_prob(d[0], p[0]); + p[1] = update_prob(d[1], p[1]); + } +} + +/* Counterpart to 6.3.11 read_is_inter_probs() */ +static void update_is_inter_probs(struct v4l2_vp9_frame_context *probs, + const struct v4l2_ctrl_vp9_compressed_hdr *deltas) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(probs->is_inter); i++) + probs->is_inter[i] = update_prob(deltas->is_inter[i], probs->is_inter[i]); +} + +/* 6.3.12 frame_reference_mode() done entirely in userspace */ + +/* Counterpart to 6.3.13 frame_reference_mode_probs() */ +static void +update_frame_reference_mode_probs(unsigned int reference_mode, + struct v4l2_vp9_frame_context *probs, + const struct v4l2_ctrl_vp9_compressed_hdr *deltas) +{ + int i; + + if (reference_mode == V4L2_VP9_REFERENCE_MODE_SELECT) + for (i = 0; i < ARRAY_SIZE(probs->comp_mode); i++) + probs->comp_mode[i] = update_prob(deltas->comp_mode[i], + probs->comp_mode[i]); + + if (reference_mode != V4L2_VP9_REFERENCE_MODE_COMPOUND_REFERENCE) + for (i = 0; i < ARRAY_SIZE(probs->single_ref); i++) { + u8 *p = probs->single_ref[i]; + const u8 *d = deltas->single_ref[i]; + + p[0] = update_prob(d[0], p[0]); + p[1] = update_prob(d[1], p[1]); + } + + if (reference_mode != V4L2_VP9_REFERENCE_MODE_SINGLE_REFERENCE) + for (i = 0; i < ARRAY_SIZE(probs->comp_ref); i++) + probs->comp_ref[i] = update_prob(deltas->comp_ref[i], probs->comp_ref[i]); +} + +/* Counterpart to 6.3.14 read_y_mode_probs() */ +static void update_y_mode_probs(struct v4l2_vp9_frame_context *probs, + const struct v4l2_ctrl_vp9_compressed_hdr *deltas) +{ + int i, j; + + for (i = 0; i < ARRAY_SIZE(probs->y_mode); i++) + for (j = 0; j < ARRAY_SIZE(probs->y_mode[0]); ++j) + probs->y_mode[i][j] = + update_prob(deltas->y_mode[i][j], probs->y_mode[i][j]); +} + +/* Counterpart to 6.3.15 read_partition_probs() */ +static void update_partition_probs(struct v4l2_vp9_frame_context *probs, + const struct v4l2_ctrl_vp9_compressed_hdr *deltas) +{ + int i, j; + + for (i = 0; i < 4; i++) + for (j = 0; j < 4; j++) { + u8 *p = probs->partition[i * 4 + j]; + const u8 *d = deltas->partition[i * 4 + j]; + + p[0] = update_prob(d[0], p[0]); + p[1] = update_prob(d[1], p[1]); + p[2] = update_prob(d[2], p[2]); + } +} + +static inline int update_mv_prob(int delta, int prob) +{ + if (!delta) + return prob; + + return delta; +} + +/* Counterpart to 6.3.16 mv_probs() */ +static void update_mv_probs(struct v4l2_vp9_frame_context *probs, + const struct v4l2_ctrl_vp9_compressed_hdr *deltas, + const struct v4l2_ctrl_vp9_frame *dec_params) +{ + u8 *p = probs->mv.joint; + const u8 *d = deltas->mv.joint; + unsigned int i, j; + + p[0] = update_mv_prob(d[0], p[0]); + p[1] = update_mv_prob(d[1], p[1]); + p[2] = update_mv_prob(d[2], p[2]); + + for (i = 0; i < ARRAY_SIZE(probs->mv.sign); i++) { + p = probs->mv.sign; + d = deltas->mv.sign; + p[i] = update_mv_prob(d[i], p[i]); + + p = probs->mv.classes[i]; + d = deltas->mv.classes[i]; + for (j = 0; j < ARRAY_SIZE(probs->mv.classes[0]); j++) + p[j] = update_mv_prob(d[j], p[j]); + + p = probs->mv.class0_bit; + d = deltas->mv.class0_bit; + p[i] = update_mv_prob(d[i], p[i]); + + p = probs->mv.bits[i]; + d = deltas->mv.bits[i]; + for (j = 0; j < ARRAY_SIZE(probs->mv.bits[0]); j++) + p[j] = update_mv_prob(d[j], p[j]); + + for (j = 0; j < ARRAY_SIZE(probs->mv.class0_fr[0]); j++) { + p = probs->mv.class0_fr[i][j]; + d = deltas->mv.class0_fr[i][j]; + + p[0] = update_mv_prob(d[0], p[0]); + p[1] = update_mv_prob(d[1], p[1]); + p[2] = update_mv_prob(d[2], p[2]); + } + + p = probs->mv.fr[i]; + d = deltas->mv.fr[i]; + for (j = 0; j < ARRAY_SIZE(probs->mv.fr[i]); j++) + p[j] = update_mv_prob(d[j], p[j]); + + if (dec_params->flags & V4L2_VP9_FRAME_FLAG_ALLOW_HIGH_PREC_MV) { + p = probs->mv.class0_hp; + d = deltas->mv.class0_hp; + p[i] = update_mv_prob(d[i], p[i]); + + p = probs->mv.hp; + d = deltas->mv.hp; + p[i] = update_mv_prob(d[i], p[i]); + } + } +} + +/* Counterpart to 6.3 compressed_header(), but parsing has been done in userspace. */ +void v4l2_vp9_fw_update_probs(struct v4l2_vp9_frame_context *probs, + const struct v4l2_ctrl_vp9_compressed_hdr *deltas, + const struct v4l2_ctrl_vp9_frame *dec_params) +{ + if (deltas->tx_mode == V4L2_VP9_TX_MODE_SELECT) + update_tx_probs(probs, deltas); + + update_coef_probs(probs, deltas, dec_params); + + update_skip_probs(probs, deltas); + + if (dec_params->flags & V4L2_VP9_FRAME_FLAG_KEY_FRAME || + dec_params->flags & V4L2_VP9_FRAME_FLAG_INTRA_ONLY) + return; + + update_inter_mode_probs(probs, deltas); + + if (dec_params->interpolation_filter == V4L2_VP9_INTERP_FILTER_SWITCHABLE) + update_interp_filter_probs(probs, deltas); + + update_is_inter_probs(probs, deltas); + + update_frame_reference_mode_probs(dec_params->reference_mode, probs, deltas); + + update_y_mode_probs(probs, deltas); + + update_partition_probs(probs, deltas); + + update_mv_probs(probs, deltas, dec_params); +} +EXPORT_SYMBOL_GPL(v4l2_vp9_fw_update_probs); + +u8 v4l2_vp9_reset_frame_ctx(const struct v4l2_ctrl_vp9_frame *dec_params, + struct v4l2_vp9_frame_context *frame_context) +{ + int i; + + u8 fctx_idx = dec_params->frame_context_idx; + + if (dec_params->flags & V4L2_VP9_FRAME_FLAG_KEY_FRAME || + dec_params->flags & V4L2_VP9_FRAME_FLAG_INTRA_ONLY || + dec_params->flags & V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT) { + /* + * setup_past_independence() + * We do nothing here. Instead of storing default probs in some intermediate + * location and then copying from that location to appropriate contexts + * in save_probs() below, we skip that step and save default probs directly + * to appropriate contexts. + */ + if (dec_params->flags & V4L2_VP9_FRAME_FLAG_KEY_FRAME || + dec_params->flags & V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT || + dec_params->reset_frame_context == V4L2_VP9_RESET_FRAME_CTX_ALL) + for (i = 0; i < 4; ++i) + /* save_probs(i) */ + memcpy(&frame_context[i], &v4l2_vp9_default_probs, + sizeof(v4l2_vp9_default_probs)); + else if (dec_params->reset_frame_context == V4L2_VP9_RESET_FRAME_CTX_SPEC) + /* save_probs(fctx_idx) */ + memcpy(&frame_context[fctx_idx], &v4l2_vp9_default_probs, + sizeof(v4l2_vp9_default_probs)); + fctx_idx = 0; + } + + return fctx_idx; +} +EXPORT_SYMBOL_GPL(v4l2_vp9_reset_frame_ctx); + +/* 8.4.1 Merge prob process */ +static u8 merge_prob(u8 pre_prob, u32 ct0, u32 ct1, u16 count_sat, u32 max_update_factor) +{ + u32 den, prob, count, factor; + + den = ct0 + ct1; + if (!den) { + /* + * prob = 128, count = 0, update_factor = 0 + * Round2's argument: pre_prob * 256 + * (pre_prob * 256 + 128) >> 8 == pre_prob + */ + return pre_prob; + } + + prob = clamp(((ct0 << 8) + (den >> 1)) / den, (u32)1, (u32)255); + count = min_t(u32, den, count_sat); + factor = fastdiv(max_update_factor * count, count_sat); + + /* + * Round2(pre_prob * (256 - factor) + prob * factor, 8) + * Round2(pre_prob * 256 + (prob - pre_prob) * factor, 8) + * (pre_prob * 256 >> 8) + (((prob - pre_prob) * factor + 128) >> 8) + */ + return pre_prob + (((prob - pre_prob) * factor + 128) >> 8); +} + +static inline u8 noncoef_merge_prob(u8 pre_prob, u32 ct0, u32 ct1) +{ + return merge_prob(pre_prob, ct0, ct1, 20, 128); +} + +/* 8.4.2 Merge probs process */ +/* + * merge_probs() is a recursive function in the spec. We avoid recursion in the kernel. + * That said, the "tree" parameter of merge_probs() controls how deep the recursion goes. + * It turns out that in all cases the recursive calls boil down to a short-ish series + * of merge_prob() invocations (note no "s"). + * + * Variant A + * --------- + * merge_probs(small_token_tree, 2): + * merge_prob(p[1], c[0], c[1] + c[2]) + * merge_prob(p[2], c[1], c[2]) + * + * Variant B + * --------- + * merge_probs(binary_tree, 0) or + * merge_probs(tx_size_8_tree, 0): + * merge_prob(p[0], c[0], c[1]) + * + * Variant C + * --------- + * merge_probs(inter_mode_tree, 0): + * merge_prob(p[0], c[2], c[1] + c[0] + c[3]) + * merge_prob(p[1], c[0], c[1] + c[3]) + * merge_prob(p[2], c[1], c[3]) + * + * Variant D + * --------- + * merge_probs(intra_mode_tree, 0): + * merge_prob(p[0], c[0], c[1] + ... + c[9]) + * merge_prob(p[1], c[9], c[1] + ... + c[8]) + * merge_prob(p[2], c[1], c[2] + ... + c[8]) + * merge_prob(p[3], c[2] + c[4] + c[5], c[3] + c[8] + c[6] + c[7]) + * merge_prob(p[4], c[2], c[4] + c[5]) + * merge_prob(p[5], c[4], c[5]) + * merge_prob(p[6], c[3], c[8] + c[6] + c[7]) + * merge_prob(p[7], c[8], c[6] + c[7]) + * merge_prob(p[8], c[6], c[7]) + * + * Variant E + * --------- + * merge_probs(partition_tree, 0) or + * merge_probs(tx_size_32_tree, 0) or + * merge_probs(mv_joint_tree, 0) or + * merge_probs(mv_fr_tree, 0): + * merge_prob(p[0], c[0], c[1] + c[2] + c[3]) + * merge_prob(p[1], c[1], c[2] + c[3]) + * merge_prob(p[2], c[2], c[3]) + * + * Variant F + * --------- + * merge_probs(interp_filter_tree, 0) or + * merge_probs(tx_size_16_tree, 0): + * merge_prob(p[0], c[0], c[1] + c[2]) + * merge_prob(p[1], c[1], c[2]) + * + * Variant G + * --------- + * merge_probs(mv_class_tree, 0): + * merge_prob(p[0], c[0], c[1] + ... + c[10]) + * merge_prob(p[1], c[1], c[2] + ... + c[10]) + * merge_prob(p[2], c[2] + c[3], c[4] + ... + c[10]) + * merge_prob(p[3], c[2], c[3]) + * merge_prob(p[4], c[4] + c[5], c[6] + ... + c[10]) + * merge_prob(p[5], c[4], c[5]) + * merge_prob(p[6], c[6], c[7] + ... + c[10]) + * merge_prob(p[7], c[7] + c[8], c[9] + c[10]) + * merge_prob(p[8], c[7], c[8]) + * merge_prob(p[9], c[9], [10]) + */ + +static inline void merge_probs_variant_a(u8 *p, const u32 *c, u16 count_sat, u32 update_factor) +{ + p[1] = merge_prob(p[1], c[0], c[1] + c[2], count_sat, update_factor); + p[2] = merge_prob(p[2], c[1], c[2], count_sat, update_factor); +} + +static inline void merge_probs_variant_b(u8 *p, const u32 *c, u16 count_sat, u32 update_factor) +{ + p[0] = merge_prob(p[0], c[0], c[1], count_sat, update_factor); +} + +static inline void merge_probs_variant_c(u8 *p, const u32 *c) +{ + p[0] = noncoef_merge_prob(p[0], c[2], c[1] + c[0] + c[3]); + p[1] = noncoef_merge_prob(p[1], c[0], c[1] + c[3]); + p[2] = noncoef_merge_prob(p[2], c[1], c[3]); +} + +static void merge_probs_variant_d(u8 *p, const u32 *c) +{ + u32 sum = 0, s2; + + sum = c[1] + c[2] + c[3] + c[4] + c[5] + c[6] + c[7] + c[8] + c[9]; + + p[0] = noncoef_merge_prob(p[0], c[0], sum); + sum -= c[9]; + p[1] = noncoef_merge_prob(p[1], c[9], sum); + sum -= c[1]; + p[2] = noncoef_merge_prob(p[2], c[1], sum); + s2 = c[2] + c[4] + c[5]; + sum -= s2; + p[3] = noncoef_merge_prob(p[3], s2, sum); + s2 -= c[2]; + p[4] = noncoef_merge_prob(p[4], c[2], s2); + p[5] = noncoef_merge_prob(p[5], c[4], c[5]); + sum -= c[3]; + p[6] = noncoef_merge_prob(p[6], c[3], sum); + sum -= c[8]; + p[7] = noncoef_merge_prob(p[7], c[8], sum); + p[8] = noncoef_merge_prob(p[8], c[6], c[7]); +} + +static inline void merge_probs_variant_e(u8 *p, const u32 *c) +{ + p[0] = noncoef_merge_prob(p[0], c[0], c[1] + c[2] + c[3]); + p[1] = noncoef_merge_prob(p[1], c[1], c[2] + c[3]); + p[2] = noncoef_merge_prob(p[2], c[2], c[3]); +} + +static inline void merge_probs_variant_f(u8 *p, const u32 *c) +{ + p[0] = noncoef_merge_prob(p[0], c[0], c[1] + c[2]); + p[1] = noncoef_merge_prob(p[1], c[1], c[2]); +} + +static void merge_probs_variant_g(u8 *p, const u32 *c) +{ + u32 sum; + + sum = c[1] + c[2] + c[3] + c[4] + c[5] + c[6] + c[7] + c[8] + c[9] + c[10]; + p[0] = noncoef_merge_prob(p[0], c[0], sum); + sum -= c[1]; + p[1] = noncoef_merge_prob(p[1], c[1], sum); + sum -= c[2] + c[3]; + p[2] = noncoef_merge_prob(p[2], c[2] + c[3], sum); + p[3] = noncoef_merge_prob(p[3], c[2], c[3]); + sum -= c[4] + c[5]; + p[4] = noncoef_merge_prob(p[4], c[4] + c[5], sum); + p[5] = noncoef_merge_prob(p[5], c[4], c[5]); + sum -= c[6]; + p[6] = noncoef_merge_prob(p[6], c[6], sum); + p[7] = noncoef_merge_prob(p[7], c[7] + c[8], c[9] + c[10]); + p[8] = noncoef_merge_prob(p[8], c[7], c[8]); + p[9] = noncoef_merge_prob(p[9], c[9], c[10]); +} + +/* 8.4.3 Coefficient probability adaptation process */ +static inline void adapt_probs_variant_a_coef(u8 *p, const u32 *c, u32 update_factor) +{ + merge_probs_variant_a(p, c, 24, update_factor); +} + +static inline void adapt_probs_variant_b_coef(u8 *p, const u32 *c, u32 update_factor) +{ + merge_probs_variant_b(p, c, 24, update_factor); +} + +static void _adapt_coeff(unsigned int i, unsigned int j, unsigned int k, + struct v4l2_vp9_frame_context *probs, + const struct v4l2_vp9_frame_symbol_counts *counts, + u32 uf) +{ + s32 l, m; + + for (l = 0; l < ARRAY_SIZE(probs->coef[0][0][0]); l++) { + for (m = 0; m < BAND_6(l); m++) { + u8 *p = probs->coef[i][j][k][l][m]; + const u32 counts_more_coefs[2] = { + *counts->eob[i][j][k][l][m][1], + *counts->eob[i][j][k][l][m][0] - *counts->eob[i][j][k][l][m][1], + }; + + adapt_probs_variant_a_coef(p, *counts->coeff[i][j][k][l][m], uf); + adapt_probs_variant_b_coef(p, counts_more_coefs, uf); + } + } +} + +static void _adapt_coef_probs(struct v4l2_vp9_frame_context *probs, + const struct v4l2_vp9_frame_symbol_counts *counts, + unsigned int uf) +{ + unsigned int i, j, k; + + for (i = 0; i < ARRAY_SIZE(probs->coef); i++) + for (j = 0; j < ARRAY_SIZE(probs->coef[0]); j++) + for (k = 0; k < ARRAY_SIZE(probs->coef[0][0]); k++) + _adapt_coeff(i, j, k, probs, counts, uf); +} + +void v4l2_vp9_adapt_coef_probs(struct v4l2_vp9_frame_context *probs, + struct v4l2_vp9_frame_symbol_counts *counts, + bool use_128, + bool frame_is_intra) +{ + if (frame_is_intra) { + _adapt_coef_probs(probs, counts, 112); + } else { + if (use_128) + _adapt_coef_probs(probs, counts, 128); + else + _adapt_coef_probs(probs, counts, 112); + } +} +EXPORT_SYMBOL_GPL(v4l2_vp9_adapt_coef_probs); + +/* 8.4.4 Non coefficient probability adaptation process, adapt_probs() */ +static inline void adapt_probs_variant_b(u8 *p, const u32 *c) +{ + merge_probs_variant_b(p, c, 20, 128); +} + +static inline void adapt_probs_variant_c(u8 *p, const u32 *c) +{ + merge_probs_variant_c(p, c); +} + +static inline void adapt_probs_variant_d(u8 *p, const u32 *c) +{ + merge_probs_variant_d(p, c); +} + +static inline void adapt_probs_variant_e(u8 *p, const u32 *c) +{ + merge_probs_variant_e(p, c); +} + +static inline void adapt_probs_variant_f(u8 *p, const u32 *c) +{ + merge_probs_variant_f(p, c); +} + +static inline void adapt_probs_variant_g(u8 *p, const u32 *c) +{ + merge_probs_variant_g(p, c); +} + +/* 8.4.4 Non coefficient probability adaptation process, adapt_prob() */ +static inline u8 adapt_prob(u8 prob, const u32 counts[2]) +{ + return noncoef_merge_prob(prob, counts[0], counts[1]); +} + +/* 8.4.4 Non coefficient probability adaptation process */ +void v4l2_vp9_adapt_noncoef_probs(struct v4l2_vp9_frame_context *probs, + struct v4l2_vp9_frame_symbol_counts *counts, + u8 reference_mode, u8 interpolation_filter, u8 tx_mode, + u32 flags) +{ + unsigned int i, j; + + for (i = 0; i < ARRAY_SIZE(probs->is_inter); i++) + probs->is_inter[i] = adapt_prob(probs->is_inter[i], (*counts->intra_inter)[i]); + + for (i = 0; i < ARRAY_SIZE(probs->comp_mode); i++) + probs->comp_mode[i] = adapt_prob(probs->comp_mode[i], (*counts->comp)[i]); + + for (i = 0; i < ARRAY_SIZE(probs->comp_ref); i++) + probs->comp_ref[i] = adapt_prob(probs->comp_ref[i], (*counts->comp_ref)[i]); + + if (reference_mode != V4L2_VP9_REFERENCE_MODE_COMPOUND_REFERENCE) + for (i = 0; i < ARRAY_SIZE(probs->single_ref); i++) + for (j = 0; j < ARRAY_SIZE(probs->single_ref[0]); j++) + probs->single_ref[i][j] = adapt_prob(probs->single_ref[i][j], + (*counts->single_ref)[i][j]); + + for (i = 0; i < ARRAY_SIZE(probs->inter_mode); i++) + adapt_probs_variant_c(probs->inter_mode[i], (*counts->mv_mode)[i]); + + for (i = 0; i < ARRAY_SIZE(probs->y_mode); i++) + adapt_probs_variant_d(probs->y_mode[i], (*counts->y_mode)[i]); + + for (i = 0; i < ARRAY_SIZE(probs->uv_mode); i++) + adapt_probs_variant_d(probs->uv_mode[i], (*counts->uv_mode)[i]); + + for (i = 0; i < ARRAY_SIZE(probs->partition); i++) + adapt_probs_variant_e(probs->partition[i], (*counts->partition)[i]); + + for (i = 0; i < ARRAY_SIZE(probs->skip); i++) + probs->skip[i] = adapt_prob(probs->skip[i], (*counts->skip)[i]); + + if (interpolation_filter == V4L2_VP9_INTERP_FILTER_SWITCHABLE) + for (i = 0; i < ARRAY_SIZE(probs->interp_filter); i++) + adapt_probs_variant_f(probs->interp_filter[i], (*counts->filter)[i]); + + if (tx_mode == V4L2_VP9_TX_MODE_SELECT) + for (i = 0; i < ARRAY_SIZE(probs->tx8); i++) { + adapt_probs_variant_b(probs->tx8[i], (*counts->tx8p)[i]); + adapt_probs_variant_f(probs->tx16[i], (*counts->tx16p)[i]); + adapt_probs_variant_e(probs->tx32[i], (*counts->tx32p)[i]); + } + + adapt_probs_variant_e(probs->mv.joint, *counts->mv_joint); + + for (i = 0; i < ARRAY_SIZE(probs->mv.sign); i++) { + probs->mv.sign[i] = adapt_prob(probs->mv.sign[i], (*counts->sign)[i]); + + adapt_probs_variant_g(probs->mv.classes[i], (*counts->classes)[i]); + + probs->mv.class0_bit[i] = adapt_prob(probs->mv.class0_bit[i], (*counts->class0)[i]); + + for (j = 0; j < ARRAY_SIZE(probs->mv.bits[0]); j++) + probs->mv.bits[i][j] = adapt_prob(probs->mv.bits[i][j], + (*counts->bits)[i][j]); + + for (j = 0; j < ARRAY_SIZE(probs->mv.class0_fr[0]); j++) + adapt_probs_variant_e(probs->mv.class0_fr[i][j], + (*counts->class0_fp)[i][j]); + + adapt_probs_variant_e(probs->mv.fr[i], (*counts->fp)[i]); + + if (!(flags & V4L2_VP9_FRAME_FLAG_ALLOW_HIGH_PREC_MV)) + continue; + + probs->mv.class0_hp[i] = adapt_prob(probs->mv.class0_hp[i], + (*counts->class0_hp)[i]); + + probs->mv.hp[i] = adapt_prob(probs->mv.hp[i], (*counts->hp)[i]); + } +} +EXPORT_SYMBOL_GPL(v4l2_vp9_adapt_noncoef_probs); + +bool +v4l2_vp9_seg_feat_enabled(const u8 *feature_enabled, + unsigned int feature, + unsigned int segid) +{ + u8 mask = V4L2_VP9_SEGMENT_FEATURE_ENABLED(feature); + + return !!(feature_enabled[segid] & mask); +} +EXPORT_SYMBOL_GPL(v4l2_vp9_seg_feat_enabled); + +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("V4L2 VP9 Helpers"); +MODULE_AUTHOR("Andrzej Pietrasiewicz "); diff --git a/include/media/v4l2-vp9.h b/include/media/v4l2-vp9.h new file mode 100644 index 000000000000..3415608dbc7c --- /dev/null +++ b/include/media/v4l2-vp9.h @@ -0,0 +1,182 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Helper functions for vp9 codecs. + * + * Copyright (c) 2021 Collabora, Ltd. + * + * Author: Andrzej Pietrasiewicz + */ + +#ifndef _MEDIA_V4L2_VP9_H +#define _MEDIA_V4L2_VP9_H + +#include + +/** + * struct v4l2_vp9_frame_mv_context - motion vector-related probabilities + * + * A member of v4l2_vp9_frame_context. + */ +struct v4l2_vp9_frame_mv_context { + u8 joint[3]; + u8 sign[2]; + u8 classes[2][10]; + u8 class0_bit[2]; + u8 bits[2][10]; + u8 class0_fr[2][2][3]; + u8 fr[2][3]; + u8 class0_hp[2]; + u8 hp[2]; +}; + +/** + * struct v4l2_vp9_frame_context - frame probabilities, including motion-vector related + * + * Drivers which need to keep track of frame context(s) can use this struct. + * The members correspond to probability tables, which are specified only implicitly in the + * vp9 spec. Section 10.5 "Default probability tables" contains all the types of involved + * tables, i.e. the actual tables are of the same kind, and when they are reset (which is + * mandated by the spec sometimes) they are overwritten with values from the default tables. + */ +struct v4l2_vp9_frame_context { + u8 tx8[2][1]; + u8 tx16[2][2]; + u8 tx32[2][3]; + u8 coef[4][2][2][6][6][3]; + u8 skip[3]; + u8 inter_mode[7][3]; + u8 interp_filter[4][2]; + u8 is_inter[4]; + u8 comp_mode[5]; + u8 single_ref[5][2]; + u8 comp_ref[5]; + u8 y_mode[4][9]; + u8 uv_mode[10][9]; + u8 partition[16][3]; + + struct v4l2_vp9_frame_mv_context mv; +}; + +/** + * struct v4l2_vp9_frame_symbol_counts - pointers to arrays of symbol counts + * + * The fields correspond to what is specified in section 8.3 "Clear counts process" of the spec. + * Different pieces of hardware can report the counts in different order, so we cannot rely on + * simply overlaying a struct on a relevant block of memory. Instead we provide pointers to + * arrays or array of pointers to arrays in case of coeff, or array of pointers for eob. + */ +struct v4l2_vp9_frame_symbol_counts { + u32 (*partition)[16][4]; + u32 (*skip)[3][2]; + u32 (*intra_inter)[4][2]; + u32 (*tx32p)[2][4]; + u32 (*tx16p)[2][4]; + u32 (*tx8p)[2][2]; + u32 (*y_mode)[4][10]; + u32 (*uv_mode)[10][10]; + u32 (*comp)[5][2]; + u32 (*comp_ref)[5][2]; + u32 (*single_ref)[5][2][2]; + u32 (*mv_mode)[7][4]; + u32 (*filter)[4][3]; + u32 (*mv_joint)[4]; + u32 (*sign)[2][2]; + u32 (*classes)[2][11]; + u32 (*class0)[2][2]; + u32 (*bits)[2][10][2]; + u32 (*class0_fp)[2][2][4]; + u32 (*fp)[2][4]; + u32 (*class0_hp)[2][2]; + u32 (*hp)[2][2]; + u32 (*coeff[4][2][2][6][6])[3]; + u32 *eob[4][2][2][6][6][2]; +}; + +extern const u8 v4l2_vp9_kf_y_mode_prob[10][10][9]; /* Section 10.4 of the spec */ +extern const u8 v4l2_vp9_kf_partition_probs[16][3]; /* Section 10.4 of the spec */ +extern const u8 v4l2_vp9_kf_uv_mode_prob[10][9]; /* Section 10.4 of the spec */ +extern const struct v4l2_vp9_frame_context v4l2_vp9_default_probs; /* Section 10.5 of the spec */ + +/** + * v4l2_vp9_fw_update_probs() - Perform forward update of vp9 probabilities + * + * @probs: current probabilities values + * @deltas: delta values from compressed header + * @dec_params: vp9 frame decoding parameters + * + * This function performs forward updates of probabilities for the vp9 boolean decoder. + * The frame header can contain a directive to update the probabilities (deltas), if so, then + * the deltas are provided in the header, too. The userspace parses those and passes the said + * deltas struct to the kernel. + */ +void v4l2_vp9_fw_update_probs(struct v4l2_vp9_frame_context *probs, + const struct v4l2_ctrl_vp9_compressed_hdr *deltas, + const struct v4l2_ctrl_vp9_frame *dec_params); + +/** + * v4l2_vp9_reset_frame_ctx() - Reset appropriate frame context + * + * @dec_params: vp9 frame decoding parameters + * @frame_context: array of the 4 frame contexts + * + * This function resets appropriate frame contexts, based on what's in dec_params. + * + * Returns the frame context index after the update, which might be reset to zero if + * mandated by the spec. + */ +u8 v4l2_vp9_reset_frame_ctx(const struct v4l2_ctrl_vp9_frame *dec_params, + struct v4l2_vp9_frame_context *frame_context); + +/** + * v4l2_vp9_adapt_coef_probs() - Perform backward update of vp9 coefficients probabilities + * + * @probs: current probabilities values + * @counts: values of symbol counts after the current frame has been decoded + * @use_128: flag to request that 128 is used as update factor if true, otherwise 112 is used + * @frame_is_intra: flag indicating that FrameIsIntra is true + * + * This function performs backward updates of coefficients probabilities for the vp9 boolean + * decoder. After a frame has been decoded the counts of how many times a given symbol has + * occurred are known and are used to update the probability of each symbol. + */ +void v4l2_vp9_adapt_coef_probs(struct v4l2_vp9_frame_context *probs, + struct v4l2_vp9_frame_symbol_counts *counts, + bool use_128, + bool frame_is_intra); + +/** + * v4l2_vp9_adapt_coef_probs() - Perform backward update of vp9 non-coefficients probabilities + * + * @probs: current probabilities values + * @counts: values of symbol counts after the current frame has been decoded + * @reference_mode: specifies the type of inter prediction to be used. See + * &v4l2_vp9_reference_mode for more details + * @interpolation_filter: specifies the filter selection used for performing inter prediction. + * See &v4l2_vp9_interpolation_filter for more details + * @tx_mode: specifies the TX mode. See &v4l2_vp9_tx_mode for more details + * @flags: combination of V4L2_VP9_FRAME_FLAG_* flags + * + * This function performs backward updates of non-coefficients probabilities for the vp9 boolean + * decoder. After a frame has been decoded the counts of how many times a given symbol has + * occurred are known and are used to update the probability of each symbol. + */ +void v4l2_vp9_adapt_noncoef_probs(struct v4l2_vp9_frame_context *probs, + struct v4l2_vp9_frame_symbol_counts *counts, + u8 reference_mode, u8 interpolation_filter, u8 tx_mode, + u32 flags); + +/** + * v4l2_vp9_seg_feat_enabled() - Check if a segmentation feature is enabled + * + * @feature_enabled: array of 8-bit flags (for all segments) + * @feature: id of the feature to check + * @segid: id of the segment to look up + * + * This function returns true if a given feature is active in a given segment. + */ +bool +v4l2_vp9_seg_feat_enabled(const u8 *feature_enabled, + unsigned int feature, + unsigned int segid); + +#endif /* _MEDIA_V4L2_VP9_H */ From patchwork Mon Sep 27 15:19:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrzej Pietrasiewicz X-Patchwork-Id: 12520179 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 911A4C433F5 for ; Mon, 27 Sep 2021 15:20:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 74060610A2 for ; Mon, 27 Sep 2021 15:20:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235177AbhI0PVy (ORCPT ); Mon, 27 Sep 2021 11:21:54 -0400 Received: from bhuna.collabora.co.uk ([46.235.227.227]:54122 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235150AbhI0PVv (ORCPT ); Mon, 27 Sep 2021 11:21:51 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: andrzej.p) with ESMTPSA id A040D1F42E4D From: Andrzej Pietrasiewicz To: linux-media@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-rockchip@lists.infradead.org, linux-staging@lists.linux.dev Cc: Andrzej Pietrasiewicz , Benjamin Gaignard , Boris Brezillon , Ezequiel Garcia , Fabio Estevam , Greg Kroah-Hartman , Hans Verkuil , Heiko Stuebner , Jernej Skrabec , Mauro Carvalho Chehab , Nicolas Dufresne , NXP Linux Team , Pengutronix Kernel Team , Philipp Zabel , Sascha Hauer , Shawn Guo , kernel@collabora.com, Ezequiel Garcia , Adrian Ratiu Subject: [PATCH v6 07/10] media: rkvdec: Add the VP9 backend Date: Mon, 27 Sep 2021 17:19:55 +0200 Message-Id: <20210927151958.24426-8-andrzej.p@collabora.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210927151958.24426-1-andrzej.p@collabora.com> References: <20210927151958.24426-1-andrzej.p@collabora.com> Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org From: Boris Brezillon The Rockchip VDEC supports VP9 profile 0 up to 4096x2304@30fps. Add a backend for this new format. Signed-off-by: Boris Brezillon Signed-off-by: Ezequiel Garcia Signed-off-by: Adrian Ratiu Co-developed-by: Andrzej Pietrasiewicz Signed-off-by: Andrzej Pietrasiewicz --- drivers/staging/media/rkvdec/Kconfig | 1 + drivers/staging/media/rkvdec/Makefile | 2 +- drivers/staging/media/rkvdec/rkvdec-vp9.c | 1078 +++++++++++++++++++++ drivers/staging/media/rkvdec/rkvdec.c | 52 +- drivers/staging/media/rkvdec/rkvdec.h | 12 +- 5 files changed, 1137 insertions(+), 8 deletions(-) create mode 100644 drivers/staging/media/rkvdec/rkvdec-vp9.c diff --git a/drivers/staging/media/rkvdec/Kconfig b/drivers/staging/media/rkvdec/Kconfig index c02199b5e0fd..dc7292f346fa 100644 --- a/drivers/staging/media/rkvdec/Kconfig +++ b/drivers/staging/media/rkvdec/Kconfig @@ -9,6 +9,7 @@ config VIDEO_ROCKCHIP_VDEC select VIDEOBUF2_VMALLOC select V4L2_MEM2MEM_DEV select V4L2_H264 + select V4L2_VP9 help Support for the Rockchip Video Decoder IP present on Rockchip SoCs, which accelerates video decoding. diff --git a/drivers/staging/media/rkvdec/Makefile b/drivers/staging/media/rkvdec/Makefile index c08fed0a39f9..cb86b429cfaa 100644 --- a/drivers/staging/media/rkvdec/Makefile +++ b/drivers/staging/media/rkvdec/Makefile @@ -1,3 +1,3 @@ obj-$(CONFIG_VIDEO_ROCKCHIP_VDEC) += rockchip-vdec.o -rockchip-vdec-y += rkvdec.o rkvdec-h264.o +rockchip-vdec-y += rkvdec.o rkvdec-h264.o rkvdec-vp9.o diff --git a/drivers/staging/media/rkvdec/rkvdec-vp9.c b/drivers/staging/media/rkvdec/rkvdec-vp9.c new file mode 100644 index 000000000000..ca463f18651a --- /dev/null +++ b/drivers/staging/media/rkvdec/rkvdec-vp9.c @@ -0,0 +1,1078 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Rockchip Video Decoder VP9 backend + * + * Copyright (C) 2019 Collabora, Ltd. + * Boris Brezillon + * Copyright (C) 2021 Collabora, Ltd. + * Andrzej Pietrasiewicz + * + * Copyright (C) 2016 Rockchip Electronics Co., Ltd. + * Alpha Lin + */ + +/* + * For following the vp9 spec please start reading this driver + * code from rkvdec_vp9_run() followed by rkvdec_vp9_done(). + */ + +#include +#include +#include +#include + +#include "rkvdec.h" +#include "rkvdec-regs.h" + +#define RKVDEC_VP9_PROBE_SIZE 4864 +#define RKVDEC_VP9_COUNT_SIZE 13232 +#define RKVDEC_VP9_MAX_SEGMAP_SIZE 73728 + +struct rkvdec_vp9_intra_mode_probs { + u8 y_mode[105]; + u8 uv_mode[23]; +}; + +struct rkvdec_vp9_intra_only_frame_probs { + u8 coef_intra[4][2][128]; + struct rkvdec_vp9_intra_mode_probs intra_mode[10]; +}; + +struct rkvdec_vp9_inter_frame_probs { + u8 y_mode[4][9]; + u8 comp_mode[5]; + u8 comp_ref[5]; + u8 single_ref[5][2]; + u8 inter_mode[7][3]; + u8 interp_filter[4][2]; + u8 padding0[11]; + u8 coef[2][4][2][128]; + u8 uv_mode_0_2[3][9]; + u8 padding1[5]; + u8 uv_mode_3_5[3][9]; + u8 padding2[5]; + u8 uv_mode_6_8[3][9]; + u8 padding3[5]; + u8 uv_mode_9[9]; + u8 padding4[7]; + u8 padding5[16]; + struct { + u8 joint[3]; + u8 sign[2]; + u8 classes[2][10]; + u8 class0_bit[2]; + u8 bits[2][10]; + u8 class0_fr[2][2][3]; + u8 fr[2][3]; + u8 class0_hp[2]; + u8 hp[2]; + } mv; +}; + +struct rkvdec_vp9_probs { + u8 partition[16][3]; + u8 pred[3]; + u8 tree[7]; + u8 skip[3]; + u8 tx32[2][3]; + u8 tx16[2][2]; + u8 tx8[2][1]; + u8 is_inter[4]; + /* 128 bit alignment */ + u8 padding0[3]; + union { + struct rkvdec_vp9_inter_frame_probs inter; + struct rkvdec_vp9_intra_only_frame_probs intra_only; + }; +}; + +/* Data structure describing auxiliary buffer format. */ +struct rkvdec_vp9_priv_tbl { + struct rkvdec_vp9_probs probs; + u8 segmap[2][RKVDEC_VP9_MAX_SEGMAP_SIZE]; +}; + +struct rkvdec_vp9_refs_counts { + u32 eob[2]; + u32 coeff[3]; +}; + +struct rkvdec_vp9_inter_frame_symbol_counts { + u32 partition[16][4]; + u32 skip[3][2]; + u32 inter[4][2]; + u32 tx32p[2][4]; + u32 tx16p[2][4]; + u32 tx8p[2][2]; + u32 y_mode[4][10]; + u32 uv_mode[10][10]; + u32 comp[5][2]; + u32 comp_ref[5][2]; + u32 single_ref[5][2][2]; + u32 mv_mode[7][4]; + u32 filter[4][3]; + u32 mv_joint[4]; + u32 sign[2][2]; + /* add 1 element for align */ + u32 classes[2][11 + 1]; + u32 class0[2][2]; + u32 bits[2][10][2]; + u32 class0_fp[2][2][4]; + u32 fp[2][4]; + u32 class0_hp[2][2]; + u32 hp[2][2]; + struct rkvdec_vp9_refs_counts ref_cnt[2][4][2][6][6]; +}; + +struct rkvdec_vp9_intra_frame_symbol_counts { + u32 partition[4][4][4]; + u32 skip[3][2]; + u32 intra[4][2]; + u32 tx32p[2][4]; + u32 tx16p[2][4]; + u32 tx8p[2][2]; + struct rkvdec_vp9_refs_counts ref_cnt[2][4][2][6][6]; +}; + +struct rkvdec_vp9_run { + struct rkvdec_run base; + const struct v4l2_ctrl_vp9_frame *decode_params; +}; + +struct rkvdec_vp9_frame_info { + u32 valid : 1; + u32 segmapid : 1; + u32 frame_context_idx : 2; + u32 reference_mode : 2; + u32 tx_mode : 3; + u32 interpolation_filter : 3; + u32 flags; + u64 timestamp; + struct v4l2_vp9_segmentation seg; + struct v4l2_vp9_loop_filter lf; +}; + +struct rkvdec_vp9_ctx { + struct rkvdec_aux_buf priv_tbl; + struct rkvdec_aux_buf count_tbl; + struct v4l2_vp9_frame_symbol_counts inter_cnts; + struct v4l2_vp9_frame_symbol_counts intra_cnts; + struct v4l2_vp9_frame_context probability_tables; + struct v4l2_vp9_frame_context frame_context[4]; + struct rkvdec_vp9_frame_info cur; + struct rkvdec_vp9_frame_info last; +}; + +static void write_coeff_plane(const u8 coef[6][6][3], u8 *coeff_plane) +{ + unsigned int idx = 0, byte_count = 0; + int k, m, n; + u8 p; + + for (k = 0; k < 6; k++) { + for (m = 0; m < 6; m++) { + for (n = 0; n < 3; n++) { + p = coef[k][m][n]; + coeff_plane[idx++] = p; + byte_count++; + if (byte_count == 27) { + idx += 5; + byte_count = 0; + } + } + } + } +} + +static void init_intra_only_probs(struct rkvdec_ctx *ctx, + const struct rkvdec_vp9_run *run) +{ + const struct v4l2_ctrl_vp9_frame *dec_params; + struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv; + struct rkvdec_vp9_priv_tbl *tbl = vp9_ctx->priv_tbl.cpu; + struct rkvdec_vp9_intra_only_frame_probs *rkprobs; + const struct v4l2_vp9_frame_context *probs; + unsigned int i, j, k, m; + + rkprobs = &tbl->probs.intra_only; + dec_params = run->decode_params; + probs = &vp9_ctx->probability_tables; + + /* + * intra only 149 x 128 bits ,aligned to 152 x 128 bits coeff related + * prob 64 x 128 bits + */ + for (i = 0; i < ARRAY_SIZE(probs->coef); i++) { + for (j = 0; j < ARRAY_SIZE(probs->coef[0]); j++) + write_coeff_plane(probs->coef[i][j][0], + rkprobs->coef_intra[i][j]); + } + + /* intra mode prob 80 x 128 bits */ + for (i = 0; i < ARRAY_SIZE(v4l2_vp9_kf_y_mode_prob); i++) { + unsigned int byte_count = 0; + int idx = 0; + + /* vp9_kf_y_mode_prob */ + for (j = 0; j < ARRAY_SIZE(v4l2_vp9_kf_y_mode_prob[0]); j++) { + for (k = 0; k < ARRAY_SIZE(v4l2_vp9_kf_y_mode_prob[0][0]); + k++) { + u8 val = v4l2_vp9_kf_y_mode_prob[i][j][k]; + + rkprobs->intra_mode[i].y_mode[idx++] = val; + byte_count++; + if (byte_count == 27) { + byte_count = 0; + idx += 5; + } + } + } + + idx = 0; + if (i < 4) { + for (m = 0; m < (i < 3 ? 23 : 21); m++) { + const u8 *ptr = (const u8 *)v4l2_vp9_kf_uv_mode_prob; + + rkprobs->intra_mode[i].uv_mode[idx++] = ptr[i * 23 + m]; + } + } + } +} + +static void init_inter_probs(struct rkvdec_ctx *ctx, + const struct rkvdec_vp9_run *run) +{ + const struct v4l2_ctrl_vp9_frame *dec_params; + struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv; + struct rkvdec_vp9_priv_tbl *tbl = vp9_ctx->priv_tbl.cpu; + struct rkvdec_vp9_inter_frame_probs *rkprobs; + const struct v4l2_vp9_frame_context *probs; + unsigned int i, j, k; + + rkprobs = &tbl->probs.inter; + dec_params = run->decode_params; + probs = &vp9_ctx->probability_tables; + + /* + * inter probs + * 151 x 128 bits, aligned to 152 x 128 bits + * inter only + * intra_y_mode & inter_block info 6 x 128 bits + */ + + memcpy(rkprobs->y_mode, probs->y_mode, sizeof(rkprobs->y_mode)); + memcpy(rkprobs->comp_mode, probs->comp_mode, + sizeof(rkprobs->comp_mode)); + memcpy(rkprobs->comp_ref, probs->comp_ref, + sizeof(rkprobs->comp_ref)); + memcpy(rkprobs->single_ref, probs->single_ref, + sizeof(rkprobs->single_ref)); + memcpy(rkprobs->inter_mode, probs->inter_mode, + sizeof(rkprobs->inter_mode)); + memcpy(rkprobs->interp_filter, probs->interp_filter, + sizeof(rkprobs->interp_filter)); + + /* 128 x 128 bits coeff related */ + for (i = 0; i < ARRAY_SIZE(probs->coef); i++) { + for (j = 0; j < ARRAY_SIZE(probs->coef[0]); j++) { + for (k = 0; k < ARRAY_SIZE(probs->coef[0][0]); k++) + write_coeff_plane(probs->coef[i][j][k], + rkprobs->coef[k][i][j]); + } + } + + /* intra uv mode 6 x 128 */ + memcpy(rkprobs->uv_mode_0_2, &probs->uv_mode[0], + sizeof(rkprobs->uv_mode_0_2)); + memcpy(rkprobs->uv_mode_3_5, &probs->uv_mode[3], + sizeof(rkprobs->uv_mode_3_5)); + memcpy(rkprobs->uv_mode_6_8, &probs->uv_mode[6], + sizeof(rkprobs->uv_mode_6_8)); + memcpy(rkprobs->uv_mode_9, &probs->uv_mode[9], + sizeof(rkprobs->uv_mode_9)); + + /* mv related 6 x 128 */ + memcpy(rkprobs->mv.joint, probs->mv.joint, + sizeof(rkprobs->mv.joint)); + memcpy(rkprobs->mv.sign, probs->mv.sign, + sizeof(rkprobs->mv.sign)); + memcpy(rkprobs->mv.classes, probs->mv.classes, + sizeof(rkprobs->mv.classes)); + memcpy(rkprobs->mv.class0_bit, probs->mv.class0_bit, + sizeof(rkprobs->mv.class0_bit)); + memcpy(rkprobs->mv.bits, probs->mv.bits, + sizeof(rkprobs->mv.bits)); + memcpy(rkprobs->mv.class0_fr, probs->mv.class0_fr, + sizeof(rkprobs->mv.class0_fr)); + memcpy(rkprobs->mv.fr, probs->mv.fr, + sizeof(rkprobs->mv.fr)); + memcpy(rkprobs->mv.class0_hp, probs->mv.class0_hp, + sizeof(rkprobs->mv.class0_hp)); + memcpy(rkprobs->mv.hp, probs->mv.hp, + sizeof(rkprobs->mv.hp)); +} + +static void init_probs(struct rkvdec_ctx *ctx, + const struct rkvdec_vp9_run *run) +{ + const struct v4l2_ctrl_vp9_frame *dec_params; + struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv; + struct rkvdec_vp9_priv_tbl *tbl = vp9_ctx->priv_tbl.cpu; + struct rkvdec_vp9_probs *rkprobs = &tbl->probs; + const struct v4l2_vp9_segmentation *seg; + const struct v4l2_vp9_frame_context *probs; + bool intra_only; + + dec_params = run->decode_params; + probs = &vp9_ctx->probability_tables; + seg = &dec_params->seg; + + memset(rkprobs, 0, sizeof(*rkprobs)); + + intra_only = !!(dec_params->flags & + (V4L2_VP9_FRAME_FLAG_KEY_FRAME | + V4L2_VP9_FRAME_FLAG_INTRA_ONLY)); + + /* sb info 5 x 128 bit */ + memcpy(rkprobs->partition, + intra_only ? v4l2_vp9_kf_partition_probs : probs->partition, + sizeof(rkprobs->partition)); + + memcpy(rkprobs->pred, seg->pred_probs, sizeof(rkprobs->pred)); + memcpy(rkprobs->tree, seg->tree_probs, sizeof(rkprobs->tree)); + memcpy(rkprobs->skip, probs->skip, sizeof(rkprobs->skip)); + memcpy(rkprobs->tx32, probs->tx32, sizeof(rkprobs->tx32)); + memcpy(rkprobs->tx16, probs->tx16, sizeof(rkprobs->tx16)); + memcpy(rkprobs->tx8, probs->tx8, sizeof(rkprobs->tx8)); + memcpy(rkprobs->is_inter, probs->is_inter, sizeof(rkprobs->is_inter)); + + if (intra_only) + init_intra_only_probs(ctx, run); + else + init_inter_probs(ctx, run); +} + +struct rkvdec_vp9_ref_reg { + u32 reg_frm_size; + u32 reg_hor_stride; + u32 reg_y_stride; + u32 reg_yuv_stride; + u32 reg_ref_base; +}; + +static struct rkvdec_vp9_ref_reg ref_regs[] = { + { + .reg_frm_size = RKVDEC_REG_VP9_FRAME_SIZE(0), + .reg_hor_stride = RKVDEC_VP9_HOR_VIRSTRIDE(0), + .reg_y_stride = RKVDEC_VP9_LAST_FRAME_YSTRIDE, + .reg_yuv_stride = RKVDEC_VP9_LAST_FRAME_YUVSTRIDE, + .reg_ref_base = RKVDEC_REG_VP9_LAST_FRAME_BASE, + }, + { + .reg_frm_size = RKVDEC_REG_VP9_FRAME_SIZE(1), + .reg_hor_stride = RKVDEC_VP9_HOR_VIRSTRIDE(1), + .reg_y_stride = RKVDEC_VP9_GOLDEN_FRAME_YSTRIDE, + .reg_yuv_stride = 0, + .reg_ref_base = RKVDEC_REG_VP9_GOLDEN_FRAME_BASE, + }, + { + .reg_frm_size = RKVDEC_REG_VP9_FRAME_SIZE(2), + .reg_hor_stride = RKVDEC_VP9_HOR_VIRSTRIDE(2), + .reg_y_stride = RKVDEC_VP9_ALTREF_FRAME_YSTRIDE, + .reg_yuv_stride = 0, + .reg_ref_base = RKVDEC_REG_VP9_ALTREF_FRAME_BASE, + } +}; + +static struct rkvdec_decoded_buffer * +get_ref_buf(struct rkvdec_ctx *ctx, struct vb2_v4l2_buffer *dst, u64 timestamp) +{ + struct v4l2_m2m_ctx *m2m_ctx = ctx->fh.m2m_ctx; + struct vb2_queue *cap_q = &m2m_ctx->cap_q_ctx.q; + int buf_idx; + + /* + * If a ref is unused or invalid, address of current destination + * buffer is returned. + */ + buf_idx = vb2_find_timestamp(cap_q, timestamp, 0); + if (buf_idx < 0) + return vb2_to_rkvdec_decoded_buf(&dst->vb2_buf); + + return vb2_to_rkvdec_decoded_buf(vb2_get_buffer(cap_q, buf_idx)); +} + +static dma_addr_t get_mv_base_addr(struct rkvdec_decoded_buffer *buf) +{ + unsigned int aligned_pitch, aligned_height, yuv_len; + + aligned_height = round_up(buf->vp9.height, 64); + aligned_pitch = round_up(buf->vp9.width * buf->vp9.bit_depth, 512) / 8; + yuv_len = (aligned_height * aligned_pitch * 3) / 2; + + return vb2_dma_contig_plane_dma_addr(&buf->base.vb.vb2_buf, 0) + + yuv_len; +} + +static void config_ref_registers(struct rkvdec_ctx *ctx, + const struct rkvdec_vp9_run *run, + struct rkvdec_decoded_buffer *ref_buf, + struct rkvdec_vp9_ref_reg *ref_reg) +{ + unsigned int aligned_pitch, aligned_height, y_len, yuv_len; + struct rkvdec_dev *rkvdec = ctx->dev; + + aligned_height = round_up(ref_buf->vp9.height, 64); + writel_relaxed(RKVDEC_VP9_FRAMEWIDTH(ref_buf->vp9.width) | + RKVDEC_VP9_FRAMEHEIGHT(ref_buf->vp9.height), + rkvdec->regs + ref_reg->reg_frm_size); + + writel_relaxed(vb2_dma_contig_plane_dma_addr(&ref_buf->base.vb.vb2_buf, 0), + rkvdec->regs + ref_reg->reg_ref_base); + + if (&ref_buf->base.vb == run->base.bufs.dst) + return; + + aligned_pitch = round_up(ref_buf->vp9.width * ref_buf->vp9.bit_depth, 512) / 8; + y_len = aligned_height * aligned_pitch; + yuv_len = (y_len * 3) / 2; + + writel_relaxed(RKVDEC_HOR_Y_VIRSTRIDE(aligned_pitch / 16) | + RKVDEC_HOR_UV_VIRSTRIDE(aligned_pitch / 16), + rkvdec->regs + ref_reg->reg_hor_stride); + writel_relaxed(RKVDEC_VP9_REF_YSTRIDE(y_len / 16), + rkvdec->regs + ref_reg->reg_y_stride); + + if (!ref_reg->reg_yuv_stride) + return; + + writel_relaxed(RKVDEC_VP9_REF_YUVSTRIDE(yuv_len / 16), + rkvdec->regs + ref_reg->reg_yuv_stride); +} + +static void config_seg_registers(struct rkvdec_ctx *ctx, unsigned int segid) +{ + struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv; + const struct v4l2_vp9_segmentation *seg; + struct rkvdec_dev *rkvdec = ctx->dev; + s16 feature_val; + int feature_id; + u32 val = 0; + + seg = vp9_ctx->last.valid ? &vp9_ctx->last.seg : &vp9_ctx->cur.seg; + feature_id = V4L2_VP9_SEG_LVL_ALT_Q; + if (v4l2_vp9_seg_feat_enabled(seg->feature_enabled, feature_id, segid)) { + feature_val = seg->feature_data[segid][feature_id]; + val |= RKVDEC_SEGID_FRAME_QP_DELTA_EN(1) | + RKVDEC_SEGID_FRAME_QP_DELTA(feature_val); + } + + feature_id = V4L2_VP9_SEG_LVL_ALT_L; + if (v4l2_vp9_seg_feat_enabled(seg->feature_enabled, feature_id, segid)) { + feature_val = seg->feature_data[segid][feature_id]; + val |= RKVDEC_SEGID_FRAME_LOOPFILTER_VALUE_EN(1) | + RKVDEC_SEGID_FRAME_LOOPFILTER_VALUE(feature_val); + } + + feature_id = V4L2_VP9_SEG_LVL_REF_FRAME; + if (v4l2_vp9_seg_feat_enabled(seg->feature_enabled, feature_id, segid)) { + feature_val = seg->feature_data[segid][feature_id]; + val |= RKVDEC_SEGID_REFERINFO_EN(1) | + RKVDEC_SEGID_REFERINFO(feature_val); + } + + feature_id = V4L2_VP9_SEG_LVL_SKIP; + if (v4l2_vp9_seg_feat_enabled(seg->feature_enabled, feature_id, segid)) + val |= RKVDEC_SEGID_FRAME_SKIP_EN(1); + + if (!segid && + (seg->flags & V4L2_VP9_SEGMENTATION_FLAG_ABS_OR_DELTA_UPDATE)) + val |= RKVDEC_SEGID_ABS_DELTA(1); + + writel_relaxed(val, rkvdec->regs + RKVDEC_VP9_SEGID_GRP(segid)); +} + +static void update_dec_buf_info(struct rkvdec_decoded_buffer *buf, + const struct v4l2_ctrl_vp9_frame *dec_params) +{ + buf->vp9.width = dec_params->frame_width_minus_1 + 1; + buf->vp9.height = dec_params->frame_height_minus_1 + 1; + buf->vp9.bit_depth = dec_params->bit_depth; +} + +static void update_ctx_cur_info(struct rkvdec_vp9_ctx *vp9_ctx, + struct rkvdec_decoded_buffer *buf, + const struct v4l2_ctrl_vp9_frame *dec_params) +{ + vp9_ctx->cur.valid = true; + vp9_ctx->cur.reference_mode = dec_params->reference_mode; + vp9_ctx->cur.interpolation_filter = dec_params->interpolation_filter; + vp9_ctx->cur.flags = dec_params->flags; + vp9_ctx->cur.timestamp = buf->base.vb.vb2_buf.timestamp; + vp9_ctx->cur.seg = dec_params->seg; + vp9_ctx->cur.lf = dec_params->lf; +} + +static void update_ctx_last_info(struct rkvdec_vp9_ctx *vp9_ctx) +{ + vp9_ctx->last = vp9_ctx->cur; +} + +static void config_registers(struct rkvdec_ctx *ctx, + const struct rkvdec_vp9_run *run) +{ + unsigned int y_len, uv_len, yuv_len, bit_depth, aligned_height, aligned_pitch, stream_len; + const struct v4l2_ctrl_vp9_frame *dec_params; + struct rkvdec_decoded_buffer *ref_bufs[3]; + struct rkvdec_decoded_buffer *dst, *last, *mv_ref; + struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv; + u32 val, last_frame_info = 0; + const struct v4l2_vp9_segmentation *seg; + struct rkvdec_dev *rkvdec = ctx->dev; + dma_addr_t addr; + bool intra_only; + unsigned int i; + + dec_params = run->decode_params; + dst = vb2_to_rkvdec_decoded_buf(&run->base.bufs.dst->vb2_buf); + ref_bufs[0] = get_ref_buf(ctx, &dst->base.vb, dec_params->last_frame_ts); + ref_bufs[1] = get_ref_buf(ctx, &dst->base.vb, dec_params->golden_frame_ts); + ref_bufs[2] = get_ref_buf(ctx, &dst->base.vb, dec_params->alt_frame_ts); + + if (vp9_ctx->last.valid) + last = get_ref_buf(ctx, &dst->base.vb, vp9_ctx->last.timestamp); + else + last = dst; + + update_dec_buf_info(dst, dec_params); + update_ctx_cur_info(vp9_ctx, dst, dec_params); + seg = &dec_params->seg; + + intra_only = !!(dec_params->flags & + (V4L2_VP9_FRAME_FLAG_KEY_FRAME | + V4L2_VP9_FRAME_FLAG_INTRA_ONLY)); + + writel_relaxed(RKVDEC_MODE(RKVDEC_MODE_VP9), + rkvdec->regs + RKVDEC_REG_SYSCTRL); + + bit_depth = dec_params->bit_depth; + aligned_height = round_up(ctx->decoded_fmt.fmt.pix_mp.height, 64); + + aligned_pitch = round_up(ctx->decoded_fmt.fmt.pix_mp.width * + bit_depth, + 512) / 8; + y_len = aligned_height * aligned_pitch; + uv_len = y_len / 2; + yuv_len = y_len + uv_len; + + writel_relaxed(RKVDEC_Y_HOR_VIRSTRIDE(aligned_pitch / 16) | + RKVDEC_UV_HOR_VIRSTRIDE(aligned_pitch / 16), + rkvdec->regs + RKVDEC_REG_PICPAR); + writel_relaxed(RKVDEC_Y_VIRSTRIDE(y_len / 16), + rkvdec->regs + RKVDEC_REG_Y_VIRSTRIDE); + writel_relaxed(RKVDEC_YUV_VIRSTRIDE(yuv_len / 16), + rkvdec->regs + RKVDEC_REG_YUV_VIRSTRIDE); + + stream_len = vb2_get_plane_payload(&run->base.bufs.src->vb2_buf, 0); + writel_relaxed(RKVDEC_STRM_LEN(stream_len), + rkvdec->regs + RKVDEC_REG_STRM_LEN); + + /* + * Reset count buffer, because decoder only output intra related syntax + * counts when decoding intra frame, but update entropy need to update + * all the probabilities. + */ + if (intra_only) + memset(vp9_ctx->count_tbl.cpu, 0, vp9_ctx->count_tbl.size); + + vp9_ctx->cur.segmapid = vp9_ctx->last.segmapid; + if (!intra_only && + !(dec_params->flags & V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT) && + (!(seg->flags & V4L2_VP9_SEGMENTATION_FLAG_ENABLED) || + (seg->flags & V4L2_VP9_SEGMENTATION_FLAG_UPDATE_MAP))) + vp9_ctx->cur.segmapid++; + + for (i = 0; i < ARRAY_SIZE(ref_bufs); i++) + config_ref_registers(ctx, run, ref_bufs[i], &ref_regs[i]); + + for (i = 0; i < 8; i++) + config_seg_registers(ctx, i); + + writel_relaxed(RKVDEC_VP9_TX_MODE(vp9_ctx->cur.tx_mode) | + RKVDEC_VP9_FRAME_REF_MODE(dec_params->reference_mode), + rkvdec->regs + RKVDEC_VP9_CPRHEADER_CONFIG); + + if (!intra_only) { + const struct v4l2_vp9_loop_filter *lf; + s8 delta; + + if (vp9_ctx->last.valid) + lf = &vp9_ctx->last.lf; + else + lf = &vp9_ctx->cur.lf; + + val = 0; + for (i = 0; i < ARRAY_SIZE(lf->ref_deltas); i++) { + delta = lf->ref_deltas[i]; + val |= RKVDEC_REF_DELTAS_LASTFRAME(i, delta); + } + + writel_relaxed(val, + rkvdec->regs + RKVDEC_VP9_REF_DELTAS_LASTFRAME); + + for (i = 0; i < ARRAY_SIZE(lf->mode_deltas); i++) { + delta = lf->mode_deltas[i]; + last_frame_info |= RKVDEC_MODE_DELTAS_LASTFRAME(i, + delta); + } + } + + if (vp9_ctx->last.valid && !intra_only && + vp9_ctx->last.seg.flags & V4L2_VP9_SEGMENTATION_FLAG_ENABLED) + last_frame_info |= RKVDEC_SEG_EN_LASTFRAME; + + if (vp9_ctx->last.valid && + vp9_ctx->last.flags & V4L2_VP9_FRAME_FLAG_SHOW_FRAME) + last_frame_info |= RKVDEC_LAST_SHOW_FRAME; + + if (vp9_ctx->last.valid && + vp9_ctx->last.flags & + (V4L2_VP9_FRAME_FLAG_KEY_FRAME | V4L2_VP9_FRAME_FLAG_INTRA_ONLY)) + last_frame_info |= RKVDEC_LAST_INTRA_ONLY; + + if (vp9_ctx->last.valid && + last->vp9.width == dst->vp9.width && + last->vp9.height == dst->vp9.height) + last_frame_info |= RKVDEC_LAST_WIDHHEIGHT_EQCUR; + + writel_relaxed(last_frame_info, + rkvdec->regs + RKVDEC_VP9_INFO_LASTFRAME); + + writel_relaxed(stream_len - dec_params->compressed_header_size - + dec_params->uncompressed_header_size, + rkvdec->regs + RKVDEC_VP9_LASTTILE_SIZE); + + for (i = 0; !intra_only && i < ARRAY_SIZE(ref_bufs); i++) { + unsigned int refw = ref_bufs[i]->vp9.width; + unsigned int refh = ref_bufs[i]->vp9.height; + u32 hscale, vscale; + + hscale = (refw << 14) / dst->vp9.width; + vscale = (refh << 14) / dst->vp9.height; + writel_relaxed(RKVDEC_VP9_REF_HOR_SCALE(hscale) | + RKVDEC_VP9_REF_VER_SCALE(vscale), + rkvdec->regs + RKVDEC_VP9_REF_SCALE(i)); + } + + addr = vb2_dma_contig_plane_dma_addr(&dst->base.vb.vb2_buf, 0); + writel_relaxed(addr, rkvdec->regs + RKVDEC_REG_DECOUT_BASE); + addr = vb2_dma_contig_plane_dma_addr(&run->base.bufs.src->vb2_buf, 0); + writel_relaxed(addr, rkvdec->regs + RKVDEC_REG_STRM_RLC_BASE); + writel_relaxed(vp9_ctx->priv_tbl.dma + + offsetof(struct rkvdec_vp9_priv_tbl, probs), + rkvdec->regs + RKVDEC_REG_CABACTBL_PROB_BASE); + writel_relaxed(vp9_ctx->count_tbl.dma, + rkvdec->regs + RKVDEC_REG_VP9COUNT_BASE); + + writel_relaxed(vp9_ctx->priv_tbl.dma + + offsetof(struct rkvdec_vp9_priv_tbl, segmap) + + (RKVDEC_VP9_MAX_SEGMAP_SIZE * vp9_ctx->cur.segmapid), + rkvdec->regs + RKVDEC_REG_VP9_SEGIDCUR_BASE); + writel_relaxed(vp9_ctx->priv_tbl.dma + + offsetof(struct rkvdec_vp9_priv_tbl, segmap) + + (RKVDEC_VP9_MAX_SEGMAP_SIZE * (!vp9_ctx->cur.segmapid)), + rkvdec->regs + RKVDEC_REG_VP9_SEGIDLAST_BASE); + + if (!intra_only && + !(dec_params->flags & V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT) && + vp9_ctx->last.valid) + mv_ref = last; + else + mv_ref = dst; + + writel_relaxed(get_mv_base_addr(mv_ref), + rkvdec->regs + RKVDEC_VP9_REF_COLMV_BASE); + + writel_relaxed(ctx->decoded_fmt.fmt.pix_mp.width | + (ctx->decoded_fmt.fmt.pix_mp.height << 16), + rkvdec->regs + RKVDEC_REG_PERFORMANCE_CYCLE); +} + +static int validate_dec_params(struct rkvdec_ctx *ctx, + const struct v4l2_ctrl_vp9_frame *dec_params) +{ + unsigned int aligned_width, aligned_height; + + /* We only support profile 0. */ + if (dec_params->profile != 0) { + dev_err(ctx->dev->dev, "unsupported profile %d\n", + dec_params->profile); + return -EINVAL; + } + + aligned_width = round_up(dec_params->frame_width_minus_1 + 1, 64); + aligned_height = round_up(dec_params->frame_height_minus_1 + 1, 64); + + /* + * Userspace should update the capture/decoded format when the + * resolution changes. + */ + if (aligned_width != ctx->decoded_fmt.fmt.pix_mp.width || + aligned_height != ctx->decoded_fmt.fmt.pix_mp.height) { + dev_err(ctx->dev->dev, + "unexpected bitstream resolution %dx%d\n", + dec_params->frame_width_minus_1 + 1, + dec_params->frame_height_minus_1 + 1); + return -EINVAL; + } + + return 0; +} + +static int rkvdec_vp9_run_preamble(struct rkvdec_ctx *ctx, + struct rkvdec_vp9_run *run) +{ + const struct v4l2_ctrl_vp9_frame *dec_params; + const struct v4l2_ctrl_vp9_compressed_hdr *prob_updates; + struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv; + struct v4l2_ctrl *ctrl; + unsigned int fctx_idx; + int ret; + + /* v4l2-specific stuff */ + rkvdec_run_preamble(ctx, &run->base); + + ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, + V4L2_CID_STATELESS_VP9_FRAME); + if (WARN_ON(!ctrl)) + return -EINVAL; + dec_params = ctrl->p_cur.p; + + ret = validate_dec_params(ctx, dec_params); + if (ret) + return ret; + + run->decode_params = dec_params; + + ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, V4L2_CID_STATELESS_VP9_COMPRESSED_HDR); + if (WARN_ON(!ctrl)) + return -EINVAL; + prob_updates = ctrl->p_cur.p; + vp9_ctx->cur.tx_mode = prob_updates->tx_mode; + + /* + * vp9 stuff + * + * by this point the userspace has done all parts of 6.2 uncompressed_header() + * except this fragment: + * if ( FrameIsIntra || error_resilient_mode ) { + * setup_past_independence ( ) + * if ( frame_type == KEY_FRAME || error_resilient_mode == 1 || + * reset_frame_context == 3 ) { + * for ( i = 0; i < 4; i ++ ) { + * save_probs( i ) + * } + * } else if ( reset_frame_context == 2 ) { + * save_probs( frame_context_idx ) + * } + * frame_context_idx = 0 + * } + */ + fctx_idx = v4l2_vp9_reset_frame_ctx(dec_params, vp9_ctx->frame_context); + vp9_ctx->cur.frame_context_idx = fctx_idx; + + /* 6.1 frame(sz): load_probs() and load_probs2() */ + vp9_ctx->probability_tables = vp9_ctx->frame_context[fctx_idx]; + + /* + * The userspace has also performed 6.3 compressed_header(), but handling the + * probs in a special way. All probs which need updating, except MV-related, + * have been read from the bitstream and translated through inv_map_table[], + * but no 6.3.6 inv_recenter_nonneg(v, m) has been performed. The values passed + * by userspace are either translated values (there are no 0 values in + * inv_map_table[]), or zero to indicate no update. All MV-related probs which need + * updating have been read from the bitstream and (mv_prob << 1) | 1 has been + * performed. The values passed by userspace are either new values + * to replace old ones (the above mentioned shift and bitwise or never result in + * a zero) or zero to indicate no update. + * fw_update_probs() performs actual probs updates or leaves probs as-is + * for values for which a zero was passed from userspace. + */ + v4l2_vp9_fw_update_probs(&vp9_ctx->probability_tables, prob_updates, dec_params); + + return 0; +} + +static int rkvdec_vp9_run(struct rkvdec_ctx *ctx) +{ + struct rkvdec_dev *rkvdec = ctx->dev; + struct rkvdec_vp9_run run = { }; + int ret; + + ret = rkvdec_vp9_run_preamble(ctx, &run); + if (ret) { + rkvdec_run_postamble(ctx, &run.base); + return ret; + } + + /* Prepare probs. */ + init_probs(ctx, &run); + + /* Configure hardware registers. */ + config_registers(ctx, &run); + + rkvdec_run_postamble(ctx, &run.base); + + schedule_delayed_work(&rkvdec->watchdog_work, msecs_to_jiffies(2000)); + + writel(1, rkvdec->regs + RKVDEC_REG_PREF_LUMA_CACHE_COMMAND); + writel(1, rkvdec->regs + RKVDEC_REG_PREF_CHR_CACHE_COMMAND); + + writel(0xe, rkvdec->regs + RKVDEC_REG_STRMD_ERR_EN); + /* Start decoding! */ + writel(RKVDEC_INTERRUPT_DEC_E | RKVDEC_CONFIG_DEC_CLK_GATE_E | + RKVDEC_TIMEOUT_E | RKVDEC_BUF_EMPTY_E, + rkvdec->regs + RKVDEC_REG_INTERRUPT); + + return 0; +} + +#define copy_tx_and_skip(p1, p2) \ +do { \ + memcpy((p1)->tx8, (p2)->tx8, sizeof((p1)->tx8)); \ + memcpy((p1)->tx16, (p2)->tx16, sizeof((p1)->tx16)); \ + memcpy((p1)->tx32, (p2)->tx32, sizeof((p1)->tx32)); \ + memcpy((p1)->skip, (p2)->skip, sizeof((p1)->skip)); \ +} while (0) + +static void rkvdec_vp9_done(struct rkvdec_ctx *ctx, + struct vb2_v4l2_buffer *src_buf, + struct vb2_v4l2_buffer *dst_buf, + enum vb2_buffer_state result) +{ + struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv; + unsigned int fctx_idx; + + /* v4l2-specific stuff */ + if (result == VB2_BUF_STATE_ERROR) + goto out_update_last; + + /* + * vp9 stuff + * + * 6.1.2 refresh_probs() + * + * In the spec a complementary condition goes last in 6.1.2 refresh_probs(), + * but it makes no sense to perform all the activities from the first "if" + * there if we actually are not refreshing the frame context. On top of that, + * because of 6.2 uncompressed_header() whenever error_resilient_mode == 1, + * refresh_frame_context == 0. Consequently, if we don't jump to out_update_last + * it means error_resilient_mode must be 0. + */ + if (!(vp9_ctx->cur.flags & V4L2_VP9_FRAME_FLAG_REFRESH_FRAME_CTX)) + goto out_update_last; + + fctx_idx = vp9_ctx->cur.frame_context_idx; + + if (!(vp9_ctx->cur.flags & V4L2_VP9_FRAME_FLAG_PARALLEL_DEC_MODE)) { + /* error_resilient_mode == 0 && frame_parallel_decoding_mode == 0 */ + struct v4l2_vp9_frame_context *probs = &vp9_ctx->probability_tables; + bool frame_is_intra = vp9_ctx->cur.flags & + (V4L2_VP9_FRAME_FLAG_KEY_FRAME | V4L2_VP9_FRAME_FLAG_INTRA_ONLY); + struct tx_and_skip { + u8 tx8[2][1]; + u8 tx16[2][2]; + u8 tx32[2][3]; + u8 skip[3]; + } _tx_skip, *tx_skip = &_tx_skip; + struct v4l2_vp9_frame_symbol_counts *counts; + + /* buffer the forward-updated TX and skip probs */ + if (frame_is_intra) + copy_tx_and_skip(tx_skip, probs); + + /* 6.1.2 refresh_probs(): load_probs() and load_probs2() */ + *probs = vp9_ctx->frame_context[fctx_idx]; + + /* if FrameIsIntra then undo the effect of load_probs2() */ + if (frame_is_intra) + copy_tx_and_skip(probs, tx_skip); + + counts = frame_is_intra ? &vp9_ctx->intra_cnts : &vp9_ctx->inter_cnts; + v4l2_vp9_adapt_coef_probs(probs, counts, + !vp9_ctx->last.valid || + vp9_ctx->last.flags & V4L2_VP9_FRAME_FLAG_KEY_FRAME, + frame_is_intra); + if (!frame_is_intra) { + const struct rkvdec_vp9_inter_frame_symbol_counts *inter_cnts; + u32 classes[2][11]; + int i; + + inter_cnts = vp9_ctx->count_tbl.cpu; + for (i = 0; i < ARRAY_SIZE(classes); ++i) + memcpy(classes[i], inter_cnts->classes[i], sizeof(classes[0])); + counts->classes = &classes; + + /* load_probs2() already done */ + v4l2_vp9_adapt_noncoef_probs(&vp9_ctx->probability_tables, counts, + vp9_ctx->cur.reference_mode, + vp9_ctx->cur.interpolation_filter, + vp9_ctx->cur.tx_mode, vp9_ctx->cur.flags); + } + } + + /* 6.1.2 refresh_probs(): save_probs(fctx_idx) */ + vp9_ctx->frame_context[fctx_idx] = vp9_ctx->probability_tables; + +out_update_last: + update_ctx_last_info(vp9_ctx); +} + +static void rkvdec_init_v4l2_vp9_count_tbl(struct rkvdec_ctx *ctx) +{ + struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv; + struct rkvdec_vp9_intra_frame_symbol_counts *intra_cnts = vp9_ctx->count_tbl.cpu; + struct rkvdec_vp9_inter_frame_symbol_counts *inter_cnts = vp9_ctx->count_tbl.cpu; + int i, j, k, l, m; + + vp9_ctx->inter_cnts.partition = &inter_cnts->partition; + vp9_ctx->inter_cnts.skip = &inter_cnts->skip; + vp9_ctx->inter_cnts.intra_inter = &inter_cnts->inter; + vp9_ctx->inter_cnts.tx32p = &inter_cnts->tx32p; + vp9_ctx->inter_cnts.tx16p = &inter_cnts->tx16p; + vp9_ctx->inter_cnts.tx8p = &inter_cnts->tx8p; + + vp9_ctx->intra_cnts.partition = (u32 (*)[16][4])(&intra_cnts->partition); + vp9_ctx->intra_cnts.skip = &intra_cnts->skip; + vp9_ctx->intra_cnts.intra_inter = &intra_cnts->intra; + vp9_ctx->intra_cnts.tx32p = &intra_cnts->tx32p; + vp9_ctx->intra_cnts.tx16p = &intra_cnts->tx16p; + vp9_ctx->intra_cnts.tx8p = &intra_cnts->tx8p; + + vp9_ctx->inter_cnts.y_mode = &inter_cnts->y_mode; + vp9_ctx->inter_cnts.uv_mode = &inter_cnts->uv_mode; + vp9_ctx->inter_cnts.comp = &inter_cnts->comp; + vp9_ctx->inter_cnts.comp_ref = &inter_cnts->comp_ref; + vp9_ctx->inter_cnts.single_ref = &inter_cnts->single_ref; + vp9_ctx->inter_cnts.mv_mode = &inter_cnts->mv_mode; + vp9_ctx->inter_cnts.filter = &inter_cnts->filter; + vp9_ctx->inter_cnts.mv_joint = &inter_cnts->mv_joint; + vp9_ctx->inter_cnts.sign = &inter_cnts->sign; + /* + * rk hardware actually uses "u32 classes[2][11 + 1];" + * instead of "u32 classes[2][11];", so this must be explicitly + * copied into vp9_ctx->classes when passing the data to the + * vp9 library function + */ + vp9_ctx->inter_cnts.class0 = &inter_cnts->class0; + vp9_ctx->inter_cnts.bits = &inter_cnts->bits; + vp9_ctx->inter_cnts.class0_fp = &inter_cnts->class0_fp; + vp9_ctx->inter_cnts.fp = &inter_cnts->fp; + vp9_ctx->inter_cnts.class0_hp = &inter_cnts->class0_hp; + vp9_ctx->inter_cnts.hp = &inter_cnts->hp; + +#define INNERMOST_LOOP \ + do { \ + for (m = 0; m < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff[0][0][0][0]); ++m) {\ + vp9_ctx->inter_cnts.coeff[i][j][k][l][m] = \ + &inter_cnts->ref_cnt[k][i][j][l][m].coeff; \ + vp9_ctx->inter_cnts.eob[i][j][k][l][m][0] = \ + &inter_cnts->ref_cnt[k][i][j][l][m].eob[0]; \ + vp9_ctx->inter_cnts.eob[i][j][k][l][m][1] = \ + &inter_cnts->ref_cnt[k][i][j][l][m].eob[1]; \ + \ + vp9_ctx->intra_cnts.coeff[i][j][k][l][m] = \ + &intra_cnts->ref_cnt[k][i][j][l][m].coeff; \ + vp9_ctx->intra_cnts.eob[i][j][k][l][m][0] = \ + &intra_cnts->ref_cnt[k][i][j][l][m].eob[0]; \ + vp9_ctx->intra_cnts.eob[i][j][k][l][m][1] = \ + &intra_cnts->ref_cnt[k][i][j][l][m].eob[1]; \ + } \ + } while (0) + + for (i = 0; i < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff); ++i) + for (j = 0; j < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff[0]); ++j) + for (k = 0; k < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff[0][0]); ++k) + for (l = 0; l < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff[0][0][0]); ++l) + INNERMOST_LOOP; +#undef INNERMOST_LOOP +} + +static int rkvdec_vp9_start(struct rkvdec_ctx *ctx) +{ + struct rkvdec_dev *rkvdec = ctx->dev; + struct rkvdec_vp9_priv_tbl *priv_tbl; + struct rkvdec_vp9_ctx *vp9_ctx; + unsigned char *count_tbl; + int ret; + + vp9_ctx = kzalloc(sizeof(*vp9_ctx), GFP_KERNEL); + if (!vp9_ctx) + return -ENOMEM; + + ctx->priv = vp9_ctx; + + priv_tbl = dma_alloc_coherent(rkvdec->dev, sizeof(*priv_tbl), + &vp9_ctx->priv_tbl.dma, GFP_KERNEL); + if (!priv_tbl) { + ret = -ENOMEM; + goto err_free_ctx; + } + + vp9_ctx->priv_tbl.size = sizeof(*priv_tbl); + vp9_ctx->priv_tbl.cpu = priv_tbl; + memset(priv_tbl, 0, sizeof(*priv_tbl)); + + count_tbl = dma_alloc_coherent(rkvdec->dev, RKVDEC_VP9_COUNT_SIZE, + &vp9_ctx->count_tbl.dma, GFP_KERNEL); + if (!count_tbl) { + ret = -ENOMEM; + goto err_free_priv_tbl; + } + + vp9_ctx->count_tbl.size = RKVDEC_VP9_COUNT_SIZE; + vp9_ctx->count_tbl.cpu = count_tbl; + memset(count_tbl, 0, sizeof(*count_tbl)); + rkvdec_init_v4l2_vp9_count_tbl(ctx); + + return 0; + +err_free_priv_tbl: + dma_free_coherent(rkvdec->dev, vp9_ctx->priv_tbl.size, + vp9_ctx->priv_tbl.cpu, vp9_ctx->priv_tbl.dma); + +err_free_ctx: + kfree(vp9_ctx); + return ret; +} + +static void rkvdec_vp9_stop(struct rkvdec_ctx *ctx) +{ + struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv; + struct rkvdec_dev *rkvdec = ctx->dev; + + dma_free_coherent(rkvdec->dev, vp9_ctx->count_tbl.size, + vp9_ctx->count_tbl.cpu, vp9_ctx->count_tbl.dma); + dma_free_coherent(rkvdec->dev, vp9_ctx->priv_tbl.size, + vp9_ctx->priv_tbl.cpu, vp9_ctx->priv_tbl.dma); + kfree(vp9_ctx); +} + +static int rkvdec_vp9_adjust_fmt(struct rkvdec_ctx *ctx, + struct v4l2_format *f) +{ + struct v4l2_pix_format_mplane *fmt = &f->fmt.pix_mp; + + fmt->num_planes = 1; + if (!fmt->plane_fmt[0].sizeimage) + fmt->plane_fmt[0].sizeimage = fmt->width * fmt->height * 2; + return 0; +} + +const struct rkvdec_coded_fmt_ops rkvdec_vp9_fmt_ops = { + .adjust_fmt = rkvdec_vp9_adjust_fmt, + .start = rkvdec_vp9_start, + .stop = rkvdec_vp9_stop, + .run = rkvdec_vp9_run, + .done = rkvdec_vp9_done, +}; diff --git a/drivers/staging/media/rkvdec/rkvdec.c b/drivers/staging/media/rkvdec/rkvdec.c index 7131156c1f2c..6aa8aca66547 100644 --- a/drivers/staging/media/rkvdec/rkvdec.c +++ b/drivers/staging/media/rkvdec/rkvdec.c @@ -99,10 +99,30 @@ static const struct rkvdec_ctrls rkvdec_h264_ctrls = { .num_ctrls = ARRAY_SIZE(rkvdec_h264_ctrl_descs), }; -static const u32 rkvdec_h264_decoded_fmts[] = { +static const u32 rkvdec_h264_vp9_decoded_fmts[] = { V4L2_PIX_FMT_NV12, }; +static const struct rkvdec_ctrl_desc rkvdec_vp9_ctrl_descs[] = { + { + .cfg.id = V4L2_CID_STATELESS_VP9_FRAME, + }, + { + .cfg.id = V4L2_CID_STATELESS_VP9_COMPRESSED_HDR, + }, + { + .cfg.id = V4L2_CID_MPEG_VIDEO_VP9_PROFILE, + .cfg.min = V4L2_MPEG_VIDEO_VP9_PROFILE_0, + .cfg.max = V4L2_MPEG_VIDEO_VP9_PROFILE_0, + .cfg.def = V4L2_MPEG_VIDEO_VP9_PROFILE_0, + }, +}; + +static const struct rkvdec_ctrls rkvdec_vp9_ctrls = { + .ctrls = rkvdec_vp9_ctrl_descs, + .num_ctrls = ARRAY_SIZE(rkvdec_vp9_ctrl_descs), +}; + static const struct rkvdec_coded_fmt_desc rkvdec_coded_fmts[] = { { .fourcc = V4L2_PIX_FMT_H264_SLICE, @@ -116,8 +136,23 @@ static const struct rkvdec_coded_fmt_desc rkvdec_coded_fmts[] = { }, .ctrls = &rkvdec_h264_ctrls, .ops = &rkvdec_h264_fmt_ops, - .num_decoded_fmts = ARRAY_SIZE(rkvdec_h264_decoded_fmts), - .decoded_fmts = rkvdec_h264_decoded_fmts, + .num_decoded_fmts = ARRAY_SIZE(rkvdec_h264_vp9_decoded_fmts), + .decoded_fmts = rkvdec_h264_vp9_decoded_fmts, + }, + { + .fourcc = V4L2_PIX_FMT_VP9_FRAME, + .frmsize = { + .min_width = 64, + .max_width = 4096, + .step_width = 64, + .min_height = 64, + .max_height = 2304, + .step_height = 64, + }, + .ctrls = &rkvdec_vp9_ctrls, + .ops = &rkvdec_vp9_fmt_ops, + .num_decoded_fmts = ARRAY_SIZE(rkvdec_h264_vp9_decoded_fmts), + .decoded_fmts = rkvdec_h264_vp9_decoded_fmts, } }; @@ -319,7 +354,7 @@ static int rkvdec_s_output_fmt(struct file *file, void *priv, struct v4l2_m2m_ctx *m2m_ctx = ctx->fh.m2m_ctx; const struct rkvdec_coded_fmt_desc *desc; struct v4l2_format *cap_fmt; - struct vb2_queue *peer_vq; + struct vb2_queue *peer_vq, *vq; int ret; /* @@ -331,6 +366,15 @@ static int rkvdec_s_output_fmt(struct file *file, void *priv, if (vb2_is_busy(peer_vq)) return -EBUSY; + /* + * Some codecs like VP9 can contain dynamic resolution changes which + * are currently not supported by the V4L2 API or driver, so return + * an error if userspace tries to reconfigure the output format. + */ + vq = v4l2_m2m_get_vq(m2m_ctx, V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE); + if (vb2_is_busy(vq)) + return -EINVAL; + ret = rkvdec_s_fmt(file, priv, f, rkvdec_try_output_fmt); if (ret) return ret; diff --git a/drivers/staging/media/rkvdec/rkvdec.h b/drivers/staging/media/rkvdec/rkvdec.h index 52ac3874c5e5..2f4ea1786b93 100644 --- a/drivers/staging/media/rkvdec/rkvdec.h +++ b/drivers/staging/media/rkvdec/rkvdec.h @@ -42,14 +42,18 @@ struct rkvdec_run { struct rkvdec_vp9_decoded_buffer_info { /* Info needed when the decoded frame serves as a reference frame. */ - u16 width; - u16 height; - u32 bit_depth : 4; + unsigned short width; + unsigned short height; + unsigned int bit_depth : 4; }; struct rkvdec_decoded_buffer { /* Must be the first field in this struct. */ struct v4l2_m2m_buffer base; + + union { + struct rkvdec_vp9_decoded_buffer_info vp9; + }; }; static inline struct rkvdec_decoded_buffer * @@ -116,4 +120,6 @@ void rkvdec_run_preamble(struct rkvdec_ctx *ctx, struct rkvdec_run *run); void rkvdec_run_postamble(struct rkvdec_ctx *ctx, struct rkvdec_run *run); extern const struct rkvdec_coded_fmt_ops rkvdec_h264_fmt_ops; +extern const struct rkvdec_coded_fmt_ops rkvdec_vp9_fmt_ops; + #endif /* RKVDEC_H_ */ From patchwork Mon Sep 27 15:19:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrzej Pietrasiewicz X-Patchwork-Id: 12520183 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 141E8C433FE for ; Mon, 27 Sep 2021 15:20:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E139C6101A for ; Mon, 27 Sep 2021 15:20:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235153AbhI0PV4 (ORCPT ); Mon, 27 Sep 2021 11:21:56 -0400 Received: from bhuna.collabora.co.uk ([46.235.227.227]:54214 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235166AbhI0PVw (ORCPT ); Mon, 27 Sep 2021 11:21:52 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: andrzej.p) with ESMTPSA id 9EC311F42E8B From: Andrzej Pietrasiewicz To: linux-media@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-rockchip@lists.infradead.org, linux-staging@lists.linux.dev Cc: Andrzej Pietrasiewicz , Benjamin Gaignard , Boris Brezillon , Ezequiel Garcia , Fabio Estevam , Greg Kroah-Hartman , Hans Verkuil , Heiko Stuebner , Jernej Skrabec , Mauro Carvalho Chehab , Nicolas Dufresne , NXP Linux Team , Pengutronix Kernel Team , Philipp Zabel , Sascha Hauer , Shawn Guo , kernel@collabora.com Subject: [PATCH v6 08/10] media: hantro: Prepare for other G2 codecs Date: Mon, 27 Sep 2021 17:19:56 +0200 Message-Id: <20210927151958.24426-9-andrzej.p@collabora.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210927151958.24426-1-andrzej.p@collabora.com> References: <20210927151958.24426-1-andrzej.p@collabora.com> Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org VeriSilicon Hantro G2 core supports other codecs besides hevc. Factor out some common code in preparation for vp9 support. Signed-off-by: Andrzej Pietrasiewicz Reviewed-by: Benjamin Gaignard --- drivers/staging/media/hantro/Makefile | 1 + drivers/staging/media/hantro/hantro.h | 7 +++++ drivers/staging/media/hantro/hantro_g2.c | 27 ++++++++++++++++ .../staging/media/hantro/hantro_g2_hevc_dec.c | 31 ------------------- drivers/staging/media/hantro/hantro_g2_regs.h | 7 +++++ drivers/staging/media/hantro/hantro_hw.h | 2 ++ 6 files changed, 44 insertions(+), 31 deletions(-) create mode 100644 drivers/staging/media/hantro/hantro_g2.c diff --git a/drivers/staging/media/hantro/Makefile b/drivers/staging/media/hantro/Makefile index 90036831fec4..fe6d84871d07 100644 --- a/drivers/staging/media/hantro/Makefile +++ b/drivers/staging/media/hantro/Makefile @@ -12,6 +12,7 @@ hantro-vpu-y += \ hantro_g1_mpeg2_dec.o \ hantro_g2_hevc_dec.o \ hantro_g1_vp8_dec.o \ + hantro_g2.o \ rockchip_vpu2_hw_jpeg_enc.o \ rockchip_vpu2_hw_h264_dec.o \ rockchip_vpu2_hw_mpeg2_dec.o \ diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h index dd5e56765d4e..d91eb2b1c509 100644 --- a/drivers/staging/media/hantro/hantro.h +++ b/drivers/staging/media/hantro/hantro.h @@ -369,6 +369,13 @@ static inline void vdpu_write(struct hantro_dev *vpu, u32 val, u32 reg) writel(val, vpu->dec_base + reg); } +static inline void hantro_write_addr(struct hantro_dev *vpu, + unsigned long offset, + dma_addr_t addr) +{ + vdpu_write(vpu, addr & 0xffffffff, offset); +} + static inline u32 vdpu_read(struct hantro_dev *vpu, u32 reg) { u32 val = readl(vpu->dec_base + reg); diff --git a/drivers/staging/media/hantro/hantro_g2.c b/drivers/staging/media/hantro/hantro_g2.c new file mode 100644 index 000000000000..5f7bb27913de --- /dev/null +++ b/drivers/staging/media/hantro/hantro_g2.c @@ -0,0 +1,27 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Hantro VPU codec driver + * + * Copyright (C) 2021 Collabora Ltd, Andrzej Pietrasiewicz + */ + +#include "hantro_hw.h" +#include "hantro_g2_regs.h" + +void hantro_g2_check_idle(struct hantro_dev *vpu) +{ + int i; + + for (i = 0; i < 3; i++) { + u32 status; + + /* Make sure the VPU is idle */ + status = vdpu_read(vpu, G2_REG_INTERRUPT); + if (status & G2_REG_INTERRUPT_DEC_E) { + dev_warn(vpu->dev, "device still running, aborting"); + status |= G2_REG_INTERRUPT_DEC_ABORT_E | G2_REG_INTERRUPT_DEC_IRQ_DIS; + vdpu_write(vpu, status, G2_REG_INTERRUPT); + } + } +} + diff --git a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c index 340efb57fd18..226cecda9495 100644 --- a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c +++ b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c @@ -8,20 +8,6 @@ #include "hantro_hw.h" #include "hantro_g2_regs.h" -#define HEVC_DEC_MODE 0xC - -#define BUS_WIDTH_32 0 -#define BUS_WIDTH_64 1 -#define BUS_WIDTH_128 2 -#define BUS_WIDTH_256 3 - -static inline void hantro_write_addr(struct hantro_dev *vpu, - unsigned long offset, - dma_addr_t addr) -{ - vdpu_write(vpu, addr & 0xffffffff, offset); -} - static void prepare_tile_info_buffer(struct hantro_ctx *ctx) { struct hantro_dev *vpu = ctx->dev; @@ -516,23 +502,6 @@ static void set_buffers(struct hantro_ctx *ctx) hantro_write_addr(vpu, G2_TILE_BSD, ctx->hevc_dec.tile_bsd.dma); } -static void hantro_g2_check_idle(struct hantro_dev *vpu) -{ - int i; - - for (i = 0; i < 3; i++) { - u32 status; - - /* Make sure the VPU is idle */ - status = vdpu_read(vpu, G2_REG_INTERRUPT); - if (status & G2_REG_INTERRUPT_DEC_E) { - dev_warn(vpu->dev, "device still running, aborting"); - status |= G2_REG_INTERRUPT_DEC_ABORT_E | G2_REG_INTERRUPT_DEC_IRQ_DIS; - vdpu_write(vpu, status, G2_REG_INTERRUPT); - } - } -} - int hantro_g2_hevc_dec_run(struct hantro_ctx *ctx) { struct hantro_dev *vpu = ctx->dev; diff --git a/drivers/staging/media/hantro/hantro_g2_regs.h b/drivers/staging/media/hantro/hantro_g2_regs.h index bb22fa921914..0ac0ba375e80 100644 --- a/drivers/staging/media/hantro/hantro_g2_regs.h +++ b/drivers/staging/media/hantro/hantro_g2_regs.h @@ -27,6 +27,13 @@ #define G2_REG_INTERRUPT_DEC_IRQ_DIS BIT(4) #define G2_REG_INTERRUPT_DEC_E BIT(0) +#define HEVC_DEC_MODE 0xc + +#define BUS_WIDTH_32 0 +#define BUS_WIDTH_64 1 +#define BUS_WIDTH_128 2 +#define BUS_WIDTH_256 3 + #define g2_strm_swap G2_DEC_REG(2, 28, 0xf) #define g2_dirmv_swap G2_DEC_REG(2, 20, 0xf) diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h index 4323e63dfbfc..42b3f3961f75 100644 --- a/drivers/staging/media/hantro/hantro_hw.h +++ b/drivers/staging/media/hantro/hantro_hw.h @@ -308,4 +308,6 @@ void hantro_vp8_dec_exit(struct hantro_ctx *ctx); void hantro_vp8_prob_update(struct hantro_ctx *ctx, const struct v4l2_ctrl_vp8_frame *hdr); +void hantro_g2_check_idle(struct hantro_dev *vpu); + #endif /* HANTRO_HW_H_ */ From patchwork Mon Sep 27 15:19:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrzej Pietrasiewicz X-Patchwork-Id: 12520185 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E9C55C43219 for ; Mon, 27 Sep 2021 15:20:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C1DCE6120D for ; Mon, 27 Sep 2021 15:20:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235195AbhI0PV5 (ORCPT ); Mon, 27 Sep 2021 11:21:57 -0400 Received: from bhuna.collabora.co.uk ([46.235.227.227]:54122 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235169AbhI0PVx (ORCPT ); Mon, 27 Sep 2021 11:21:53 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: andrzej.p) with ESMTPSA id 8344A1F42E7E From: Andrzej Pietrasiewicz To: linux-media@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-rockchip@lists.infradead.org, linux-staging@lists.linux.dev Cc: Andrzej Pietrasiewicz , Benjamin Gaignard , Boris Brezillon , Ezequiel Garcia , Fabio Estevam , Greg Kroah-Hartman , Hans Verkuil , Heiko Stuebner , Jernej Skrabec , Mauro Carvalho Chehab , Nicolas Dufresne , NXP Linux Team , Pengutronix Kernel Team , Philipp Zabel , Sascha Hauer , Shawn Guo , kernel@collabora.com Subject: [PATCH v6 09/10] media: hantro: Support VP9 on the G2 core Date: Mon, 27 Sep 2021 17:19:57 +0200 Message-Id: <20210927151958.24426-10-andrzej.p@collabora.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210927151958.24426-1-andrzej.p@collabora.com> References: <20210927151958.24426-1-andrzej.p@collabora.com> Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org VeriSilicon Hantro G2 core supports VP9 codec. Signed-off-by: Andrzej Pietrasiewicz Reviewed-by: Benjamin Gaignard --- drivers/staging/media/hantro/Kconfig | 1 + drivers/staging/media/hantro/Makefile | 6 +- drivers/staging/media/hantro/hantro.h | 26 + drivers/staging/media/hantro/hantro_drv.c | 18 +- drivers/staging/media/hantro/hantro_g2_regs.h | 97 ++ .../staging/media/hantro/hantro_g2_vp9_dec.c | 978 ++++++++++++++++++ drivers/staging/media/hantro/hantro_hw.h | 67 ++ drivers/staging/media/hantro/hantro_v4l2.c | 6 + drivers/staging/media/hantro/hantro_vp9.c | 240 +++++ drivers/staging/media/hantro/hantro_vp9.h | 103 ++ drivers/staging/media/hantro/imx8m_vpu_hw.c | 22 +- 11 files changed, 1560 insertions(+), 4 deletions(-) create mode 100644 drivers/staging/media/hantro/hantro_g2_vp9_dec.c create mode 100644 drivers/staging/media/hantro/hantro_vp9.c create mode 100644 drivers/staging/media/hantro/hantro_vp9.h diff --git a/drivers/staging/media/hantro/Kconfig b/drivers/staging/media/hantro/Kconfig index 20b1f6d7b69c..00a57d88c92e 100644 --- a/drivers/staging/media/hantro/Kconfig +++ b/drivers/staging/media/hantro/Kconfig @@ -9,6 +9,7 @@ config VIDEO_HANTRO select VIDEOBUF2_VMALLOC select V4L2_MEM2MEM_DEV select V4L2_H264 + select V4L2_VP9 help Support for the Hantro IP based Video Processing Units present on Rockchip and NXP i.MX8M SoCs, which accelerate video and image diff --git a/drivers/staging/media/hantro/Makefile b/drivers/staging/media/hantro/Makefile index fe6d84871d07..28af0a1ee4bf 100644 --- a/drivers/staging/media/hantro/Makefile +++ b/drivers/staging/media/hantro/Makefile @@ -10,9 +10,10 @@ hantro-vpu-y += \ hantro_g1.o \ hantro_g1_h264_dec.o \ hantro_g1_mpeg2_dec.o \ - hantro_g2_hevc_dec.o \ hantro_g1_vp8_dec.o \ hantro_g2.o \ + hantro_g2_hevc_dec.o \ + hantro_g2_vp9_dec.o \ rockchip_vpu2_hw_jpeg_enc.o \ rockchip_vpu2_hw_h264_dec.o \ rockchip_vpu2_hw_mpeg2_dec.o \ @@ -21,7 +22,8 @@ hantro-vpu-y += \ hantro_h264.o \ hantro_hevc.o \ hantro_mpeg2.o \ - hantro_vp8.o + hantro_vp8.o \ + hantro_vp9.o hantro-vpu-$(CONFIG_VIDEO_HANTRO_IMX8M) += \ imx8m_vpu_hw.o diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h index d91eb2b1c509..1e8c1a6e3eb0 100644 --- a/drivers/staging/media/hantro/hantro.h +++ b/drivers/staging/media/hantro/hantro.h @@ -36,6 +36,7 @@ struct hantro_postproc_ops; #define HANTRO_VP8_DECODER BIT(17) #define HANTRO_H264_DECODER BIT(18) #define HANTRO_HEVC_DECODER BIT(19) +#define HANTRO_VP9_DECODER BIT(20) #define HANTRO_DECODERS 0xffff0000 /** @@ -110,6 +111,7 @@ enum hantro_codec_mode { HANTRO_MODE_MPEG2_DEC, HANTRO_MODE_VP8_DEC, HANTRO_MODE_HEVC_DEC, + HANTRO_MODE_VP9_DEC, }; /* @@ -223,6 +225,7 @@ struct hantro_dev { * @mpeg2_dec: MPEG-2-decoding context. * @vp8_dec: VP8-decoding context. * @hevc_dec: HEVC-decoding context. + * @vp9_dec: VP9-decoding context. */ struct hantro_ctx { struct hantro_dev *dev; @@ -250,6 +253,7 @@ struct hantro_ctx { struct hantro_mpeg2_dec_hw_ctx mpeg2_dec; struct hantro_vp8_dec_hw_ctx vp8_dec; struct hantro_hevc_dec_hw_ctx hevc_dec; + struct hantro_vp9_dec_hw_ctx vp9_dec; }; }; @@ -299,6 +303,22 @@ struct hantro_postproc_regs { struct hantro_reg display_width; }; +struct hantro_vp9_decoded_buffer_info { + /* Info needed when the decoded frame serves as a reference frame. */ + unsigned short width; + unsigned short height; + u32 bit_depth : 4; +}; + +struct hantro_decoded_buffer { + /* Must be the first field in this struct. */ + struct v4l2_m2m_buffer base; + + union { + struct hantro_vp9_decoded_buffer_info vp9; + }; +}; + /* Logging helpers */ /** @@ -436,6 +456,12 @@ hantro_get_dec_buf_addr(struct hantro_ctx *ctx, struct vb2_buffer *vb) return vb2_dma_contig_plane_dma_addr(vb, 0); } +static inline struct hantro_decoded_buffer * +vb2_to_hantro_decoded_buf(struct vb2_buffer *buf) +{ + return container_of(buf, struct hantro_decoded_buffer, base.vb.vb2_buf); +} + void hantro_postproc_disable(struct hantro_ctx *ctx); void hantro_postproc_enable(struct hantro_ctx *ctx); void hantro_postproc_free(struct hantro_ctx *ctx); diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c index 8a2edd67f2c6..800c8879aee0 100644 --- a/drivers/staging/media/hantro/hantro_drv.c +++ b/drivers/staging/media/hantro/hantro_drv.c @@ -232,7 +232,7 @@ queue_init(void *priv, struct vb2_queue *src_vq, struct vb2_queue *dst_vq) dst_vq->io_modes = VB2_MMAP | VB2_DMABUF; dst_vq->drv_priv = ctx; dst_vq->ops = &hantro_queue_ops; - dst_vq->buf_struct_size = sizeof(struct v4l2_m2m_buffer); + dst_vq->buf_struct_size = sizeof(struct hantro_decoded_buffer); dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY; dst_vq->lock = &ctx->dev->vpu_mutex; dst_vq->dev = ctx->dev->v4l2_dev.dev; @@ -266,6 +266,12 @@ static int hantro_try_ctrl(struct v4l2_ctrl *ctrl) if (sps->flags & V4L2_HEVC_SPS_FLAG_SCALING_LIST_ENABLED) /* No scaling support */ return -EINVAL; + } else if (ctrl->id == V4L2_CID_STATELESS_VP9_FRAME) { + const struct v4l2_ctrl_vp9_frame *dec_params = ctrl->p_new.p_vp9_frame; + + /* We only support profile 0 */ + if (dec_params->profile != 0) + return -EINVAL; } return 0; } @@ -459,6 +465,16 @@ static const struct hantro_ctrl controls[] = { .step = 1, .ops = &hantro_hevc_ctrl_ops, }, + }, { + .codec = HANTRO_VP9_DECODER, + .cfg = { + .id = V4L2_CID_STATELESS_VP9_FRAME, + }, + }, { + .codec = HANTRO_VP9_DECODER, + .cfg = { + .id = V4L2_CID_STATELESS_VP9_COMPRESSED_HDR, + }, }, }; diff --git a/drivers/staging/media/hantro/hantro_g2_regs.h b/drivers/staging/media/hantro/hantro_g2_regs.h index 0ac0ba375e80..21ca21648614 100644 --- a/drivers/staging/media/hantro/hantro_g2_regs.h +++ b/drivers/staging/media/hantro/hantro_g2_regs.h @@ -28,6 +28,7 @@ #define G2_REG_INTERRUPT_DEC_E BIT(0) #define HEVC_DEC_MODE 0xc +#define VP9_DEC_MODE 0xd #define BUS_WIDTH_32 0 #define BUS_WIDTH_64 1 @@ -49,6 +50,7 @@ #define g2_pic_height_in_cbs G2_DEC_REG(4, 6, 0x1fff) #define g2_num_ref_frames G2_DEC_REG(4, 0, 0x1f) +#define g2_start_bit G2_DEC_REG(5, 25, 0x7f) #define g2_scaling_list_e G2_DEC_REG(5, 24, 0x1) #define g2_cb_qp_offset G2_DEC_REG(5, 19, 0x1f) #define g2_cr_qp_offset G2_DEC_REG(5, 14, 0x1f) @@ -84,6 +86,7 @@ #define g2_bit_depth_y_minus8 G2_DEC_REG(8, 6, 0x3) #define g2_bit_depth_c_minus8 G2_DEC_REG(8, 4, 0x3) #define g2_output_8_bits G2_DEC_REG(8, 3, 0x1) +#define g2_output_format G2_DEC_REG(8, 0, 0x7) #define g2_refidx1_active G2_DEC_REG(9, 19, 0x1f) #define g2_refidx0_active G2_DEC_REG(9, 14, 0x1f) @@ -96,6 +99,14 @@ #define g2_tile_e G2_DEC_REG(10, 1, 0x1) #define g2_entropy_sync_e G2_DEC_REG(10, 0, 0x1) +#define vp9_transform_mode G2_DEC_REG(11, 27, 0x7) +#define vp9_filt_sharpness G2_DEC_REG(11, 21, 0x7) +#define vp9_mcomp_filt_type G2_DEC_REG(11, 8, 0x7) +#define vp9_high_prec_mv_e G2_DEC_REG(11, 7, 0x1) +#define vp9_comp_pred_mode G2_DEC_REG(11, 4, 0x3) +#define vp9_gref_sign_bias G2_DEC_REG(11, 2, 0x1) +#define vp9_aref_sign_bias G2_DEC_REG(11, 0, 0x1) + #define g2_refer_lterm_e G2_DEC_REG(12, 16, 0xffff) #define g2_min_cb_size G2_DEC_REG(12, 13, 0x7) #define g2_max_cb_size G2_DEC_REG(12, 10, 0x7) @@ -154,6 +165,50 @@ #define g2_partial_ctb_y G2_DEC_REG(20, 30, 0x1) #define g2_pic_width_4x4 G2_DEC_REG(20, 16, 0xfff) #define g2_pic_height_4x4 G2_DEC_REG(20, 0, 0xfff) + +#define vp9_qp_delta_y_dc G2_DEC_REG(13, 23, 0x3f) +#define vp9_qp_delta_ch_dc G2_DEC_REG(13, 17, 0x3f) +#define vp9_qp_delta_ch_ac G2_DEC_REG(13, 11, 0x3f) +#define vp9_last_sign_bias G2_DEC_REG(13, 10, 0x1) +#define vp9_lossless_e G2_DEC_REG(13, 9, 0x1) +#define vp9_comp_pred_var_ref1 G2_DEC_REG(13, 7, 0x3) +#define vp9_comp_pred_var_ref0 G2_DEC_REG(13, 5, 0x3) +#define vp9_comp_pred_fixed_ref G2_DEC_REG(13, 3, 0x3) +#define vp9_segment_temp_upd_e G2_DEC_REG(13, 2, 0x1) +#define vp9_segment_upd_e G2_DEC_REG(13, 1, 0x1) +#define vp9_segment_e G2_DEC_REG(13, 0, 0x1) + +#define vp9_filt_level G2_DEC_REG(14, 18, 0x3f) +#define vp9_refpic_seg0 G2_DEC_REG(14, 15, 0x7) +#define vp9_skip_seg0 G2_DEC_REG(14, 14, 0x1) +#define vp9_filt_level_seg0 G2_DEC_REG(14, 8, 0x3f) +#define vp9_quant_seg0 G2_DEC_REG(14, 0, 0xff) + +#define vp9_refpic_seg1 G2_DEC_REG(15, 15, 0x7) +#define vp9_skip_seg1 G2_DEC_REG(15, 14, 0x1) +#define vp9_filt_level_seg1 G2_DEC_REG(15, 8, 0x3f) +#define vp9_quant_seg1 G2_DEC_REG(15, 0, 0xff) + +#define vp9_refpic_seg2 G2_DEC_REG(16, 15, 0x7) +#define vp9_skip_seg2 G2_DEC_REG(16, 14, 0x1) +#define vp9_filt_level_seg2 G2_DEC_REG(16, 8, 0x3f) +#define vp9_quant_seg2 G2_DEC_REG(16, 0, 0xff) + +#define vp9_refpic_seg3 G2_DEC_REG(17, 15, 0x7) +#define vp9_skip_seg3 G2_DEC_REG(17, 14, 0x1) +#define vp9_filt_level_seg3 G2_DEC_REG(17, 8, 0x3f) +#define vp9_quant_seg3 G2_DEC_REG(17, 0, 0xff) + +#define vp9_refpic_seg4 G2_DEC_REG(18, 15, 0x7) +#define vp9_skip_seg4 G2_DEC_REG(18, 14, 0x1) +#define vp9_filt_level_seg4 G2_DEC_REG(18, 8, 0x3f) +#define vp9_quant_seg4 G2_DEC_REG(18, 0, 0xff) + +#define vp9_refpic_seg5 G2_DEC_REG(19, 15, 0x7) +#define vp9_skip_seg5 G2_DEC_REG(19, 14, 0x1) +#define vp9_filt_level_seg5 G2_DEC_REG(19, 8, 0x3f) +#define vp9_quant_seg5 G2_DEC_REG(19, 0, 0xff) + #define hevc_cur_poc_00 G2_DEC_REG(46, 24, 0xff) #define hevc_cur_poc_01 G2_DEC_REG(46, 16, 0xff) #define hevc_cur_poc_02 G2_DEC_REG(46, 8, 0xff) @@ -174,6 +229,44 @@ #define hevc_cur_poc_14 G2_DEC_REG(49, 8, 0xff) #define hevc_cur_poc_15 G2_DEC_REG(49, 0, 0xff) +#define vp9_refpic_seg6 G2_DEC_REG(31, 15, 0x7) +#define vp9_skip_seg6 G2_DEC_REG(31, 14, 0x1) +#define vp9_filt_level_seg6 G2_DEC_REG(31, 8, 0x3f) +#define vp9_quant_seg6 G2_DEC_REG(31, 0, 0xff) + +#define vp9_refpic_seg7 G2_DEC_REG(32, 15, 0x7) +#define vp9_skip_seg7 G2_DEC_REG(32, 14, 0x1) +#define vp9_filt_level_seg7 G2_DEC_REG(32, 8, 0x3f) +#define vp9_quant_seg7 G2_DEC_REG(32, 0, 0xff) + +#define vp9_lref_width G2_DEC_REG(33, 16, 0xffff) +#define vp9_lref_height G2_DEC_REG(33, 0, 0xffff) + +#define vp9_gref_width G2_DEC_REG(34, 16, 0xffff) +#define vp9_gref_height G2_DEC_REG(34, 0, 0xffff) + +#define vp9_aref_width G2_DEC_REG(35, 16, 0xffff) +#define vp9_aref_height G2_DEC_REG(35, 0, 0xffff) + +#define vp9_lref_hor_scale G2_DEC_REG(36, 16, 0xffff) +#define vp9_lref_ver_scale G2_DEC_REG(36, 0, 0xffff) + +#define vp9_gref_hor_scale G2_DEC_REG(37, 16, 0xffff) +#define vp9_gref_ver_scale G2_DEC_REG(37, 0, 0xffff) + +#define vp9_aref_hor_scale G2_DEC_REG(38, 16, 0xffff) +#define vp9_aref_ver_scale G2_DEC_REG(38, 0, 0xffff) + +#define vp9_filt_ref_adj_0 G2_DEC_REG(46, 24, 0x7f) +#define vp9_filt_ref_adj_1 G2_DEC_REG(46, 16, 0x7f) +#define vp9_filt_ref_adj_2 G2_DEC_REG(46, 8, 0x7f) +#define vp9_filt_ref_adj_3 G2_DEC_REG(46, 0, 0x7f) + +#define vp9_filt_mb_adj_0 G2_DEC_REG(47, 24, 0x7f) +#define vp9_filt_mb_adj_1 G2_DEC_REG(47, 16, 0x7f) +#define vp9_filt_mb_adj_2 G2_DEC_REG(47, 8, 0x7f) +#define vp9_filt_mb_adj_3 G2_DEC_REG(47, 0, 0x7f) + #define g2_apf_threshold G2_DEC_REG(55, 0, 0xffff) #define g2_clk_gate_e G2_DEC_REG(58, 16, 0x1) @@ -186,6 +279,8 @@ #define G2_ADDR_DST (G2_SWREG(65)) #define G2_REG_ADDR_REF(i) (G2_SWREG(67) + ((i) * 0x8)) +#define VP9_ADDR_SEGMENT_WRITE (G2_SWREG(79)) +#define VP9_ADDR_SEGMENT_READ (G2_SWREG(81)) #define G2_ADDR_DST_CHR (G2_SWREG(99)) #define G2_REG_CHR_REF(i) (G2_SWREG(101) + ((i) * 0x8)) #define G2_ADDR_DST_MV (G2_SWREG(133)) @@ -193,6 +288,8 @@ #define G2_ADDR_TILE_SIZE (G2_SWREG(167)) #define G2_ADDR_STR (G2_SWREG(169)) #define HEVC_SCALING_LIST (G2_SWREG(171)) +#define VP9_ADDR_CTR (G2_SWREG(171)) +#define VP9_ADDR_PROBS (G2_SWREG(173)) #define G2_RASTER_SCAN (G2_SWREG(175)) #define G2_RASTER_SCAN_CHR (G2_SWREG(177)) #define G2_TILE_FILTER (G2_SWREG(179)) diff --git a/drivers/staging/media/hantro/hantro_g2_vp9_dec.c b/drivers/staging/media/hantro/hantro_g2_vp9_dec.c new file mode 100644 index 000000000000..f1b207666fa7 --- /dev/null +++ b/drivers/staging/media/hantro/hantro_g2_vp9_dec.c @@ -0,0 +1,978 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Hantro VP9 codec driver + * + * Copyright (C) 2021 Collabora Ltd. + */ +#include "media/videobuf2-core.h" +#include "media/videobuf2-dma-contig.h" +#include "media/videobuf2-v4l2.h" +#include +#include +#include +#include + +#include "hantro.h" +#include "hantro_vp9.h" +#include "hantro_g2_regs.h" + +#define G2_ALIGN 16 + +enum hantro_ref_frames { + INTRA_FRAME = 0, + LAST_FRAME = 1, + GOLDEN_FRAME = 2, + ALTREF_FRAME = 3, + MAX_REF_FRAMES = 4 +}; + +static int start_prepare_run(struct hantro_ctx *ctx, const struct v4l2_ctrl_vp9_frame **dec_params) +{ + const struct v4l2_ctrl_vp9_compressed_hdr *prob_updates; + struct hantro_vp9_dec_hw_ctx *vp9_ctx = &ctx->vp9_dec; + struct v4l2_ctrl *ctrl; + unsigned int fctx_idx; + + /* v4l2-specific stuff */ + hantro_start_prepare_run(ctx); + + ctrl = v4l2_ctrl_find(&ctx->ctrl_handler, V4L2_CID_STATELESS_VP9_FRAME); + if (WARN_ON(!ctrl)) + return -EINVAL; + *dec_params = ctrl->p_cur.p; + + ctrl = v4l2_ctrl_find(&ctx->ctrl_handler, V4L2_CID_STATELESS_VP9_COMPRESSED_HDR); + if (WARN_ON(!ctrl)) + return -EINVAL; + prob_updates = ctrl->p_cur.p; + vp9_ctx->cur.tx_mode = prob_updates->tx_mode; + + /* + * vp9 stuff + * + * by this point the userspace has done all parts of 6.2 uncompressed_header() + * except this fragment: + * if ( FrameIsIntra || error_resilient_mode ) { + * setup_past_independence ( ) + * if ( frame_type == KEY_FRAME || error_resilient_mode == 1 || + * reset_frame_context == 3 ) { + * for ( i = 0; i < 4; i ++ ) { + * save_probs( i ) + * } + * } else if ( reset_frame_context == 2 ) { + * save_probs( frame_context_idx ) + * } + * frame_context_idx = 0 + * } + */ + fctx_idx = v4l2_vp9_reset_frame_ctx(*dec_params, vp9_ctx->frame_context); + vp9_ctx->cur.frame_context_idx = fctx_idx; + + /* 6.1 frame(sz): load_probs() and load_probs2() */ + vp9_ctx->probability_tables = vp9_ctx->frame_context[fctx_idx]; + + /* + * The userspace has also performed 6.3 compressed_header(), but handling the + * probs in a special way. All probs which need updating, except MV-related, + * have been read from the bitstream and translated through inv_map_table[], + * but no 6.3.6 inv_recenter_nonneg(v, m) has been performed. The values passed + * by userspace are either translated values (there are no 0 values in + * inv_map_table[]), or zero to indicate no update. All MV-related probs which need + * updating have been read from the bitstream and (mv_prob << 1) | 1 has been + * performed. The values passed by userspace are either new values + * to replace old ones (the above mentioned shift and bitwise or never result in + * a zero) or zero to indicate no update. + * fw_update_probs() performs actual probs updates or leaves probs as-is + * for values for which a zero was passed from userspace. + */ + v4l2_vp9_fw_update_probs(&vp9_ctx->probability_tables, prob_updates, *dec_params); + + return 0; +} + +static size_t chroma_offset(const struct hantro_ctx *ctx, + const struct v4l2_ctrl_vp9_frame *dec_params) +{ + int bytes_per_pixel = dec_params->bit_depth == 8 ? 1 : 2; + + return ctx->src_fmt.width * ctx->src_fmt.height * bytes_per_pixel; +} + +static size_t mv_offset(const struct hantro_ctx *ctx, + const struct v4l2_ctrl_vp9_frame *dec_params) +{ + size_t cr_offset = chroma_offset(ctx, dec_params); + + return ALIGN((cr_offset * 3) / 2, G2_ALIGN); +} + +static struct hantro_decoded_buffer * +get_ref_buf(struct hantro_ctx *ctx, struct vb2_v4l2_buffer *dst, u64 timestamp) +{ + struct v4l2_m2m_ctx *m2m_ctx = ctx->fh.m2m_ctx; + struct vb2_queue *cap_q = &m2m_ctx->cap_q_ctx.q; + int buf_idx; + + /* + * If a ref is unused or invalid, address of current destination + * buffer is returned. + */ + buf_idx = vb2_find_timestamp(cap_q, timestamp, 0); + if (buf_idx < 0) + return vb2_to_hantro_decoded_buf(&dst->vb2_buf); + + return vb2_to_hantro_decoded_buf(vb2_get_buffer(cap_q, buf_idx)); +} + +static void update_dec_buf_info(struct hantro_decoded_buffer *buf, + const struct v4l2_ctrl_vp9_frame *dec_params) +{ + buf->vp9.width = dec_params->frame_width_minus_1 + 1; + buf->vp9.height = dec_params->frame_height_minus_1 + 1; + buf->vp9.bit_depth = dec_params->bit_depth; +} + +static void update_ctx_cur_info(struct hantro_vp9_dec_hw_ctx *vp9_ctx, + struct hantro_decoded_buffer *buf, + const struct v4l2_ctrl_vp9_frame *dec_params) +{ + vp9_ctx->cur.valid = true; + vp9_ctx->cur.reference_mode = dec_params->reference_mode; + vp9_ctx->cur.interpolation_filter = dec_params->interpolation_filter; + vp9_ctx->cur.flags = dec_params->flags; + vp9_ctx->cur.timestamp = buf->base.vb.vb2_buf.timestamp; +} + +static void config_output(struct hantro_ctx *ctx, + struct hantro_decoded_buffer *dst, + const struct v4l2_ctrl_vp9_frame *dec_params) +{ + dma_addr_t luma_addr, chroma_addr, mv_addr; + + hantro_reg_write(ctx->dev, &g2_out_dis, 0); + hantro_reg_write(ctx->dev, &g2_output_format, 0); + + luma_addr = vb2_dma_contig_plane_dma_addr(&dst->base.vb.vb2_buf, 0); + hantro_write_addr(ctx->dev, G2_ADDR_DST, luma_addr); + + chroma_addr = luma_addr + chroma_offset(ctx, dec_params); + hantro_write_addr(ctx->dev, G2_ADDR_DST_CHR, chroma_addr); + + mv_addr = luma_addr + mv_offset(ctx, dec_params); + hantro_write_addr(ctx->dev, G2_ADDR_DST_MV, mv_addr); +} + +struct hantro_vp9_ref_reg { + const struct hantro_reg width; + const struct hantro_reg height; + const struct hantro_reg hor_scale; + const struct hantro_reg ver_scale; + u32 y_base; + u32 c_base; +}; + +static void config_ref(struct hantro_ctx *ctx, + struct hantro_decoded_buffer *dst, + const struct hantro_vp9_ref_reg *ref_reg, + const struct v4l2_ctrl_vp9_frame *dec_params, + u64 ref_ts) +{ + struct hantro_decoded_buffer *buf; + dma_addr_t luma_addr, chroma_addr; + u32 refw, refh; + + buf = get_ref_buf(ctx, &dst->base.vb, ref_ts); + refw = buf->vp9.width; + refh = buf->vp9.height; + + hantro_reg_write(ctx->dev, &ref_reg->width, refw); + hantro_reg_write(ctx->dev, &ref_reg->height, refh); + + hantro_reg_write(ctx->dev, &ref_reg->hor_scale, (refw << 14) / dst->vp9.width); + hantro_reg_write(ctx->dev, &ref_reg->ver_scale, (refh << 14) / dst->vp9.height); + + luma_addr = vb2_dma_contig_plane_dma_addr(&buf->base.vb.vb2_buf, 0); + hantro_write_addr(ctx->dev, ref_reg->y_base, luma_addr); + + chroma_addr = luma_addr + chroma_offset(ctx, dec_params); + hantro_write_addr(ctx->dev, ref_reg->c_base, chroma_addr); +} + +static void config_ref_registers(struct hantro_ctx *ctx, + const struct v4l2_ctrl_vp9_frame *dec_params, + struct hantro_decoded_buffer *dst, + struct hantro_decoded_buffer *mv_ref) +{ + static const struct hantro_vp9_ref_reg ref_regs[] = { + { + /* Last */ + .width = vp9_lref_width, + .height = vp9_lref_height, + .hor_scale = vp9_lref_hor_scale, + .ver_scale = vp9_lref_ver_scale, + .y_base = G2_REG_ADDR_REF(0), + .c_base = G2_REG_CHR_REF(0), + }, { + /* Golden */ + .width = vp9_gref_width, + .height = vp9_gref_height, + .hor_scale = vp9_gref_hor_scale, + .ver_scale = vp9_gref_ver_scale, + .y_base = G2_REG_ADDR_REF(4), + .c_base = G2_REG_CHR_REF(4), + }, { + /* Altref */ + .width = vp9_aref_width, + .height = vp9_aref_height, + .hor_scale = vp9_aref_hor_scale, + .ver_scale = vp9_aref_ver_scale, + .y_base = G2_REG_ADDR_REF(5), + .c_base = G2_REG_CHR_REF(5), + }, + }; + dma_addr_t mv_addr; + + config_ref(ctx, dst, &ref_regs[0], dec_params, dec_params->last_frame_ts); + config_ref(ctx, dst, &ref_regs[1], dec_params, dec_params->golden_frame_ts); + config_ref(ctx, dst, &ref_regs[2], dec_params, dec_params->alt_frame_ts); + + mv_addr = vb2_dma_contig_plane_dma_addr(&mv_ref->base.vb.vb2_buf, 0) + + mv_offset(ctx, dec_params); + hantro_write_addr(ctx->dev, G2_REG_DMV_REF(0), mv_addr); + + hantro_reg_write(ctx->dev, &vp9_last_sign_bias, + dec_params->ref_frame_sign_bias & V4L2_VP9_SIGN_BIAS_LAST ? 1 : 0); + + hantro_reg_write(ctx->dev, &vp9_gref_sign_bias, + dec_params->ref_frame_sign_bias & V4L2_VP9_SIGN_BIAS_GOLDEN ? 1 : 0); + + hantro_reg_write(ctx->dev, &vp9_aref_sign_bias, + dec_params->ref_frame_sign_bias & V4L2_VP9_SIGN_BIAS_ALT ? 1 : 0); +} + +static void recompute_tile_info(unsigned short *tile_info, unsigned int tiles, unsigned int sbs) +{ + int i; + unsigned int accumulated = 0; + unsigned int next_accumulated; + + for (i = 1; i <= tiles; ++i) { + next_accumulated = i * sbs / tiles; + *tile_info++ = next_accumulated - accumulated; + accumulated = next_accumulated; + } +} + +static void +recompute_tile_rc_info(struct hantro_ctx *ctx, + unsigned int tile_r, unsigned int tile_c, + unsigned int sbs_r, unsigned int sbs_c) +{ + struct hantro_vp9_dec_hw_ctx *vp9_ctx = &ctx->vp9_dec; + + recompute_tile_info(vp9_ctx->tile_r_info, tile_r, sbs_r); + recompute_tile_info(vp9_ctx->tile_c_info, tile_c, sbs_c); + + vp9_ctx->last_tile_r = tile_r; + vp9_ctx->last_tile_c = tile_c; + vp9_ctx->last_sbs_r = sbs_r; + vp9_ctx->last_sbs_c = sbs_c; +} + +static inline unsigned int first_tile_row(unsigned int tile_r, unsigned int sbs_r) +{ + if (tile_r == sbs_r + 1) + return 1; + + if (tile_r == sbs_r + 2) + return 2; + + return 0; +} + +static void +fill_tile_info(struct hantro_ctx *ctx, + unsigned int tile_r, unsigned int tile_c, + unsigned int sbs_r, unsigned int sbs_c, + unsigned short *tile_mem) +{ + struct hantro_vp9_dec_hw_ctx *vp9_ctx = &ctx->vp9_dec; + unsigned int i, j; + bool first = true; + + for (i = first_tile_row(tile_r, sbs_r); i < tile_r; ++i) { + unsigned short r_info = vp9_ctx->tile_r_info[i]; + + if (first) { + if (i > 0) + r_info += vp9_ctx->tile_r_info[0]; + if (i == 2) + r_info += vp9_ctx->tile_r_info[1]; + first = false; + } + for (j = 0; j < tile_c; ++j) { + *tile_mem++ = vp9_ctx->tile_c_info[j]; + *tile_mem++ = r_info; + } + } +} + +static void +config_tiles(struct hantro_ctx *ctx, + const struct v4l2_ctrl_vp9_frame *dec_params, + struct hantro_decoded_buffer *dst) +{ + struct hantro_vp9_dec_hw_ctx *vp9_ctx = &ctx->vp9_dec; + struct hantro_aux_buf *misc = &vp9_ctx->misc; + struct hantro_aux_buf *tile_edge = &vp9_ctx->tile_edge; + dma_addr_t addr; + unsigned short *tile_mem; + + addr = misc->dma + vp9_ctx->tile_info_offset; + hantro_write_addr(ctx->dev, G2_ADDR_TILE_SIZE, addr); + + tile_mem = misc->cpu + vp9_ctx->tile_info_offset; + if (dec_params->tile_cols_log2 || dec_params->tile_rows_log2) { + unsigned int tile_r = (1 << dec_params->tile_rows_log2); + unsigned int tile_c = (1 << dec_params->tile_cols_log2); + unsigned int sbs_r = hantro_vp9_num_sbs(dst->vp9.height); + unsigned int sbs_c = hantro_vp9_num_sbs(dst->vp9.width); + + if (tile_r != vp9_ctx->last_tile_r || tile_c != vp9_ctx->last_tile_c || + sbs_r != vp9_ctx->last_sbs_r || sbs_c != vp9_ctx->last_sbs_c) + recompute_tile_rc_info(ctx, tile_r, tile_c, sbs_r, sbs_c); + + fill_tile_info(ctx, tile_r, tile_c, sbs_r, sbs_c, tile_mem); + + hantro_reg_write(ctx->dev, &g2_tile_e, 1); + hantro_reg_write(ctx->dev, &g2_num_tile_cols, tile_c); + hantro_reg_write(ctx->dev, &g2_num_tile_rows, tile_r); + + addr = tile_edge->dma; + hantro_write_addr(ctx->dev, G2_TILE_FILTER, addr); + + addr = tile_edge->dma + vp9_ctx->bsd_ctrl_offset; + hantro_write_addr(ctx->dev, G2_TILE_BSD, addr); + } else { + tile_mem[0] = hantro_vp9_num_sbs(dst->vp9.width); + tile_mem[1] = hantro_vp9_num_sbs(dst->vp9.height); + + hantro_reg_write(ctx->dev, &g2_tile_e, 0); + hantro_reg_write(ctx->dev, &g2_num_tile_cols, 1); + hantro_reg_write(ctx->dev, &g2_num_tile_rows, 1); + } +} + +static void +update_feat_and_flag(struct hantro_vp9_dec_hw_ctx *vp9_ctx, + const struct v4l2_vp9_segmentation *seg, + unsigned int feature, + unsigned int segid) +{ + u8 mask = V4L2_VP9_SEGMENT_FEATURE_ENABLED(feature); + + vp9_ctx->feature_data[segid][feature] = seg->feature_data[segid][feature]; + vp9_ctx->feature_enabled[segid] &= ~mask; + vp9_ctx->feature_enabled[segid] |= (seg->feature_enabled[segid] & mask); +} + +static inline s16 clip3(s16 x, s16 y, s16 z) +{ + return (z < x) ? x : (z > y) ? y : z; +} + +static s16 feat_val_clip3(s16 feat_val, s16 feature_data, bool absolute, u8 clip) +{ + if (absolute) + return feature_data; + + return clip3(0, 255, feat_val + feature_data); +} + +static void config_segment(struct hantro_ctx *ctx, const struct v4l2_ctrl_vp9_frame *dec_params) +{ + struct hantro_vp9_dec_hw_ctx *vp9_ctx = &ctx->vp9_dec; + const struct v4l2_vp9_segmentation *seg; + s16 feat_val; + unsigned char feat_id; + unsigned int segid; + bool segment_enabled, absolute, update_data; + + static const struct hantro_reg seg_regs[8][V4L2_VP9_SEG_LVL_MAX] = { + { vp9_quant_seg0, vp9_filt_level_seg0, vp9_refpic_seg0, vp9_skip_seg0 }, + { vp9_quant_seg1, vp9_filt_level_seg1, vp9_refpic_seg1, vp9_skip_seg1 }, + { vp9_quant_seg2, vp9_filt_level_seg2, vp9_refpic_seg2, vp9_skip_seg2 }, + { vp9_quant_seg3, vp9_filt_level_seg3, vp9_refpic_seg3, vp9_skip_seg3 }, + { vp9_quant_seg4, vp9_filt_level_seg4, vp9_refpic_seg4, vp9_skip_seg4 }, + { vp9_quant_seg5, vp9_filt_level_seg5, vp9_refpic_seg5, vp9_skip_seg5 }, + { vp9_quant_seg6, vp9_filt_level_seg6, vp9_refpic_seg6, vp9_skip_seg6 }, + { vp9_quant_seg7, vp9_filt_level_seg7, vp9_refpic_seg7, vp9_skip_seg7 }, + }; + + segment_enabled = !!(dec_params->seg.flags & V4L2_VP9_SEGMENTATION_FLAG_ENABLED); + hantro_reg_write(ctx->dev, &vp9_segment_e, segment_enabled); + hantro_reg_write(ctx->dev, &vp9_segment_upd_e, + !!(dec_params->seg.flags & V4L2_VP9_SEGMENTATION_FLAG_UPDATE_MAP)); + hantro_reg_write(ctx->dev, &vp9_segment_temp_upd_e, + !!(dec_params->seg.flags & V4L2_VP9_SEGMENTATION_FLAG_TEMPORAL_UPDATE)); + + seg = &dec_params->seg; + absolute = !!(seg->flags & V4L2_VP9_SEGMENTATION_FLAG_ABS_OR_DELTA_UPDATE); + update_data = !!(seg->flags & V4L2_VP9_SEGMENTATION_FLAG_UPDATE_DATA); + + for (segid = 0; segid < 8; ++segid) { + /* Quantizer segment feature */ + feat_id = V4L2_VP9_SEG_LVL_ALT_Q; + feat_val = dec_params->quant.base_q_idx; + if (segment_enabled) { + if (update_data) + update_feat_and_flag(vp9_ctx, seg, feat_id, segid); + if (v4l2_vp9_seg_feat_enabled(vp9_ctx->feature_enabled, feat_id, segid)) + feat_val = feat_val_clip3(feat_val, + vp9_ctx->feature_data[segid][feat_id], + absolute, 255); + } + hantro_reg_write(ctx->dev, &seg_regs[segid][feat_id], feat_val); + + /* Loop filter segment feature */ + feat_id = V4L2_VP9_SEG_LVL_ALT_L; + feat_val = dec_params->lf.level; + if (segment_enabled) { + if (update_data) + update_feat_and_flag(vp9_ctx, seg, feat_id, segid); + if (v4l2_vp9_seg_feat_enabled(vp9_ctx->feature_enabled, feat_id, segid)) + feat_val = feat_val_clip3(feat_val, + vp9_ctx->feature_data[segid][feat_id], + absolute, 63); + } + hantro_reg_write(ctx->dev, &seg_regs[segid][feat_id], feat_val); + + /* Reference frame segment feature */ + feat_id = V4L2_VP9_SEG_LVL_REF_FRAME; + feat_val = 0; + if (segment_enabled) { + if (update_data) + update_feat_and_flag(vp9_ctx, seg, feat_id, segid); + if (!(dec_params->flags & V4L2_VP9_FRAME_FLAG_KEY_FRAME) && + v4l2_vp9_seg_feat_enabled(vp9_ctx->feature_enabled, feat_id, segid)) + feat_val = vp9_ctx->feature_data[segid][feat_id] + 1; + } + hantro_reg_write(ctx->dev, &seg_regs[segid][feat_id], feat_val); + + /* Skip segment feature */ + feat_id = V4L2_VP9_SEG_LVL_SKIP; + feat_val = 0; + if (segment_enabled) { + if (update_data) + update_feat_and_flag(vp9_ctx, seg, feat_id, segid); + feat_val = v4l2_vp9_seg_feat_enabled(vp9_ctx->feature_enabled, + feat_id, segid) ? 1 : 0; + } + hantro_reg_write(ctx->dev, &seg_regs[segid][feat_id], feat_val); + } +} + +static void config_loop_filter(struct hantro_ctx *ctx, const struct v4l2_ctrl_vp9_frame *dec_params) +{ + bool d = dec_params->lf.flags & V4L2_VP9_LOOP_FILTER_FLAG_DELTA_ENABLED; + + hantro_reg_write(ctx->dev, &vp9_filt_level, dec_params->lf.level); + hantro_reg_write(ctx->dev, &g2_out_filtering_dis, dec_params->lf.level == 0); + hantro_reg_write(ctx->dev, &vp9_filt_sharpness, dec_params->lf.sharpness); + + hantro_reg_write(ctx->dev, &vp9_filt_ref_adj_0, d ? dec_params->lf.ref_deltas[0] : 0); + hantro_reg_write(ctx->dev, &vp9_filt_ref_adj_1, d ? dec_params->lf.ref_deltas[1] : 0); + hantro_reg_write(ctx->dev, &vp9_filt_ref_adj_2, d ? dec_params->lf.ref_deltas[2] : 0); + hantro_reg_write(ctx->dev, &vp9_filt_ref_adj_3, d ? dec_params->lf.ref_deltas[3] : 0); + hantro_reg_write(ctx->dev, &vp9_filt_mb_adj_0, d ? dec_params->lf.mode_deltas[0] : 0); + hantro_reg_write(ctx->dev, &vp9_filt_mb_adj_1, d ? dec_params->lf.mode_deltas[1] : 0); +} + +static void config_picture_dimensions(struct hantro_ctx *ctx, struct hantro_decoded_buffer *dst) +{ + u32 pic_w_4x4, pic_h_4x4; + + hantro_reg_write(ctx->dev, &g2_pic_width_in_cbs, (dst->vp9.width + 7) / 8); + hantro_reg_write(ctx->dev, &g2_pic_height_in_cbs, (dst->vp9.height + 7) / 8); + pic_w_4x4 = roundup(dst->vp9.width, 8) >> 2; + pic_h_4x4 = roundup(dst->vp9.height, 8) >> 2; + hantro_reg_write(ctx->dev, &g2_pic_width_4x4, pic_w_4x4); + hantro_reg_write(ctx->dev, &g2_pic_height_4x4, pic_h_4x4); +} + +static void +config_bit_depth(struct hantro_ctx *ctx, const struct v4l2_ctrl_vp9_frame *dec_params) +{ + hantro_reg_write(ctx->dev, &g2_bit_depth_y_minus8, dec_params->bit_depth - 8); + hantro_reg_write(ctx->dev, &g2_bit_depth_c_minus8, dec_params->bit_depth - 8); +} + +static inline bool is_lossless(const struct v4l2_vp9_quantization *quant) +{ + return quant->base_q_idx == 0 && quant->delta_q_uv_ac == 0 && + quant->delta_q_uv_dc == 0 && quant->delta_q_y_dc == 0; +} + +static void +config_quant(struct hantro_ctx *ctx, const struct v4l2_ctrl_vp9_frame *dec_params) +{ + hantro_reg_write(ctx->dev, &vp9_qp_delta_y_dc, dec_params->quant.delta_q_y_dc); + hantro_reg_write(ctx->dev, &vp9_qp_delta_ch_dc, dec_params->quant.delta_q_uv_dc); + hantro_reg_write(ctx->dev, &vp9_qp_delta_ch_ac, dec_params->quant.delta_q_uv_ac); + hantro_reg_write(ctx->dev, &vp9_lossless_e, is_lossless(&dec_params->quant)); +} + +static u32 +hantro_interp_filter_from_v4l2(unsigned int interpolation_filter) +{ + switch (interpolation_filter) { + case V4L2_VP9_INTERP_FILTER_EIGHTTAP: + return 0x1; + case V4L2_VP9_INTERP_FILTER_EIGHTTAP_SMOOTH: + return 0; + case V4L2_VP9_INTERP_FILTER_EIGHTTAP_SHARP: + return 0x2; + case V4L2_VP9_INTERP_FILTER_BILINEAR: + return 0x3; + case V4L2_VP9_INTERP_FILTER_SWITCHABLE: + return 0x4; + } + + return 0; +} + +static void +config_others(struct hantro_ctx *ctx, const struct v4l2_ctrl_vp9_frame *dec_params, + bool intra_only, bool resolution_change) +{ + struct hantro_vp9_dec_hw_ctx *vp9_ctx = &ctx->vp9_dec; + + hantro_reg_write(ctx->dev, &g2_idr_pic_e, intra_only); + + hantro_reg_write(ctx->dev, &vp9_transform_mode, vp9_ctx->cur.tx_mode); + + hantro_reg_write(ctx->dev, &vp9_mcomp_filt_type, intra_only ? + 0 : hantro_interp_filter_from_v4l2(dec_params->interpolation_filter)); + + hantro_reg_write(ctx->dev, &vp9_high_prec_mv_e, + !!(dec_params->flags & V4L2_VP9_FRAME_FLAG_ALLOW_HIGH_PREC_MV)); + + hantro_reg_write(ctx->dev, &vp9_comp_pred_mode, dec_params->reference_mode); + + hantro_reg_write(ctx->dev, &g2_tempor_mvp_e, + !(dec_params->flags & V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT) && + !(dec_params->flags & V4L2_VP9_FRAME_FLAG_KEY_FRAME) && + !(vp9_ctx->last.flags & V4L2_VP9_FRAME_FLAG_KEY_FRAME) && + !(dec_params->flags & V4L2_VP9_FRAME_FLAG_INTRA_ONLY) && + !resolution_change && + vp9_ctx->last.flags & V4L2_VP9_FRAME_FLAG_SHOW_FRAME + ); + + hantro_reg_write(ctx->dev, &g2_write_mvs_e, + !(dec_params->flags & V4L2_VP9_FRAME_FLAG_KEY_FRAME)); +} + +static void +config_compound_reference(struct hantro_ctx *ctx, + const struct v4l2_ctrl_vp9_frame *dec_params) +{ + u32 comp_fixed_ref, comp_var_ref[2]; + bool last_ref_frame_sign_bias; + bool golden_ref_frame_sign_bias; + bool alt_ref_frame_sign_bias; + bool comp_ref_allowed = 0; + + comp_fixed_ref = 0; + comp_var_ref[0] = 0; + comp_var_ref[1] = 0; + + last_ref_frame_sign_bias = dec_params->ref_frame_sign_bias & V4L2_VP9_SIGN_BIAS_LAST; + golden_ref_frame_sign_bias = dec_params->ref_frame_sign_bias & V4L2_VP9_SIGN_BIAS_GOLDEN; + alt_ref_frame_sign_bias = dec_params->ref_frame_sign_bias & V4L2_VP9_SIGN_BIAS_ALT; + + /* 6.3.12 Frame reference mode syntax */ + comp_ref_allowed |= golden_ref_frame_sign_bias != last_ref_frame_sign_bias; + comp_ref_allowed |= alt_ref_frame_sign_bias != last_ref_frame_sign_bias; + + if (comp_ref_allowed) { + if (last_ref_frame_sign_bias == + golden_ref_frame_sign_bias) { + comp_fixed_ref = ALTREF_FRAME; + comp_var_ref[0] = LAST_FRAME; + comp_var_ref[1] = GOLDEN_FRAME; + } else if (last_ref_frame_sign_bias == + alt_ref_frame_sign_bias) { + comp_fixed_ref = GOLDEN_FRAME; + comp_var_ref[0] = LAST_FRAME; + comp_var_ref[1] = ALTREF_FRAME; + } else { + comp_fixed_ref = LAST_FRAME; + comp_var_ref[0] = GOLDEN_FRAME; + comp_var_ref[1] = ALTREF_FRAME; + } + } + + hantro_reg_write(ctx->dev, &vp9_comp_pred_fixed_ref, comp_fixed_ref); + hantro_reg_write(ctx->dev, &vp9_comp_pred_var_ref0, comp_var_ref[0]); + hantro_reg_write(ctx->dev, &vp9_comp_pred_var_ref1, comp_var_ref[1]); +} + +#define INNER_LOOP \ +do { \ + for (m = 0; m < ARRAY_SIZE(adaptive->coef[0][0][0][0]); ++m) { \ + memcpy(adaptive->coef[i][j][k][l][m], \ + probs->coef[i][j][k][l][m], \ + sizeof(probs->coef[i][j][k][l][m])); \ + \ + adaptive->coef[i][j][k][l][m][3] = 0; \ + } \ +} while (0) + +static void config_probs(struct hantro_ctx *ctx, const struct v4l2_ctrl_vp9_frame *dec_params) +{ + struct hantro_vp9_dec_hw_ctx *vp9_ctx = &ctx->vp9_dec; + struct hantro_aux_buf *misc = &vp9_ctx->misc; + struct hantro_g2_all_probs *all_probs = misc->cpu; + struct hantro_g2_probs *adaptive; + struct hantro_g2_mv_probs *mv; + const struct v4l2_vp9_segmentation *seg = &dec_params->seg; + const struct v4l2_vp9_frame_context *probs = &vp9_ctx->probability_tables; + int i, j, k, l, m; + + for (i = 0; i < ARRAY_SIZE(all_probs->kf_y_mode_prob); ++i) + for (j = 0; j < ARRAY_SIZE(all_probs->kf_y_mode_prob[0]); ++j) { + memcpy(all_probs->kf_y_mode_prob[i][j], + v4l2_vp9_kf_y_mode_prob[i][j], + ARRAY_SIZE(all_probs->kf_y_mode_prob[i][j])); + + all_probs->kf_y_mode_prob_tail[i][j][0] = + v4l2_vp9_kf_y_mode_prob[i][j][8]; + } + + memcpy(all_probs->mb_segment_tree_probs, seg->tree_probs, + sizeof(all_probs->mb_segment_tree_probs)); + + memcpy(all_probs->segment_pred_probs, seg->pred_probs, + sizeof(all_probs->segment_pred_probs)); + + for (i = 0; i < ARRAY_SIZE(all_probs->kf_uv_mode_prob); ++i) { + memcpy(all_probs->kf_uv_mode_prob[i], v4l2_vp9_kf_uv_mode_prob[i], + ARRAY_SIZE(all_probs->kf_uv_mode_prob[i])); + + all_probs->kf_uv_mode_prob_tail[i][0] = v4l2_vp9_kf_uv_mode_prob[i][8]; + } + + adaptive = &all_probs->probs; + + for (i = 0; i < ARRAY_SIZE(adaptive->inter_mode); ++i) { + memcpy(adaptive->inter_mode[i], probs->inter_mode[i], + sizeof(probs->inter_mode)); + + adaptive->inter_mode[i][3] = 0; + } + + memcpy(adaptive->is_inter, probs->is_inter, sizeof(adaptive->is_inter)); + + for (i = 0; i < ARRAY_SIZE(adaptive->uv_mode); ++i) { + memcpy(adaptive->uv_mode[i], probs->uv_mode[i], + sizeof(adaptive->uv_mode[i])); + adaptive->uv_mode_tail[i][0] = probs->uv_mode[i][8]; + } + + memcpy(adaptive->tx8, probs->tx8, sizeof(adaptive->tx8)); + memcpy(adaptive->tx16, probs->tx16, sizeof(adaptive->tx16)); + memcpy(adaptive->tx32, probs->tx32, sizeof(adaptive->tx32)); + + for (i = 0; i < ARRAY_SIZE(adaptive->y_mode); ++i) { + memcpy(adaptive->y_mode[i], probs->y_mode[i], + ARRAY_SIZE(adaptive->y_mode[i])); + + adaptive->y_mode_tail[i][0] = probs->y_mode[i][8]; + } + + for (i = 0; i < ARRAY_SIZE(adaptive->partition[0]); ++i) { + memcpy(adaptive->partition[0][i], v4l2_vp9_kf_partition_probs[i], + sizeof(v4l2_vp9_kf_partition_probs[i])); + + adaptive->partition[0][i][3] = 0; + } + + for (i = 0; i < ARRAY_SIZE(adaptive->partition[1]); ++i) { + memcpy(adaptive->partition[1][i], probs->partition[i], + sizeof(probs->partition[i])); + + adaptive->partition[1][i][3] = 0; + } + + memcpy(adaptive->interp_filter, probs->interp_filter, + sizeof(adaptive->interp_filter)); + + memcpy(adaptive->comp_mode, probs->comp_mode, sizeof(adaptive->comp_mode)); + + memcpy(adaptive->skip, probs->skip, sizeof(adaptive->skip)); + + mv = &adaptive->mv; + + memcpy(mv->joint, probs->mv.joint, sizeof(mv->joint)); + memcpy(mv->sign, probs->mv.sign, sizeof(mv->sign)); + memcpy(mv->class0_bit, probs->mv.class0_bit, sizeof(mv->class0_bit)); + memcpy(mv->fr, probs->mv.fr, sizeof(mv->fr)); + memcpy(mv->class0_hp, probs->mv.class0_hp, sizeof(mv->class0_hp)); + memcpy(mv->hp, probs->mv.hp, sizeof(mv->hp)); + memcpy(mv->classes, probs->mv.classes, sizeof(mv->classes)); + memcpy(mv->class0_fr, probs->mv.class0_fr, sizeof(mv->class0_fr)); + memcpy(mv->bits, probs->mv.bits, sizeof(mv->bits)); + + memcpy(adaptive->single_ref, probs->single_ref, sizeof(adaptive->single_ref)); + + memcpy(adaptive->comp_ref, probs->comp_ref, sizeof(adaptive->comp_ref)); + + for (i = 0; i < ARRAY_SIZE(adaptive->coef); ++i) + for (j = 0; j < ARRAY_SIZE(adaptive->coef[0]); ++j) + for (k = 0; k < ARRAY_SIZE(adaptive->coef[0][0]); ++k) + for (l = 0; l < ARRAY_SIZE(adaptive->coef[0][0][0]); ++l) + INNER_LOOP; + + hantro_write_addr(ctx->dev, VP9_ADDR_PROBS, misc->dma); +} + +static void config_counts(struct hantro_ctx *ctx) +{ + struct hantro_vp9_dec_hw_ctx *vp9_dec = &ctx->vp9_dec; + struct hantro_aux_buf *misc = &vp9_dec->misc; + dma_addr_t addr = misc->dma + vp9_dec->ctx_counters_offset; + + hantro_write_addr(ctx->dev, VP9_ADDR_CTR, addr); +} + +static void config_seg_map(struct hantro_ctx *ctx, + const struct v4l2_ctrl_vp9_frame *dec_params, + bool intra_only, bool update_map) +{ + struct hantro_vp9_dec_hw_ctx *vp9_ctx = &ctx->vp9_dec; + struct hantro_aux_buf *segment_map = &vp9_ctx->segment_map; + dma_addr_t addr; + + if (intra_only || + (dec_params->flags & V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT)) { + memset(segment_map->cpu, 0, segment_map->size); + memset(vp9_ctx->feature_data, 0, sizeof(vp9_ctx->feature_data)); + memset(vp9_ctx->feature_enabled, 0, sizeof(vp9_ctx->feature_enabled)); + } + + addr = segment_map->dma + vp9_ctx->active_segment * vp9_ctx->segment_map_size; + hantro_write_addr(ctx->dev, VP9_ADDR_SEGMENT_READ, addr); + + addr = segment_map->dma + (1 - vp9_ctx->active_segment) * vp9_ctx->segment_map_size; + hantro_write_addr(ctx->dev, VP9_ADDR_SEGMENT_WRITE, addr); + + if (update_map) + vp9_ctx->active_segment = 1 - vp9_ctx->active_segment; +} + +static void +config_source(struct hantro_ctx *ctx, const struct v4l2_ctrl_vp9_frame *dec_params, + struct vb2_v4l2_buffer *vb2_src) +{ + dma_addr_t stream_base, tmp_addr; + unsigned int headres_size; + u32 src_len, start_bit, src_buf_len; + + headres_size = dec_params->uncompressed_header_size + + dec_params->compressed_header_size; + + stream_base = vb2_dma_contig_plane_dma_addr(&vb2_src->vb2_buf, 0); + hantro_write_addr(ctx->dev, G2_ADDR_STR, stream_base); + + tmp_addr = stream_base + headres_size; + start_bit = (tmp_addr & 0xf) * 8; + hantro_reg_write(ctx->dev, &g2_start_bit, start_bit); + + src_len = vb2_get_plane_payload(&vb2_src->vb2_buf, 0); + src_len += start_bit / 8 - headres_size; + hantro_reg_write(ctx->dev, &g2_stream_len, src_len); + + tmp_addr &= ~0xf; + hantro_reg_write(ctx->dev, &g2_strm_start_offset, tmp_addr - stream_base); + src_buf_len = vb2_plane_size(&vb2_src->vb2_buf, 0); + hantro_reg_write(ctx->dev, &g2_strm_buffer_len, src_buf_len); +} + +static void +config_registers(struct hantro_ctx *ctx, const struct v4l2_ctrl_vp9_frame *dec_params, + struct vb2_v4l2_buffer *vb2_src, struct vb2_v4l2_buffer *vb2_dst) +{ + struct hantro_decoded_buffer *dst, *last, *mv_ref; + struct hantro_vp9_dec_hw_ctx *vp9_ctx = &ctx->vp9_dec; + const struct v4l2_vp9_segmentation *seg; + bool intra_only, resolution_change; + + /* vp9 stuff */ + dst = vb2_to_hantro_decoded_buf(&vb2_dst->vb2_buf); + + if (vp9_ctx->last.valid) + last = get_ref_buf(ctx, &dst->base.vb, vp9_ctx->last.timestamp); + else + last = dst; + + update_dec_buf_info(dst, dec_params); + update_ctx_cur_info(vp9_ctx, dst, dec_params); + seg = &dec_params->seg; + + intra_only = !!(dec_params->flags & + (V4L2_VP9_FRAME_FLAG_KEY_FRAME | + V4L2_VP9_FRAME_FLAG_INTRA_ONLY)); + + if (!intra_only && + !(dec_params->flags & V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT) && + vp9_ctx->last.valid) + mv_ref = last; + else + mv_ref = dst; + + resolution_change = dst->vp9.width != last->vp9.width || + dst->vp9.height != last->vp9.height; + + /* configure basic registers */ + hantro_reg_write(ctx->dev, &g2_mode, VP9_DEC_MODE); + hantro_reg_write(ctx->dev, &g2_strm_swap, 0xf); + hantro_reg_write(ctx->dev, &g2_dirmv_swap, 0xf); + hantro_reg_write(ctx->dev, &g2_compress_swap, 0xf); + hantro_reg_write(ctx->dev, &g2_buswidth, BUS_WIDTH_128); + hantro_reg_write(ctx->dev, &g2_max_burst, 16); + hantro_reg_write(ctx->dev, &g2_apf_threshold, 8); + hantro_reg_write(ctx->dev, &g2_ref_compress_bypass, 1); + hantro_reg_write(ctx->dev, &g2_clk_gate_e, 1); + hantro_reg_write(ctx->dev, &g2_max_cb_size, 6); + hantro_reg_write(ctx->dev, &g2_min_cb_size, 3); + + config_output(ctx, dst, dec_params); + + if (!intra_only) + config_ref_registers(ctx, dec_params, dst, mv_ref); + + config_tiles(ctx, dec_params, dst); + config_segment(ctx, dec_params); + config_loop_filter(ctx, dec_params); + config_picture_dimensions(ctx, dst); + config_bit_depth(ctx, dec_params); + config_quant(ctx, dec_params); + config_others(ctx, dec_params, intra_only, resolution_change); + config_compound_reference(ctx, dec_params); + config_probs(ctx, dec_params); + config_counts(ctx); + config_seg_map(ctx, dec_params, intra_only, + seg->flags & V4L2_VP9_SEGMENTATION_FLAG_UPDATE_MAP); + config_source(ctx, dec_params, vb2_src); +} + +int hantro_g2_vp9_dec_run(struct hantro_ctx *ctx) +{ + const struct v4l2_ctrl_vp9_frame *decode_params; + struct vb2_v4l2_buffer *src; + struct vb2_v4l2_buffer *dst; + int ret; + + hantro_g2_check_idle(ctx->dev); + + ret = start_prepare_run(ctx, &decode_params); + if (ret) { + hantro_end_prepare_run(ctx); + return ret; + } + + src = hantro_get_src_buf(ctx); + dst = hantro_get_dst_buf(ctx); + + config_registers(ctx, decode_params, src, dst); + + hantro_end_prepare_run(ctx); + + vdpu_write(ctx->dev, G2_REG_INTERRUPT_DEC_E, G2_REG_INTERRUPT); + + return 0; +} + +#define copy_tx_and_skip(p1, p2) \ +do { \ + memcpy((p1)->tx8, (p2)->tx8, sizeof((p1)->tx8)); \ + memcpy((p1)->tx16, (p2)->tx16, sizeof((p1)->tx16)); \ + memcpy((p1)->tx32, (p2)->tx32, sizeof((p1)->tx32)); \ + memcpy((p1)->skip, (p2)->skip, sizeof((p1)->skip)); \ +} while (0) + +void hantro_g2_vp9_dec_done(struct hantro_ctx *ctx) +{ + struct hantro_vp9_dec_hw_ctx *vp9_ctx = &ctx->vp9_dec; + unsigned int fctx_idx; + + if (!(vp9_ctx->cur.flags & V4L2_VP9_FRAME_FLAG_REFRESH_FRAME_CTX)) + goto out_update_last; + + fctx_idx = vp9_ctx->cur.frame_context_idx; + + if (!(vp9_ctx->cur.flags & V4L2_VP9_FRAME_FLAG_PARALLEL_DEC_MODE)) { + /* error_resilient_mode == 0 && frame_parallel_decoding_mode == 0 */ + struct v4l2_vp9_frame_context *probs = &vp9_ctx->probability_tables; + bool frame_is_intra = vp9_ctx->cur.flags & + (V4L2_VP9_FRAME_FLAG_KEY_FRAME | V4L2_VP9_FRAME_FLAG_INTRA_ONLY); + struct tx_and_skip { + u8 tx8[2][1]; + u8 tx16[2][2]; + u8 tx32[2][3]; + u8 skip[3]; + } _tx_skip, *tx_skip = &_tx_skip; + struct v4l2_vp9_frame_symbol_counts *counts; + struct symbol_counts *hantro_cnts; + u32 tx16p[2][4]; + int i; + + /* buffer the forward-updated TX and skip probs */ + if (frame_is_intra) + copy_tx_and_skip(tx_skip, probs); + + /* 6.1.2 refresh_probs(): load_probs() and load_probs2() */ + *probs = vp9_ctx->frame_context[fctx_idx]; + + /* if FrameIsIntra then undo the effect of load_probs2() */ + if (frame_is_intra) + copy_tx_and_skip(probs, tx_skip); + + counts = &vp9_ctx->cnts; + hantro_cnts = vp9_ctx->misc.cpu + vp9_ctx->ctx_counters_offset; + for (i = 0; i < ARRAY_SIZE(tx16p); ++i) { + memcpy(tx16p[i], + hantro_cnts->tx16x16_count[i], + sizeof(hantro_cnts->tx16x16_count[0])); + tx16p[i][3] = 0; + } + counts->tx16p = &tx16p; + + v4l2_vp9_adapt_coef_probs(probs, counts, + !vp9_ctx->last.valid || + vp9_ctx->last.flags & V4L2_VP9_FRAME_FLAG_KEY_FRAME, + frame_is_intra); + + if (!frame_is_intra) { + /* load_probs2() already done */ + u32 mv_mode[7][4]; + + for (i = 0; i < ARRAY_SIZE(mv_mode); ++i) { + mv_mode[i][0] = hantro_cnts->inter_mode_counts[i][1][0]; + mv_mode[i][1] = hantro_cnts->inter_mode_counts[i][2][0]; + mv_mode[i][2] = hantro_cnts->inter_mode_counts[i][0][0]; + mv_mode[i][3] = hantro_cnts->inter_mode_counts[i][2][1]; + } + counts->mv_mode = &mv_mode; + v4l2_vp9_adapt_noncoef_probs(&vp9_ctx->probability_tables, counts, + vp9_ctx->cur.reference_mode, + vp9_ctx->cur.interpolation_filter, + vp9_ctx->cur.tx_mode, vp9_ctx->cur.flags); + } + } + + vp9_ctx->frame_context[fctx_idx] = vp9_ctx->probability_tables; + +out_update_last: + vp9_ctx->last = vp9_ctx->cur; +} diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h index 42b3f3961f75..2961d399fd60 100644 --- a/drivers/staging/media/hantro/hantro_hw.h +++ b/drivers/staging/media/hantro/hantro_hw.h @@ -12,6 +12,7 @@ #include #include #include +#include #include #define DEC_8190_ALIGN_MASK 0x07U @@ -161,6 +162,50 @@ struct hantro_vp8_dec_hw_ctx { struct hantro_aux_buf prob_tbl; }; +struct hantro_vp9_frame_info { + u32 valid : 1; + u32 frame_context_idx : 2; + u32 reference_mode : 2; + u32 tx_mode : 3; + u32 interpolation_filter : 3; + u32 flags; + u64 timestamp; +}; + +#define MAX_SB_COLS 64 +#define MAX_SB_ROWS 34 + +/** + * struct hantro_vp9_dec_hw_ctx + * + */ +struct hantro_vp9_dec_hw_ctx { + struct hantro_aux_buf tile_edge; + struct hantro_aux_buf segment_map; + struct hantro_aux_buf misc; + struct v4l2_vp9_frame_symbol_counts cnts; + struct v4l2_vp9_frame_context probability_tables; + struct v4l2_vp9_frame_context frame_context[4]; + struct hantro_vp9_frame_info cur; + struct hantro_vp9_frame_info last; + + unsigned int bsd_ctrl_offset; + unsigned int segment_map_size; + unsigned int ctx_counters_offset; + unsigned int tile_info_offset; + + unsigned short tile_r_info[MAX_SB_ROWS]; + unsigned short tile_c_info[MAX_SB_COLS]; + unsigned int last_tile_r; + unsigned int last_tile_c; + unsigned int last_sbs_r; + unsigned int last_sbs_c; + + unsigned int active_segment; + u8 feature_enabled[8]; + s16 feature_data[8][4]; +}; + /** * struct hantro_postproc_ctx * @@ -267,6 +312,24 @@ void hantro_hevc_ref_remove_unused(struct hantro_ctx *ctx); size_t hantro_hevc_chroma_offset(const struct v4l2_ctrl_hevc_sps *sps); size_t hantro_hevc_motion_vectors_offset(const struct v4l2_ctrl_hevc_sps *sps); +static inline unsigned short hantro_vp9_num_sbs(unsigned short dimension) +{ + return (dimension + 63) / 64; +} + +static inline size_t +hantro_vp9_mv_size(unsigned int width, unsigned int height) +{ + int num_ctbs; + + /* + * There can be up to (CTBs x 64) number of blocks, + * and the motion vector for each block needs 16 bytes. + */ + num_ctbs = hantro_vp9_num_sbs(width) * hantro_vp9_num_sbs(height); + return (num_ctbs * 64) * 16; +} + static inline size_t hantro_h264_mv_size(unsigned int width, unsigned int height) { @@ -308,6 +371,10 @@ void hantro_vp8_dec_exit(struct hantro_ctx *ctx); void hantro_vp8_prob_update(struct hantro_ctx *ctx, const struct v4l2_ctrl_vp8_frame *hdr); +int hantro_g2_vp9_dec_run(struct hantro_ctx *ctx); +void hantro_g2_vp9_dec_done(struct hantro_ctx *ctx); +int hantro_vp9_dec_init(struct hantro_ctx *ctx); +void hantro_vp9_dec_exit(struct hantro_ctx *ctx); void hantro_g2_check_idle(struct hantro_dev *vpu); #endif /* HANTRO_HW_H_ */ diff --git a/drivers/staging/media/hantro/hantro_v4l2.c b/drivers/staging/media/hantro/hantro_v4l2.c index d1f060c55fed..e4b0645ba6fc 100644 --- a/drivers/staging/media/hantro/hantro_v4l2.c +++ b/drivers/staging/media/hantro/hantro_v4l2.c @@ -299,6 +299,11 @@ static int hantro_try_fmt(const struct hantro_ctx *ctx, pix_mp->plane_fmt[0].sizeimage += hantro_h264_mv_size(pix_mp->width, pix_mp->height); + else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_VP9_FRAME && + !hantro_needs_postproc(ctx, fmt)) + pix_mp->plane_fmt[0].sizeimage += + hantro_vp9_mv_size(pix_mp->width, + pix_mp->height); } else if (!pix_mp->plane_fmt[0].sizeimage) { /* * For coded formats the application can specify @@ -407,6 +412,7 @@ hantro_update_requires_request(struct hantro_ctx *ctx, u32 fourcc) case V4L2_PIX_FMT_VP8_FRAME: case V4L2_PIX_FMT_H264_SLICE: case V4L2_PIX_FMT_HEVC_SLICE: + case V4L2_PIX_FMT_VP9_FRAME: ctx->fh.m2m_ctx->out_q_ctx.q.requires_requests = true; break; default: diff --git a/drivers/staging/media/hantro/hantro_vp9.c b/drivers/staging/media/hantro/hantro_vp9.c new file mode 100644 index 000000000000..566cd376c097 --- /dev/null +++ b/drivers/staging/media/hantro/hantro_vp9.c @@ -0,0 +1,240 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Hantro VP9 codec driver + * + * Copyright (C) 2021 Collabora Ltd. + */ + +#include +#include + +#include "hantro.h" +#include "hantro_hw.h" +#include "hantro_vp9.h" + +#define POW2(x) (1 << (x)) + +#define MAX_LOG2_TILE_COLUMNS 6 +#define MAX_NUM_TILE_COLS POW2(MAX_LOG2_TILE_COLUMNS) +#define MAX_TILE_COLS 20 +#define MAX_TILE_ROWS 22 + +static size_t hantro_vp9_tile_filter_size(unsigned int height) +{ + u32 h, height32, size; + + h = roundup(height, 8); + + height32 = roundup(h, 64); + size = 24 * height32 * (MAX_NUM_TILE_COLS - 1); /* luma: 8, chroma: 8 + 8 */ + + return size; +} + +static size_t hantro_vp9_bsd_control_size(unsigned int height) +{ + u32 h, height32; + + h = roundup(height, 8); + height32 = roundup(h, 64); + + return 16 * (height32 / 4) * (MAX_NUM_TILE_COLS - 1); +} + +static size_t hantro_vp9_segment_map_size(unsigned int width, unsigned int height) +{ + u32 w, h; + int num_ctbs; + + w = roundup(width, 8); + h = roundup(height, 8); + num_ctbs = ((w + 63) / 64) * ((h + 63) / 64); + + return num_ctbs * 32; +} + +static inline size_t hantro_vp9_prob_tab_size(void) +{ + return roundup(sizeof(struct hantro_g2_all_probs), 16); +} + +static inline size_t hantro_vp9_count_tab_size(void) +{ + return roundup(sizeof(struct symbol_counts), 16); +} + +static inline size_t hantro_vp9_tile_info_size(void) +{ + return roundup((MAX_TILE_COLS * MAX_TILE_ROWS * 4 * sizeof(u16) + 15 + 16) & ~0xf, 16); +} + +static void *get_coeffs_arr(struct symbol_counts *cnts, int i, int j, int k, int l, int m) +{ + if (i == 0) + return &cnts->count_coeffs[j][k][l][m]; + + if (i == 1) + return &cnts->count_coeffs8x8[j][k][l][m]; + + if (i == 2) + return &cnts->count_coeffs16x16[j][k][l][m]; + + if (i == 3) + return &cnts->count_coeffs32x32[j][k][l][m]; + + return NULL; +} + +static void *get_eobs1(struct symbol_counts *cnts, int i, int j, int k, int l, int m) +{ + if (i == 0) + return &cnts->count_coeffs[j][k][l][m][3]; + + if (i == 1) + return &cnts->count_coeffs8x8[j][k][l][m][3]; + + if (i == 2) + return &cnts->count_coeffs16x16[j][k][l][m][3]; + + if (i == 3) + return &cnts->count_coeffs32x32[j][k][l][m][3]; + + return NULL; +} + +#define INNER_LOOP \ + do { \ + for (m = 0; m < ARRAY_SIZE(vp9_ctx->cnts.coeff[i][0][0][0]); ++m) { \ + vp9_ctx->cnts.coeff[i][j][k][l][m] = \ + get_coeffs_arr(cnts, i, j, k, l, m); \ + vp9_ctx->cnts.eob[i][j][k][l][m][0] = \ + &cnts->count_eobs[i][j][k][l][m]; \ + vp9_ctx->cnts.eob[i][j][k][l][m][1] = \ + get_eobs1(cnts, i, j, k, l, m); \ + } \ + } while (0) + +static void init_v4l2_vp9_count_tbl(struct hantro_ctx *ctx) +{ + struct hantro_vp9_dec_hw_ctx *vp9_ctx = &ctx->vp9_dec; + struct symbol_counts *cnts = vp9_ctx->misc.cpu + vp9_ctx->ctx_counters_offset; + int i, j, k, l, m; + + vp9_ctx->cnts.partition = &cnts->partition_counts; + vp9_ctx->cnts.skip = &cnts->mbskip_count; + vp9_ctx->cnts.intra_inter = &cnts->intra_inter_count; + vp9_ctx->cnts.tx32p = &cnts->tx32x32_count; + /* + * g2 hardware uses tx16x16_count[2][3], while the api + * expects tx16p[2][4], so this must be explicitly copied + * into vp9_ctx->cnts.tx16p when passing the data to the + * vp9 library function + */ + vp9_ctx->cnts.tx8p = &cnts->tx8x8_count; + + vp9_ctx->cnts.y_mode = &cnts->sb_ymode_counts; + vp9_ctx->cnts.uv_mode = &cnts->uv_mode_counts; + vp9_ctx->cnts.comp = &cnts->comp_inter_count; + vp9_ctx->cnts.comp_ref = &cnts->comp_ref_count; + vp9_ctx->cnts.single_ref = &cnts->single_ref_count; + vp9_ctx->cnts.filter = &cnts->switchable_interp_counts; + vp9_ctx->cnts.mv_joint = &cnts->mv_counts.joints; + vp9_ctx->cnts.sign = &cnts->mv_counts.sign; + vp9_ctx->cnts.classes = &cnts->mv_counts.classes; + vp9_ctx->cnts.class0 = &cnts->mv_counts.class0; + vp9_ctx->cnts.bits = &cnts->mv_counts.bits; + vp9_ctx->cnts.class0_fp = &cnts->mv_counts.class0_fp; + vp9_ctx->cnts.fp = &cnts->mv_counts.fp; + vp9_ctx->cnts.class0_hp = &cnts->mv_counts.class0_hp; + vp9_ctx->cnts.hp = &cnts->mv_counts.hp; + + for (i = 0; i < ARRAY_SIZE(vp9_ctx->cnts.coeff); ++i) + for (j = 0; j < ARRAY_SIZE(vp9_ctx->cnts.coeff[i]); ++j) + for (k = 0; k < ARRAY_SIZE(vp9_ctx->cnts.coeff[i][0]); ++k) + for (l = 0; l < ARRAY_SIZE(vp9_ctx->cnts.coeff[i][0][0]); ++l) + INNER_LOOP; +} + +int hantro_vp9_dec_init(struct hantro_ctx *ctx) +{ + struct hantro_dev *vpu = ctx->dev; + const struct hantro_variant *variant = vpu->variant; + struct hantro_vp9_dec_hw_ctx *vp9_dec = &ctx->vp9_dec; + struct hantro_aux_buf *tile_edge = &vp9_dec->tile_edge; + struct hantro_aux_buf *segment_map = &vp9_dec->segment_map; + struct hantro_aux_buf *misc = &vp9_dec->misc; + u32 i, max_width, max_height, size; + + if (variant->num_dec_fmts < 1) + return -EINVAL; + + for (i = 0; i < variant->num_dec_fmts; ++i) + if (variant->dec_fmts[i].fourcc == V4L2_PIX_FMT_VP9_FRAME) + break; + + if (i == variant->num_dec_fmts) + return -EINVAL; + + max_width = vpu->variant->dec_fmts[i].frmsize.max_width; + max_height = vpu->variant->dec_fmts[i].frmsize.max_height; + + size = hantro_vp9_tile_filter_size(max_height); + vp9_dec->bsd_ctrl_offset = size; + size += hantro_vp9_bsd_control_size(max_height); + + tile_edge->cpu = dma_alloc_coherent(vpu->dev, size, &tile_edge->dma, GFP_KERNEL); + if (!tile_edge->cpu) + return -ENOMEM; + + tile_edge->size = size; + memset(tile_edge->cpu, 0, size); + + size = hantro_vp9_segment_map_size(max_width, max_height); + vp9_dec->segment_map_size = size; + size *= 2; /* we need two areas of this size, used alternately */ + + segment_map->cpu = dma_alloc_coherent(vpu->dev, size, &segment_map->dma, GFP_KERNEL); + if (!segment_map->cpu) + goto err_segment_map; + + segment_map->size = size; + memset(segment_map->cpu, 0, size); + + size = hantro_vp9_prob_tab_size(); + vp9_dec->ctx_counters_offset = size; + size += hantro_vp9_count_tab_size(); + vp9_dec->tile_info_offset = size; + size += hantro_vp9_tile_info_size(); + + misc->cpu = dma_alloc_coherent(vpu->dev, size, &misc->dma, GFP_KERNEL); + if (!misc->cpu) + goto err_misc; + + misc->size = size; + memset(misc->cpu, 0, size); + + init_v4l2_vp9_count_tbl(ctx); + + return 0; + +err_misc: + dma_free_coherent(vpu->dev, segment_map->size, segment_map->cpu, segment_map->dma); + +err_segment_map: + dma_free_coherent(vpu->dev, tile_edge->size, tile_edge->cpu, tile_edge->dma); + + return -ENOMEM; +} + +void hantro_vp9_dec_exit(struct hantro_ctx *ctx) +{ + struct hantro_dev *vpu = ctx->dev; + struct hantro_vp9_dec_hw_ctx *vp9_dec = &ctx->vp9_dec; + struct hantro_aux_buf *tile_edge = &vp9_dec->tile_edge; + struct hantro_aux_buf *segment_map = &vp9_dec->segment_map; + struct hantro_aux_buf *misc = &vp9_dec->misc; + + dma_free_coherent(vpu->dev, misc->size, misc->cpu, misc->dma); + dma_free_coherent(vpu->dev, segment_map->size, segment_map->cpu, segment_map->dma); + dma_free_coherent(vpu->dev, tile_edge->size, tile_edge->cpu, tile_edge->dma); +} diff --git a/drivers/staging/media/hantro/hantro_vp9.h b/drivers/staging/media/hantro/hantro_vp9.h new file mode 100644 index 000000000000..c7f4bd3ff8dd --- /dev/null +++ b/drivers/staging/media/hantro/hantro_vp9.h @@ -0,0 +1,103 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Hantro VP9 codec driver + * + * Copyright (C) 2021 Collabora Ltd. + */ + +struct hantro_g2_mv_probs { + u8 joint[3]; + u8 sign[2]; + u8 class0_bit[2][1]; + u8 fr[2][3]; + u8 class0_hp[2]; + u8 hp[2]; + u8 classes[2][10]; + u8 class0_fr[2][2][3]; + u8 bits[2][10]; +}; + +struct hantro_g2_probs { + u8 inter_mode[7][4]; + u8 is_inter[4]; + u8 uv_mode[10][8]; + u8 tx8[2][1]; + u8 tx16[2][2]; + u8 tx32[2][3]; + u8 y_mode_tail[4][1]; + u8 y_mode[4][8]; + u8 partition[2][16][4]; /* [keyframe][][], [inter][][] */ + u8 uv_mode_tail[10][1]; + u8 interp_filter[4][2]; + u8 comp_mode[5]; + u8 skip[3]; + + u8 pad1[1]; + + struct hantro_g2_mv_probs mv; + + u8 single_ref[5][2]; + u8 comp_ref[5]; + + u8 pad2[17]; + + u8 coef[4][2][2][6][6][4]; +}; + +struct hantro_g2_all_probs { + u8 kf_y_mode_prob[10][10][8]; + + u8 kf_y_mode_prob_tail[10][10][1]; + u8 ref_pred_probs[3]; + u8 mb_segment_tree_probs[7]; + u8 segment_pred_probs[3]; + u8 ref_scores[4]; + u8 prob_comppred[2]; + + u8 pad1[9]; + + u8 kf_uv_mode_prob[10][8]; + u8 kf_uv_mode_prob_tail[10][1]; + + u8 pad2[6]; + + struct hantro_g2_probs probs; +}; + +struct mv_counts { + u32 joints[4]; + u32 sign[2][2]; + u32 classes[2][11]; + u32 class0[2][2]; + u32 bits[2][10][2]; + u32 class0_fp[2][2][4]; + u32 fp[2][4]; + u32 class0_hp[2][2]; + u32 hp[2][2]; +}; + +struct symbol_counts { + u32 inter_mode_counts[7][3][2]; + u32 sb_ymode_counts[4][10]; + u32 uv_mode_counts[10][10]; + u32 partition_counts[16][4]; + u32 switchable_interp_counts[4][3]; + u32 intra_inter_count[4][2]; + u32 comp_inter_count[5][2]; + u32 single_ref_count[5][2][2]; + u32 comp_ref_count[5][2]; + u32 tx32x32_count[2][4]; + u32 tx16x16_count[2][3]; + u32 tx8x8_count[2][2]; + u32 mbskip_count[3][2]; + + struct mv_counts mv_counts; + + u32 count_coeffs[2][2][6][6][4]; + u32 count_coeffs8x8[2][2][6][6][4]; + u32 count_coeffs16x16[2][2][6][6][4]; + u32 count_coeffs32x32[2][2][6][6][4]; + + u32 count_eobs[4][2][2][6][6]; +}; + diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/staging/media/hantro/imx8m_vpu_hw.c index a40b161e5956..455a107ffb02 100644 --- a/drivers/staging/media/hantro/imx8m_vpu_hw.c +++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c @@ -150,6 +150,19 @@ static const struct hantro_fmt imx8m_vpu_g2_dec_fmts[] = { .step_height = MB_DIM, }, }, + { + .fourcc = V4L2_PIX_FMT_VP9_FRAME, + .codec_mode = HANTRO_MODE_VP9_DEC, + .max_depth = 2, + .frmsize = { + .min_width = 48, + .max_width = 3840, + .step_width = MB_DIM, + .min_height = 48, + .max_height = 2160, + .step_height = MB_DIM, + }, + }, }; static irqreturn_t imx8m_vpu_g1_irq(int irq, void *dev_id) @@ -241,6 +254,13 @@ static const struct hantro_codec_ops imx8mq_vpu_g2_codec_ops[] = { .init = hantro_hevc_dec_init, .exit = hantro_hevc_dec_exit, }, + [HANTRO_MODE_VP9_DEC] = { + .run = hantro_g2_vp9_dec_run, + .done = hantro_g2_vp9_dec_done, + .reset = imx8m_vpu_g2_reset, + .init = hantro_vp9_dec_init, + .exit = hantro_vp9_dec_exit, + }, }; /* @@ -281,7 +301,7 @@ const struct hantro_variant imx8mq_vpu_g2_variant = { .dec_offset = 0x0, .dec_fmts = imx8m_vpu_g2_dec_fmts, .num_dec_fmts = ARRAY_SIZE(imx8m_vpu_g2_dec_fmts), - .codec = HANTRO_HEVC_DECODER, + .codec = HANTRO_HEVC_DECODER | HANTRO_VP9_DECODER, .codec_ops = imx8mq_vpu_g2_codec_ops, .init = imx8mq_vpu_hw_init, .runtime_resume = imx8mq_runtime_resume, From patchwork Mon Sep 27 15:19:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrzej Pietrasiewicz X-Patchwork-Id: 12520187 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75C77C433EF for ; Mon, 27 Sep 2021 15:20:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6070261157 for ; Mon, 27 Sep 2021 15:20:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235179AbhI0PWC (ORCPT ); Mon, 27 Sep 2021 11:22:02 -0400 Received: from bhuna.collabora.co.uk ([46.235.227.227]:54214 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235176AbhI0PVx (ORCPT ); Mon, 27 Sep 2021 11:21:53 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: andrzej.p) with ESMTPSA id 6B0C31F42E99 From: Andrzej Pietrasiewicz To: linux-media@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-rockchip@lists.infradead.org, linux-staging@lists.linux.dev Cc: Andrzej Pietrasiewicz , Benjamin Gaignard , Boris Brezillon , Ezequiel Garcia , Fabio Estevam , Greg Kroah-Hartman , Hans Verkuil , Heiko Stuebner , Jernej Skrabec , Mauro Carvalho Chehab , Nicolas Dufresne , NXP Linux Team , Pengutronix Kernel Team , Philipp Zabel , Sascha Hauer , Shawn Guo , kernel@collabora.com, Ezequiel Garcia Subject: [PATCH v6 10/10] media: hantro: Support NV12 on the G2 core Date: Mon, 27 Sep 2021 17:19:58 +0200 Message-Id: <20210927151958.24426-11-andrzej.p@collabora.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210927151958.24426-1-andrzej.p@collabora.com> References: <20210927151958.24426-1-andrzej.p@collabora.com> Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org From: Ezequiel Garcia The G2 decoder block produces NV12 4x4 tiled format (NV12_4L4). Enable the G2 post-processor block, in order to produce regular NV12. The logic in hantro_postproc.c is leveraged to take care of allocating the extra buffers and configure the post-processor, which is significantly simpler than the one on the G1. Signed-off-by: Ezequiel Garcia Signed-off-by: Andrzej Pietrasiewicz --- .../staging/media/hantro/hantro_g2_vp9_dec.c | 6 ++-- drivers/staging/media/hantro/hantro_hw.h | 1 + .../staging/media/hantro/hantro_postproc.c | 31 +++++++++++++++++++ drivers/staging/media/hantro/imx8m_vpu_hw.c | 11 +++++++ 4 files changed, 46 insertions(+), 3 deletions(-) diff --git a/drivers/staging/media/hantro/hantro_g2_vp9_dec.c b/drivers/staging/media/hantro/hantro_g2_vp9_dec.c index f1b207666fa7..c44e668d075a 100644 --- a/drivers/staging/media/hantro/hantro_g2_vp9_dec.c +++ b/drivers/staging/media/hantro/hantro_g2_vp9_dec.c @@ -152,7 +152,7 @@ static void config_output(struct hantro_ctx *ctx, hantro_reg_write(ctx->dev, &g2_out_dis, 0); hantro_reg_write(ctx->dev, &g2_output_format, 0); - luma_addr = vb2_dma_contig_plane_dma_addr(&dst->base.vb.vb2_buf, 0); + luma_addr = hantro_get_dec_buf_addr(ctx, &dst->base.vb.vb2_buf); hantro_write_addr(ctx->dev, G2_ADDR_DST, luma_addr); chroma_addr = luma_addr + chroma_offset(ctx, dec_params); @@ -191,7 +191,7 @@ static void config_ref(struct hantro_ctx *ctx, hantro_reg_write(ctx->dev, &ref_reg->hor_scale, (refw << 14) / dst->vp9.width); hantro_reg_write(ctx->dev, &ref_reg->ver_scale, (refh << 14) / dst->vp9.height); - luma_addr = vb2_dma_contig_plane_dma_addr(&buf->base.vb.vb2_buf, 0); + luma_addr = hantro_get_dec_buf_addr(ctx, &buf->base.vb.vb2_buf); hantro_write_addr(ctx->dev, ref_reg->y_base, luma_addr); chroma_addr = luma_addr + chroma_offset(ctx, dec_params); @@ -236,7 +236,7 @@ static void config_ref_registers(struct hantro_ctx *ctx, config_ref(ctx, dst, &ref_regs[1], dec_params, dec_params->golden_frame_ts); config_ref(ctx, dst, &ref_regs[2], dec_params, dec_params->alt_frame_ts); - mv_addr = vb2_dma_contig_plane_dma_addr(&mv_ref->base.vb.vb2_buf, 0) + + mv_addr = hantro_get_dec_buf_addr(ctx, &mv_ref->base.vb.vb2_buf) + mv_offset(ctx, dec_params); hantro_write_addr(ctx->dev, G2_REG_DMV_REF(0), mv_addr); diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h index 2961d399fd60..3d4a5dc1e6d5 100644 --- a/drivers/staging/media/hantro/hantro_hw.h +++ b/drivers/staging/media/hantro/hantro_hw.h @@ -274,6 +274,7 @@ extern const struct hantro_variant rk3399_vpu_variant; extern const struct hantro_variant sama5d4_vdec_variant; extern const struct hantro_postproc_ops hantro_g1_postproc_ops; +extern const struct hantro_postproc_ops hantro_g2_postproc_ops; extern const u32 hantro_vp8_dec_mc_filter[8][6]; diff --git a/drivers/staging/media/hantro/hantro_postproc.c b/drivers/staging/media/hantro/hantro_postproc.c index 4549aec08feb..bc94bf46d218 100644 --- a/drivers/staging/media/hantro/hantro_postproc.c +++ b/drivers/staging/media/hantro/hantro_postproc.c @@ -11,6 +11,7 @@ #include "hantro.h" #include "hantro_hw.h" #include "hantro_g1_regs.h" +#include "hantro_g2_regs.h" #define HANTRO_PP_REG_WRITE(vpu, reg_name, val) \ { \ @@ -99,6 +100,21 @@ static void hantro_postproc_g1_enable(struct hantro_ctx *ctx) HANTRO_PP_REG_WRITE(vpu, display_width, ctx->dst_fmt.width); } +static void hantro_postproc_g2_enable(struct hantro_ctx *ctx) +{ + struct hantro_dev *vpu = ctx->dev; + struct vb2_v4l2_buffer *dst_buf; + size_t chroma_offset = ctx->dst_fmt.width * ctx->dst_fmt.height; + dma_addr_t dst_dma; + + dst_buf = hantro_get_dst_buf(ctx); + dst_dma = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0); + + hantro_write_addr(vpu, G2_RASTER_SCAN, dst_dma); + hantro_write_addr(vpu, G2_RASTER_SCAN_CHR, dst_dma + chroma_offset); + hantro_reg_write(vpu, &g2_out_rs_e, 1); +} + void hantro_postproc_free(struct hantro_ctx *ctx) { struct hantro_dev *vpu = ctx->dev; @@ -127,6 +143,9 @@ int hantro_postproc_alloc(struct hantro_ctx *ctx) if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_H264_SLICE) buf_size += hantro_h264_mv_size(ctx->dst_fmt.width, ctx->dst_fmt.height); + else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_VP9_FRAME) + buf_size += hantro_vp9_mv_size(ctx->dst_fmt.width, + ctx->dst_fmt.height); for (i = 0; i < num_buffers; ++i) { struct hantro_aux_buf *priv = &ctx->postproc.dec_q[i]; @@ -152,6 +171,13 @@ static void hantro_postproc_g1_disable(struct hantro_ctx *ctx) HANTRO_PP_REG_WRITE_S(vpu, pipeline_en, 0x0); } +static void hantro_postproc_g2_disable(struct hantro_ctx *ctx) +{ + struct hantro_dev *vpu = ctx->dev; + + hantro_reg_write(vpu, &g2_out_rs_e, 0); +} + void hantro_postproc_disable(struct hantro_ctx *ctx) { struct hantro_dev *vpu = ctx->dev; @@ -172,3 +198,8 @@ const struct hantro_postproc_ops hantro_g1_postproc_ops = { .enable = hantro_postproc_g1_enable, .disable = hantro_postproc_g1_disable, }; + +const struct hantro_postproc_ops hantro_g2_postproc_ops = { + .enable = hantro_postproc_g2_enable, + .disable = hantro_postproc_g2_disable, +}; diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/staging/media/hantro/imx8m_vpu_hw.c index 455a107ffb02..1a43f6fceef9 100644 --- a/drivers/staging/media/hantro/imx8m_vpu_hw.c +++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c @@ -132,6 +132,14 @@ static const struct hantro_fmt imx8m_vpu_dec_fmts[] = { }, }; +static const struct hantro_fmt imx8m_vpu_g2_postproc_fmts[] = { + { + .fourcc = V4L2_PIX_FMT_NV12, + .codec_mode = HANTRO_MODE_NONE, + .postprocessed = true, + }, +}; + static const struct hantro_fmt imx8m_vpu_g2_dec_fmts[] = { { .fourcc = V4L2_PIX_FMT_NV12_4L4, @@ -301,6 +309,9 @@ const struct hantro_variant imx8mq_vpu_g2_variant = { .dec_offset = 0x0, .dec_fmts = imx8m_vpu_g2_dec_fmts, .num_dec_fmts = ARRAY_SIZE(imx8m_vpu_g2_dec_fmts), + .postproc_fmts = imx8m_vpu_g2_postproc_fmts, + .num_postproc_fmts = ARRAY_SIZE(imx8m_vpu_g2_postproc_fmts), + .postproc_ops = &hantro_g2_postproc_ops, .codec = HANTRO_HEVC_DECODER | HANTRO_VP9_DECODER, .codec_ops = imx8mq_vpu_g2_codec_ops, .init = imx8mq_vpu_hw_init,