From patchwork Thu Dec 2 18:51:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Gaignard X-Patchwork-Id: 12653291 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8F303C433EF for ; Thu, 2 Dec 2021 18:52:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-Id:Date:Subject:Cc :To:From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=E2QQP9xF2kpdsAuaf1qWMwbCQYXVr0oxEUmp43KOGWM=; b=0FA9uQrdBRxyNv VhblWmi+nqmmJ0Z+JWt/BS9w5rhxqipDM2X0joSW6Mi7GVGdtnhLxriQy4rSjHNA+XfTt6x792Slm noIq8Hvw9F6CfhnFqiaOWMHVWhn2sKn/pfbbOlG7U/jIrMNKFL+W52VU/3NdQdbklbtCSzP1ZqgWM 9zooSeBhb1OgSw9zikLIGR+7T3gsbfieszMbLjbwz8z0w8uGHnFy8KU0XgFLZbLj1L2qzZby3ytb/ CmLy9zbel4uuBvzcqO39bDv+q/oWPjrz6XwA5VYLEPbZ4PXlaIfKqQfU7JaDltO8GT+95sT7gDt/x 4IpOcMCUkY+f7/4URP6A==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1msrBY-00DIci-7V; Thu, 02 Dec 2021 18:52:00 +0000 Received: from bhuna.collabora.co.uk ([46.235.227.227]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1msrBR-00DIbV-DF for linux-rockchip@lists.infradead.org; Thu, 02 Dec 2021 18:51:55 +0000 Received: from benjamin-XPS-13-9310.. (unknown [IPv6:2a01:e0a:120:3210:f817:8a6c:e399:d0b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: benjamin.gaignard) by bhuna.collabora.co.uk (Postfix) with ESMTPSA id 013BF1F41610; Thu, 2 Dec 2021 18:51:49 +0000 (GMT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=collabora.com; s=mail; t=1638471110; bh=JmK3+MmTqcbT43Glx6RigkZKrI+vedLNGoPiZIBqnX0=; h=From:To:Cc:Subject:Date:From; b=TOJvG5tZCGqmAmshIyx7upsCS8FZvPfNWxtQHvGOS70lFhhf/jAn6FwjeDLNwRE9C Av010T4XH++/uetz0iObCoR9eowIfKhpM511K03zk4WrpzM9RM5POoNIMLO4wjuDQ7 Ut0T9dCwJZH165RY0TBI4npHfjZsuC6Gm38zjTlG6suDH165p+7VB130tlF+IlDIY2 WjCT5G8IjO6oTZRrkKz3QsXHvAd7Uvp792W+Aud4qLW41WTIWBPnMuSNWMaU96lrpd lwDTli5o57tUGdZjh//xIMZs6ObOUOWz7is5tHYOS6g0DDf522KvTSCH7HrVXggOJV 6dI27lMzUNFBQ== From: Benjamin Gaignard To: ezequiel@vanguardiasur.com.ar, p.zabel@pengutronix.de, mchehab@kernel.org, gregkh@linuxfoundation.org Cc: linux-media@vger.kernel.org, linux-rockchip@lists.infradead.org, linux-staging@lists.linux.dev, linux-kernel@vger.kernel.org, kernel@collabora.com, Benjamin Gaignard Subject: [PATCH v4] media: hantro: Make G2/HEVC use hantro_postproc_ops Date: Thu, 2 Dec 2021 19:51:42 +0100 Message-Id: <20211202185142.886814-1-benjamin.gaignard@collabora.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20211202_105153_804254_62D124B5 X-CRM114-Status: GOOD ( 21.67 ) X-BeenThere: linux-rockchip@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Upstream kernel work for Rockchip platforms List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "Linux-rockchip" Errors-To: linux-rockchip-bounces+linux-rockchip=archiver.kernel.org@lists.infradead.org Use the postprocessor functions introduced by Hantro G2/VP9 codec series and remove duplicated buffer management. This allow Hantro G2/HEVC to produce NV12_4L4 and NV12. Fluster scores are 77/147 for HEVC and 129/303 for VP9 (no regression). Beauty, Jockey and ShakeNDry bitstreams from UVG (http://ultravideo.fi/) set have also been tested. Signed-off-by: Benjamin Gaignard --- v4: - Add fluster VP9 scores. change since v1: - Reword commit message. .../staging/media/hantro/hantro_g2_hevc_dec.c | 25 +++--- drivers/staging/media/hantro/hantro_hevc.c | 79 +++---------------- drivers/staging/media/hantro/hantro_hw.h | 11 +++ .../staging/media/hantro/hantro_postproc.c | 3 + drivers/staging/media/hantro/hantro_v4l2.c | 19 ++--- 5 files changed, 41 insertions(+), 96 deletions(-) diff --git a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c index b35f36109a6f..6ef5430b18eb 100644 --- a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c +++ b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c @@ -341,6 +341,8 @@ static int set_ref(struct hantro_ctx *ctx) const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb; dma_addr_t luma_addr, chroma_addr, mv_addr = 0; struct hantro_dev *vpu = ctx->dev; + struct vb2_v4l2_buffer *vb2_dst; + struct hantro_decoded_buffer *dst; size_t cr_offset = hantro_hevc_chroma_offset(sps); size_t mv_offset = hantro_hevc_motion_vectors_offset(sps); u32 max_ref_frames; @@ -426,10 +428,15 @@ static int set_ref(struct hantro_ctx *ctx) hantro_write_addr(vpu, G2_REF_MV_ADDR(i), mv_addr); } - luma_addr = hantro_hevc_get_ref_buf(ctx, decode_params->pic_order_cnt_val); + vb2_dst = hantro_get_dst_buf(ctx); + dst = vb2_to_hantro_decoded_buf(&vb2_dst->vb2_buf); + luma_addr = hantro_get_dec_buf_addr(ctx, &dst->base.vb.vb2_buf); if (!luma_addr) return -ENOMEM; + if (hantro_hevc_add_ref_buf(ctx, decode_params->pic_order_cnt_val, luma_addr)) + return -EINVAL; + chroma_addr = luma_addr + cr_offset; mv_addr = luma_addr + mv_offset; @@ -456,16 +463,12 @@ static int set_ref(struct hantro_ctx *ctx) static void set_buffers(struct hantro_ctx *ctx) { - struct vb2_v4l2_buffer *src_buf, *dst_buf; + struct vb2_v4l2_buffer *src_buf; struct hantro_dev *vpu = ctx->dev; - const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls; - const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps; - size_t cr_offset = hantro_hevc_chroma_offset(sps); - dma_addr_t src_dma, dst_dma; + dma_addr_t src_dma; u32 src_len, src_buf_len; src_buf = hantro_get_src_buf(ctx); - dst_buf = hantro_get_dst_buf(ctx); /* Source (stream) buffer. */ src_dma = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0); @@ -478,11 +481,6 @@ static void set_buffers(struct hantro_ctx *ctx) hantro_reg_write(vpu, &g2_strm_start_offset, 0); hantro_reg_write(vpu, &g2_write_mvs_e, 1); - /* Destination (decoded frame) buffer. */ - dst_dma = hantro_get_dec_buf_addr(ctx, &dst_buf->vb2_buf); - - hantro_write_addr(vpu, G2_RS_OUT_LUMA_ADDR, dst_dma); - hantro_write_addr(vpu, G2_RS_OUT_CHROMA_ADDR, dst_dma + cr_offset); hantro_write_addr(vpu, G2_TILE_SIZES_ADDR, ctx->hevc_dec.tile_sizes.dma); hantro_write_addr(vpu, G2_TILE_FILTER_ADDR, ctx->hevc_dec.tile_filter.dma); hantro_write_addr(vpu, G2_TILE_SAO_ADDR, ctx->hevc_dec.tile_sao.dma); @@ -575,9 +573,6 @@ int hantro_g2_hevc_dec_run(struct hantro_ctx *ctx) /* Don't compress buffers */ hantro_reg_write(vpu, &g2_ref_compress_bypass, 1); - /* use NV12 as output format */ - hantro_reg_write(vpu, &g2_out_rs_e, 1); - /* Bus width and max burst */ hantro_reg_write(vpu, &g2_buswidth, BUS_WIDTH_128); hantro_reg_write(vpu, &g2_max_burst, 16); diff --git a/drivers/staging/media/hantro/hantro_hevc.c b/drivers/staging/media/hantro/hantro_hevc.c index ee03123e7704..b49a41d7ae91 100644 --- a/drivers/staging/media/hantro/hantro_hevc.c +++ b/drivers/staging/media/hantro/hantro_hevc.c @@ -44,47 +44,6 @@ size_t hantro_hevc_motion_vectors_offset(const struct v4l2_ctrl_hevc_sps *sps) return ALIGN((cr_offset * 3) / 2, G2_ALIGN); } -static size_t hantro_hevc_mv_size(const struct v4l2_ctrl_hevc_sps *sps) -{ - u32 min_cb_log2_size_y = sps->log2_min_luma_coding_block_size_minus3 + 3; - u32 ctb_log2_size_y = min_cb_log2_size_y + sps->log2_diff_max_min_luma_coding_block_size; - u32 pic_width_in_ctbs_y = (sps->pic_width_in_luma_samples + (1 << ctb_log2_size_y) - 1) - >> ctb_log2_size_y; - u32 pic_height_in_ctbs_y = (sps->pic_height_in_luma_samples + (1 << ctb_log2_size_y) - 1) - >> ctb_log2_size_y; - size_t mv_size; - - mv_size = pic_width_in_ctbs_y * pic_height_in_ctbs_y * - (1 << (2 * (ctb_log2_size_y - 4))) * 16; - - vpu_debug(4, "%dx%d (CTBs) %zu MV bytes\n", - pic_width_in_ctbs_y, pic_height_in_ctbs_y, mv_size); - - return mv_size; -} - -static size_t hantro_hevc_ref_size(struct hantro_ctx *ctx) -{ - const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls; - const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps; - - return hantro_hevc_motion_vectors_offset(sps) + hantro_hevc_mv_size(sps); -} - -static void hantro_hevc_ref_free(struct hantro_ctx *ctx) -{ - struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec; - struct hantro_dev *vpu = ctx->dev; - int i; - - for (i = 0; i < NUM_REF_PICTURES; i++) { - if (hevc_dec->ref_bufs[i].cpu) - dma_free_coherent(vpu->dev, hevc_dec->ref_bufs[i].size, - hevc_dec->ref_bufs[i].cpu, - hevc_dec->ref_bufs[i].dma); - } -} - static void hantro_hevc_ref_init(struct hantro_ctx *ctx) { struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec; @@ -108,37 +67,25 @@ dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx, } } - /* Allocate a new reference buffer */ + return 0; +} + +int hantro_hevc_add_ref_buf(struct hantro_ctx *ctx, int poc, dma_addr_t addr) +{ + struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec; + int i; + + /* Add a new reference buffer */ for (i = 0; i < NUM_REF_PICTURES; i++) { if (hevc_dec->ref_bufs_poc[i] == UNUSED_REF) { - if (!hevc_dec->ref_bufs[i].cpu) { - struct hantro_dev *vpu = ctx->dev; - - /* - * Allocate the space needed for the raw data + - * motion vector data. Optimizations could be to - * allocate raw data in non coherent memory and only - * clear the motion vector data. - */ - hevc_dec->ref_bufs[i].cpu = - dma_alloc_coherent(vpu->dev, - hantro_hevc_ref_size(ctx), - &hevc_dec->ref_bufs[i].dma, - GFP_KERNEL); - if (!hevc_dec->ref_bufs[i].cpu) - return 0; - - hevc_dec->ref_bufs[i].size = hantro_hevc_ref_size(ctx); - } hevc_dec->ref_bufs_used |= 1 << i; - memset(hevc_dec->ref_bufs[i].cpu, 0, hantro_hevc_ref_size(ctx)); hevc_dec->ref_bufs_poc[i] = poc; - - return hevc_dec->ref_bufs[i].dma; + hevc_dec->ref_bufs[i].dma = addr; + return 0; } } - return 0; + return -EINVAL; } void hantro_hevc_ref_remove_unused(struct hantro_ctx *ctx) @@ -314,8 +261,6 @@ void hantro_hevc_dec_exit(struct hantro_ctx *ctx) hevc_dec->tile_bsd.cpu, hevc_dec->tile_bsd.dma); hevc_dec->tile_bsd.cpu = NULL; - - hantro_hevc_ref_free(ctx); } int hantro_hevc_dec_init(struct hantro_ctx *ctx) diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h index dbe51303724b..7286404c32ab 100644 --- a/drivers/staging/media/hantro/hantro_hw.h +++ b/drivers/staging/media/hantro/hantro_hw.h @@ -345,6 +345,7 @@ void hantro_hevc_dec_exit(struct hantro_ctx *ctx); int hantro_g2_hevc_dec_run(struct hantro_ctx *ctx); int hantro_hevc_dec_prepare_run(struct hantro_ctx *ctx); dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx, int poc); +int hantro_hevc_add_ref_buf(struct hantro_ctx *ctx, int poc, dma_addr_t addr); void hantro_hevc_ref_remove_unused(struct hantro_ctx *ctx); size_t hantro_hevc_chroma_offset(const struct v4l2_ctrl_hevc_sps *sps); size_t hantro_hevc_motion_vectors_offset(const struct v4l2_ctrl_hevc_sps *sps); @@ -394,6 +395,16 @@ hantro_h264_mv_size(unsigned int width, unsigned int height) return 64 * MB_WIDTH(width) * MB_WIDTH(height) + 32; } +static inline size_t +hantro_hevc_mv_size(unsigned int width, unsigned int height) +{ + /* + * A CTB can be 64x64, 32x32 or 16x16. + * Allocated memory for the "worse" case: 16x16 + */ + return width * height / 16; +} + int hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx); int rockchip_vpu2_mpeg2_dec_run(struct hantro_ctx *ctx); void hantro_mpeg2_dec_copy_qtable(u8 *qtable, diff --git a/drivers/staging/media/hantro/hantro_postproc.c b/drivers/staging/media/hantro/hantro_postproc.c index a7774ad4c445..248abe5423f0 100644 --- a/drivers/staging/media/hantro/hantro_postproc.c +++ b/drivers/staging/media/hantro/hantro_postproc.c @@ -146,6 +146,9 @@ int hantro_postproc_alloc(struct hantro_ctx *ctx) else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_VP9_FRAME) buf_size += hantro_vp9_mv_size(ctx->dst_fmt.width, ctx->dst_fmt.height); + else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_HEVC_SLICE) + buf_size += hantro_hevc_mv_size(ctx->dst_fmt.width, + ctx->dst_fmt.height); for (i = 0; i < num_buffers; ++i) { struct hantro_aux_buf *priv = &ctx->postproc.dec_q[i]; diff --git a/drivers/staging/media/hantro/hantro_v4l2.c b/drivers/staging/media/hantro/hantro_v4l2.c index e4b0645ba6fc..e1fe37afe576 100644 --- a/drivers/staging/media/hantro/hantro_v4l2.c +++ b/drivers/staging/media/hantro/hantro_v4l2.c @@ -150,20 +150,6 @@ static int vidioc_enum_fmt(struct file *file, void *priv, unsigned int num_fmts, i, j = 0; bool skip_mode_none; - /* - * The HEVC decoder on the G2 core needs a little quirk to offer NV12 - * only on the capture side. Once the post-processor logic is used, - * we will be able to expose NV12_4L4 and NV12 as the other cases, - * and therefore remove this quirk. - */ - if (capture && ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_HEVC_SLICE) { - if (f->index == 0) { - f->pixelformat = V4L2_PIX_FMT_NV12; - return 0; - } - return -EINVAL; - } - /* * When dealing with an encoder: * - on the capture side we want to filter out all MODE_NONE formats. @@ -304,6 +290,11 @@ static int hantro_try_fmt(const struct hantro_ctx *ctx, pix_mp->plane_fmt[0].sizeimage += hantro_vp9_mv_size(pix_mp->width, pix_mp->height); + else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_HEVC_SLICE && + !hantro_needs_postproc(ctx, fmt)) + pix_mp->plane_fmt[0].sizeimage += + hantro_hevc_mv_size(pix_mp->width, + pix_mp->height); } else if (!pix_mp->plane_fmt[0].sizeimage) { /* * For coded formats the application can specify