[v3,0/3] Enable Hantro G1 post-processor

Message ID	20191113175603.24742-1-ezequiel@collabora.com (mailing list archive)
Headers	show Return-Path: <SRS0=zAWw=ZF=vger.kernel.org=linux-media-owner@kernel.org> sender: ezequiel) with ESMTPSA id 168D628A9C8 From: Ezequiel Garcia <ezequiel@collabora.com> To: linux-media@vger.kernel.org Cc: kernel@collabora.com, Tomasz Figa <tfiga@chromium.org>, linux-rockchip@lists.infradead.org, Heiko Stuebner <heiko@sntech.de>, Jonas Karlman <jonas@kwiboo.se>, Philipp Zabel <p.zabel@pengutronix.de>, Boris Brezillon <boris.brezillon@collabora.com>, Chris Healy <cphealy@gmail.com>, linux-kernel@vger.kernel.org, Ezequiel Garcia <ezequiel@collabora.com> Subject: [PATCH v3 0/3] Enable Hantro G1 post-processor Date: Wed, 13 Nov 2019 14:56:00 -0300 Message-Id: <20191113175603.24742-1-ezequiel@collabora.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-media-owner@vger.kernel.org Precedence: bulk
Series	Enable Hantro G1 post-processor \| expand [v3,0/3] Enable Hantro G1 post-processor [v3,1/3] media: hantro: Cleanup format negotiation helpers [v3,2/3] media: hantro: Support color conversion via post-processing [v3,3/3] media: vidioc-enum-fmt.rst: clarify format preference

Ezequiel Garcia Nov. 13, 2019, 5:56 p.m. UTC

Hi all,

The Hantro G1 VPU post-processor block can be pipelined with
the decoder hardware, allowing to perform operations such as
color conversion, scaling, rotation, cropping, among others.

When the post-processor is enabled, the decoder hardware
needs its own set of NV12 buffers (the native decoder format),
and the post-processor is the owner of the CAPTURE buffers,
allocated for the post-processed format.

This way, applications obtain post-processed
(scaled, converted, etc) buffers transparently.

This feature is implemented by exposing the post-processed pixel
formats on ENUM_FMT, ordered as "preferred pixelformat first":

v4l2-ctl -d 1 --list-formats
ioctl: VIDIOC_ENUM_FMT
	Type: Video Capture Multiplanar

	[0]: 'NV12' (Y/CbCr 4:2:0)
	[1]: 'YUYV' (YUYV 4:2:2)

The order of preference in ENUM_FMT can be used as a hint
by applications. This series updates the uAPI specification
accordingly.

When the application sets a pixel format other than NV12,
the post-processor is transparently enabled.

Patch 1 is a cleanups needed to easier integrate the post-processor.
Patch 2 introduces the post-processing support.
Patch 3 updates the uAPI specification.

This is tested on RK3288 platforms with MPEG-2, VP8 and
H264 streams, decoding to YUY2 surfaces. For now, this series
is only adding support for NV12-to-YUY2 conversion.

Applies to media/master.

Future plans
------------

It seems to me that we should start moving this driver to use
regmap-based access to registers. However, such move is out of scope
and not entirely related to this post-processor enablement.

We'll work on that as follow-up patches.

Changelog
---------

Changes v3:

* After discussing with Hans and Tomasz during the media summit
in ELCE, we decided to go back on the MC changes. The MC topology
is now untouched. This means the series is now similar to v1,
except we explicitly use the ENUM_FMT to hint about the post-processed
formats.

Changes v2:

* The decoder->post-processor topology is now exposed
  explicitly and applications need to configure the pipeline.
  By default, the decoder is enabled and the post-processor
  is disabled.

* RGB post-processing output has been dropped. We might
  add this in the future, but for now, it seems it would
  make the code more complex without a use-case in mind.
  RGB is much more memory-consuming so less attractive
  than YUV, and modern GPUs and display controllers support YUV.

* The post-processor implementation still supports RK3288
  only. However, a generic register infrastructure is introduced
  to make addition of other variants such as RK3399 really easy.

Ezequiel Garcia (3):
  media: hantro: Cleanup format negotiation helpers
  media: hantro: Support color conversion via post-processing
  media: vidioc-enum-fmt.rst: clarify format preference

 .../media/uapi/v4l/vidioc-enum-fmt.rst        |   4 +-
 drivers/staging/media/hantro/Makefile         |   1 +
 drivers/staging/media/hantro/hantro.h         |  64 +++++++-
 drivers/staging/media/hantro/hantro_drv.c     |   8 +-
 .../staging/media/hantro/hantro_g1_h264_dec.c |   2 +-
 .../media/hantro/hantro_g1_mpeg2_dec.c        |   2 +-
 drivers/staging/media/hantro/hantro_g1_regs.h |  53 +++++++
 .../staging/media/hantro/hantro_g1_vp8_dec.c  |   2 +-
 drivers/staging/media/hantro/hantro_h264.c    |   6 +-
 drivers/staging/media/hantro/hantro_hw.h      |  13 ++
 .../staging/media/hantro/hantro_postproc.c    | 141 ++++++++++++++++++
 drivers/staging/media/hantro/hantro_v4l2.c    | 105 ++++++++-----
 drivers/staging/media/hantro/rk3288_vpu_hw.c  |  10 ++
 13 files changed, 366 insertions(+), 45 deletions(-)
 create mode 100644 drivers/staging/media/hantro/hantro_postproc.c

Nicolas Dufresne Feb. 9, 2020, 7:52 p.m. UTC | #1

Le mercredi 13 novembre 2019 à 14:56 -0300, Ezequiel Garcia a écrit :
> Hi all,
> 
> The Hantro G1 VPU post-processor block can be pipelined with
> the decoder hardware, allowing to perform operations such as
> color conversion, scaling, rotation, cropping, among others.
> 
> When the post-processor is enabled, the decoder hardware
> needs its own set of NV12 buffers (the native decoder format),
> and the post-processor is the owner of the CAPTURE buffers,
> allocated for the post-processed format.
> 
> This way, applications obtain post-processed
> (scaled, converted, etc) buffers transparently.
> 
> This feature is implemented by exposing the post-processed pixel
> formats on ENUM_FMT, ordered as "preferred pixelformat first":
> 
> v4l2-ctl -d 1 --list-formats
> ioctl: VIDIOC_ENUM_FMT
> 	Type: Video Capture Multiplanar
> 
> 	[0]: 'NV12' (Y/CbCr 4:2:0)
> 	[1]: 'YUYV' (YUYV 4:2:2)
> 
> The order of preference in ENUM_FMT can be used as a hint
> by applications. This series updates the uAPI specification
> accordingly.

As I'm implementing this, I realize that there may me a gap in being
able to implement both IPP and non-IPP support in a generic framework.
Unlike the above comment, we for non-IPP decoder we cannot naively pick
the first format. In fact we parse the chroma and depth information
from the headers (like pps from H264), and we pick a matching pixel
format. This way, if we have a 10bit stream, and our IP supports 10bit,
we will pick a 10bit pixel formats, otherwise decoding will just fail.

None of this information is passed to the driver prior to the first
Request being made, so there is no way (as of current spec) that the
driver can validate this in try_fmt ahead of time. Unless I set picture
parameters without a request_fd for that purpose. If this is the way,
then we should document this.

Is this the intended way to negotiation IPP functions with the driver ?

> 
> When the application sets a pixel format other than NV12,
> the post-processor is transparently enabled.
> 
> Patch 1 is a cleanups needed to easier integrate the post-processor.
> Patch 2 introduces the post-processing support.
> Patch 3 updates the uAPI specification.
> 
> This is tested on RK3288 platforms with MPEG-2, VP8 and
> H264 streams, decoding to YUY2 surfaces. For now, this series
> is only adding support for NV12-to-YUY2 conversion.
> 
> Applies to media/master.
> 
> Future plans
> ------------
> 
> It seems to me that we should start moving this driver to use
> regmap-based access to registers. However, such move is out of scope
> and not entirely related to this post-processor enablement.
> 
> We'll work on that as follow-up patches.
> 
> Changelog
> ---------
> 
> Changes v3:
> 
> * After discussing with Hans and Tomasz during the media summit
> in ELCE, we decided to go back on the MC changes. The MC topology
> is now untouched. This means the series is now similar to v1,
> except we explicitly use the ENUM_FMT to hint about the post-processed
> formats.
> 
> Changes v2:
> 
> * The decoder->post-processor topology is now exposed
>   explicitly and applications need to configure the pipeline.
>   By default, the decoder is enabled and the post-processor
>   is disabled.
> 
> * RGB post-processing output has been dropped. We might
>   add this in the future, but for now, it seems it would
>   make the code more complex without a use-case in mind.
>   RGB is much more memory-consuming so less attractive
>   than YUV, and modern GPUs and display controllers support YUV.
> 
> * The post-processor implementation still supports RK3288
>   only. However, a generic register infrastructure is introduced
>   to make addition of other variants such as RK3399 really easy.
> 
> Ezequiel Garcia (3):
>   media: hantro: Cleanup format negotiation helpers
>   media: hantro: Support color conversion via post-processing
>   media: vidioc-enum-fmt.rst: clarify format preference
> 
>  .../media/uapi/v4l/vidioc-enum-fmt.rst        |   4 +-
>  drivers/staging/media/hantro/Makefile         |   1 +
>  drivers/staging/media/hantro/hantro.h         |  64 +++++++-
>  drivers/staging/media/hantro/hantro_drv.c     |   8 +-
>  .../staging/media/hantro/hantro_g1_h264_dec.c |   2 +-
>  .../media/hantro/hantro_g1_mpeg2_dec.c        |   2 +-
>  drivers/staging/media/hantro/hantro_g1_regs.h |  53 +++++++
>  .../staging/media/hantro/hantro_g1_vp8_dec.c  |   2 +-
>  drivers/staging/media/hantro/hantro_h264.c    |   6 +-
>  drivers/staging/media/hantro/hantro_hw.h      |  13 ++
>  .../staging/media/hantro/hantro_postproc.c    | 141 ++++++++++++++++++
>  drivers/staging/media/hantro/hantro_v4l2.c    | 105 ++++++++-----
>  drivers/staging/media/hantro/rk3288_vpu_hw.c  |  10 ++
>  13 files changed, 366 insertions(+), 45 deletions(-)
>  create mode 100644 drivers/staging/media/hantro/hantro_postproc.c
>

Tomasz Figa Feb. 10, 2020, 2:45 a.m. UTC | #2

On Mon, Feb 10, 2020 at 4:52 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
>
> Le mercredi 13 novembre 2019 à 14:56 -0300, Ezequiel Garcia a écrit :
> > Hi all,
> >
> > The Hantro G1 VPU post-processor block can be pipelined with
> > the decoder hardware, allowing to perform operations such as
> > color conversion, scaling, rotation, cropping, among others.
> >
> > When the post-processor is enabled, the decoder hardware
> > needs its own set of NV12 buffers (the native decoder format),
> > and the post-processor is the owner of the CAPTURE buffers,
> > allocated for the post-processed format.
> >
> > This way, applications obtain post-processed
> > (scaled, converted, etc) buffers transparently.
> >
> > This feature is implemented by exposing the post-processed pixel
> > formats on ENUM_FMT, ordered as "preferred pixelformat first":
> >
> > v4l2-ctl -d 1 --list-formats
> > ioctl: VIDIOC_ENUM_FMT
> >       Type: Video Capture Multiplanar
> >
> >       [0]: 'NV12' (Y/CbCr 4:2:0)
> >       [1]: 'YUYV' (YUYV 4:2:2)
> >
> > The order of preference in ENUM_FMT can be used as a hint
> > by applications. This series updates the uAPI specification
> > accordingly.
>
> As I'm implementing this, I realize that there may me a gap in being
> able to implement both IPP and non-IPP support in a generic framework.
> Unlike the above comment, we for non-IPP decoder we cannot naively pick
> the first format. In fact we parse the chroma and depth information
> from the headers (like pps from H264), and we pick a matching pixel
> format. This way, if we have a 10bit stream, and our IP supports 10bit,
> we will pick a 10bit pixel formats, otherwise decoding will just fail.
>
> None of this information is passed to the driver prior to the first
> Request being made, so there is no way (as of current spec) that the
> driver can validate this in try_fmt ahead of time. Unless I set picture
> parameters without a request_fd for that purpose. If this is the way,
> then we should document this.

+Alexandre Courbot

It was suggested in the very early RFC stage, but it looks like it
didn't make it to the final spec.
https://patchwork.kernel.org/patch/10583233/#22209555

>
> Is this the intended way to negotiation IPP functions with the driver ?
>

In theory, if the userspace knows whether the stream is 4:2:0 or 4:2:2
and 8-bit or 10-bit, it can still select the first format from the top
that matches these properties.

That's not how format handling in V4L2 works, though. ENUM_FMT is
expected to return a list of valid formats and if we forget about the
image processor for a moment, a stateless decoder would always return
any possible format, including ones invalid for the stream.

Now back to the image processor, if it handles conversions from any to
any format listed by ENUM_FMT, we kind of regain the V4L2 compliance,
but if the conversions are limited, the above requirement still
doesn't hold and we're not implementing V4L2 correctly.

Perhaps we can still amend the spec and require controls that
determine the stream properties to be set before starting the
streaming? I can imagine it could also help the driver filter out some
unsupported streams early, before allocating buffers and attempting to
decode.

Best regards,
Tomasz

> >
> > When the application sets a pixel format other than NV12,
> > the post-processor is transparently enabled.
> >
> > Patch 1 is a cleanups needed to easier integrate the post-processor.
> > Patch 2 introduces the post-processing support.
> > Patch 3 updates the uAPI specification.
> >
> > This is tested on RK3288 platforms with MPEG-2, VP8 and
> > H264 streams, decoding to YUY2 surfaces. For now, this series
> > is only adding support for NV12-to-YUY2 conversion.
> >
> > Applies to media/master.
> >
> > Future plans
> > ------------
> >
> > It seems to me that we should start moving this driver to use
> > regmap-based access to registers. However, such move is out of scope
> > and not entirely related to this post-processor enablement.
> >
> > We'll work on that as follow-up patches.
> >
> > Changelog
> > ---------
> >
> > Changes v3:
> >
> > * After discussing with Hans and Tomasz during the media summit
> > in ELCE, we decided to go back on the MC changes. The MC topology
> > is now untouched. This means the series is now similar to v1,
> > except we explicitly use the ENUM_FMT to hint about the post-processed
> > formats.
> >
> > Changes v2:
> >
> > * The decoder->post-processor topology is now exposed
> >   explicitly and applications need to configure the pipeline.
> >   By default, the decoder is enabled and the post-processor
> >   is disabled.
> >
> > * RGB post-processing output has been dropped. We might
> >   add this in the future, but for now, it seems it would
> >   make the code more complex without a use-case in mind.
> >   RGB is much more memory-consuming so less attractive
> >   than YUV, and modern GPUs and display controllers support YUV.
> >
> > * The post-processor implementation still supports RK3288
> >   only. However, a generic register infrastructure is introduced
> >   to make addition of other variants such as RK3399 really easy.
> >
> > Ezequiel Garcia (3):
> >   media: hantro: Cleanup format negotiation helpers
> >   media: hantro: Support color conversion via post-processing
> >   media: vidioc-enum-fmt.rst: clarify format preference
> >
> >  .../media/uapi/v4l/vidioc-enum-fmt.rst        |   4 +-
> >  drivers/staging/media/hantro/Makefile         |   1 +
> >  drivers/staging/media/hantro/hantro.h         |  64 +++++++-
> >  drivers/staging/media/hantro/hantro_drv.c     |   8 +-
> >  .../staging/media/hantro/hantro_g1_h264_dec.c |   2 +-
> >  .../media/hantro/hantro_g1_mpeg2_dec.c        |   2 +-
> >  drivers/staging/media/hantro/hantro_g1_regs.h |  53 +++++++
> >  .../staging/media/hantro/hantro_g1_vp8_dec.c  |   2 +-
> >  drivers/staging/media/hantro/hantro_h264.c    |   6 +-
> >  drivers/staging/media/hantro/hantro_hw.h      |  13 ++
> >  .../staging/media/hantro/hantro_postproc.c    | 141 ++++++++++++++++++
> >  drivers/staging/media/hantro/hantro_v4l2.c    | 105 ++++++++-----
> >  drivers/staging/media/hantro/rk3288_vpu_hw.c  |  10 ++
> >  13 files changed, 366 insertions(+), 45 deletions(-)
> >  create mode 100644 drivers/staging/media/hantro/hantro_postproc.c
> >
>

Nicolas Dufresne Feb. 11, 2020, 4:16 a.m. UTC | #3

Le lundi 10 février 2020 à 11:45 +0900, Tomasz Figa a écrit :
> On Mon, Feb 10, 2020 at 4:52 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
> > Le mercredi 13 novembre 2019 à 14:56 -0300, Ezequiel Garcia a écrit :
> > > Hi all,
> > > 
> > > The Hantro G1 VPU post-processor block can be pipelined with
> > > the decoder hardware, allowing to perform operations such as
> > > color conversion, scaling, rotation, cropping, among others.
> > > 
> > > When the post-processor is enabled, the decoder hardware
> > > needs its own set of NV12 buffers (the native decoder format),
> > > and the post-processor is the owner of the CAPTURE buffers,
> > > allocated for the post-processed format.
> > > 
> > > This way, applications obtain post-processed
> > > (scaled, converted, etc) buffers transparently.
> > > 
> > > This feature is implemented by exposing the post-processed pixel
> > > formats on ENUM_FMT, ordered as "preferred pixelformat first":
> > > 
> > > v4l2-ctl -d 1 --list-formats
> > > ioctl: VIDIOC_ENUM_FMT
> > >       Type: Video Capture Multiplanar
> > > 
> > >       [0]: 'NV12' (Y/CbCr 4:2:0)
> > >       [1]: 'YUYV' (YUYV 4:2:2)
> > > 
> > > The order of preference in ENUM_FMT can be used as a hint
> > > by applications. This series updates the uAPI specification
> > > accordingly.
> > 
> > As I'm implementing this, I realize that there may me a gap in being
> > able to implement both IPP and non-IPP support in a generic framework.
> > Unlike the above comment, we for non-IPP decoder we cannot naively pick
> > the first format. In fact we parse the chroma and depth information
> > from the headers (like pps from H264), and we pick a matching pixel
> > format. This way, if we have a 10bit stream, and our IP supports 10bit,
> > we will pick a 10bit pixel formats, otherwise decoding will just fail.
> > 
> > None of this information is passed to the driver prior to the first
> > Request being made, so there is no way (as of current spec) that the
> > driver can validate this in try_fmt ahead of time. Unless I set picture
> > parameters without a request_fd for that purpose. If this is the way,
> > then we should document this.
> 
> +Alexandre Courbot
> 
> It was suggested in the very early RFC stage, but it looks like it
> didn't make it to the final spec.
> https://patchwork.kernel.org/patch/10583233/#22209555

Ok, maybe we should revive it, it would fill that gap. Again, only an
issue if you have a post processor. I'm still surprised we didn't
expose the IPP functions through the topology, it would make so much
sense to me, and I can make better code with that knowledge.

I found while coding this, that even if it's more difficult,
classification of device by looking at the topology and the entity
functions is much nicer, less of a guess.

Though, we lack some documentation (or clarification) for how to
properly handle formats, size and selection in order to configure the
IPP. Ezequiel was saying that we don't implement selection in Hanto, so
I guess the scaling is a bit ambiguous then in regard to coded/display
sizes. Though we pass a size to the OUTPUT side, so the driver can
always control a little bit.

> 
> > Is this the intended way to negotiation IPP functions with the driver ?
> > 
> 
> In theory, if the userspace knows whether the stream is 4:2:0 or 4:2:2
> and 8-bit or 10-bit, it can still select the first format from the top
> that matches these properties.

That's exactly what I do, I have a map of the formats and their
chroma/depth, and take the first one that match. If that fails, I just
fallback to the first one. It's something you need to do anyway, as we
prefer the native format first (even if there is an IPP).

> 
> That's not how format handling in V4L2 works, though. ENUM_FMT is
> expected to return a list of valid formats and if we forget about the
> image processor for a moment, a stateless decoder would always return
> any possible format, including ones invalid for the stream.
> 
> Now back to the image processor, if it handles conversions from any to
> any format listed by ENUM_FMT, we kind of regain the V4L2 compliance,
> but if the conversions are limited, the above requirement still
> doesn't hold and we're not implementing V4L2 correctly.
> 
> Perhaps we can still amend the spec and require controls that
> determine the stream properties to be set before starting the
> streaming? I can imagine it could also help the driver filter out some
> unsupported streams early, before allocating buffers and attempting to
> decode.

I think it would make sense, so just to make sure, for H264 we could
set the V4L2_CID_MPEG_VIDEO_H264_SPS along with the OUTPUT format,
prior to CAPTURE enumeration.

> 
> Best regards,
> Tomasz
> 
> > > When the application sets a pixel format other than NV12,
> > > the post-processor is transparently enabled.
> > > 
> > > Patch 1 is a cleanups needed to easier integrate the post-processor.
> > > Patch 2 introduces the post-processing support.
> > > Patch 3 updates the uAPI specification.
> > > 
> > > This is tested on RK3288 platforms with MPEG-2, VP8 and
> > > H264 streams, decoding to YUY2 surfaces. For now, this series
> > > is only adding support for NV12-to-YUY2 conversion.
> > > 
> > > Applies to media/master.
> > > 
> > > Future plans
> > > ------------
> > > 
> > > It seems to me that we should start moving this driver to use
> > > regmap-based access to registers. However, such move is out of scope
> > > and not entirely related to this post-processor enablement.
> > > 
> > > We'll work on that as follow-up patches.
> > > 
> > > Changelog
> > > ---------
> > > 
> > > Changes v3:
> > > 
> > > * After discussing with Hans and Tomasz during the media summit
> > > in ELCE, we decided to go back on the MC changes. The MC topology
> > > is now untouched. This means the series is now similar to v1,
> > > except we explicitly use the ENUM_FMT to hint about the post-processed
> > > formats.
> > > 
> > > Changes v2:
> > > 
> > > * The decoder->post-processor topology is now exposed
> > >   explicitly and applications need to configure the pipeline.
> > >   By default, the decoder is enabled and the post-processor
> > >   is disabled.
> > > 
> > > * RGB post-processing output has been dropped. We might
> > >   add this in the future, but for now, it seems it would
> > >   make the code more complex without a use-case in mind.
> > >   RGB is much more memory-consuming so less attractive
> > >   than YUV, and modern GPUs and display controllers support YUV.
> > > 
> > > * The post-processor implementation still supports RK3288
> > >   only. However, a generic register infrastructure is introduced
> > >   to make addition of other variants such as RK3399 really easy.
> > > 
> > > Ezequiel Garcia (3):
> > >   media: hantro: Cleanup format negotiation helpers
> > >   media: hantro: Support color conversion via post-processing
> > >   media: vidioc-enum-fmt.rst: clarify format preference
> > > 
> > >  .../media/uapi/v4l/vidioc-enum-fmt.rst        |   4 +-
> > >  drivers/staging/media/hantro/Makefile         |   1 +
> > >  drivers/staging/media/hantro/hantro.h         |  64 +++++++-
> > >  drivers/staging/media/hantro/hantro_drv.c     |   8 +-
> > >  .../staging/media/hantro/hantro_g1_h264_dec.c |   2 +-
> > >  .../media/hantro/hantro_g1_mpeg2_dec.c        |   2 +-
> > >  drivers/staging/media/hantro/hantro_g1_regs.h |  53 +++++++
> > >  .../staging/media/hantro/hantro_g1_vp8_dec.c  |   2 +-
> > >  drivers/staging/media/hantro/hantro_h264.c    |   6 +-
> > >  drivers/staging/media/hantro/hantro_hw.h      |  13 ++
> > >  .../staging/media/hantro/hantro_postproc.c    | 141 ++++++++++++++++++
> > >  drivers/staging/media/hantro/hantro_v4l2.c    | 105 ++++++++-----
> > >  drivers/staging/media/hantro/rk3288_vpu_hw.c  |  10 ++
> > >  13 files changed, 366 insertions(+), 45 deletions(-)
> > >  create mode 100644 drivers/staging/media/hantro/hantro_postproc.c
> > >

Nicolas Dufresne Feb. 11, 2020, 4:22 p.m. UTC | #4

On lun, 2020-02-10 at 23:16 -0500, Nicolas Dufresne wrote:
> Le lundi 10 février 2020 à 11:45 +0900, Tomasz Figa a écrit :
> > On Mon, Feb 10, 2020 at 4:52 AM Nicolas Dufresne <nicolas@ndufresne.ca>
> > wrote:
> > > Le mercredi 13 novembre 2019 à 14:56 -0300, Ezequiel Garcia a écrit :
> > > > Hi all,
> > > > 
> > > > The Hantro G1 VPU post-processor block can be pipelined with
> > > > the decoder hardware, allowing to perform operations such as
> > > > color conversion, scaling, rotation, cropping, among others.
> > > > 
> > > > When the post-processor is enabled, the decoder hardware
> > > > needs its own set of NV12 buffers (the native decoder format),
> > > > and the post-processor is the owner of the CAPTURE buffers,
> > > > allocated for the post-processed format.
> > > > 
> > > > This way, applications obtain post-processed
> > > > (scaled, converted, etc) buffers transparently.
> > > > 
> > > > This feature is implemented by exposing the post-processed pixel
> > > > formats on ENUM_FMT, ordered as "preferred pixelformat first":
> > > > 
> > > > v4l2-ctl -d 1 --list-formats
> > > > ioctl: VIDIOC_ENUM_FMT
> > > >       Type: Video Capture Multiplanar
> > > > 
> > > >       [0]: 'NV12' (Y/CbCr 4:2:0)
> > > >       [1]: 'YUYV' (YUYV 4:2:2)
> > > > 
> > > > The order of preference in ENUM_FMT can be used as a hint
> > > > by applications. This series updates the uAPI specification
> > > > accordingly.
> > > 
> > > As I'm implementing this, I realize that there may me a gap in being
> > > able to implement both IPP and non-IPP support in a generic framework.
> > > Unlike the above comment, we for non-IPP decoder we cannot naively pick
> > > the first format. In fact we parse the chroma and depth information
> > > from the headers (like pps from H264), and we pick a matching pixel
> > > format. This way, if we have a 10bit stream, and our IP supports 10bit,
> > > we will pick a 10bit pixel formats, otherwise decoding will just fail.
> > > 
> > > None of this information is passed to the driver prior to the first
> > > Request being made, so there is no way (as of current spec) that the
> > > driver can validate this in try_fmt ahead of time. Unless I set picture
> > > parameters without a request_fd for that purpose. If this is the way,
> > > then we should document this.
> > 
> > +Alexandre Courbot
> > 
> > It was suggested in the very early RFC stage, but it looks like it
> > didn't make it to the final spec.
> > https://patchwork.kernel.org/patch/10583233/#22209555
> 
> Ok, maybe we should revive it, it would fill that gap. Again, only an
> issue if you have a post processor. I'm still surprised we didn't
> expose the IPP functions through the topology, it would make so much
> sense to me, and I can make better code with that knowledge.
> 
> I found while coding this, that even if it's more difficult,
> classification of device by looking at the topology and the entity
> functions is much nicer, less of a guess.
> 
> Though, we lack some documentation (or clarification) for how to
> properly handle formats, size and selection in order to configure the
> IPP. Ezequiel was saying that we don't implement selection in Hanto, so
> I guess the scaling is a bit ambiguous then in regard to coded/display
> sizes. Though we pass a size to the OUTPUT side, so the driver can
> always control a little bit.

Re-reading the "initialization process", it's actually still there:

"Call VIDIOC_S_EXT_CTRLS() to set all the controls (parsed headers, etc.)
required by the OUTPUT format to enumerate the CAPTURE formats."

And then it suggest to use G_FMT to retreive the optimal format as selected by
the driver. So I guess this is a false alarm. And we probably don't need to play
with the subsampling to filter the formats, as the driver is expected to do so.

So what could be improve, is have in the per-codec documentation the list of
controls that are "required by the OUTPUT format to enumerate the CAPTURE
formats.". Could just be a comment in the control doc so say, "this control is
needed to allow enumerationg of the CAPTURE formats".

> 
> > > Is this the intended way to negotiation IPP functions with the driver ?
> > > 
> > 
> > In theory, if the userspace knows whether the stream is 4:2:0 or 4:2:2
> > and 8-bit or 10-bit, it can still select the first format from the top
> > that matches these properties.
> 
> That's exactly what I do, I have a map of the formats and their
> chroma/depth, and take the first one that match. If that fails, I just
> fallback to the first one. It's something you need to do anyway, as we
> prefer the native format first (even if there is an IPP).
> 
> > That's not how format handling in V4L2 works, though. ENUM_FMT is
> > expected to return a list of valid formats and if we forget about the
> > image processor for a moment, a stateless decoder would always return
> > any possible format, including ones invalid for the stream.
> > 
> > Now back to the image processor, if it handles conversions from any to
> > any format listed by ENUM_FMT, we kind of regain the V4L2 compliance,
> > but if the conversions are limited, the above requirement still
> > doesn't hold and we're not implementing V4L2 correctly.
> > 
> > Perhaps we can still amend the spec and require controls that
> > determine the stream properties to be set before starting the
> > streaming? I can imagine it could also help the driver filter out some
> > unsupported streams early, before allocating buffers and attempting to
> > decode.
> 
> I think it would make sense, so just to make sure, for H264 we could
> set the V4L2_CID_MPEG_VIDEO_H264_SPS along with the OUTPUT format,
> prior to CAPTURE enumeration.
> 
> > Best regards,
> > Tomasz
> > 
> > > > When the application sets a pixel format other than NV12,
> > > > the post-processor is transparently enabled.
> > > > 
> > > > Patch 1 is a cleanups needed to easier integrate the post-processor.
> > > > Patch 2 introduces the post-processing support.
> > > > Patch 3 updates the uAPI specification.
> > > > 
> > > > This is tested on RK3288 platforms with MPEG-2, VP8 and
> > > > H264 streams, decoding to YUY2 surfaces. For now, this series
> > > > is only adding support for NV12-to-YUY2 conversion.
> > > > 
> > > > Applies to media/master.
> > > > 
> > > > Future plans
> > > > ------------
> > > > 
> > > > It seems to me that we should start moving this driver to use
> > > > regmap-based access to registers. However, such move is out of scope
> > > > and not entirely related to this post-processor enablement.
> > > > 
> > > > We'll work on that as follow-up patches.
> > > > 
> > > > Changelog
> > > > ---------
> > > > 
> > > > Changes v3:
> > > > 
> > > > * After discussing with Hans and Tomasz during the media summit
> > > > in ELCE, we decided to go back on the MC changes. The MC topology
> > > > is now untouched. This means the series is now similar to v1,
> > > > except we explicitly use the ENUM_FMT to hint about the post-processed
> > > > formats.
> > > > 
> > > > Changes v2:
> > > > 
> > > > * The decoder->post-processor topology is now exposed
> > > >   explicitly and applications need to configure the pipeline.
> > > >   By default, the decoder is enabled and the post-processor
> > > >   is disabled.
> > > > 
> > > > * RGB post-processing output has been dropped. We might
> > > >   add this in the future, but for now, it seems it would
> > > >   make the code more complex without a use-case in mind.
> > > >   RGB is much more memory-consuming so less attractive
> > > >   than YUV, and modern GPUs and display controllers support YUV.
> > > > 
> > > > * The post-processor implementation still supports RK3288
> > > >   only. However, a generic register infrastructure is introduced
> > > >   to make addition of other variants such as RK3399 really easy.
> > > > 
> > > > Ezequiel Garcia (3):
> > > >   media: hantro: Cleanup format negotiation helpers
> > > >   media: hantro: Support color conversion via post-processing
> > > >   media: vidioc-enum-fmt.rst: clarify format preference
> > > > 
> > > >  .../media/uapi/v4l/vidioc-enum-fmt.rst        |   4 +-
> > > >  drivers/staging/media/hantro/Makefile         |   1 +
> > > >  drivers/staging/media/hantro/hantro.h         |  64 +++++++-
> > > >  drivers/staging/media/hantro/hantro_drv.c     |   8 +-
> > > >  .../staging/media/hantro/hantro_g1_h264_dec.c |   2 +-
> > > >  .../media/hantro/hantro_g1_mpeg2_dec.c        |   2 +-
> > > >  drivers/staging/media/hantro/hantro_g1_regs.h |  53 +++++++
> > > >  .../staging/media/hantro/hantro_g1_vp8_dec.c  |   2 +-
> > > >  drivers/staging/media/hantro/hantro_h264.c    |   6 +-
> > > >  drivers/staging/media/hantro/hantro_hw.h      |  13 ++
> > > >  .../staging/media/hantro/hantro_postproc.c    | 141 ++++++++++++++++++
> > > >  drivers/staging/media/hantro/hantro_v4l2.c    | 105 ++++++++-----
> > > >  drivers/staging/media/hantro/rk3288_vpu_hw.c  |  10 ++
> > > >  13 files changed, 366 insertions(+), 45 deletions(-)
> > > >  create mode 100644 drivers/staging/media/hantro/hantro_postproc.c
> > > >

Tomasz Figa Feb. 12, 2020, 7:06 a.m. UTC | #5

On Wed, Feb 12, 2020 at 1:22 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
>
> On lun, 2020-02-10 at 23:16 -0500, Nicolas Dufresne wrote:
> > Le lundi 10 février 2020 à 11:45 +0900, Tomasz Figa a écrit :
> > > On Mon, Feb 10, 2020 at 4:52 AM Nicolas Dufresne <nicolas@ndufresne.ca>
> > > wrote:
> > > > Le mercredi 13 novembre 2019 à 14:56 -0300, Ezequiel Garcia a écrit :
> > > > > Hi all,
> > > > >
> > > > > The Hantro G1 VPU post-processor block can be pipelined with
> > > > > the decoder hardware, allowing to perform operations such as
> > > > > color conversion, scaling, rotation, cropping, among others.
> > > > >
> > > > > When the post-processor is enabled, the decoder hardware
> > > > > needs its own set of NV12 buffers (the native decoder format),
> > > > > and the post-processor is the owner of the CAPTURE buffers,
> > > > > allocated for the post-processed format.
> > > > >
> > > > > This way, applications obtain post-processed
> > > > > (scaled, converted, etc) buffers transparently.
> > > > >
> > > > > This feature is implemented by exposing the post-processed pixel
> > > > > formats on ENUM_FMT, ordered as "preferred pixelformat first":
> > > > >
> > > > > v4l2-ctl -d 1 --list-formats
> > > > > ioctl: VIDIOC_ENUM_FMT
> > > > >       Type: Video Capture Multiplanar
> > > > >
> > > > >       [0]: 'NV12' (Y/CbCr 4:2:0)
> > > > >       [1]: 'YUYV' (YUYV 4:2:2)
> > > > >
> > > > > The order of preference in ENUM_FMT can be used as a hint
> > > > > by applications. This series updates the uAPI specification
> > > > > accordingly.
> > > >
> > > > As I'm implementing this, I realize that there may me a gap in being
> > > > able to implement both IPP and non-IPP support in a generic framework.
> > > > Unlike the above comment, we for non-IPP decoder we cannot naively pick
> > > > the first format. In fact we parse the chroma and depth information
> > > > from the headers (like pps from H264), and we pick a matching pixel
> > > > format. This way, if we have a 10bit stream, and our IP supports 10bit,
> > > > we will pick a 10bit pixel formats, otherwise decoding will just fail.
> > > >
> > > > None of this information is passed to the driver prior to the first
> > > > Request being made, so there is no way (as of current spec) that the
> > > > driver can validate this in try_fmt ahead of time. Unless I set picture
> > > > parameters without a request_fd for that purpose. If this is the way,
> > > > then we should document this.
> > >
> > > +Alexandre Courbot
> > >
> > > It was suggested in the very early RFC stage, but it looks like it
> > > didn't make it to the final spec.
> > > https://patchwork.kernel.org/patch/10583233/#22209555
> >
> > Ok, maybe we should revive it, it would fill that gap. Again, only an
> > issue if you have a post processor. I'm still surprised we didn't
> > expose the IPP functions through the topology, it would make so much
> > sense to me, and I can make better code with that knowledge.
> >
> > I found while coding this, that even if it's more difficult,
> > classification of device by looking at the topology and the entity
> > functions is much nicer, less of a guess.
> >
> > Though, we lack some documentation (or clarification) for how to
> > properly handle formats, size and selection in order to configure the
> > IPP. Ezequiel was saying that we don't implement selection in Hanto, so
> > I guess the scaling is a bit ambiguous then in regard to coded/display
> > sizes. Though we pass a size to the OUTPUT side, so the driver can
> > always control a little bit.
>
> Re-reading the "initialization process", it's actually still there:
>
> "Call VIDIOC_S_EXT_CTRLS() to set all the controls (parsed headers, etc.)
> required by the OUTPUT format to enumerate the CAPTURE formats."
>

Oh, so it did make it to the final spec. Sorry, my bad. Should have
checked better.

> And then it suggest to use G_FMT to retreive the optimal format as selected by
> the driver. So I guess this is a false alarm. And we probably don't need to play
> with the subsampling to filter the formats, as the driver is expected to do so.
>

The question that this raises then is whether the merged drivers do
so. Hantro G1 supports only 8-bit YUV 4:2:0, so there is no problem,
although the driver doesn't seem to validate that in the headers. I'd
also say the CAPTURE resolution should be determined by the headers,
rather than the OUTPUT queue, consistently to the stateful decoders.
Resolution changes should be also signaled if controls are set to
headers that contain different resolutions than previously set.

> So what could be improve, is have in the per-codec documentation the list of
> controls that are "required by the OUTPUT format to enumerate the CAPTURE
> formats.". Could just be a comment in the control doc so say, "this control is
> needed to allow enumerationg of the CAPTURE formats".
>

I wouldn't limit this to the format enumeration. The drivers may also
need to check for things like profile, some stream features or even
just whether the resolution doesn't exceed the maximum supported by
the decoder.

Best regards,
Tomasz

> >
> > > > Is this the intended way to negotiation IPP functions with the driver ?
> > > >
> > >
> > > In theory, if the userspace knows whether the stream is 4:2:0 or 4:2:2
> > > and 8-bit or 10-bit, it can still select the first format from the top
> > > that matches these properties.
> >
> > That's exactly what I do, I have a map of the formats and their
> > chroma/depth, and take the first one that match. If that fails, I just
> > fallback to the first one. It's something you need to do anyway, as we
> > prefer the native format first (even if there is an IPP).
> >
> > > That's not how format handling in V4L2 works, though. ENUM_FMT is
> > > expected to return a list of valid formats and if we forget about the
> > > image processor for a moment, a stateless decoder would always return
> > > any possible format, including ones invalid for the stream.
> > >
> > > Now back to the image processor, if it handles conversions from any to
> > > any format listed by ENUM_FMT, we kind of regain the V4L2 compliance,
> > > but if the conversions are limited, the above requirement still
> > > doesn't hold and we're not implementing V4L2 correctly.
> > >
> > > Perhaps we can still amend the spec and require controls that
> > > determine the stream properties to be set before starting the
> > > streaming? I can imagine it could also help the driver filter out some
> > > unsupported streams early, before allocating buffers and attempting to
> > > decode.
> >
> > I think it would make sense, so just to make sure, for H264 we could
> > set the V4L2_CID_MPEG_VIDEO_H264_SPS along with the OUTPUT format,
> > prior to CAPTURE enumeration.
> >
> > > Best regards,
> > > Tomasz
> > >
> > > > > When the application sets a pixel format other than NV12,
> > > > > the post-processor is transparently enabled.
> > > > >
> > > > > Patch 1 is a cleanups needed to easier integrate the post-processor.
> > > > > Patch 2 introduces the post-processing support.
> > > > > Patch 3 updates the uAPI specification.
> > > > >
> > > > > This is tested on RK3288 platforms with MPEG-2, VP8 and
> > > > > H264 streams, decoding to YUY2 surfaces. For now, this series
> > > > > is only adding support for NV12-to-YUY2 conversion.
> > > > >
> > > > > Applies to media/master.
> > > > >
> > > > > Future plans
> > > > > ------------
> > > > >
> > > > > It seems to me that we should start moving this driver to use
> > > > > regmap-based access to registers. However, such move is out of scope
> > > > > and not entirely related to this post-processor enablement.
> > > > >
> > > > > We'll work on that as follow-up patches.
> > > > >
> > > > > Changelog
> > > > > ---------
> > > > >
> > > > > Changes v3:
> > > > >
> > > > > * After discussing with Hans and Tomasz during the media summit
> > > > > in ELCE, we decided to go back on the MC changes. The MC topology
> > > > > is now untouched. This means the series is now similar to v1,
> > > > > except we explicitly use the ENUM_FMT to hint about the post-processed
> > > > > formats.
> > > > >
> > > > > Changes v2:
> > > > >
> > > > > * The decoder->post-processor topology is now exposed
> > > > >   explicitly and applications need to configure the pipeline.
> > > > >   By default, the decoder is enabled and the post-processor
> > > > >   is disabled.
> > > > >
> > > > > * RGB post-processing output has been dropped. We might
> > > > >   add this in the future, but for now, it seems it would
> > > > >   make the code more complex without a use-case in mind.
> > > > >   RGB is much more memory-consuming so less attractive
> > > > >   than YUV, and modern GPUs and display controllers support YUV.
> > > > >
> > > > > * The post-processor implementation still supports RK3288
> > > > >   only. However, a generic register infrastructure is introduced
> > > > >   to make addition of other variants such as RK3399 really easy.
> > > > >
> > > > > Ezequiel Garcia (3):
> > > > >   media: hantro: Cleanup format negotiation helpers
> > > > >   media: hantro: Support color conversion via post-processing
> > > > >   media: vidioc-enum-fmt.rst: clarify format preference
> > > > >
> > > > >  .../media/uapi/v4l/vidioc-enum-fmt.rst        |   4 +-
> > > > >  drivers/staging/media/hantro/Makefile         |   1 +
> > > > >  drivers/staging/media/hantro/hantro.h         |  64 +++++++-
> > > > >  drivers/staging/media/hantro/hantro_drv.c     |   8 +-
> > > > >  .../staging/media/hantro/hantro_g1_h264_dec.c |   2 +-
> > > > >  .../media/hantro/hantro_g1_mpeg2_dec.c        |   2 +-
> > > > >  drivers/staging/media/hantro/hantro_g1_regs.h |  53 +++++++
> > > > >  .../staging/media/hantro/hantro_g1_vp8_dec.c  |   2 +-
> > > > >  drivers/staging/media/hantro/hantro_h264.c    |   6 +-
> > > > >  drivers/staging/media/hantro/hantro_hw.h      |  13 ++
> > > > >  .../staging/media/hantro/hantro_postproc.c    | 141 ++++++++++++++++++
> > > > >  drivers/staging/media/hantro/hantro_v4l2.c    | 105 ++++++++-----
> > > > >  drivers/staging/media/hantro/rk3288_vpu_hw.c  |  10 ++
> > > > >  13 files changed, 366 insertions(+), 45 deletions(-)
> > > > >  create mode 100644 drivers/staging/media/hantro/hantro_postproc.c
> > > > >
>

Nicolas Dufresne Feb. 19, 2020, 9:06 p.m. UTC | #6

On mer, 2020-02-12 at 16:06 +0900, Tomasz Figa wrote:
> On Wed, Feb 12, 2020 at 1:22 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
> > On lun, 2020-02-10 at 23:16 -0500, Nicolas Dufresne wrote:
> > > Le lundi 10 février 2020 à 11:45 +0900, Tomasz Figa a écrit :
> > > > On Mon, Feb 10, 2020 at 4:52 AM Nicolas Dufresne <nicolas@ndufresne.ca>
> > > > wrote:
> > > > > Le mercredi 13 novembre 2019 à 14:56 -0300, Ezequiel Garcia a écrit :
> > > > > > Hi all,
> > > > > > 
> > > > > > The Hantro G1 VPU post-processor block can be pipelined with
> > > > > > the decoder hardware, allowing to perform operations such as
> > > > > > color conversion, scaling, rotation, cropping, among others.
> > > > > > 
> > > > > > When the post-processor is enabled, the decoder hardware
> > > > > > needs its own set of NV12 buffers (the native decoder format),
> > > > > > and the post-processor is the owner of the CAPTURE buffers,
> > > > > > allocated for the post-processed format.
> > > > > > 
> > > > > > This way, applications obtain post-processed
> > > > > > (scaled, converted, etc) buffers transparently.
> > > > > > 
> > > > > > This feature is implemented by exposing the post-processed pixel
> > > > > > formats on ENUM_FMT, ordered as "preferred pixelformat first":
> > > > > > 
> > > > > > v4l2-ctl -d 1 --list-formats
> > > > > > ioctl: VIDIOC_ENUM_FMT
> > > > > >       Type: Video Capture Multiplanar
> > > > > > 
> > > > > >       [0]: 'NV12' (Y/CbCr 4:2:0)
> > > > > >       [1]: 'YUYV' (YUYV 4:2:2)
> > > > > > 
> > > > > > The order of preference in ENUM_FMT can be used as a hint
> > > > > > by applications. This series updates the uAPI specification
> > > > > > accordingly.
> > > > > 
> > > > > As I'm implementing this, I realize that there may me a gap in being
> > > > > able to implement both IPP and non-IPP support in a generic framework.
> > > > > Unlike the above comment, we for non-IPP decoder we cannot naively
> > > > > pick
> > > > > the first format. In fact we parse the chroma and depth information
> > > > > from the headers (like pps from H264), and we pick a matching pixel
> > > > > format. This way, if we have a 10bit stream, and our IP supports
> > > > > 10bit,
> > > > > we will pick a 10bit pixel formats, otherwise decoding will just fail.
> > > > > 
> > > > > None of this information is passed to the driver prior to the first
> > > > > Request being made, so there is no way (as of current spec) that the
> > > > > driver can validate this in try_fmt ahead of time. Unless I set
> > > > > picture
> > > > > parameters without a request_fd for that purpose. If this is the way,
> > > > > then we should document this.
> > > > 
> > > > +Alexandre Courbot
> > > > 
> > > > It was suggested in the very early RFC stage, but it looks like it
> > > > didn't make it to the final spec.
> > > > https://patchwork.kernel.org/patch/10583233/#22209555
> > > 
> > > Ok, maybe we should revive it, it would fill that gap. Again, only an
> > > issue if you have a post processor. I'm still surprised we didn't
> > > expose the IPP functions through the topology, it would make so much
> > > sense to me, and I can make better code with that knowledge.
> > > 
> > > I found while coding this, that even if it's more difficult,
> > > classification of device by looking at the topology and the entity
> > > functions is much nicer, less of a guess.
> > > 
> > > Though, we lack some documentation (or clarification) for how to
> > > properly handle formats, size and selection in order to configure the
> > > IPP. Ezequiel was saying that we don't implement selection in Hanto, so
> > > I guess the scaling is a bit ambiguous then in regard to coded/display
> > > sizes. Though we pass a size to the OUTPUT side, so the driver can
> > > always control a little bit.
> > 
> > Re-reading the "initialization process", it's actually still there:
> > 
> > "Call VIDIOC_S_EXT_CTRLS() to set all the controls (parsed headers, etc.)
> > required by the OUTPUT format to enumerate the CAPTURE formats."
> > 
> 
> Oh, so it did make it to the final spec. Sorry, my bad. Should have
> checked better.
> 
> > And then it suggest to use G_FMT to retreive the optimal format as selected
> > by
> > the driver. So I guess this is a false alarm. And we probably don't need to
> > play
> > with the subsampling to filter the formats, as the driver is expected to do
> > so.
> > 
> 
> The question that this raises then is whether the merged drivers do
> so. Hantro G1 supports only 8-bit YUV 4:2:0, so there is no problem,
> although the driver doesn't seem to validate that in the headers. I'd
> also say the CAPTURE resolution should be determined by the headers,
> rather than the OUTPUT queue, consistently to the stateful decoders.
> Resolution changes should be also signaled if controls are set to
> headers that contain different resolutions than previously set.
> 
> > So what could be improve, is have in the per-codec documentation the list of
> > controls that are "required by the OUTPUT format to enumerate the CAPTURE
> > formats.". Could just be a comment in the control doc so say, "this control
> > is
> > needed to allow enumerationg of the CAPTURE formats".
> > 
> 
> I wouldn't limit this to the format enumeration. The drivers may also
> need to check for things like profile, some stream features or even
> just whether the resolution doesn't exceed the maximum supported by
> the decoder.

Yes, I came to similar conclusion. As an example, an H264 should validate the
SPS (and I think for H264 that is sufficient ?) based on it's hardware specific
constraint. Using the SPS for format and size numeration is also what makes the
most sense when there is an IPP. An IPP could support crop, and the driver could
select the default cropping to matcht the display resolution. Driver should be
enhanced, but it seems to be part of normal refine.

To come back to the spec, shall we document a return value for the case where
the control is supported, but the stream isn't supported ?

> 
> Best regards,
> Tomasz
> 
> > > > > Is this the intended way to negotiation IPP functions with the driver
> > > > > ?
> > > > > 
> > > > 
> > > > In theory, if the userspace knows whether the stream is 4:2:0 or 4:2:2
> > > > and 8-bit or 10-bit, it can still select the first format from the top
> > > > that matches these properties.
> > > 
> > > That's exactly what I do, I have a map of the formats and their
> > > chroma/depth, and take the first one that match. If that fails, I just
> > > fallback to the first one. It's something you need to do anyway, as we
> > > prefer the native format first (even if there is an IPP).
> > > 
> > > > That's not how format handling in V4L2 works, though. ENUM_FMT is
> > > > expected to return a list of valid formats and if we forget about the
> > > > image processor for a moment, a stateless decoder would always return
> > > > any possible format, including ones invalid for the stream.
> > > > 
> > > > Now back to the image processor, if it handles conversions from any to
> > > > any format listed by ENUM_FMT, we kind of regain the V4L2 compliance,
> > > > but if the conversions are limited, the above requirement still
> > > > doesn't hold and we're not implementing V4L2 correctly.
> > > > 
> > > > Perhaps we can still amend the spec and require controls that
> > > > determine the stream properties to be set before starting the
> > > > streaming? I can imagine it could also help the driver filter out some
> > > > unsupported streams early, before allocating buffers and attempting to
> > > > decode.
> > > 
> > > I think it would make sense, so just to make sure, for H264 we could
> > > set the V4L2_CID_MPEG_VIDEO_H264_SPS along with the OUTPUT format,
> > > prior to CAPTURE enumeration.
> > > 
> > > > Best regards,
> > > > Tomasz
> > > > 
> > > > > > When the application sets a pixel format other than NV12,
> > > > > > the post-processor is transparently enabled.
> > > > > > 
> > > > > > Patch 1 is a cleanups needed to easier integrate the post-processor.
> > > > > > Patch 2 introduces the post-processing support.
> > > > > > Patch 3 updates the uAPI specification.
> > > > > > 
> > > > > > This is tested on RK3288 platforms with MPEG-2, VP8 and
> > > > > > H264 streams, decoding to YUY2 surfaces. For now, this series
> > > > > > is only adding support for NV12-to-YUY2 conversion.
> > > > > > 
> > > > > > Applies to media/master.
> > > > > > 
> > > > > > Future plans
> > > > > > ------------
> > > > > > 
> > > > > > It seems to me that we should start moving this driver to use
> > > > > > regmap-based access to registers. However, such move is out of scope
> > > > > > and not entirely related to this post-processor enablement.
> > > > > > 
> > > > > > We'll work on that as follow-up patches.
> > > > > > 
> > > > > > Changelog
> > > > > > ---------
> > > > > > 
> > > > > > Changes v3:
> > > > > > 
> > > > > > * After discussing with Hans and Tomasz during the media summit
> > > > > > in ELCE, we decided to go back on the MC changes. The MC topology
> > > > > > is now untouched. This means the series is now similar to v1,
> > > > > > except we explicitly use the ENUM_FMT to hint about the post-
> > > > > > processed
> > > > > > formats.
> > > > > > 
> > > > > > Changes v2:
> > > > > > 
> > > > > > * The decoder->post-processor topology is now exposed
> > > > > >   explicitly and applications need to configure the pipeline.
> > > > > >   By default, the decoder is enabled and the post-processor
> > > > > >   is disabled.
> > > > > > 
> > > > > > * RGB post-processing output has been dropped. We might
> > > > > >   add this in the future, but for now, it seems it would
> > > > > >   make the code more complex without a use-case in mind.
> > > > > >   RGB is much more memory-consuming so less attractive
> > > > > >   than YUV, and modern GPUs and display controllers support YUV.
> > > > > > 
> > > > > > * The post-processor implementation still supports RK3288
> > > > > >   only. However, a generic register infrastructure is introduced
> > > > > >   to make addition of other variants such as RK3399 really easy.
> > > > > > 
> > > > > > Ezequiel Garcia (3):
> > > > > >   media: hantro: Cleanup format negotiation helpers
> > > > > >   media: hantro: Support color conversion via post-processing
> > > > > >   media: vidioc-enum-fmt.rst: clarify format preference
> > > > > > 
> > > > > >  .../media/uapi/v4l/vidioc-enum-fmt.rst        |   4 +-
> > > > > >  drivers/staging/media/hantro/Makefile         |   1 +
> > > > > >  drivers/staging/media/hantro/hantro.h         |  64 +++++++-
> > > > > >  drivers/staging/media/hantro/hantro_drv.c     |   8 +-
> > > > > >  .../staging/media/hantro/hantro_g1_h264_dec.c |   2 +-
> > > > > >  .../media/hantro/hantro_g1_mpeg2_dec.c        |   2 +-
> > > > > >  drivers/staging/media/hantro/hantro_g1_regs.h |  53 +++++++
> > > > > >  .../staging/media/hantro/hantro_g1_vp8_dec.c  |   2 +-
> > > > > >  drivers/staging/media/hantro/hantro_h264.c    |   6 +-
> > > > > >  drivers/staging/media/hantro/hantro_hw.h      |  13 ++
> > > > > >  .../staging/media/hantro/hantro_postproc.c    | 141
> > > > > > ++++++++++++++++++
> > > > > >  drivers/staging/media/hantro/hantro_v4l2.c    | 105 ++++++++-----
> > > > > >  drivers/staging/media/hantro/rk3288_vpu_hw.c  |  10 ++
> > > > > >  13 files changed, 366 insertions(+), 45 deletions(-)
> > > > > >  create mode 100644 drivers/staging/media/hantro/hantro_postproc.c
> > > > > >

Tomasz Figa March 9, 2020, 11:11 a.m. UTC | #7

On Thu, Feb 20, 2020 at 6:06 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
>
> On mer, 2020-02-12 at 16:06 +0900, Tomasz Figa wrote:
> > On Wed, Feb 12, 2020 at 1:22 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
> > > On lun, 2020-02-10 at 23:16 -0500, Nicolas Dufresne wrote:
> > > > Le lundi 10 février 2020 à 11:45 +0900, Tomasz Figa a écrit :
> > > > > On Mon, Feb 10, 2020 at 4:52 AM Nicolas Dufresne <nicolas@ndufresne.ca>
> > > > > wrote:
> > > > > > Le mercredi 13 novembre 2019 à 14:56 -0300, Ezequiel Garcia a écrit :
> > > > > > > Hi all,
> > > > > > >
> > > > > > > The Hantro G1 VPU post-processor block can be pipelined with
> > > > > > > the decoder hardware, allowing to perform operations such as
> > > > > > > color conversion, scaling, rotation, cropping, among others.
> > > > > > >
> > > > > > > When the post-processor is enabled, the decoder hardware
> > > > > > > needs its own set of NV12 buffers (the native decoder format),
> > > > > > > and the post-processor is the owner of the CAPTURE buffers,
> > > > > > > allocated for the post-processed format.
> > > > > > >
> > > > > > > This way, applications obtain post-processed
> > > > > > > (scaled, converted, etc) buffers transparently.
> > > > > > >
> > > > > > > This feature is implemented by exposing the post-processed pixel
> > > > > > > formats on ENUM_FMT, ordered as "preferred pixelformat first":
> > > > > > >
> > > > > > > v4l2-ctl -d 1 --list-formats
> > > > > > > ioctl: VIDIOC_ENUM_FMT
> > > > > > >       Type: Video Capture Multiplanar
> > > > > > >
> > > > > > >       [0]: 'NV12' (Y/CbCr 4:2:0)
> > > > > > >       [1]: 'YUYV' (YUYV 4:2:2)
> > > > > > >
> > > > > > > The order of preference in ENUM_FMT can be used as a hint
> > > > > > > by applications. This series updates the uAPI specification
> > > > > > > accordingly.
> > > > > >
> > > > > > As I'm implementing this, I realize that there may me a gap in being
> > > > > > able to implement both IPP and non-IPP support in a generic framework.
> > > > > > Unlike the above comment, we for non-IPP decoder we cannot naively
> > > > > > pick
> > > > > > the first format. In fact we parse the chroma and depth information
> > > > > > from the headers (like pps from H264), and we pick a matching pixel
> > > > > > format. This way, if we have a 10bit stream, and our IP supports
> > > > > > 10bit,
> > > > > > we will pick a 10bit pixel formats, otherwise decoding will just fail.
> > > > > >
> > > > > > None of this information is passed to the driver prior to the first
> > > > > > Request being made, so there is no way (as of current spec) that the
> > > > > > driver can validate this in try_fmt ahead of time. Unless I set
> > > > > > picture
> > > > > > parameters without a request_fd for that purpose. If this is the way,
> > > > > > then we should document this.
> > > > >
> > > > > +Alexandre Courbot
> > > > >
> > > > > It was suggested in the very early RFC stage, but it looks like it
> > > > > didn't make it to the final spec.
> > > > > https://patchwork.kernel.org/patch/10583233/#22209555
> > > >
> > > > Ok, maybe we should revive it, it would fill that gap. Again, only an
> > > > issue if you have a post processor. I'm still surprised we didn't
> > > > expose the IPP functions through the topology, it would make so much
> > > > sense to me, and I can make better code with that knowledge.
> > > >
> > > > I found while coding this, that even if it's more difficult,
> > > > classification of device by looking at the topology and the entity
> > > > functions is much nicer, less of a guess.
> > > >
> > > > Though, we lack some documentation (or clarification) for how to
> > > > properly handle formats, size and selection in order to configure the
> > > > IPP. Ezequiel was saying that we don't implement selection in Hanto, so
> > > > I guess the scaling is a bit ambiguous then in regard to coded/display
> > > > sizes. Though we pass a size to the OUTPUT side, so the driver can
> > > > always control a little bit.
> > >
> > > Re-reading the "initialization process", it's actually still there:
> > >
> > > "Call VIDIOC_S_EXT_CTRLS() to set all the controls (parsed headers, etc.)
> > > required by the OUTPUT format to enumerate the CAPTURE formats."
> > >
> >
> > Oh, so it did make it to the final spec. Sorry, my bad. Should have
> > checked better.
> >
> > > And then it suggest to use G_FMT to retreive the optimal format as selected
> > > by
> > > the driver. So I guess this is a false alarm. And we probably don't need to
> > > play
> > > with the subsampling to filter the formats, as the driver is expected to do
> > > so.
> > >
> >
> > The question that this raises then is whether the merged drivers do
> > so. Hantro G1 supports only 8-bit YUV 4:2:0, so there is no problem,
> > although the driver doesn't seem to validate that in the headers. I'd
> > also say the CAPTURE resolution should be determined by the headers,
> > rather than the OUTPUT queue, consistently to the stateful decoders.
> > Resolution changes should be also signaled if controls are set to
> > headers that contain different resolutions than previously set.
> >
> > > So what could be improve, is have in the per-codec documentation the list of
> > > controls that are "required by the OUTPUT format to enumerate the CAPTURE
> > > formats.". Could just be a comment in the control doc so say, "this control
> > > is
> > > needed to allow enumerationg of the CAPTURE formats".
> > >
> >
> > I wouldn't limit this to the format enumeration. The drivers may also
> > need to check for things like profile, some stream features or even
> > just whether the resolution doesn't exceed the maximum supported by
> > the decoder.
>
> Yes, I came to similar conclusion. As an example, an H264 should validate the
> SPS (and I think for H264 that is sufficient ?) based on it's hardware specific
> constraint. Using the SPS for format and size numeration is also what makes the
> most sense when there is an IPP. An IPP could support crop, and the driver could
> select the default cropping to matcht the display resolution. Driver should be
> enhanced, but it seems to be part of normal refine.
>
> To come back to the spec, shall we document a return value for the case where
> the control is supported, but the stream isn't supported ?
>

I think the answer is yes. :) I'll leave the details to others, though.

Best regards,
Tomasz

> >
> > Best regards,
> > Tomasz
> >
> > > > > > Is this the intended way to negotiation IPP functions with the driver
> > > > > > ?
> > > > > >
> > > > >
> > > > > In theory, if the userspace knows whether the stream is 4:2:0 or 4:2:2
> > > > > and 8-bit or 10-bit, it can still select the first format from the top
> > > > > that matches these properties.
> > > >
> > > > That's exactly what I do, I have a map of the formats and their
> > > > chroma/depth, and take the first one that match. If that fails, I just
> > > > fallback to the first one. It's something you need to do anyway, as we
> > > > prefer the native format first (even if there is an IPP).
> > > >
> > > > > That's not how format handling in V4L2 works, though. ENUM_FMT is
> > > > > expected to return a list of valid formats and if we forget about the
> > > > > image processor for a moment, a stateless decoder would always return
> > > > > any possible format, including ones invalid for the stream.
> > > > >
> > > > > Now back to the image processor, if it handles conversions from any to
> > > > > any format listed by ENUM_FMT, we kind of regain the V4L2 compliance,
> > > > > but if the conversions are limited, the above requirement still
> > > > > doesn't hold and we're not implementing V4L2 correctly.
> > > > >
> > > > > Perhaps we can still amend the spec and require controls that
> > > > > determine the stream properties to be set before starting the
> > > > > streaming? I can imagine it could also help the driver filter out some
> > > > > unsupported streams early, before allocating buffers and attempting to
> > > > > decode.
> > > >
> > > > I think it would make sense, so just to make sure, for H264 we could
> > > > set the V4L2_CID_MPEG_VIDEO_H264_SPS along with the OUTPUT format,
> > > > prior to CAPTURE enumeration.
> > > >
> > > > > Best regards,
> > > > > Tomasz
> > > > >
> > > > > > > When the application sets a pixel format other than NV12,
> > > > > > > the post-processor is transparently enabled.
> > > > > > >
> > > > > > > Patch 1 is a cleanups needed to easier integrate the post-processor.
> > > > > > > Patch 2 introduces the post-processing support.
> > > > > > > Patch 3 updates the uAPI specification.
> > > > > > >
> > > > > > > This is tested on RK3288 platforms with MPEG-2, VP8 and
> > > > > > > H264 streams, decoding to YUY2 surfaces. For now, this series
> > > > > > > is only adding support for NV12-to-YUY2 conversion.
> > > > > > >
> > > > > > > Applies to media/master.
> > > > > > >
> > > > > > > Future plans
> > > > > > > ------------
> > > > > > >
> > > > > > > It seems to me that we should start moving this driver to use
> > > > > > > regmap-based access to registers. However, such move is out of scope
> > > > > > > and not entirely related to this post-processor enablement.
> > > > > > >
> > > > > > > We'll work on that as follow-up patches.
> > > > > > >
> > > > > > > Changelog
> > > > > > > ---------
> > > > > > >
> > > > > > > Changes v3:
> > > > > > >
> > > > > > > * After discussing with Hans and Tomasz during the media summit
> > > > > > > in ELCE, we decided to go back on the MC changes. The MC topology
> > > > > > > is now untouched. This means the series is now similar to v1,
> > > > > > > except we explicitly use the ENUM_FMT to hint about the post-
> > > > > > > processed
> > > > > > > formats.
> > > > > > >
> > > > > > > Changes v2:
> > > > > > >
> > > > > > > * The decoder->post-processor topology is now exposed
> > > > > > >   explicitly and applications need to configure the pipeline.
> > > > > > >   By default, the decoder is enabled and the post-processor
> > > > > > >   is disabled.
> > > > > > >
> > > > > > > * RGB post-processing output has been dropped. We might
> > > > > > >   add this in the future, but for now, it seems it would
> > > > > > >   make the code more complex without a use-case in mind.
> > > > > > >   RGB is much more memory-consuming so less attractive
> > > > > > >   than YUV, and modern GPUs and display controllers support YUV.
> > > > > > >
> > > > > > > * The post-processor implementation still supports RK3288
> > > > > > >   only. However, a generic register infrastructure is introduced
> > > > > > >   to make addition of other variants such as RK3399 really easy.
> > > > > > >
> > > > > > > Ezequiel Garcia (3):
> > > > > > >   media: hantro: Cleanup format negotiation helpers
> > > > > > >   media: hantro: Support color conversion via post-processing
> > > > > > >   media: vidioc-enum-fmt.rst: clarify format preference
> > > > > > >
> > > > > > >  .../media/uapi/v4l/vidioc-enum-fmt.rst        |   4 +-
> > > > > > >  drivers/staging/media/hantro/Makefile         |   1 +
> > > > > > >  drivers/staging/media/hantro/hantro.h         |  64 +++++++-
> > > > > > >  drivers/staging/media/hantro/hantro_drv.c     |   8 +-
> > > > > > >  .../staging/media/hantro/hantro_g1_h264_dec.c |   2 +-
> > > > > > >  .../media/hantro/hantro_g1_mpeg2_dec.c        |   2 +-
> > > > > > >  drivers/staging/media/hantro/hantro_g1_regs.h |  53 +++++++
> > > > > > >  .../staging/media/hantro/hantro_g1_vp8_dec.c  |   2 +-
> > > > > > >  drivers/staging/media/hantro/hantro_h264.c    |   6 +-
> > > > > > >  drivers/staging/media/hantro/hantro_hw.h      |  13 ++
> > > > > > >  .../staging/media/hantro/hantro_postproc.c    | 141
> > > > > > > ++++++++++++++++++
> > > > > > >  drivers/staging/media/hantro/hantro_v4l2.c    | 105 ++++++++-----
> > > > > > >  drivers/staging/media/hantro/rk3288_vpu_hw.c  |  10 ++
> > > > > > >  13 files changed, 366 insertions(+), 45 deletions(-)
> > > > > > >  create mode 100644 drivers/staging/media/hantro/hantro_postproc.c
> > > > > > >
>

[v3,0/3] Enable Hantro G1 post-processor

Message

Comments