diff mbox

[RFC,2/2] media: docs-rst: Add encoder UAPI specification to Codec Interfaces

Message ID 20180605103328.176255-3-tfiga@chromium.org (mailing list archive)
State New, archived
Headers show

Commit Message

Tomasz Figa June 5, 2018, 10:33 a.m. UTC
Due to complexity of the video encoding process, the V4L2 drivers of
stateful encoder hardware require specific sequencies of V4L2 API calls
to be followed. These include capability enumeration, initialization,
encoding, encode parameters change and flush.

Specifics of the above have been discussed during Media Workshops at
LinuxCon Europe 2012 in Barcelona and then later Embedded Linux
Conference Europe 2014 in Düsseldorf. The de facto Codec API that
originated at those events was later implemented by the drivers we already
have merged in mainline, such as s5p-mfc or mtk-vcodec.

The only thing missing was the real specification included as a part of
Linux Media documentation. Fix it now and document the encoder part of
the Codec API.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
---
 Documentation/media/uapi/v4l/dev-codec.rst | 313 +++++++++++++++++++++
 1 file changed, 313 insertions(+)

Comments

Philipp Zabel June 5, 2018, 11:53 a.m. UTC | #1
On Tue, 2018-06-05 at 19:33 +0900, Tomasz Figa wrote:
> Due to complexity of the video encoding process, the V4L2 drivers of
> stateful encoder hardware require specific sequencies of V4L2 API calls
> to be followed. These include capability enumeration, initialization,
> encoding, encode parameters change and flush.
> 
> Specifics of the above have been discussed during Media Workshops at
> LinuxCon Europe 2012 in Barcelona and then later Embedded Linux
> Conference Europe 2014 in Düsseldorf. The de facto Codec API that
> originated at those events was later implemented by the drivers we already
> have merged in mainline, such as s5p-mfc or mtk-vcodec.
> 
> The only thing missing was the real specification included as a part of
> Linux Media documentation. Fix it now and document the encoder part of
> the Codec API.
> 
> Signed-off-by: Tomasz Figa <tfiga@chromium.org>
> ---
>  Documentation/media/uapi/v4l/dev-codec.rst | 313 +++++++++++++++++++++
>  1 file changed, 313 insertions(+)
> 
> diff --git a/Documentation/media/uapi/v4l/dev-codec.rst b/Documentation/media/uapi/v4l/dev-codec.rst
> index 0483b10c205e..325a51bb09df 100644
> --- a/Documentation/media/uapi/v4l/dev-codec.rst
> +++ b/Documentation/media/uapi/v4l/dev-codec.rst
> @@ -805,3 +805,316 @@ of the driver.
>  To summarize, setting formats and allocation must always start with the
>  OUTPUT queue and the OUTPUT queue is the master that governs the set of
>  supported formats for the CAPTURE queue.
> +
> +Encoder
> +=======
> +
> +Querying capabilities
> +---------------------
> +
> +1. To enumerate the set of coded formats supported by the driver, the
> +   client uses :c:func:`VIDIOC_ENUM_FMT` for CAPTURE. The driver must always
> +   return the full set of supported formats, irrespective of the
> +   format set on the OUTPUT queue.
> +
> +2. To enumerate the set of supported raw formats, the client uses
> +   :c:func:`VIDIOC_ENUM_FMT` for OUTPUT queue. The driver must return only
> +   the formats supported for the format currently set on the
> +   CAPTURE queue.
> +   In order to enumerate raw formats supported by a given coded
> +   format, the client must first set that coded format on the
> +   CAPTURE queue and then enumerate the OUTPUT queue.
> +
> +3. The client may use :c:func:`VIDIOC_ENUM_FRAMESIZES` to detect supported
> +   resolutions for a given format, passing its fourcc in
> +   :c:type:`v4l2_frmivalenum` ``pixel_format``.
> +
> +   a. Values returned from :c:func:`VIDIOC_ENUM_FRAMESIZES` for coded formats
> +      must be maximums for given coded format for all supported raw
> +      formats.
> +
> +   b. Values returned from :c:func:`VIDIOC_ENUM_FRAMESIZES` for raw formats must
> +      be maximums for given raw format for all supported coded
> +      formats.
> +
> +   c. The client should derive the supported resolution for a
> +      combination of coded+raw format by calculating the
> +      intersection of resolutions returned from calls to
> +      :c:func:`VIDIOC_ENUM_FRAMESIZES` for the given coded and raw formats.
> +
> +4. Supported profiles and levels for given format, if applicable, may be
> +   queried using their respective controls via :c:func:`VIDIOC_QUERYCTRL`.
> +
> +5. The client may use :c:func:`VIDIOC_ENUM_FRAMEINTERVALS` to enumerate maximum
> +   supported framerates by the driver/hardware for a given
> +   format+resolution combination.
> +
> +6. Any additional encoder capabilities may be discovered by querying
> +   their respective controls.
> +
> +.. note::
> +
> +   Full format enumeration requires enumerating all raw formats
> +   on the OUTPUT queue for all possible (enumerated) coded formats on
> +   CAPTURE queue (setting each format on the CAPTURE queue before each
> +   enumeration on the OUTPUT queue.
> +
> +Initialization
> +--------------
> +
> +1. (optional) Enumerate supported formats and resolutions. See
> +   capability enumeration.
> +
> +2. Set a coded format on the CAPTURE queue via :c:func:`VIDIOC_S_FMT`
> +
> +   a. Required fields:
> +
> +      i.  type = CAPTURE
> +
> +      ii. fmt.pix_mp.pixelformat set to a coded format to be produced
> +
> +   b. Return values:
> +
> +      i.  EINVAL: unsupported format.
> +
> +      ii. Others: per spec
> +
> +   c. Return fields:
> +
> +      i. fmt.pix_mp.width, fmt.pix_mp.height should be 0.
> +
> +   .. note::
> +
> +      After a coded format is set, the set of raw formats
> +      supported as source on the OUTPUT queue may change.

So setting CAPTURE potentially also changes OUTPUT format?

If the encoded stream supports colorimetry information, should that
information be taken from the CAPTURE queue?

> +3. (optional) Enumerate supported OUTPUT formats (raw formats for
> +   source) for the selected coded format via :c:func:`VIDIOC_ENUM_FMT`.
> +
> +   a. Required fields:
> +
> +      i.  type = OUTPUT
> +
> +      ii. index = per spec
> +
> +   b. Return values: per spec
> +
> +   c. Return fields:
> +
> +      i. pixelformat: raw format supported for the coded format
> +         currently selected on the OUTPUT queue.
> +
> +4. Set a raw format on the OUTPUT queue and visible resolution for the
> +   source raw frames via :c:func:`VIDIOC_S_FMT` on the OUTPUT queue.

Isn't this optional? If S_FMT(CAP) already sets OUTPUT to a valid
format, just G_FMT(OUT) should be valid here as well.

> +
> +   a. Required fields:
> +
> +      i.   type = OUTPUT
> +
> +      ii.  fmt.pix_mp.pixelformat = raw format to be used as source of
> +           encode
> +
> +      iii. fmt.pix_mp.width, fmt.pix_mp.height = input resolution
> +           for the source raw frames

These are specific to multiplanar drivers. The same should apply to
singleplanar drivers.

> +
> +      iv.  num_planes: set to number of planes for pixelformat.
> +
> +      v.   For each plane p = [0, num_planes-1]:
> +           plane_fmt[p].sizeimage, plane_fmt[p].bytesperline: as
> +           per spec for input resolution.
> +
> +   b. Return values: as per spec.
> +
> +   c. Return fields:
> +
> +      i.  fmt.pix_mp.width, fmt.pix_mp.height = may be adjusted by
> +          driver to match alignment requirements, as required by the
> +          currently selected formats.
> +
> +      ii. For each plane p = [0, num_planes-1]:
> +          plane_fmt[p].sizeimage, plane_fmt[p].bytesperline: as
> +          per spec for the adjusted input resolution.
> +
> +   d. Setting the input resolution will reset visible resolution to the
> +      adjusted input resolution rounded up to the closest visible
> +      resolution supported by the driver. Similarly, coded size will
> +      be reset to input resolution rounded up to the closest coded
> +      resolution supported by the driver (typically a multiple of
> +      macroblock size).
> +
> +5. (optional) Set visible size for the stream metadata via
> +   :c:func:`VIDIOC_S_SELECTION` on the OUTPUT queue.
> +
> +   a. Required fields:
> +
> +      i.   type = OUTPUT
> +
> +      ii.  target = ``V4L2_SEL_TGT_CROP``
> +
> +      iii. r.left, r.top, r.width, r.height: visible rectangle; this
> +           must fit within coded resolution returned from
> +           :c:func:`VIDIOC_S_FMT`.
> +
> +   b. Return values: as per spec.
> +
> +   c. Return fields:
> +
> +      i. r.left, r.top, r.width, r.height: visible rectangle adjusted by
> +         the driver to match internal constraints.
> +
> +   d. This resolution must be used as the visible resolution in the
> +      stream metadata.
> +
> +   .. note::
> +
> +      The driver might not support arbitrary values of the
> +      crop rectangle and will adjust it to the closest supported
> +      one.
> +
> +6. Allocate buffers for both OUTPUT and CAPTURE queues via
> +   :c:func:`VIDIOC_REQBUFS`. This may be performed in any order.
> +
> +   a. Required fields:
> +
> +      i.   count = n, where n > 0.
> +
> +      ii.  type = OUTPUT or CAPTURE
> +
> +      iii. memory = as per spec
> +
> +   b. Return values: Per spec.
> +
> +   c. Return fields:
> +
> +      i. count: adjusted to allocated number of buffers
> +
> +   d. The driver must adjust count to minimum of required number of
> +      buffers for given format and count passed. The client must
> +      check this value after the ioctl returns to get the number of
> +      buffers actually allocated.
> +
> +   .. note::
> +
> +      Passing count = 1 is useful for letting the driver choose the
> +      minimum according to the selected format/hardware
> +      requirements.
> +
> +   .. note::
> +
> +      To allocate more than minimum number of buffers (for pipeline
> +      depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT)`` or
> +      G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE)``, respectively,
> +      to get the minimum number of buffers required by the
> +      driver/format, and pass the obtained value plus the number of
> +      additional buffers needed in count field to :c:func:`VIDIOC_REQBUFS`.
> +
> +7. Begin streaming on both OUTPUT and CAPTURE queues via
> +   :c:func:`VIDIOC_STREAMON`. This may be performed in any order.

Actual encoding starts once both queues are streaming and stops as soon
as the first queue receives STREAMOFF?

> +Encoding
> +--------
> +
> +This state is reached after a successful initialization sequence. In
> +this state, client queues and dequeues buffers to both queues via
> +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, as per spec.
> +
> +Both queues operate independently. The client may queue and dequeue
> +buffers to queues in any order and at any rate, also at a rate different
> +for each queue. The client may queue buffers within the same queue in
> +any order (V4L2 index-wise). It is recommended for the client to operate
> +the queues independently for best performance.
> +
> +Source OUTPUT buffers must contain full raw frames in the selected
> +OUTPUT format, exactly one frame per buffer.
> +
> +Encoding parameter changes
> +--------------------------
> +
> +The client is allowed to use :c:func:`VIDIOC_S_CTRL` to change encoder
> +parameters at any time. The driver must apply the new setting starting
> +at the next frame queued to it.
> +
> +This specifically means that if the driver maintains a queue of buffers
> +to be encoded and at the time of the call to :c:func:`VIDIOC_S_CTRL` not all the
> +buffers in the queue are processed yet, the driver must not apply the
> +change immediately, but schedule it for when the next buffer queued
> +after the :c:func:`VIDIOC_S_CTRL` starts being processed.

Does this mean that hardware that doesn't support changing parameters at
runtime at all must stop streaming and restart streaming internally with
every parameter change? Or is it acceptable to not allow the controls to
be changed during streaming?

> +Flush
> +-----
> +
> +Flush is the process of draining the CAPTURE queue of any remaining
> +buffers. After the flush sequence is complete, the client has received
> +all encoded frames for all OUTPUT buffers queued before the sequence was
> +started.
> +
> +1. Begin flush by issuing :c:func:`VIDIOC_ENCODER_CMD`.
> +
> +   a. Required fields:
> +
> +      i. cmd = ``V4L2_ENC_CMD_STOP``
> +
> +2. The driver must process and encode as normal all OUTPUT buffers
> +   queued by the client before the :c:func:`VIDIOC_ENCODER_CMD` was issued.
> +
> +3. Once all OUTPUT buffers queued before ``V4L2_ENC_CMD_STOP`` are
> +   processed:
> +
> +   a. Once all decoded frames (if any) are ready to be dequeued on the
> +      CAPTURE queue, the driver must send a ``V4L2_EVENT_EOS``. The
> +      driver must also set ``V4L2_BUF_FLAG_LAST`` in
> +      :c:type:`v4l2_buffer` ``flags`` field on the buffer on the CAPTURE queue
> +      containing the last frame (if any) produced as a result of
> +      processing the OUTPUT buffers queued before
> +      ``V4L2_ENC_CMD_STOP``. If no more frames are left to be
> +      returned at the point of handling ``V4L2_ENC_CMD_STOP``, the
> +      driver must return an empty buffer (with
> +      :c:type:`v4l2_buffer` ``bytesused`` = 0) as the last buffer with
> +      ``V4L2_BUF_FLAG_LAST`` set instead.
> +      Any attempts to dequeue more buffers beyond the buffer
> +      marked with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE
> +      error from :c:func:`VIDIOC_DQBUF`.
> +
> +4. At this point, encoding is paused and the driver will accept, but not
> +   process any newly queued OUTPUT buffers until the client issues
> +   ``V4L2_ENC_CMD_START`` or :c:func:`VIDIOC_STREAMON`.
> +
> +Once the flush sequence is initiated, the client needs to drive it to
> +completion, as described by the above steps, unless it aborts the
> +process by issuing :c:func:`VIDIOC_STREAMOFF` on OUTPUT queue. The client is not
> +allowed to issue ``V4L2_ENC_CMD_START`` or ``V4L2_ENC_CMD_STOP`` again
> +while the flush sequence is in progress.
> +
> +Issuing :c:func:`VIDIOC_STREAMON` on OUTPUT queue will implicitly restart
> +encoding.

Only if CAPTURE is already streaming?

>  :c:func:`VIDIOC_STREAMON` and :c:func:`VIDIOC_STREAMOFF` on CAPTURE queue will
> +not affect the flush sequence, allowing the client to change CAPTURE
> +buffer set if needed.
> +
> +Commit points
> +-------------
> +
> +Setting formats and allocating buffers triggers changes in the behavior
> +of the driver.
> +
> +1. Setting format on CAPTURE queue may change the set of formats
> +   supported/advertised on the OUTPUT queue. It also must change the
> +   format currently selected on OUTPUT queue if it is not supported
> +   by the newly selected CAPTURE format to a supported one.

Should TRY_FMT on the OUTPUT queue only return formats that can be
transformed into the currently set format on the capture queue?
(That is, after setting colorimetry on the CAPTURE queue, will
TRY_FMT(OUT) always return that colorimetry?)

> +2. Enumerating formats on OUTPUT queue must only return OUTPUT formats
> +   supported for the CAPTURE format currently set.
> +
> +3. Setting/changing format on OUTPUT queue does not change formats
> +   available on CAPTURE queue. An attempt to set OUTPUT format that
> +   is not supported for the currently selected CAPTURE format must
> +   result in an error (-EINVAL) from :c:func:`VIDIOC_S_FMT`.

Same as for decoding, is this limited to pixel format? Why isn't the
pixel format corrected to a supported choice? What about
width/height/colorimetry?

> +4. Enumerating formats on CAPTURE queue always returns a full set of
> +   supported coded formats, irrespective of the current format
> +   selected on OUTPUT queue.
> +
> +5. After allocating buffers on a queue, it is not possible to change
> +   format on it.
> +
> +In summary, the CAPTURE (coded format) queue is the master that governs
> +the set of supported formats for the OUTPUT queue.

regards
Philipp
Tomasz Figa June 5, 2018, 12:31 p.m. UTC | #2
Hi Philipp,

Thanks for review!

On Tue, Jun 5, 2018 at 8:53 PM Philipp Zabel <p.zabel@pengutronix.de> wrote:
>
> On Tue, 2018-06-05 at 19:33 +0900, Tomasz Figa wrote:
> > Due to complexity of the video encoding process, the V4L2 drivers of
> > stateful encoder hardware require specific sequencies of V4L2 API calls
> > to be followed. These include capability enumeration, initialization,
> > encoding, encode parameters change and flush.
> >
> > Specifics of the above have been discussed during Media Workshops at
> > LinuxCon Europe 2012 in Barcelona and then later Embedded Linux
> > Conference Europe 2014 in Düsseldorf. The de facto Codec API that
> > originated at those events was later implemented by the drivers we already
> > have merged in mainline, such as s5p-mfc or mtk-vcodec.
> >
> > The only thing missing was the real specification included as a part of
> > Linux Media documentation. Fix it now and document the encoder part of
> > the Codec API.
> >
> > Signed-off-by: Tomasz Figa <tfiga@chromium.org>
> > ---
> >  Documentation/media/uapi/v4l/dev-codec.rst | 313 +++++++++++++++++++++
> >  1 file changed, 313 insertions(+)
> >
> > diff --git a/Documentation/media/uapi/v4l/dev-codec.rst b/Documentation/media/uapi/v4l/dev-codec.rst
> > index 0483b10c205e..325a51bb09df 100644
> > --- a/Documentation/media/uapi/v4l/dev-codec.rst
> > +++ b/Documentation/media/uapi/v4l/dev-codec.rst
> > @@ -805,3 +805,316 @@ of the driver.
> >  To summarize, setting formats and allocation must always start with the
> >  OUTPUT queue and the OUTPUT queue is the master that governs the set of
> >  supported formats for the CAPTURE queue.
> > +
> > +Encoder
> > +=======
> > +
> > +Querying capabilities
> > +---------------------
> > +
> > +1. To enumerate the set of coded formats supported by the driver, the
> > +   client uses :c:func:`VIDIOC_ENUM_FMT` for CAPTURE. The driver must always
> > +   return the full set of supported formats, irrespective of the
> > +   format set on the OUTPUT queue.
> > +
> > +2. To enumerate the set of supported raw formats, the client uses
> > +   :c:func:`VIDIOC_ENUM_FMT` for OUTPUT queue. The driver must return only
> > +   the formats supported for the format currently set on the
> > +   CAPTURE queue.
> > +   In order to enumerate raw formats supported by a given coded
> > +   format, the client must first set that coded format on the
> > +   CAPTURE queue and then enumerate the OUTPUT queue.
> > +
> > +3. The client may use :c:func:`VIDIOC_ENUM_FRAMESIZES` to detect supported
> > +   resolutions for a given format, passing its fourcc in
> > +   :c:type:`v4l2_frmivalenum` ``pixel_format``.
> > +
> > +   a. Values returned from :c:func:`VIDIOC_ENUM_FRAMESIZES` for coded formats
> > +      must be maximums for given coded format for all supported raw
> > +      formats.
> > +
> > +   b. Values returned from :c:func:`VIDIOC_ENUM_FRAMESIZES` for raw formats must
> > +      be maximums for given raw format for all supported coded
> > +      formats.
> > +
> > +   c. The client should derive the supported resolution for a
> > +      combination of coded+raw format by calculating the
> > +      intersection of resolutions returned from calls to
> > +      :c:func:`VIDIOC_ENUM_FRAMESIZES` for the given coded and raw formats.
> > +
> > +4. Supported profiles and levels for given format, if applicable, may be
> > +   queried using their respective controls via :c:func:`VIDIOC_QUERYCTRL`.
> > +
> > +5. The client may use :c:func:`VIDIOC_ENUM_FRAMEINTERVALS` to enumerate maximum
> > +   supported framerates by the driver/hardware for a given
> > +   format+resolution combination.
> > +
> > +6. Any additional encoder capabilities may be discovered by querying
> > +   their respective controls.
> > +
> > +.. note::
> > +
> > +   Full format enumeration requires enumerating all raw formats
> > +   on the OUTPUT queue for all possible (enumerated) coded formats on
> > +   CAPTURE queue (setting each format on the CAPTURE queue before each
> > +   enumeration on the OUTPUT queue.
> > +
> > +Initialization
> > +--------------
> > +
> > +1. (optional) Enumerate supported formats and resolutions. See
> > +   capability enumeration.
> > +
> > +2. Set a coded format on the CAPTURE queue via :c:func:`VIDIOC_S_FMT`
> > +
> > +   a. Required fields:
> > +
> > +      i.  type = CAPTURE
> > +
> > +      ii. fmt.pix_mp.pixelformat set to a coded format to be produced
> > +
> > +   b. Return values:
> > +
> > +      i.  EINVAL: unsupported format.
> > +
> > +      ii. Others: per spec
> > +
> > +   c. Return fields:
> > +
> > +      i. fmt.pix_mp.width, fmt.pix_mp.height should be 0.
> > +
> > +   .. note::
> > +
> > +      After a coded format is set, the set of raw formats
> > +      supported as source on the OUTPUT queue may change.
>
> So setting CAPTURE potentially also changes OUTPUT format?

Yes, but at this point userspace hasn't yet set the desired format.

> If the encoded stream supports colorimetry information, should that
> information be taken from the CAPTURE queue?

What's colorimetry? Is it something that is included in
v4l2_pix_format(_mplane)? Is it something that can vary between raw
input and encoded output?

>
> > +3. (optional) Enumerate supported OUTPUT formats (raw formats for
> > +   source) for the selected coded format via :c:func:`VIDIOC_ENUM_FMT`.
> > +
> > +   a. Required fields:
> > +
> > +      i.  type = OUTPUT
> > +
> > +      ii. index = per spec
> > +
> > +   b. Return values: per spec
> > +
> > +   c. Return fields:
> > +
> > +      i. pixelformat: raw format supported for the coded format
> > +         currently selected on the OUTPUT queue.
> > +
> > +4. Set a raw format on the OUTPUT queue and visible resolution for the
> > +   source raw frames via :c:func:`VIDIOC_S_FMT` on the OUTPUT queue.
>
> Isn't this optional? If S_FMT(CAP) already sets OUTPUT to a valid
> format, just G_FMT(OUT) should be valid here as well.

Technically it would be valid indeed, but that would be unlikely what
the client needs, given that it probably already has some existing raw
frames (at certain resolution) to encode.

>
> > +
> > +   a. Required fields:
> > +
> > +      i.   type = OUTPUT
> > +
> > +      ii.  fmt.pix_mp.pixelformat = raw format to be used as source of
> > +           encode
> > +
> > +      iii. fmt.pix_mp.width, fmt.pix_mp.height = input resolution
> > +           for the source raw frames
>
> These are specific to multiplanar drivers. The same should apply to
> singleplanar drivers.

Right. In general I'd be interested in getting some suggestions in how
to write this kind of descriptions nicely and consistent with other
kernel documentation.

>
> > +
> > +      iv.  num_planes: set to number of planes for pixelformat.
> > +
> > +      v.   For each plane p = [0, num_planes-1]:
> > +           plane_fmt[p].sizeimage, plane_fmt[p].bytesperline: as
> > +           per spec for input resolution.
> > +
> > +   b. Return values: as per spec.
> > +
> > +   c. Return fields:
> > +
> > +      i.  fmt.pix_mp.width, fmt.pix_mp.height = may be adjusted by
> > +          driver to match alignment requirements, as required by the
> > +          currently selected formats.
> > +
> > +      ii. For each plane p = [0, num_planes-1]:
> > +          plane_fmt[p].sizeimage, plane_fmt[p].bytesperline: as
> > +          per spec for the adjusted input resolution.
> > +
> > +   d. Setting the input resolution will reset visible resolution to the
> > +      adjusted input resolution rounded up to the closest visible
> > +      resolution supported by the driver. Similarly, coded size will
> > +      be reset to input resolution rounded up to the closest coded
> > +      resolution supported by the driver (typically a multiple of
> > +      macroblock size).
> > +
> > +5. (optional) Set visible size for the stream metadata via
> > +   :c:func:`VIDIOC_S_SELECTION` on the OUTPUT queue.
> > +
> > +   a. Required fields:
> > +
> > +      i.   type = OUTPUT
> > +
> > +      ii.  target = ``V4L2_SEL_TGT_CROP``
> > +
> > +      iii. r.left, r.top, r.width, r.height: visible rectangle; this
> > +           must fit within coded resolution returned from
> > +           :c:func:`VIDIOC_S_FMT`.
> > +
> > +   b. Return values: as per spec.
> > +
> > +   c. Return fields:
> > +
> > +      i. r.left, r.top, r.width, r.height: visible rectangle adjusted by
> > +         the driver to match internal constraints.
> > +
> > +   d. This resolution must be used as the visible resolution in the
> > +      stream metadata.
> > +
> > +   .. note::
> > +
> > +      The driver might not support arbitrary values of the
> > +      crop rectangle and will adjust it to the closest supported
> > +      one.
> > +
> > +6. Allocate buffers for both OUTPUT and CAPTURE queues via
> > +   :c:func:`VIDIOC_REQBUFS`. This may be performed in any order.
> > +
> > +   a. Required fields:
> > +
> > +      i.   count = n, where n > 0.
> > +
> > +      ii.  type = OUTPUT or CAPTURE
> > +
> > +      iii. memory = as per spec
> > +
> > +   b. Return values: Per spec.
> > +
> > +   c. Return fields:
> > +
> > +      i. count: adjusted to allocated number of buffers
> > +
> > +   d. The driver must adjust count to minimum of required number of
> > +      buffers for given format and count passed. The client must
> > +      check this value after the ioctl returns to get the number of
> > +      buffers actually allocated.
> > +
> > +   .. note::
> > +
> > +      Passing count = 1 is useful for letting the driver choose the
> > +      minimum according to the selected format/hardware
> > +      requirements.
> > +
> > +   .. note::
> > +
> > +      To allocate more than minimum number of buffers (for pipeline
> > +      depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT)`` or
> > +      G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE)``, respectively,
> > +      to get the minimum number of buffers required by the
> > +      driver/format, and pass the obtained value plus the number of
> > +      additional buffers needed in count field to :c:func:`VIDIOC_REQBUFS`.
> > +
> > +7. Begin streaming on both OUTPUT and CAPTURE queues via
> > +   :c:func:`VIDIOC_STREAMON`. This may be performed in any order.
>
> Actual encoding starts once both queues are streaming

I think that's the only thing possible with vb2, since it gives
buffers to the driver when streaming starts on given queue.

> and stops as soon
> as the first queue receives STREAMOFF?

Given that STREAMOFF is supposed to drop all the buffers from the
queue, it should be so +/- finishing what's already queued to the
hardware, if it cannot be cancelled.

I guess we should say this more explicitly.

>
> > +Encoding
> > +--------
> > +
> > +This state is reached after a successful initialization sequence. In
> > +this state, client queues and dequeues buffers to both queues via
> > +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, as per spec.
> > +
> > +Both queues operate independently. The client may queue and dequeue
> > +buffers to queues in any order and at any rate, also at a rate different
> > +for each queue. The client may queue buffers within the same queue in
> > +any order (V4L2 index-wise). It is recommended for the client to operate
> > +the queues independently for best performance.
> > +
> > +Source OUTPUT buffers must contain full raw frames in the selected
> > +OUTPUT format, exactly one frame per buffer.
> > +
> > +Encoding parameter changes
> > +--------------------------
> > +
> > +The client is allowed to use :c:func:`VIDIOC_S_CTRL` to change encoder
> > +parameters at any time. The driver must apply the new setting starting
> > +at the next frame queued to it.
> > +
> > +This specifically means that if the driver maintains a queue of buffers
> > +to be encoded and at the time of the call to :c:func:`VIDIOC_S_CTRL` not all the
> > +buffers in the queue are processed yet, the driver must not apply the
> > +change immediately, but schedule it for when the next buffer queued
> > +after the :c:func:`VIDIOC_S_CTRL` starts being processed.
>
> Does this mean that hardware that doesn't support changing parameters at
> runtime at all must stop streaming and restart streaming internally with
> every parameter change? Or is it acceptable to not allow the controls to
> be changed during streaming?

That's a good question. I'd be leaning towards the latter (not allow),
as to keep kernel code simple, but maybe we could have others
(especially Pawel) comment on this.

>
> > +Flush
> > +-----
> > +
> > +Flush is the process of draining the CAPTURE queue of any remaining
> > +buffers. After the flush sequence is complete, the client has received
> > +all encoded frames for all OUTPUT buffers queued before the sequence was
> > +started.
> > +
> > +1. Begin flush by issuing :c:func:`VIDIOC_ENCODER_CMD`.
> > +
> > +   a. Required fields:
> > +
> > +      i. cmd = ``V4L2_ENC_CMD_STOP``
> > +
> > +2. The driver must process and encode as normal all OUTPUT buffers
> > +   queued by the client before the :c:func:`VIDIOC_ENCODER_CMD` was issued.
> > +
> > +3. Once all OUTPUT buffers queued before ``V4L2_ENC_CMD_STOP`` are
> > +   processed:
> > +
> > +   a. Once all decoded frames (if any) are ready to be dequeued on the
> > +      CAPTURE queue, the driver must send a ``V4L2_EVENT_EOS``. The
> > +      driver must also set ``V4L2_BUF_FLAG_LAST`` in
> > +      :c:type:`v4l2_buffer` ``flags`` field on the buffer on the CAPTURE queue
> > +      containing the last frame (if any) produced as a result of
> > +      processing the OUTPUT buffers queued before
> > +      ``V4L2_ENC_CMD_STOP``. If no more frames are left to be
> > +      returned at the point of handling ``V4L2_ENC_CMD_STOP``, the
> > +      driver must return an empty buffer (with
> > +      :c:type:`v4l2_buffer` ``bytesused`` = 0) as the last buffer with
> > +      ``V4L2_BUF_FLAG_LAST`` set instead.
> > +      Any attempts to dequeue more buffers beyond the buffer
> > +      marked with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE
> > +      error from :c:func:`VIDIOC_DQBUF`.
> > +
> > +4. At this point, encoding is paused and the driver will accept, but not
> > +   process any newly queued OUTPUT buffers until the client issues
> > +   ``V4L2_ENC_CMD_START`` or :c:func:`VIDIOC_STREAMON`.
> > +
> > +Once the flush sequence is initiated, the client needs to drive it to
> > +completion, as described by the above steps, unless it aborts the
> > +process by issuing :c:func:`VIDIOC_STREAMOFF` on OUTPUT queue. The client is not
> > +allowed to issue ``V4L2_ENC_CMD_START`` or ``V4L2_ENC_CMD_STOP`` again
> > +while the flush sequence is in progress.
> > +
> > +Issuing :c:func:`VIDIOC_STREAMON` on OUTPUT queue will implicitly restart
> > +encoding.
>
> Only if CAPTURE is already streaming?

Yes, I'd say so, to be consistent with initial streaming start. I
guess we should state this explicitly.

>
> >  :c:func:`VIDIOC_STREAMON` and :c:func:`VIDIOC_STREAMOFF` on CAPTURE queue will
> > +not affect the flush sequence, allowing the client to change CAPTURE
> > +buffer set if needed.
> > +
> > +Commit points
> > +-------------
> > +
> > +Setting formats and allocating buffers triggers changes in the behavior
> > +of the driver.
> > +
> > +1. Setting format on CAPTURE queue may change the set of formats
> > +   supported/advertised on the OUTPUT queue. It also must change the
> > +   format currently selected on OUTPUT queue if it is not supported
> > +   by the newly selected CAPTURE format to a supported one.
>
> Should TRY_FMT on the OUTPUT queue only return formats that can be
> transformed into the currently set format on the capture queue?
> (That is, after setting colorimetry on the CAPTURE queue, will
> TRY_FMT(OUT) always return that colorimetry?)

Yes, that's my understanding. This way we avoid the "negotiation
hell", which would cause both queues to fight with each other, if
userspace keeps setting incompatible settings.

>
> > +2. Enumerating formats on OUTPUT queue must only return OUTPUT formats
> > +   supported for the CAPTURE format currently set.
> > +
> > +3. Setting/changing format on OUTPUT queue does not change formats
> > +   available on CAPTURE queue. An attempt to set OUTPUT format that
> > +   is not supported for the currently selected CAPTURE format must
> > +   result in an error (-EINVAL) from :c:func:`VIDIOC_S_FMT`.
>
> Same as for decoding, is this limited to pixel format? Why isn't the
> pixel format corrected to a supported choice? What about
> width/height/colorimetry?

Width/height/colorimetry(Do you mean color space?) is a part of
v4l2_pix_format(_mplane). I believe that's what this point was about.

I'd say that we should have 1 master queue, which would enforce the
constraints and the 2 points above mark the OUTPUT queue as such. This
way we avoid the "negotiation" hell as I mentioned above and we can be
sure that the driver commits to some format on given queue, e.g.

S_FMT(OUTPUT, o_0)
o_1 = G_FMT(OUTPUT)
S_FMT(CAPTURE, c_0)
c_1 = G_FMT(CAPTURE)

At this point we can be sure that OUTPUT queue will operate with
exactly format o_1 and CAPTURE queue with exactly c_1.

Best regards,
Tomasz
Philipp Zabel June 5, 2018, 2:22 p.m. UTC | #3
On Tue, 2018-06-05 at 21:31 +0900, Tomasz Figa wrote:
[...]
> +Initialization
> > > +--------------
> > > +
> > > +1. (optional) Enumerate supported formats and resolutions. See
> > > +   capability enumeration.
> > > +
> > > +2. Set a coded format on the CAPTURE queue via :c:func:`VIDIOC_S_FMT`
> > > +
> > > +   a. Required fields:
> > > +
> > > +      i.  type = CAPTURE
> > > +
> > > +      ii. fmt.pix_mp.pixelformat set to a coded format to be produced
> > > +
> > > +   b. Return values:
> > > +
> > > +      i.  EINVAL: unsupported format.
> > > +
> > > +      ii. Others: per spec
> > > +
> > > +   c. Return fields:
> > > +
> > > +      i. fmt.pix_mp.width, fmt.pix_mp.height should be 0.
> > > +
> > > +   .. note::
> > > +
> > > +      After a coded format is set, the set of raw formats
> > > +      supported as source on the OUTPUT queue may change.
> > 
> > So setting CAPTURE potentially also changes OUTPUT format?
> 
> Yes, but at this point userspace hasn't yet set the desired format.
> 
> > If the encoded stream supports colorimetry information, should that
> > information be taken from the CAPTURE queue?
> 
> What's colorimetry? Is it something that is included in
> v4l2_pix_format(_mplane)? Is it something that can vary between raw
> input and encoded output?

FTR, yes, I meant the colorspace, ycbcr_enc, quantization, and xfer_func
fields of the v4l2_pix_format(_mplane) structs. GStreamer uses the term
"colorimetry" to pull these fields together into a single parameter.

The codecs usually don't care at all about this information, except some
streams (such as h.264 in the VUI parameters section of the SPS header)
may optionally contain a representation of these fields, so it may be
desirable to let encoders write the configured colorimetry or to let
decoders return the detected colorimetry via G_FMT(CAP) after a source
change event.

I think it could be useful to enforce the same colorimetry on CAPTURE
and OUTPUT queue if the hardware doesn't do any colorspace conversion.

> > > +3. (optional) Enumerate supported OUTPUT formats (raw formats for
> > > +   source) for the selected coded format via :c:func:`VIDIOC_ENUM_FMT`.
> > > +
> > > +   a. Required fields:
> > > +
> > > +      i.  type = OUTPUT
> > > +
> > > +      ii. index = per spec
> > > +
> > > +   b. Return values: per spec
> > > +
> > > +   c. Return fields:
> > > +
> > > +      i. pixelformat: raw format supported for the coded format
> > > +         currently selected on the OUTPUT queue.
> > > +
> > > +4. Set a raw format on the OUTPUT queue and visible resolution for the
> > > +   source raw frames via :c:func:`VIDIOC_S_FMT` on the OUTPUT queue.
> > 
> > Isn't this optional? If S_FMT(CAP) already sets OUTPUT to a valid
> > format, just G_FMT(OUT) should be valid here as well.
> 
> Technically it would be valid indeed, but that would be unlikely what
> the client needs, given that it probably already has some existing raw
> frames (at certain resolution) to encode.

Maybe add a clarifying note that G_FMT is acceptable as an alternative?
We don't have to put this front and center if it is not the expected use
case, but it would still be nice to have it documented as valid use.

This could be part of a still ongoing negotiation process if the source
is a scaler or some frame generator that can create frames of any size.

> > > +
> > > +   a. Required fields:
> > > +
> > > +      i.   type = OUTPUT
> > > +
> > > +      ii.  fmt.pix_mp.pixelformat = raw format to be used as source of
> > > +           encode
> > > +
> > > +      iii. fmt.pix_mp.width, fmt.pix_mp.height = input resolution
> > > +           for the source raw frames
> > 
> > These are specific to multiplanar drivers. The same should apply to
> > singleplanar drivers.
> 
> Right. In general I'd be interested in getting some suggestions in how
> to write this kind of descriptions nicely and consistent with other
> kernel documentation.

Maybe just:

	a. Required fields:

	   i.   type = OUTPUT or OUTPUT_MPLANE

	   ii.  fmt.pix.pixelformat or fmt.pix_mp.pixelformat = ...

           iii. fmt.pix.width, fmt.pix_mp.height or fmt.pix_mp.width,
                fmt.pix_mp.height = ...


[...]
> > > +7. Begin streaming on both OUTPUT and CAPTURE queues via
> > > +   :c:func:`VIDIOC_STREAMON`. This may be performed in any order.
> > 
> > Actual encoding starts once both queues are streaming
> 
> I think that's the only thing possible with vb2, since it gives
> buffers to the driver when streaming starts on given queue.
>
> > and stops as soon
> > as the first queue receives STREAMOFF?
> 
> Given that STREAMOFF is supposed to drop all the buffers from the
> queue, it should be so +/- finishing what's already queued to the
> hardware, if it cannot be cancelled.

Oh, right.

> I guess we should say this more explicitly.
> 
[...]
> > > +Encoding parameter changes
> > > +--------------------------
> > > +
> > > +The client is allowed to use :c:func:`VIDIOC_S_CTRL` to change encoder
> > > +parameters at any time. The driver must apply the new setting starting
> > > +at the next frame queued to it.
> > > +
> > > +This specifically means that if the driver maintains a queue of buffers
> > > +to be encoded and at the time of the call to :c:func:`VIDIOC_S_CTRL` not all the
> > > +buffers in the queue are processed yet, the driver must not apply the
> > > +change immediately, but schedule it for when the next buffer queued
> > > +after the :c:func:`VIDIOC_S_CTRL` starts being processed.
> > 
> > Does this mean that hardware that doesn't support changing parameters at
> > runtime at all must stop streaming and restart streaming internally with
> > every parameter change? Or is it acceptable to not allow the controls to
> > be changed during streaming?
> 
> That's a good question. I'd be leaning towards the latter (not allow),
> as to keep kernel code simple, but maybe we could have others
> (especially Pawel) comment on this.

Same here.

[...]
> > > +2. Enumerating formats on OUTPUT queue must only return OUTPUT formats
> > > +   supported for the CAPTURE format currently set.
> > > +
> > > +3. Setting/changing format on OUTPUT queue does not change formats
> > > +   available on CAPTURE queue. An attempt to set OUTPUT format that
> > > +   is not supported for the currently selected CAPTURE format must
> > > +   result in an error (-EINVAL) from :c:func:`VIDIOC_S_FMT`.
> > 
> > Same as for decoding, is this limited to pixel format? Why isn't the
> > pixel format corrected to a supported choice? What about
> > width/height/colorimetry?
> 
> Width/height/colorimetry(Do you mean color space?) is a part of
> v4l2_pix_format(_mplane). I believe that's what this point was about.

Yes. My question was more about whether this should return -EINVAL or
whether TRY_FMT/S_FMT should change the parameters to valid values.

> I'd say that we should have 1 master queue, which would enforce the
> constraints and the 2 points above mark the OUTPUT queue as such. This
> way we avoid the "negotiation" hell as I mentioned above and we can be
> sure that the driver commits to some format on given queue, e.g.
> 
> S_FMT(OUTPUT, o_0)
> o_1 = G_FMT(OUTPUT)
> S_FMT(CAPTURE, c_0)
> c_1 = G_FMT(CAPTURE)
> 
> At this point we can be sure that OUTPUT queue will operate with
> exactly format o_1 and CAPTURE queue with exactly c_1.

Agreed.

regards
Philipp
Tomasz Figa June 6, 2018, 9:17 a.m. UTC | #4
On Tue, Jun 5, 2018 at 11:23 PM Philipp Zabel <p.zabel@pengutronix.de> wrote:
>
> On Tue, 2018-06-05 at 21:31 +0900, Tomasz Figa wrote:
> [...]
> > +Initialization
> > > > +--------------
> > > > +
> > > > +1. (optional) Enumerate supported formats and resolutions. See
> > > > +   capability enumeration.
> > > > +
> > > > +2. Set a coded format on the CAPTURE queue via :c:func:`VIDIOC_S_FMT`
> > > > +
> > > > +   a. Required fields:
> > > > +
> > > > +      i.  type = CAPTURE
> > > > +
> > > > +      ii. fmt.pix_mp.pixelformat set to a coded format to be produced
> > > > +
> > > > +   b. Return values:
> > > > +
> > > > +      i.  EINVAL: unsupported format.
> > > > +
> > > > +      ii. Others: per spec
> > > > +
> > > > +   c. Return fields:
> > > > +
> > > > +      i. fmt.pix_mp.width, fmt.pix_mp.height should be 0.
> > > > +
> > > > +   .. note::
> > > > +
> > > > +      After a coded format is set, the set of raw formats
> > > > +      supported as source on the OUTPUT queue may change.
> > >
> > > So setting CAPTURE potentially also changes OUTPUT format?
> >
> > Yes, but at this point userspace hasn't yet set the desired format.
> >
> > > If the encoded stream supports colorimetry information, should that
> > > information be taken from the CAPTURE queue?
> >
> > What's colorimetry? Is it something that is included in
> > v4l2_pix_format(_mplane)? Is it something that can vary between raw
> > input and encoded output?
>
> FTR, yes, I meant the colorspace, ycbcr_enc, quantization, and xfer_func
> fields of the v4l2_pix_format(_mplane) structs. GStreamer uses the term
> "colorimetry" to pull these fields together into a single parameter.
>
> The codecs usually don't care at all about this information, except some
> streams (such as h.264 in the VUI parameters section of the SPS header)
> may optionally contain a representation of these fields, so it may be
> desirable to let encoders write the configured colorimetry or to let
> decoders return the detected colorimetry via G_FMT(CAP) after a source
> change event.
>
> I think it could be useful to enforce the same colorimetry on CAPTURE
> and OUTPUT queue if the hardware doesn't do any colorspace conversion.

After thinking a bit more on this, I guess it wouldn't overly
complicate things if we require that the values from OUTPUT queue are
copied to CAPTURE queue, if the stream doesn't include such
information or the hardware just can't parse them. Also, userspace
that can't parse them wouldn't have to do anything, as the colorspace
default on OUTPUT would be V4L2_COLORSPACE_DEFAULT and if hardware
can't parse it either, it would just be propagated to CAPTURE.

>
> > > > +3. (optional) Enumerate supported OUTPUT formats (raw formats for
> > > > +   source) for the selected coded format via :c:func:`VIDIOC_ENUM_FMT`.
> > > > +
> > > > +   a. Required fields:
> > > > +
> > > > +      i.  type = OUTPUT
> > > > +
> > > > +      ii. index = per spec
> > > > +
> > > > +   b. Return values: per spec
> > > > +
> > > > +   c. Return fields:
> > > > +
> > > > +      i. pixelformat: raw format supported for the coded format
> > > > +         currently selected on the OUTPUT queue.
> > > > +
> > > > +4. Set a raw format on the OUTPUT queue and visible resolution for the
> > > > +   source raw frames via :c:func:`VIDIOC_S_FMT` on the OUTPUT queue.
> > >
> > > Isn't this optional? If S_FMT(CAP) already sets OUTPUT to a valid
> > > format, just G_FMT(OUT) should be valid here as well.
> >
> > Technically it would be valid indeed, but that would be unlikely what
> > the client needs, given that it probably already has some existing raw
> > frames (at certain resolution) to encode.
>
> Maybe add a clarifying note that G_FMT is acceptable as an alternative?
> We don't have to put this front and center if it is not the expected use
> case, but it would still be nice to have it documented as valid use.
>
> This could be part of a still ongoing negotiation process if the source
> is a scaler or some frame generator that can create frames of any size.
>

I guess it wouldn't hurt to say so, with a clear annotation that there
is no expectation that the default values are practically usable. For
example the input resolution could be set to minimum supported by
default.

> > > > +
> > > > +   a. Required fields:
> > > > +
> > > > +      i.   type = OUTPUT
> > > > +
> > > > +      ii.  fmt.pix_mp.pixelformat = raw format to be used as source of
> > > > +           encode
> > > > +
> > > > +      iii. fmt.pix_mp.width, fmt.pix_mp.height = input resolution
> > > > +           for the source raw frames
> > >
> > > These are specific to multiplanar drivers. The same should apply to
> > > singleplanar drivers.
> >
> > Right. In general I'd be interested in getting some suggestions in how
> > to write this kind of descriptions nicely and consistent with other
> > kernel documentation.
>
> Maybe just:
>
>         a. Required fields:
>
>            i.   type = OUTPUT or OUTPUT_MPLANE
>
>            ii.  fmt.pix.pixelformat or fmt.pix_mp.pixelformat = ...
>
>            iii. fmt.pix.width, fmt.pix_mp.height or fmt.pix_mp.width,
>                 fmt.pix_mp.height = ...
>

Ack.

>
> [...]
> > > > +7. Begin streaming on both OUTPUT and CAPTURE queues via
> > > > +   :c:func:`VIDIOC_STREAMON`. This may be performed in any order.
> > >
> > > Actual encoding starts once both queues are streaming
> >
> > I think that's the only thing possible with vb2, since it gives
> > buffers to the driver when streaming starts on given queue.
> >
> > > and stops as soon
> > > as the first queue receives STREAMOFF?
> >
> > Given that STREAMOFF is supposed to drop all the buffers from the
> > queue, it should be so +/- finishing what's already queued to the
> > hardware, if it cannot be cancelled.
>
> Oh, right.
>
> > I guess we should say this more explicitly.
> >
> [...]
> > > > +Encoding parameter changes
> > > > +--------------------------
> > > > +
> > > > +The client is allowed to use :c:func:`VIDIOC_S_CTRL` to change encoder
> > > > +parameters at any time. The driver must apply the new setting starting
> > > > +at the next frame queued to it.
> > > > +
> > > > +This specifically means that if the driver maintains a queue of buffers
> > > > +to be encoded and at the time of the call to :c:func:`VIDIOC_S_CTRL` not all the
> > > > +buffers in the queue are processed yet, the driver must not apply the
> > > > +change immediately, but schedule it for when the next buffer queued
> > > > +after the :c:func:`VIDIOC_S_CTRL` starts being processed.
> > >
> > > Does this mean that hardware that doesn't support changing parameters at
> > > runtime at all must stop streaming and restart streaming internally with
> > > every parameter change? Or is it acceptable to not allow the controls to
> > > be changed during streaming?
> >
> > That's a good question. I'd be leaning towards the latter (not allow),
> > as to keep kernel code simple, but maybe we could have others
> > (especially Pawel) comment on this.
>
> Same here.

Same as where? :)

>
> [...]
> > > > +2. Enumerating formats on OUTPUT queue must only return OUTPUT formats
> > > > +   supported for the CAPTURE format currently set.
> > > > +
> > > > +3. Setting/changing format on OUTPUT queue does not change formats
> > > > +   available on CAPTURE queue. An attempt to set OUTPUT format that
> > > > +   is not supported for the currently selected CAPTURE format must
> > > > +   result in an error (-EINVAL) from :c:func:`VIDIOC_S_FMT`.
> > >
> > > Same as for decoding, is this limited to pixel format? Why isn't the
> > > pixel format corrected to a supported choice? What about
> > > width/height/colorimetry?
> >
> > Width/height/colorimetry(Do you mean color space?) is a part of
> > v4l2_pix_format(_mplane). I believe that's what this point was about.
>
> Yes. My question was more about whether this should return -EINVAL or
> whether TRY_FMT/S_FMT should change the parameters to valid values.

As per the standard semantics of TRY_/S_FMT, they should adjust the
format on given queue. We only require that the state on other queue
is left intact.

Best regards,
Tomasz
Philipp Zabel June 6, 2018, 9:40 a.m. UTC | #5
On Wed, 2018-06-06 at 18:17 +0900, Tomasz Figa wrote:
> On Tue, Jun 5, 2018 at 11:23 PM Philipp Zabel <p.zabel@pengutronix.de> wrote:
> > 
> > On Tue, 2018-06-05 at 21:31 +0900, Tomasz Figa wrote:
> > [...]
> > > +Initialization
> > > > > +--------------
> > > > > +
> > > > > +1. (optional) Enumerate supported formats and resolutions. See
> > > > > +   capability enumeration.
> > > > > +
> > > > > +2. Set a coded format on the CAPTURE queue via :c:func:`VIDIOC_S_FMT`
> > > > > +
> > > > > +   a. Required fields:
> > > > > +
> > > > > +      i.  type = CAPTURE
> > > > > +
> > > > > +      ii. fmt.pix_mp.pixelformat set to a coded format to be produced
> > > > > +
> > > > > +   b. Return values:
> > > > > +
> > > > > +      i.  EINVAL: unsupported format.
> > > > > +
> > > > > +      ii. Others: per spec
> > > > > +
> > > > > +   c. Return fields:
> > > > > +
> > > > > +      i. fmt.pix_mp.width, fmt.pix_mp.height should be 0.
> > > > > +
> > > > > +   .. note::
> > > > > +
> > > > > +      After a coded format is set, the set of raw formats
> > > > > +      supported as source on the OUTPUT queue may change.
> > > > 
> > > > So setting CAPTURE potentially also changes OUTPUT format?
> > > 
> > > Yes, but at this point userspace hasn't yet set the desired format.
> > > 
> > > > If the encoded stream supports colorimetry information, should that
> > > > information be taken from the CAPTURE queue?
> > > 
> > > What's colorimetry? Is it something that is included in
> > > v4l2_pix_format(_mplane)? Is it something that can vary between raw
> > > input and encoded output?
> > 
> > FTR, yes, I meant the colorspace, ycbcr_enc, quantization, and xfer_func
> > fields of the v4l2_pix_format(_mplane) structs. GStreamer uses the term
> > "colorimetry" to pull these fields together into a single parameter.
> > 
> > The codecs usually don't care at all about this information, except some
> > streams (such as h.264 in the VUI parameters section of the SPS header)
> > may optionally contain a representation of these fields, so it may be
> > desirable to let encoders write the configured colorimetry or to let
> > decoders return the detected colorimetry via G_FMT(CAP) after a source
> > change event.
> > 
> > I think it could be useful to enforce the same colorimetry on CAPTURE
> > and OUTPUT queue if the hardware doesn't do any colorspace conversion.
> 
> After thinking a bit more on this, I guess it wouldn't overly
> complicate things if we require that the values from OUTPUT queue are
> copied to CAPTURE queue, if the stream doesn't include such
> information or the hardware just can't parse them.

And for encoders it would be copied from CAPTURE queue to OUTPUT queue?

> Also, userspace
> that can't parse them wouldn't have to do anything, as the colorspace
> default on OUTPUT would be V4L2_COLORSPACE_DEFAULT and if hardware
> can't parse it either, it would just be propagated to CAPTURE.

I wonder if this wouldn't change the meaning of V4L2_COLORSPACE_DEFAULT?
Documentation/media/uapi/v4l/colorspaces-defs.rst states:

      - The default colorspace. This can be used by applications to let
        the driver fill in the colorspace.

This sounds to me like it is intended to be used by the application
only, like V4L2_FIELD_ANY. If we let decoders return
V4L2_COLORSPACE_DEFAULT on the CAPTURE queue to indicate they have no
idea about colorspace, it should be mentioned explicitly and maybe
clarify in colorspaces-defs.rst as well.

[...]
> > > > > +Encoding parameter changes
> > > > > +--------------------------
> > > > > +
> > > > > +The client is allowed to use :c:func:`VIDIOC_S_CTRL` to change encoder
> > > > > +parameters at any time. The driver must apply the new setting starting
> > > > > +at the next frame queued to it.
> > > > > +
> > > > > +This specifically means that if the driver maintains a queue of buffers
> > > > > +to be encoded and at the time of the call to :c:func:`VIDIOC_S_CTRL` not all the
> > > > > +buffers in the queue are processed yet, the driver must not apply the
> > > > > +change immediately, but schedule it for when the next buffer queued
> > > > > +after the :c:func:`VIDIOC_S_CTRL` starts being processed.
> > > > 
> > > > Does this mean that hardware that doesn't support changing parameters at
> > > > runtime at all must stop streaming and restart streaming internally with
> > > > every parameter change? Or is it acceptable to not allow the controls to
> > > > be changed during streaming?
> > > 
> > > That's a good question. I'd be leaning towards the latter (not allow),
> > > as to keep kernel code simple, but maybe we could have others
> > > (especially Pawel) comment on this.
> > 
> > Same here.
> 
> Same as where? :)

I'd be leaning towards the latter (not allow) as well.

> > [...]
> > > > > +2. Enumerating formats on OUTPUT queue must only return OUTPUT formats
> > > > > +   supported for the CAPTURE format currently set.
> > > > > +
> > > > > +3. Setting/changing format on OUTPUT queue does not change formats
> > > > > +   available on CAPTURE queue. An attempt to set OUTPUT format that
> > > > > +   is not supported for the currently selected CAPTURE format must
> > > > > +   result in an error (-EINVAL) from :c:func:`VIDIOC_S_FMT`.
> > > > 
> > > > Same as for decoding, is this limited to pixel format? Why isn't the
> > > > pixel format corrected to a supported choice? What about
> > > > width/height/colorimetry?
> > > 
> > > Width/height/colorimetry(Do you mean color space?) is a part of
> > > v4l2_pix_format(_mplane). I believe that's what this point was about.
> > 
> > Yes. My question was more about whether this should return -EINVAL or
> > whether TRY_FMT/S_FMT should change the parameters to valid values.
> 
> As per the standard semantics of TRY_/S_FMT, they should adjust the
> format on given queue. We only require that the state on other queue
> is left intact.

This contradicts 3. above, which says S_FMT(OUT) should instead return
-EINVAL if the format doesn't match.

regards
Philipp
Tomasz Figa June 6, 2018, 10:37 a.m. UTC | #6
On Wed, Jun 6, 2018 at 6:40 PM Philipp Zabel <p.zabel@pengutronix.de> wrote:
>
> On Wed, 2018-06-06 at 18:17 +0900, Tomasz Figa wrote:
> > On Tue, Jun 5, 2018 at 11:23 PM Philipp Zabel <p.zabel@pengutronix.de> wrote:
> > >
> > > On Tue, 2018-06-05 at 21:31 +0900, Tomasz Figa wrote:
> > > [...]
> > > > +Initialization
> > > > > > +--------------
> > > > > > +
> > > > > > +1. (optional) Enumerate supported formats and resolutions. See
> > > > > > +   capability enumeration.
> > > > > > +
> > > > > > +2. Set a coded format on the CAPTURE queue via :c:func:`VIDIOC_S_FMT`
> > > > > > +
> > > > > > +   a. Required fields:
> > > > > > +
> > > > > > +      i.  type = CAPTURE
> > > > > > +
> > > > > > +      ii. fmt.pix_mp.pixelformat set to a coded format to be produced
> > > > > > +
> > > > > > +   b. Return values:
> > > > > > +
> > > > > > +      i.  EINVAL: unsupported format.
> > > > > > +
> > > > > > +      ii. Others: per spec
> > > > > > +
> > > > > > +   c. Return fields:
> > > > > > +
> > > > > > +      i. fmt.pix_mp.width, fmt.pix_mp.height should be 0.
> > > > > > +
> > > > > > +   .. note::
> > > > > > +
> > > > > > +      After a coded format is set, the set of raw formats
> > > > > > +      supported as source on the OUTPUT queue may change.
> > > > >
> > > > > So setting CAPTURE potentially also changes OUTPUT format?
> > > >
> > > > Yes, but at this point userspace hasn't yet set the desired format.
> > > >
> > > > > If the encoded stream supports colorimetry information, should that
> > > > > information be taken from the CAPTURE queue?
> > > >
> > > > What's colorimetry? Is it something that is included in
> > > > v4l2_pix_format(_mplane)? Is it something that can vary between raw
> > > > input and encoded output?
> > >
> > > FTR, yes, I meant the colorspace, ycbcr_enc, quantization, and xfer_func
> > > fields of the v4l2_pix_format(_mplane) structs. GStreamer uses the term
> > > "colorimetry" to pull these fields together into a single parameter.
> > >
> > > The codecs usually don't care at all about this information, except some
> > > streams (such as h.264 in the VUI parameters section of the SPS header)
> > > may optionally contain a representation of these fields, so it may be
> > > desirable to let encoders write the configured colorimetry or to let
> > > decoders return the detected colorimetry via G_FMT(CAP) after a source
> > > change event.
> > >
> > > I think it could be useful to enforce the same colorimetry on CAPTURE
> > > and OUTPUT queue if the hardware doesn't do any colorspace conversion.
> >
> > After thinking a bit more on this, I guess it wouldn't overly
> > complicate things if we require that the values from OUTPUT queue are
> > copied to CAPTURE queue, if the stream doesn't include such
> > information or the hardware just can't parse them.
>
> And for encoders it would be copied from CAPTURE queue to OUTPUT queue?
>

I guess iy would be from OUTPUT to CAPTURE for encoders as well, since
the colorimetry of OUTPUT is ultimately defined by the raw frames that
userspace is going to be feeding to the encoder.

> > Also, userspace
> > that can't parse them wouldn't have to do anything, as the colorspace
> > default on OUTPUT would be V4L2_COLORSPACE_DEFAULT and if hardware
> > can't parse it either, it would just be propagated to CAPTURE.
>
> I wonder if this wouldn't change the meaning of V4L2_COLORSPACE_DEFAULT?
> Documentation/media/uapi/v4l/colorspaces-defs.rst states:
>
>       - The default colorspace. This can be used by applications to let
>         the driver fill in the colorspace.
>
> This sounds to me like it is intended to be used by the application
> only, like V4L2_FIELD_ANY. If we let decoders return
> V4L2_COLORSPACE_DEFAULT on the CAPTURE queue to indicate they have no
> idea about colorspace, it should be mentioned explicitly and maybe
> clarify in colorspaces-defs.rst as well.

Yes, it would change it slightly (in a non-contradicting way) and we
need to update the description indeed.

>
> [...]
> > > > > > +Encoding parameter changes
> > > > > > +--------------------------
> > > > > > +
> > > > > > +The client is allowed to use :c:func:`VIDIOC_S_CTRL` to change encoder
> > > > > > +parameters at any time. The driver must apply the new setting starting
> > > > > > +at the next frame queued to it.
> > > > > > +
> > > > > > +This specifically means that if the driver maintains a queue of buffers
> > > > > > +to be encoded and at the time of the call to :c:func:`VIDIOC_S_CTRL` not all the
> > > > > > +buffers in the queue are processed yet, the driver must not apply the
> > > > > > +change immediately, but schedule it for when the next buffer queued
> > > > > > +after the :c:func:`VIDIOC_S_CTRL` starts being processed.
> > > > >
> > > > > Does this mean that hardware that doesn't support changing parameters at
> > > > > runtime at all must stop streaming and restart streaming internally with
> > > > > every parameter change? Or is it acceptable to not allow the controls to
> > > > > be changed during streaming?
> > > >
> > > > That's a good question. I'd be leaning towards the latter (not allow),
> > > > as to keep kernel code simple, but maybe we could have others
> > > > (especially Pawel) comment on this.
> > >
> > > Same here.
> >
> > Same as where? :)
>
> I'd be leaning towards the latter (not allow) as well.

Ack. Thanks for clarifying.

>
> > > [...]
> > > > > > +2. Enumerating formats on OUTPUT queue must only return OUTPUT formats
> > > > > > +   supported for the CAPTURE format currently set.
> > > > > > +
> > > > > > +3. Setting/changing format on OUTPUT queue does not change formats
> > > > > > +   available on CAPTURE queue. An attempt to set OUTPUT format that
> > > > > > +   is not supported for the currently selected CAPTURE format must
> > > > > > +   result in an error (-EINVAL) from :c:func:`VIDIOC_S_FMT`.
> > > > >
> > > > > Same as for decoding, is this limited to pixel format? Why isn't the
> > > > > pixel format corrected to a supported choice? What about
> > > > > width/height/colorimetry?
> > > >
> > > > Width/height/colorimetry(Do you mean color space?) is a part of
> > > > v4l2_pix_format(_mplane). I believe that's what this point was about.
> > >
> > > Yes. My question was more about whether this should return -EINVAL or
> > > whether TRY_FMT/S_FMT should change the parameters to valid values.
> >
> > As per the standard semantics of TRY_/S_FMT, they should adjust the
> > format on given queue. We only require that the state on other queue
> > is left intact.
>
> This contradicts 3. above, which says S_FMT(OUT) should instead return
> -EINVAL if the format doesn't match.

Right. That point needs to be fixed.

Best regards,
Tomasz
Hans Verkuil June 7, 2018, 7:27 a.m. UTC | #7
On 06/06/2018 12:37 PM, Tomasz Figa wrote:
> On Wed, Jun 6, 2018 at 6:40 PM Philipp Zabel <p.zabel@pengutronix.de> wrote:
>>
>> On Wed, 2018-06-06 at 18:17 +0900, Tomasz Figa wrote:
>>> On Tue, Jun 5, 2018 at 11:23 PM Philipp Zabel <p.zabel@pengutronix.de> wrote:
>>>>
>>>> On Tue, 2018-06-05 at 21:31 +0900, Tomasz Figa wrote:
>>>> [...]
>>>>> +Initialization
>>>>>>> +--------------
>>>>>>> +
>>>>>>> +1. (optional) Enumerate supported formats and resolutions. See
>>>>>>> +   capability enumeration.
>>>>>>> +
>>>>>>> +2. Set a coded format on the CAPTURE queue via :c:func:`VIDIOC_S_FMT`
>>>>>>> +
>>>>>>> +   a. Required fields:
>>>>>>> +
>>>>>>> +      i.  type = CAPTURE
>>>>>>> +
>>>>>>> +      ii. fmt.pix_mp.pixelformat set to a coded format to be produced
>>>>>>> +
>>>>>>> +   b. Return values:
>>>>>>> +
>>>>>>> +      i.  EINVAL: unsupported format.
>>>>>>> +
>>>>>>> +      ii. Others: per spec
>>>>>>> +
>>>>>>> +   c. Return fields:
>>>>>>> +
>>>>>>> +      i. fmt.pix_mp.width, fmt.pix_mp.height should be 0.
>>>>>>> +
>>>>>>> +   .. note::
>>>>>>> +
>>>>>>> +      After a coded format is set, the set of raw formats
>>>>>>> +      supported as source on the OUTPUT queue may change.
>>>>>>
>>>>>> So setting CAPTURE potentially also changes OUTPUT format?
>>>>>
>>>>> Yes, but at this point userspace hasn't yet set the desired format.
>>>>>
>>>>>> If the encoded stream supports colorimetry information, should that
>>>>>> information be taken from the CAPTURE queue?
>>>>>
>>>>> What's colorimetry? Is it something that is included in
>>>>> v4l2_pix_format(_mplane)? Is it something that can vary between raw
>>>>> input and encoded output?
>>>>
>>>> FTR, yes, I meant the colorspace, ycbcr_enc, quantization, and xfer_func
>>>> fields of the v4l2_pix_format(_mplane) structs. GStreamer uses the term
>>>> "colorimetry" to pull these fields together into a single parameter.
>>>>
>>>> The codecs usually don't care at all about this information, except some
>>>> streams (such as h.264 in the VUI parameters section of the SPS header)
>>>> may optionally contain a representation of these fields, so it may be
>>>> desirable to let encoders write the configured colorimetry or to let
>>>> decoders return the detected colorimetry via G_FMT(CAP) after a source
>>>> change event.
>>>>
>>>> I think it could be useful to enforce the same colorimetry on CAPTURE
>>>> and OUTPUT queue if the hardware doesn't do any colorspace conversion.
>>>
>>> After thinking a bit more on this, I guess it wouldn't overly
>>> complicate things if we require that the values from OUTPUT queue are
>>> copied to CAPTURE queue, if the stream doesn't include such
>>> information or the hardware just can't parse them.
>>
>> And for encoders it would be copied from CAPTURE queue to OUTPUT queue?
>>
> 
> I guess iy would be from OUTPUT to CAPTURE for encoders as well, since
> the colorimetry of OUTPUT is ultimately defined by the raw frames that
> userspace is going to be feeding to the encoder.

Correct. All mem2mem drivers should just copy the colorimetry from the
output buffers to the capture buffers, unless the decoder hardware is able to
extract that data from the stream, in which case it can overwrite it for
the capture buffer.

Currently colorspace converters are not supported since the V4L2 API does
not provide a way to let userspace define colorimetry for the capture queue.
I have a patch to add a new v4l2_format flag for that since forever, but
since we do not have any drivers that can do this in the kernel it has never
been upstreamed.

What is supported is basic RGB <-> YUV conversions since that's selected through
the provided pixelformat.

Regards,

	Hans
Hans Verkuil June 7, 2018, 9:21 a.m. UTC | #8
On 06/05/2018 12:33 PM, Tomasz Figa wrote:
> Due to complexity of the video encoding process, the V4L2 drivers of
> stateful encoder hardware require specific sequencies of V4L2 API calls

sequences

> to be followed. These include capability enumeration, initialization,
> encoding, encode parameters change and flush.
> 
> Specifics of the above have been discussed during Media Workshops at
> LinuxCon Europe 2012 in Barcelona and then later Embedded Linux
> Conference Europe 2014 in Düsseldorf. The de facto Codec API that
> originated at those events was later implemented by the drivers we already
> have merged in mainline, such as s5p-mfc or mtk-vcodec.
> 
> The only thing missing was the real specification included as a part of
> Linux Media documentation. Fix it now and document the encoder part of
> the Codec API.
> 
> Signed-off-by: Tomasz Figa <tfiga@chromium.org>
> ---
>  Documentation/media/uapi/v4l/dev-codec.rst | 313 +++++++++++++++++++++
>  1 file changed, 313 insertions(+)
> 
> diff --git a/Documentation/media/uapi/v4l/dev-codec.rst b/Documentation/media/uapi/v4l/dev-codec.rst
> index 0483b10c205e..325a51bb09df 100644
> --- a/Documentation/media/uapi/v4l/dev-codec.rst
> +++ b/Documentation/media/uapi/v4l/dev-codec.rst
> @@ -805,3 +805,316 @@ of the driver.
>  To summarize, setting formats and allocation must always start with the
>  OUTPUT queue and the OUTPUT queue is the master that governs the set of
>  supported formats for the CAPTURE queue.
> +
> +Encoder
> +=======
> +
> +Querying capabilities
> +---------------------
> +
> +1. To enumerate the set of coded formats supported by the driver, the
> +   client uses :c:func:`VIDIOC_ENUM_FMT` for CAPTURE. The driver must always
> +   return the full set of supported formats, irrespective of the
> +   format set on the OUTPUT queue.
> +
> +2. To enumerate the set of supported raw formats, the client uses
> +   :c:func:`VIDIOC_ENUM_FMT` for OUTPUT queue. The driver must return only
> +   the formats supported for the format currently set on the
> +   CAPTURE queue.
> +   In order to enumerate raw formats supported by a given coded
> +   format, the client must first set that coded format on the
> +   CAPTURE queue and then enumerate the OUTPUT queue.
> +
> +3. The client may use :c:func:`VIDIOC_ENUM_FRAMESIZES` to detect supported
> +   resolutions for a given format, passing its fourcc in
> +   :c:type:`v4l2_frmivalenum` ``pixel_format``.
> +
> +   a. Values returned from :c:func:`VIDIOC_ENUM_FRAMESIZES` for coded formats
> +      must be maximums for given coded format for all supported raw
> +      formats.
> +
> +   b. Values returned from :c:func:`VIDIOC_ENUM_FRAMESIZES` for raw formats must
> +      be maximums for given raw format for all supported coded
> +      formats.
> +
> +   c. The client should derive the supported resolution for a
> +      combination of coded+raw format by calculating the
> +      intersection of resolutions returned from calls to
> +      :c:func:`VIDIOC_ENUM_FRAMESIZES` for the given coded and raw formats.
> +
> +4. Supported profiles and levels for given format, if applicable, may be
> +   queried using their respective controls via :c:func:`VIDIOC_QUERYCTRL`.
> +
> +5. The client may use :c:func:`VIDIOC_ENUM_FRAMEINTERVALS` to enumerate maximum
> +   supported framerates by the driver/hardware for a given
> +   format+resolution combination.
> +
> +6. Any additional encoder capabilities may be discovered by querying
> +   their respective controls.
> +
> +.. note::
> +
> +   Full format enumeration requires enumerating all raw formats
> +   on the OUTPUT queue for all possible (enumerated) coded formats on
> +   CAPTURE queue (setting each format on the CAPTURE queue before each
> +   enumeration on the OUTPUT queue.
> +
> +Initialization
> +--------------
> +
> +1. (optional) Enumerate supported formats and resolutions. See
> +   capability enumeration.
> +
> +2. Set a coded format on the CAPTURE queue via :c:func:`VIDIOC_S_FMT`
> +
> +   a. Required fields:
> +
> +      i.  type = CAPTURE
> +
> +      ii. fmt.pix_mp.pixelformat set to a coded format to be produced
> +
> +   b. Return values:
> +
> +      i.  EINVAL: unsupported format.

I'm still not sure about returning an error in this case.

And what should TRY_FMT do?

Do you know what current codecs do? Return EINVAL or replace with a supported format?

It would be nice to standardize on one rule or another.

The spec says that it should always return a valid format, but not all drivers adhere
to that. Perhaps we need to add a flag to let the driver signal the behavior of S_FMT
to userspace.

This is a long-standing issue with S_FMT, actually.

> +
> +      ii. Others: per spec
> +
> +   c. Return fields:
> +
> +      i. fmt.pix_mp.width, fmt.pix_mp.height should be 0.
> +
> +   .. note::
> +
> +      After a coded format is set, the set of raw formats
> +      supported as source on the OUTPUT queue may change.
> +
> +3. (optional) Enumerate supported OUTPUT formats (raw formats for
> +   source) for the selected coded format via :c:func:`VIDIOC_ENUM_FMT`.
> +
> +   a. Required fields:
> +
> +      i.  type = OUTPUT
> +
> +      ii. index = per spec
> +
> +   b. Return values: per spec
> +
> +   c. Return fields:
> +
> +      i. pixelformat: raw format supported for the coded format
> +         currently selected on the OUTPUT queue.
> +
> +4. Set a raw format on the OUTPUT queue and visible resolution for the
> +   source raw frames via :c:func:`VIDIOC_S_FMT` on the OUTPUT queue.
> +
> +   a. Required fields:
> +
> +      i.   type = OUTPUT
> +
> +      ii.  fmt.pix_mp.pixelformat = raw format to be used as source of
> +           encode
> +
> +      iii. fmt.pix_mp.width, fmt.pix_mp.height = input resolution
> +           for the source raw frames
> +
> +      iv.  num_planes: set to number of planes for pixelformat.
> +
> +      v.   For each plane p = [0, num_planes-1]:
> +           plane_fmt[p].sizeimage, plane_fmt[p].bytesperline: as
> +           per spec for input resolution.
> +
> +   b. Return values: as per spec.
> +
> +   c. Return fields:
> +
> +      i.  fmt.pix_mp.width, fmt.pix_mp.height = may be adjusted by
> +          driver to match alignment requirements, as required by the
> +          currently selected formats.
> +
> +      ii. For each plane p = [0, num_planes-1]:
> +          plane_fmt[p].sizeimage, plane_fmt[p].bytesperline: as
> +          per spec for the adjusted input resolution.
> +
> +   d. Setting the input resolution will reset visible resolution to the
> +      adjusted input resolution rounded up to the closest visible
> +      resolution supported by the driver. Similarly, coded size will
> +      be reset to input resolution rounded up to the closest coded
> +      resolution supported by the driver (typically a multiple of
> +      macroblock size).
> +
> +5. (optional) Set visible size for the stream metadata via

What exactly do you mean with 'stream metadata'? Definitely something for
the glossary.

> +   :c:func:`VIDIOC_S_SELECTION` on the OUTPUT queue.
> +
> +   a. Required fields:
> +
> +      i.   type = OUTPUT
> +
> +      ii.  target = ``V4L2_SEL_TGT_CROP``
> +
> +      iii. r.left, r.top, r.width, r.height: visible rectangle; this
> +           must fit within coded resolution returned from

from -> by

> +           :c:func:`VIDIOC_S_FMT`.
> +
> +   b. Return values: as per spec.
> +
> +   c. Return fields:
> +
> +      i. r.left, r.top, r.width, r.height: visible rectangle adjusted by
> +         the driver to match internal constraints.
> +
> +   d. This resolution must be used as the visible resolution in the
> +      stream metadata.
> +
> +   .. note::
> +
> +      The driver might not support arbitrary values of the
> +      crop rectangle and will adjust it to the closest supported
> +      one.
> +
> +6. Allocate buffers for both OUTPUT and CAPTURE queues via
> +   :c:func:`VIDIOC_REQBUFS`. This may be performed in any order.
> +
> +   a. Required fields:
> +
> +      i.   count = n, where n > 0.
> +
> +      ii.  type = OUTPUT or CAPTURE
> +
> +      iii. memory = as per spec
> +
> +   b. Return values: Per spec.
> +
> +   c. Return fields:
> +
> +      i. count: adjusted to allocated number of buffers
> +
> +   d. The driver must adjust count to minimum of required number of
> +      buffers for given format and count passed. The client must
> +      check this value after the ioctl returns to get the number of
> +      buffers actually allocated.
> +
> +   .. note::
> +
> +      Passing count = 1 is useful for letting the driver choose the
> +      minimum according to the selected format/hardware
> +      requirements.
> +
> +   .. note::
> +
> +      To allocate more than minimum number of buffers (for pipeline
> +      depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT)`` or
> +      G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE)``, respectively,
> +      to get the minimum number of buffers required by the
> +      driver/format, and pass the obtained value plus the number of
> +      additional buffers needed in count field to :c:func:`VIDIOC_REQBUFS`.
> +
> +7. Begin streaming on both OUTPUT and CAPTURE queues via
> +   :c:func:`VIDIOC_STREAMON`. This may be performed in any order.
> +
> +Encoding
> +--------
> +
> +This state is reached after a successful initialization sequence. In
> +this state, client queues and dequeues buffers to both queues via
> +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, as per spec.
> +
> +Both queues operate independently. The client may queue and dequeue
> +buffers to queues in any order and at any rate, also at a rate different
> +for each queue. The client may queue buffers within the same queue in
> +any order (V4L2 index-wise).

I'd drop the whole 'in any order' in the text above. This has always been
the case, and I think it is only confusing. I think what you really want
to say is that both queues operate independently and quite possibly at
different rates. So clients should operate them independently as well.

 It is recommended for the client to operate
> +the queues independently for best performance.
> +
> +Source OUTPUT buffers must contain full raw frames in the selected
> +OUTPUT format, exactly one frame per buffer.
> +
> +Encoding parameter changes
> +--------------------------
> +
> +The client is allowed to use :c:func:`VIDIOC_S_CTRL` to change encoder
> +parameters at any time. The driver must apply the new setting starting
> +at the next frame queued to it.
> +
> +This specifically means that if the driver maintains a queue of buffers
> +to be encoded and at the time of the call to :c:func:`VIDIOC_S_CTRL` not all the
> +buffers in the queue are processed yet, the driver must not apply the
> +change immediately, but schedule it for when the next buffer queued
> +after the :c:func:`VIDIOC_S_CTRL` starts being processed.

Is this what drivers do today? I thought it was applied immediately?
This sounds like something for which you need the Request API.

> +
> +Flush
> +-----
> +
> +Flush is the process of draining the CAPTURE queue of any remaining
> +buffers. After the flush sequence is complete, the client has received
> +all encoded frames for all OUTPUT buffers queued before the sequence was
> +started.
> +
> +1. Begin flush by issuing :c:func:`VIDIOC_ENCODER_CMD`.
> +
> +   a. Required fields:
> +
> +      i. cmd = ``V4L2_ENC_CMD_STOP``
> +
> +2. The driver must process and encode as normal all OUTPUT buffers
> +   queued by the client before the :c:func:`VIDIOC_ENCODER_CMD` was issued.

Note: TRY_ENCODER_CMD should also be supported, likely via a standard helper
in v4l2-mem2mem.c.

> +
> +3. Once all OUTPUT buffers queued before ``V4L2_ENC_CMD_STOP`` are
> +   processed:
> +
> +   a. Once all decoded frames (if any) are ready to be dequeued on the
> +      CAPTURE queue, the driver must send a ``V4L2_EVENT_EOS``. The
> +      driver must also set ``V4L2_BUF_FLAG_LAST`` in
> +      :c:type:`v4l2_buffer` ``flags`` field on the buffer on the CAPTURE queue
> +      containing the last frame (if any) produced as a result of
> +      processing the OUTPUT buffers queued before
> +      ``V4L2_ENC_CMD_STOP``. If no more frames are left to be
> +      returned at the point of handling ``V4L2_ENC_CMD_STOP``, the
> +      driver must return an empty buffer (with
> +      :c:type:`v4l2_buffer` ``bytesused`` = 0) as the last buffer with
> +      ``V4L2_BUF_FLAG_LAST`` set instead.
> +      Any attempts to dequeue more buffers beyond the buffer
> +      marked with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE
> +      error from :c:func:`VIDIOC_DQBUF`.
> +
> +4. At this point, encoding is paused and the driver will accept, but not
> +   process any newly queued OUTPUT buffers until the client issues
> +   ``V4L2_ENC_CMD_START`` or :c:func:`VIDIOC_STREAMON`.

STREAMON on which queue? Shouldn't there be a STREAMOFF first?

> +
> +Once the flush sequence is initiated, the client needs to drive it to
> +completion, as described by the above steps, unless it aborts the
> +process by issuing :c:func:`VIDIOC_STREAMOFF` on OUTPUT queue. The client is not
> +allowed to issue ``V4L2_ENC_CMD_START`` or ``V4L2_ENC_CMD_STOP`` again
> +while the flush sequence is in progress.
> +
> +Issuing :c:func:`VIDIOC_STREAMON` on OUTPUT queue will implicitly restart
> +encoding.

This feels wrong. Calling STREAMON on a queue that is already streaming does nothing
according to the spec. I think you should either call CMD_START or STREAMOFF/ON on
the OUTPUT queue. Of course, calling STREAMOFF first will dequeue any queued OUTPUT
buffers that were queued since ENC_CMD_STOP was called. But that's normal behavior
for STREAMOFF.

 :c:func:`VIDIOC_STREAMON` and :c:func:`VIDIOC_STREAMOFF` on CAPTURE queue will
> +not affect the flush sequence, allowing the client to change CAPTURE
> +buffer set if needed.
> +
> +Commit points
> +-------------
> +
> +Setting formats and allocating buffers triggers changes in the behavior
> +of the driver.
> +
> +1. Setting format on CAPTURE queue may change the set of formats
> +   supported/advertised on the OUTPUT queue. It also must change the
> +   format currently selected on OUTPUT queue if it is not supported
> +   by the newly selected CAPTURE format to a supported one.
> +
> +2. Enumerating formats on OUTPUT queue must only return OUTPUT formats
> +   supported for the CAPTURE format currently set.
> +
> +3. Setting/changing format on OUTPUT queue does not change formats
> +   available on CAPTURE queue. An attempt to set OUTPUT format that
> +   is not supported for the currently selected CAPTURE format must
> +   result in an error (-EINVAL) from :c:func:`VIDIOC_S_FMT`.
> +
> +4. Enumerating formats on CAPTURE queue always returns a full set of
> +   supported coded formats, irrespective of the current format
> +   selected on OUTPUT queue.
> +
> +5. After allocating buffers on a queue, it is not possible to change
> +   format on it.
> +
> +In summary, the CAPTURE (coded format) queue is the master that governs
> +the set of supported formats for the OUTPUT queue.
> 

Regards,

	Hans
Philipp Zabel June 7, 2018, 10:32 a.m. UTC | #9
On Thu, 2018-06-07 at 09:27 +0200, Hans Verkuil wrote:
[...]
> > > > > I think it could be useful to enforce the same colorimetry on CAPTURE
> > > > > and OUTPUT queue if the hardware doesn't do any colorspace conversion.
> > > > 
> > > > After thinking a bit more on this, I guess it wouldn't overly
> > > > complicate things if we require that the values from OUTPUT queue are
> > > > copied to CAPTURE queue, if the stream doesn't include such
> > > > information or the hardware just can't parse them.
> > > 
> > > And for encoders it would be copied from CAPTURE queue to OUTPUT queue?
> > > 
> > 
> > I guess iy would be from OUTPUT to CAPTURE for encoders as well, since
> > the colorimetry of OUTPUT is ultimately defined by the raw frames that
> > userspace is going to be feeding to the encoder.
> 
> Correct. All mem2mem drivers should just copy the colorimetry from the
> output buffers to the capture buffers, unless the decoder hardware is able to
> extract that data from the stream, in which case it can overwrite it for
> the capture buffer.
> 
> Currently colorspace converters are not supported since the V4L2 API does
> not provide a way to let userspace define colorimetry for the capture queue.

Oh, I never realized this limitation [1] ...

 "Image colorspace, from enum v4l2_colorspace. This information
  supplements the pixelformat and must be set by the driver for capture
  streams and by the application for output streams, see Colorspaces."

[1] https://linuxtv.org/downloads/v4l-dvb-apis-new/uapi/v4l/pixfmt-v4l2.html

It's just a bit unintuitive that the initialization sequence requires to
set S_FMT(CAP) first and then S_FMT(OUT) but with colorspace there is
information that flows the opposite way.

> I have a patch to add a new v4l2_format flag for that since forever, but
> since we do not have any drivers that can do this in the kernel it has never
> been upstreamed.

Has this patch been posted some time? I think we could add a mem2mem
device to imx-media with support for linear transformations.

regards
Philipp
Philipp Zabel June 7, 2018, 10:39 a.m. UTC | #10
On Thu, 2018-06-07 at 11:21 +0200, Hans Verkuil wrote:
[...]
> > +Encoder
> > +=======
[...]
> > +Initialization
> > +--------------
> > +
> > +1. (optional) Enumerate supported formats and resolutions. See
> > +   capability enumeration.
> > +
> > +2. Set a coded format on the CAPTURE queue via :c:func:`VIDIOC_S_FMT`
> > +
> > +   a. Required fields:
> > +
> > +      i.  type = CAPTURE
> > +
> > +      ii. fmt.pix_mp.pixelformat set to a coded format to be produced
> > +
> > +   b. Return values:
> > +
> > +      i.  EINVAL: unsupported format.
> 
> I'm still not sure about returning an error in this case.
>
> And what should TRY_FMT do?

Also the documentation currently states in [1]:

  Drivers should not return an error code unless the type field is
 
invalid, this is a mechanism to fathom device capabilities and to
 
approach parameters acceptable for both the application and driver. 

[1] https://linuxtv.org/downloads/v4l-dvb-apis-new/uapi/v4l/vidioc-g-fmt.html

> Do you know what current codecs do? Return EINVAL or replace with a supported format?

At least coda replaces incorrect pixelformat with a supported format.

> It would be nice to standardize on one rule or another.
> 
> The spec says that it should always return a valid format, but not all drivers adhere
> to that. Perhaps we need to add a flag to let the driver signal the behavior of S_FMT
> to userspace.
> 
> This is a long-standing issue with S_FMT, actually.
> 
[...]
> > +Encoding parameter changes
> > +--------------------------
> > +
> > +The client is allowed to use :c:func:`VIDIOC_S_CTRL` to change encoder
> > +parameters at any time. The driver must apply the new setting starting
> > +at the next frame queued to it.
> > +
> > +This specifically means that if the driver maintains a queue of buffers
> > +to be encoded and at the time of the call to :c:func:`VIDIOC_S_CTRL` not all the
> > +buffers in the queue are processed yet, the driver must not apply the
> > +change immediately, but schedule it for when the next buffer queued
> > +after the :c:func:`VIDIOC_S_CTRL` starts being processed.
> 
> Is this what drivers do today? I thought it was applied immediately?
> This sounds like something for which you need the Request API.

coda currently doesn't support dynamically changing controls at all.

> > +
> > +Flush
> > +-----
> > +
> > +Flush is the process of draining the CAPTURE queue of any remaining
> > +buffers. After the flush sequence is complete, the client has received
> > +all encoded frames for all OUTPUT buffers queued before the sequence was
> > +started.
> > +
> > +1. Begin flush by issuing :c:func:`VIDIOC_ENCODER_CMD`.
> > +
> > +   a. Required fields:
> > +
> > +      i. cmd = ``V4L2_ENC_CMD_STOP``
> > +
> > +2. The driver must process and encode as normal all OUTPUT buffers
> > +   queued by the client before the :c:func:`VIDIOC_ENCODER_CMD` was issued.
> 
> Note: TRY_ENCODER_CMD should also be supported, likely via a standard helper
> in v4l2-mem2mem.c.

TRY_ENCODER_CMD can be used to check whether the hardware supports
things like V4L2_ENC_CMD_STOP_AT_GOP_END, I don't think this will be the
same for all codecs.

regards
Philipp
Hans Verkuil June 7, 2018, 10:54 a.m. UTC | #11
On 06/07/18 12:32, Philipp Zabel wrote:
> On Thu, 2018-06-07 at 09:27 +0200, Hans Verkuil wrote:
> [...]
>>>>>> I think it could be useful to enforce the same colorimetry on CAPTURE
>>>>>> and OUTPUT queue if the hardware doesn't do any colorspace conversion.
>>>>>
>>>>> After thinking a bit more on this, I guess it wouldn't overly
>>>>> complicate things if we require that the values from OUTPUT queue are
>>>>> copied to CAPTURE queue, if the stream doesn't include such
>>>>> information or the hardware just can't parse them.
>>>>
>>>> And for encoders it would be copied from CAPTURE queue to OUTPUT queue?
>>>>
>>>
>>> I guess iy would be from OUTPUT to CAPTURE for encoders as well, since
>>> the colorimetry of OUTPUT is ultimately defined by the raw frames that
>>> userspace is going to be feeding to the encoder.
>>
>> Correct. All mem2mem drivers should just copy the colorimetry from the
>> output buffers to the capture buffers, unless the decoder hardware is able to
>> extract that data from the stream, in which case it can overwrite it for
>> the capture buffer.
>>
>> Currently colorspace converters are not supported since the V4L2 API does
>> not provide a way to let userspace define colorimetry for the capture queue.
> 
> Oh, I never realized this limitation [1] ...
> 
>  "Image colorspace, from enum v4l2_colorspace. This information
>   supplements the pixelformat and must be set by the driver for capture
>   streams and by the application for output streams, see Colorspaces."
> 
> [1] https://linuxtv.org/downloads/v4l-dvb-apis-new/uapi/v4l/pixfmt-v4l2.html
> 
> It's just a bit unintuitive that the initialization sequence requires to
> set S_FMT(CAP) first and then S_FMT(OUT) but with colorspace there is
> information that flows the opposite way.
> 
>> I have a patch to add a new v4l2_format flag for that since forever, but
>> since we do not have any drivers that can do this in the kernel it has never
>> been upstreamed.
> 
> Has this patch been posted some time? I think we could add a mem2mem
> device to imx-media with support for linear transformations.

I don't believe it's ever been posted.

It's here:

https://git.linuxtv.org/hverkuil/media_tree.git/commit/?h=csc&id=d0e588c1a36604538e16f24cad3444c84f5da73e

Regards,

	Hans
Philipp Zabel June 7, 2018, 11:02 a.m. UTC | #12
On Thu, 2018-06-07 at 12:54 +0200, Hans Verkuil wrote:
[...]
> > > I have a patch to add a new v4l2_format flag for that since forever, but
> > > since we do not have any drivers that can do this in the kernel it has never
> > > been upstreamed.
> > 
> > Has this patch been posted some time? I think we could add a mem2mem
> > device to imx-media with support for linear transformations.
> 
> I don't believe it's ever been posted.
> 
> It's here:
> 
> https://git.linuxtv.org/hverkuil/media_tree.git/commit/?h=csc&id=d0e588c1a36604538e16f24cad3444c84f5da73e

Thanks!

regards
Philipp
Tomasz Figa June 11, 2018, 7:49 a.m. UTC | #13
Hi Hans,

On Thu, Jun 7, 2018 at 6:21 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>
> On 06/05/2018 12:33 PM, Tomasz Figa wrote:
[snip]
> > +Initialization
> > +--------------
> > +
> > +1. (optional) Enumerate supported formats and resolutions. See
> > +   capability enumeration.
> > +
> > +2. Set a coded format on the CAPTURE queue via :c:func:`VIDIOC_S_FMT`
> > +
> > +   a. Required fields:
> > +
> > +      i.  type = CAPTURE
> > +
> > +      ii. fmt.pix_mp.pixelformat set to a coded format to be produced
> > +
> > +   b. Return values:
> > +
> > +      i.  EINVAL: unsupported format.
>
> I'm still not sure about returning an error in this case.
>
> And what should TRY_FMT do?
>
> Do you know what current codecs do? Return EINVAL or replace with a supported format?
>

s5p-mfc returns -EINVAL, while mtk-vcodec and coda seem to fall back
to current format.

> It would be nice to standardize on one rule or another.
>
> The spec says that it should always return a valid format, but not all drivers adhere
> to that. Perhaps we need to add a flag to let the driver signal the behavior of S_FMT
> to userspace.
>
> This is a long-standing issue with S_FMT, actually.
>

Agreed. I'd personally prefer agreeing on one pattern to simplify
things. I generally don't like the "negotiation hell" that the
fallback introduces, but with the general documentation clearly
stating such behavior, I'd be worried that returning an error actually
breaks some userspace.

[snip]
> > +Encoding
> > +--------
> > +
> > +This state is reached after a successful initialization sequence. In
> > +this state, client queues and dequeues buffers to both queues via
> > +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, as per spec.
> > +
> > +Both queues operate independently. The client may queue and dequeue
> > +buffers to queues in any order and at any rate, also at a rate different
> > +for each queue. The client may queue buffers within the same queue in
> > +any order (V4L2 index-wise).
>
> I'd drop the whole 'in any order' in the text above. This has always been
> the case, and I think it is only confusing. I think what you really want
> to say is that both queues operate independently and quite possibly at
> different rates. So clients should operate them independently as well.

I think there are at least 3 different "in any order" in play:

1) the order of buffers being dequeued not matching the order of
queuing the buffers in the same queue (e.g. encoder keeping some
buffers as reference framebuffers)

2) possible difference in order of queuing raw frames to encoder
OUTPUT and dequeuing encoded bitstream from encoder CAPTURE,

3) the order of handling the queue/dequeue operations on both queues,
i.e. dequeue OUTPUT, queue OUTPUT, dequeue CAPTURE, queue CAPTURE,

4) the order of queuing buffers (indices) within the queue being up to
the client - this has always been the case indeed. The extra bit here
is that this keeps being true, even with having 2 queues.

I believe the text above refers to 3) and 4). I guess we can drop 4)
indeed and we actually may want to clearly state 1) and 2).

By the way, do we already have some place in the documentation that
mentions the bits that have always been the case? We could refer to it
instead.

>
>  It is recommended for the client to operate
> > +the queues independently for best performance.
> > +
> > +Source OUTPUT buffers must contain full raw frames in the selected
> > +OUTPUT format, exactly one frame per buffer.
> > +
> > +Encoding parameter changes
> > +--------------------------
> > +
> > +The client is allowed to use :c:func:`VIDIOC_S_CTRL` to change encoder
> > +parameters at any time. The driver must apply the new setting starting
> > +at the next frame queued to it.
> > +
> > +This specifically means that if the driver maintains a queue of buffers
> > +to be encoded and at the time of the call to :c:func:`VIDIOC_S_CTRL` not all the
> > +buffers in the queue are processed yet, the driver must not apply the
> > +change immediately, but schedule it for when the next buffer queued
> > +after the :c:func:`VIDIOC_S_CTRL` starts being processed.
>
> Is this what drivers do today? I thought it was applied immediately?
> This sounds like something for which you need the Request API.

mtk-vcodec seems to implement the above, while s5p-mfc, coda, venus
don't seem to be doing so.

> > +
> > +Flush
> > +-----
> > +
> > +Flush is the process of draining the CAPTURE queue of any remaining
> > +buffers. After the flush sequence is complete, the client has received
> > +all encoded frames for all OUTPUT buffers queued before the sequence was
> > +started.
> > +
> > +1. Begin flush by issuing :c:func:`VIDIOC_ENCODER_CMD`.
> > +
> > +   a. Required fields:
> > +
> > +      i. cmd = ``V4L2_ENC_CMD_STOP``
> > +
> > +2. The driver must process and encode as normal all OUTPUT buffers
> > +   queued by the client before the :c:func:`VIDIOC_ENCODER_CMD` was issued.
>
> Note: TRY_ENCODER_CMD should also be supported, likely via a standard helper
> in v4l2-mem2mem.c.

Ack.

>
> > +
> > +3. Once all OUTPUT buffers queued before ``V4L2_ENC_CMD_STOP`` are
> > +   processed:
> > +
> > +   a. Once all decoded frames (if any) are ready to be dequeued on the
> > +      CAPTURE queue, the driver must send a ``V4L2_EVENT_EOS``. The
> > +      driver must also set ``V4L2_BUF_FLAG_LAST`` in
> > +      :c:type:`v4l2_buffer` ``flags`` field on the buffer on the CAPTURE queue
> > +      containing the last frame (if any) produced as a result of
> > +      processing the OUTPUT buffers queued before
> > +      ``V4L2_ENC_CMD_STOP``. If no more frames are left to be
> > +      returned at the point of handling ``V4L2_ENC_CMD_STOP``, the
> > +      driver must return an empty buffer (with
> > +      :c:type:`v4l2_buffer` ``bytesused`` = 0) as the last buffer with
> > +      ``V4L2_BUF_FLAG_LAST`` set instead.
> > +      Any attempts to dequeue more buffers beyond the buffer
> > +      marked with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE
> > +      error from :c:func:`VIDIOC_DQBUF`.
> > +
> > +4. At this point, encoding is paused and the driver will accept, but not
> > +   process any newly queued OUTPUT buffers until the client issues
> > +   ``V4L2_ENC_CMD_START`` or :c:func:`VIDIOC_STREAMON`.
>
> STREAMON on which queue? Shouldn't there be a STREAMOFF first?
>

Yes, STREAMOFF is implied here, since it's not possible to STREAMON on
an already streaming queue. It is mentioned only because the general
encoder command documentation states that STREAMON includes an
implicit START command.

> > +
> > +Once the flush sequence is initiated, the client needs to drive it to
> > +completion, as described by the above steps, unless it aborts the
> > +process by issuing :c:func:`VIDIOC_STREAMOFF` on OUTPUT queue. The client is not
> > +allowed to issue ``V4L2_ENC_CMD_START`` or ``V4L2_ENC_CMD_STOP`` again
> > +while the flush sequence is in progress.
> > +
> > +Issuing :c:func:`VIDIOC_STREAMON` on OUTPUT queue will implicitly restart
> > +encoding.
>
> This feels wrong. Calling STREAMON on a queue that is already streaming does nothing
> according to the spec.

That would be contradicting to
https://linuxtv.org/downloads/v4l-dvb-apis/uapi/v4l/vidioc-encoder-cmd.html,
which says that

  "A read() or VIDIOC_STREAMON call sends an implicit START command to
the encoder if it has not been started yet."

> I think you should either call CMD_START or STREAMOFF/ON on
> the OUTPUT queue. Of course, calling STREAMOFF first will dequeue any queued OUTPUT
> buffers that were queued since ENC_CMD_STOP was called. But that's normal behavior
> for STREAMOFF.

I think it would be kind of inconsistent for userspace, since STREAMON
would return 0 in both cases, but it would issue the implicit START
only if STREAMOFF was called before. Perhaps it could be made saner by
saying that "STREAMOFF resets the stop condition and thus encoding
would continue normally after a matching STREAMON".

Best regards,
Tomasz
diff mbox

Patch

diff --git a/Documentation/media/uapi/v4l/dev-codec.rst b/Documentation/media/uapi/v4l/dev-codec.rst
index 0483b10c205e..325a51bb09df 100644
--- a/Documentation/media/uapi/v4l/dev-codec.rst
+++ b/Documentation/media/uapi/v4l/dev-codec.rst
@@ -805,3 +805,316 @@  of the driver.
 To summarize, setting formats and allocation must always start with the
 OUTPUT queue and the OUTPUT queue is the master that governs the set of
 supported formats for the CAPTURE queue.
+
+Encoder
+=======
+
+Querying capabilities
+---------------------
+
+1. To enumerate the set of coded formats supported by the driver, the
+   client uses :c:func:`VIDIOC_ENUM_FMT` for CAPTURE. The driver must always
+   return the full set of supported formats, irrespective of the
+   format set on the OUTPUT queue.
+
+2. To enumerate the set of supported raw formats, the client uses
+   :c:func:`VIDIOC_ENUM_FMT` for OUTPUT queue. The driver must return only
+   the formats supported for the format currently set on the
+   CAPTURE queue.
+   In order to enumerate raw formats supported by a given coded
+   format, the client must first set that coded format on the
+   CAPTURE queue and then enumerate the OUTPUT queue.
+
+3. The client may use :c:func:`VIDIOC_ENUM_FRAMESIZES` to detect supported
+   resolutions for a given format, passing its fourcc in
+   :c:type:`v4l2_frmivalenum` ``pixel_format``.
+
+   a. Values returned from :c:func:`VIDIOC_ENUM_FRAMESIZES` for coded formats
+      must be maximums for given coded format for all supported raw
+      formats.
+
+   b. Values returned from :c:func:`VIDIOC_ENUM_FRAMESIZES` for raw formats must
+      be maximums for given raw format for all supported coded
+      formats.
+
+   c. The client should derive the supported resolution for a
+      combination of coded+raw format by calculating the
+      intersection of resolutions returned from calls to
+      :c:func:`VIDIOC_ENUM_FRAMESIZES` for the given coded and raw formats.
+
+4. Supported profiles and levels for given format, if applicable, may be
+   queried using their respective controls via :c:func:`VIDIOC_QUERYCTRL`.
+
+5. The client may use :c:func:`VIDIOC_ENUM_FRAMEINTERVALS` to enumerate maximum
+   supported framerates by the driver/hardware for a given
+   format+resolution combination.
+
+6. Any additional encoder capabilities may be discovered by querying
+   their respective controls.
+
+.. note::
+
+   Full format enumeration requires enumerating all raw formats
+   on the OUTPUT queue for all possible (enumerated) coded formats on
+   CAPTURE queue (setting each format on the CAPTURE queue before each
+   enumeration on the OUTPUT queue.
+
+Initialization
+--------------
+
+1. (optional) Enumerate supported formats and resolutions. See
+   capability enumeration.
+
+2. Set a coded format on the CAPTURE queue via :c:func:`VIDIOC_S_FMT`
+
+   a. Required fields:
+
+      i.  type = CAPTURE
+
+      ii. fmt.pix_mp.pixelformat set to a coded format to be produced
+
+   b. Return values:
+
+      i.  EINVAL: unsupported format.
+
+      ii. Others: per spec
+
+   c. Return fields:
+
+      i. fmt.pix_mp.width, fmt.pix_mp.height should be 0.
+
+   .. note::
+
+      After a coded format is set, the set of raw formats
+      supported as source on the OUTPUT queue may change.
+
+3. (optional) Enumerate supported OUTPUT formats (raw formats for
+   source) for the selected coded format via :c:func:`VIDIOC_ENUM_FMT`.
+
+   a. Required fields:
+
+      i.  type = OUTPUT
+
+      ii. index = per spec
+
+   b. Return values: per spec
+
+   c. Return fields:
+
+      i. pixelformat: raw format supported for the coded format
+         currently selected on the OUTPUT queue.
+
+4. Set a raw format on the OUTPUT queue and visible resolution for the
+   source raw frames via :c:func:`VIDIOC_S_FMT` on the OUTPUT queue.
+
+   a. Required fields:
+
+      i.   type = OUTPUT
+
+      ii.  fmt.pix_mp.pixelformat = raw format to be used as source of
+           encode
+
+      iii. fmt.pix_mp.width, fmt.pix_mp.height = input resolution
+           for the source raw frames
+
+      iv.  num_planes: set to number of planes for pixelformat.
+
+      v.   For each plane p = [0, num_planes-1]:
+           plane_fmt[p].sizeimage, plane_fmt[p].bytesperline: as
+           per spec for input resolution.
+
+   b. Return values: as per spec.
+
+   c. Return fields:
+
+      i.  fmt.pix_mp.width, fmt.pix_mp.height = may be adjusted by
+          driver to match alignment requirements, as required by the
+          currently selected formats.
+
+      ii. For each plane p = [0, num_planes-1]:
+          plane_fmt[p].sizeimage, plane_fmt[p].bytesperline: as
+          per spec for the adjusted input resolution.
+
+   d. Setting the input resolution will reset visible resolution to the
+      adjusted input resolution rounded up to the closest visible
+      resolution supported by the driver. Similarly, coded size will
+      be reset to input resolution rounded up to the closest coded
+      resolution supported by the driver (typically a multiple of
+      macroblock size).
+
+5. (optional) Set visible size for the stream metadata via
+   :c:func:`VIDIOC_S_SELECTION` on the OUTPUT queue.
+
+   a. Required fields:
+
+      i.   type = OUTPUT
+
+      ii.  target = ``V4L2_SEL_TGT_CROP``
+
+      iii. r.left, r.top, r.width, r.height: visible rectangle; this
+           must fit within coded resolution returned from
+           :c:func:`VIDIOC_S_FMT`.
+
+   b. Return values: as per spec.
+
+   c. Return fields:
+
+      i. r.left, r.top, r.width, r.height: visible rectangle adjusted by
+         the driver to match internal constraints.
+
+   d. This resolution must be used as the visible resolution in the
+      stream metadata.
+
+   .. note::
+
+      The driver might not support arbitrary values of the
+      crop rectangle and will adjust it to the closest supported
+      one.
+
+6. Allocate buffers for both OUTPUT and CAPTURE queues via
+   :c:func:`VIDIOC_REQBUFS`. This may be performed in any order.
+
+   a. Required fields:
+
+      i.   count = n, where n > 0.
+
+      ii.  type = OUTPUT or CAPTURE
+
+      iii. memory = as per spec
+
+   b. Return values: Per spec.
+
+   c. Return fields:
+
+      i. count: adjusted to allocated number of buffers
+
+   d. The driver must adjust count to minimum of required number of
+      buffers for given format and count passed. The client must
+      check this value after the ioctl returns to get the number of
+      buffers actually allocated.
+
+   .. note::
+
+      Passing count = 1 is useful for letting the driver choose the
+      minimum according to the selected format/hardware
+      requirements.
+
+   .. note::
+
+      To allocate more than minimum number of buffers (for pipeline
+      depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT)`` or
+      G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE)``, respectively,
+      to get the minimum number of buffers required by the
+      driver/format, and pass the obtained value plus the number of
+      additional buffers needed in count field to :c:func:`VIDIOC_REQBUFS`.
+
+7. Begin streaming on both OUTPUT and CAPTURE queues via
+   :c:func:`VIDIOC_STREAMON`. This may be performed in any order.
+
+Encoding
+--------
+
+This state is reached after a successful initialization sequence. In
+this state, client queues and dequeues buffers to both queues via
+:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, as per spec.
+
+Both queues operate independently. The client may queue and dequeue
+buffers to queues in any order and at any rate, also at a rate different
+for each queue. The client may queue buffers within the same queue in
+any order (V4L2 index-wise). It is recommended for the client to operate
+the queues independently for best performance.
+
+Source OUTPUT buffers must contain full raw frames in the selected
+OUTPUT format, exactly one frame per buffer.
+
+Encoding parameter changes
+--------------------------
+
+The client is allowed to use :c:func:`VIDIOC_S_CTRL` to change encoder
+parameters at any time. The driver must apply the new setting starting
+at the next frame queued to it.
+
+This specifically means that if the driver maintains a queue of buffers
+to be encoded and at the time of the call to :c:func:`VIDIOC_S_CTRL` not all the
+buffers in the queue are processed yet, the driver must not apply the
+change immediately, but schedule it for when the next buffer queued
+after the :c:func:`VIDIOC_S_CTRL` starts being processed.
+
+Flush
+-----
+
+Flush is the process of draining the CAPTURE queue of any remaining
+buffers. After the flush sequence is complete, the client has received
+all encoded frames for all OUTPUT buffers queued before the sequence was
+started.
+
+1. Begin flush by issuing :c:func:`VIDIOC_ENCODER_CMD`.
+
+   a. Required fields:
+
+      i. cmd = ``V4L2_ENC_CMD_STOP``
+
+2. The driver must process and encode as normal all OUTPUT buffers
+   queued by the client before the :c:func:`VIDIOC_ENCODER_CMD` was issued.
+
+3. Once all OUTPUT buffers queued before ``V4L2_ENC_CMD_STOP`` are
+   processed:
+
+   a. Once all decoded frames (if any) are ready to be dequeued on the
+      CAPTURE queue, the driver must send a ``V4L2_EVENT_EOS``. The
+      driver must also set ``V4L2_BUF_FLAG_LAST`` in
+      :c:type:`v4l2_buffer` ``flags`` field on the buffer on the CAPTURE queue
+      containing the last frame (if any) produced as a result of
+      processing the OUTPUT buffers queued before
+      ``V4L2_ENC_CMD_STOP``. If no more frames are left to be
+      returned at the point of handling ``V4L2_ENC_CMD_STOP``, the
+      driver must return an empty buffer (with
+      :c:type:`v4l2_buffer` ``bytesused`` = 0) as the last buffer with
+      ``V4L2_BUF_FLAG_LAST`` set instead.
+      Any attempts to dequeue more buffers beyond the buffer
+      marked with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE
+      error from :c:func:`VIDIOC_DQBUF`.
+
+4. At this point, encoding is paused and the driver will accept, but not
+   process any newly queued OUTPUT buffers until the client issues
+   ``V4L2_ENC_CMD_START`` or :c:func:`VIDIOC_STREAMON`.
+
+Once the flush sequence is initiated, the client needs to drive it to
+completion, as described by the above steps, unless it aborts the
+process by issuing :c:func:`VIDIOC_STREAMOFF` on OUTPUT queue. The client is not
+allowed to issue ``V4L2_ENC_CMD_START`` or ``V4L2_ENC_CMD_STOP`` again
+while the flush sequence is in progress.
+
+Issuing :c:func:`VIDIOC_STREAMON` on OUTPUT queue will implicitly restart
+encoding. :c:func:`VIDIOC_STREAMON` and :c:func:`VIDIOC_STREAMOFF` on CAPTURE queue will
+not affect the flush sequence, allowing the client to change CAPTURE
+buffer set if needed.
+
+Commit points
+-------------
+
+Setting formats and allocating buffers triggers changes in the behavior
+of the driver.
+
+1. Setting format on CAPTURE queue may change the set of formats
+   supported/advertised on the OUTPUT queue. It also must change the
+   format currently selected on OUTPUT queue if it is not supported
+   by the newly selected CAPTURE format to a supported one.
+
+2. Enumerating formats on OUTPUT queue must only return OUTPUT formats
+   supported for the CAPTURE format currently set.
+
+3. Setting/changing format on OUTPUT queue does not change formats
+   available on CAPTURE queue. An attempt to set OUTPUT format that
+   is not supported for the currently selected CAPTURE format must
+   result in an error (-EINVAL) from :c:func:`VIDIOC_S_FMT`.
+
+4. Enumerating formats on CAPTURE queue always returns a full set of
+   supported coded formats, irrespective of the current format
+   selected on OUTPUT queue.
+
+5. After allocating buffers on a queue, it is not possible to change
+   format on it.
+
+In summary, the CAPTURE (coded format) queue is the master that governs
+the set of supported formats for the OUTPUT queue.