diff mbox series

[1/2] media: docs-rst: Document memory-to-memory video decoder interface

Message ID 20180724140621.59624-2-tfiga@chromium.org (mailing list archive)
State New, archived
Headers show
Series Document memory-to-memory video codec interfaces | expand

Commit Message

Tomasz Figa July 24, 2018, 2:06 p.m. UTC
Due to complexity of the video decoding process, the V4L2 drivers of
stateful decoder hardware require specific sequences of V4L2 API calls
to be followed. These include capability enumeration, initialization,
decoding, seek, pause, dynamic resolution change, drain and end of
stream.

Specifics of the above have been discussed during Media Workshops at
LinuxCon Europe 2012 in Barcelona and then later Embedded Linux
Conference Europe 2014 in Düsseldorf. The de facto Codec API that
originated at those events was later implemented by the drivers we already
have merged in mainline, such as s5p-mfc or coda.

The only thing missing was the real specification included as a part of
Linux Media documentation. Fix it now and document the decoder part of
the Codec API.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
---
 Documentation/media/uapi/v4l/dev-decoder.rst | 872 +++++++++++++++++++
 Documentation/media/uapi/v4l/devices.rst     |   1 +
 Documentation/media/uapi/v4l/v4l2.rst        |  10 +-
 3 files changed, 882 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst

Comments

Hans Verkuil July 25, 2018, 11:58 a.m. UTC | #1
Hi Tomasz,

Many, many thanks for working on this! It's a great document and when done
it will be very useful indeed.

Review comments follow...

On 24/07/18 16:06, Tomasz Figa wrote:
> Due to complexity of the video decoding process, the V4L2 drivers of
> stateful decoder hardware require specific sequences of V4L2 API calls
> to be followed. These include capability enumeration, initialization,
> decoding, seek, pause, dynamic resolution change, drain and end of
> stream.
> 
> Specifics of the above have been discussed during Media Workshops at
> LinuxCon Europe 2012 in Barcelona and then later Embedded Linux
> Conference Europe 2014 in Düsseldorf. The de facto Codec API that
> originated at those events was later implemented by the drivers we already
> have merged in mainline, such as s5p-mfc or coda.
> 
> The only thing missing was the real specification included as a part of
> Linux Media documentation. Fix it now and document the decoder part of
> the Codec API.
> 
> Signed-off-by: Tomasz Figa <tfiga@chromium.org>
> ---
>  Documentation/media/uapi/v4l/dev-decoder.rst | 872 +++++++++++++++++++
>  Documentation/media/uapi/v4l/devices.rst     |   1 +
>  Documentation/media/uapi/v4l/v4l2.rst        |  10 +-
>  3 files changed, 882 insertions(+), 1 deletion(-)
>  create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst
> 
> diff --git a/Documentation/media/uapi/v4l/dev-decoder.rst b/Documentation/media/uapi/v4l/dev-decoder.rst
> new file mode 100644
> index 000000000000..f55d34d2f860
> --- /dev/null
> +++ b/Documentation/media/uapi/v4l/dev-decoder.rst
> @@ -0,0 +1,872 @@
> +.. -*- coding: utf-8; mode: rst -*-
> +
> +.. _decoder:
> +
> +****************************************
> +Memory-to-memory Video Decoder Interface
> +****************************************
> +
> +Input data to a video decoder are buffers containing unprocessed video
> +stream (e.g. Annex-B H.264/HEVC stream, raw VP8/9 stream). The driver is
> +expected not to require any additional information from the client to
> +process these buffers. Output data are raw video frames returned in display
> +order.
> +
> +Performing software parsing, processing etc. of the stream in the driver
> +in order to support this interface is strongly discouraged. In case such
> +operations are needed, use of Stateless Video Decoder Interface (in
> +development) is strongly advised.
> +
> +Conventions and notation used in this document
> +==============================================
> +
> +1. The general V4L2 API rules apply if not specified in this document
> +   otherwise.
> +
> +2. The meaning of words “must”, “may”, “should”, etc. is as per RFC
> +   2119.
> +
> +3. All steps not marked “optional” are required.
> +
> +4. :c:func:`VIDIOC_G_EXT_CTRLS`, :c:func:`VIDIOC_S_EXT_CTRLS` may be used
> +   interchangeably with :c:func:`VIDIOC_G_CTRL`, :c:func:`VIDIOC_S_CTRL`,
> +   unless specified otherwise.
> +
> +5. Single-plane API (see spec) and applicable structures may be used
> +   interchangeably with Multi-plane API, unless specified otherwise,
> +   depending on driver capabilities and following the general V4L2
> +   guidelines.
> +
> +6. i = [a..b]: sequence of integers from a to b, inclusive, i.e. i =
> +   [0..2]: i = 0, 1, 2.
> +
> +7. For ``OUTPUT`` buffer A, A’ represents a buffer on the ``CAPTURE`` queue
> +   containing data (decoded frame/stream) that resulted from processing
> +   buffer A.
> +
> +Glossary
> +========
> +
> +CAPTURE
> +   the destination buffer queue; the queue of buffers containing decoded
> +   frames; ``V4L2_BUF_TYPE_VIDEO_CAPTURE```` or
> +   ``V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE``; data are captured from the
> +   hardware into ``CAPTURE`` buffers
> +
> +client
> +   application client communicating with the driver implementing this API
> +
> +coded format
> +   encoded/compressed video bitstream format (e.g. H.264, VP8, etc.); see
> +   also: raw format
> +
> +coded height
> +   height for given coded resolution
> +
> +coded resolution
> +   stream resolution in pixels aligned to codec and hardware requirements;
> +   typically visible resolution rounded up to full macroblocks;
> +   see also: visible resolution
> +
> +coded width
> +   width for given coded resolution
> +
> +decode order
> +   the order in which frames are decoded; may differ from display order if
> +   coded format includes a feature of frame reordering; ``OUTPUT`` buffers
> +   must be queued by the client in decode order
> +
> +destination
> +   data resulting from the decode process; ``CAPTURE``
> +
> +display order
> +   the order in which frames must be displayed; ``CAPTURE`` buffers must be
> +   returned by the driver in display order
> +
> +DPB
> +   Decoded Picture Buffer; a H.264 term for a buffer that stores a picture

a H.264 -> an H.264

> +   that is encoded or decoded and available for reference in further
> +   decode/encode steps.
> +
> +EOS
> +   end of stream
> +
> +IDR
> +   a type of a keyframe in H.264-encoded stream, which clears the list of
> +   earlier reference frames (DPBs)

You do not actually say what IDR stands for. Can you add that?

> +
> +keyframe
> +   an encoded frame that does not reference frames decoded earlier, i.e.
> +   can be decoded fully on its own.
> +
> +OUTPUT
> +   the source buffer queue; the queue of buffers containing encoded
> +   bitstream; ``V4L2_BUF_TYPE_VIDEO_OUTPUT`` or
> +   ``V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE``; the hardware is fed with data
> +   from ``OUTPUT`` buffers
> +
> +PPS
> +   Picture Parameter Set; a type of metadata entity in H.264 bitstream
> +
> +raw format
> +   uncompressed format containing raw pixel data (e.g. YUV, RGB formats)
> +
> +resume point
> +   a point in the bitstream from which decoding may start/continue, without
> +   any previous state/data present, e.g.: a keyframe (VP8/VP9) or
> +   SPS/PPS/IDR sequence (H.264); a resume point is required to start decode
> +   of a new stream, or to resume decoding after a seek
> +
> +source
> +   data fed to the decoder; ``OUTPUT``
> +
> +SPS
> +   Sequence Parameter Set; a type of metadata entity in H.264 bitstream
> +
> +visible height
> +   height for given visible resolution; display height
> +
> +visible resolution
> +   stream resolution of the visible picture, in pixels, to be used for
> +   display purposes; must be smaller or equal to coded resolution;
> +   display resolution
> +
> +visible width
> +   width for given visible resolution; display width
> +
> +Querying capabilities
> +=====================
> +
> +1. To enumerate the set of coded formats supported by the driver, the
> +   client may call :c:func:`VIDIOC_ENUM_FMT` on ``OUTPUT``.
> +
> +   * The driver must always return the full set of supported formats,
> +     irrespective of the format set on the ``CAPTURE``.
> +
> +2. To enumerate the set of supported raw formats, the client may call
> +   :c:func:`VIDIOC_ENUM_FMT` on ``CAPTURE``.
> +
> +   * The driver must return only the formats supported for the format
> +     currently active on ``OUTPUT``.
> +
> +   * In order to enumerate raw formats supported by a given coded format,
> +     the client must first set that coded format on ``OUTPUT`` and then
> +     enumerate the ``CAPTURE`` queue.
> +
> +3. The client may use :c:func:`VIDIOC_ENUM_FRAMESIZES` to detect supported
> +   resolutions for a given format, passing desired pixel format in
> +   :c:type:`v4l2_frmsizeenum` ``pixel_format``.
> +
> +   * Values returned by :c:func:`VIDIOC_ENUM_FRAMESIZES` on ``OUTPUT``
> +     must include all possible coded resolutions supported by the decoder
> +     for given coded pixel format.

This is confusing. Since VIDIOC_ENUM_FRAMESIZES does not have a buffer type
argument you cannot say 'on OUTPUT'. I would remove 'on OUTPUT' entirely.

> +
> +   * Values returned by :c:func:`VIDIOC_ENUM_FRAMESIZES` on ``CAPTURE``

Ditto for 'on CAPTURE'

> +     must include all possible frame buffer resolutions supported by the
> +     decoder for given raw pixel format and coded format currently set on
> +     ``OUTPUT``.
> +
> +    .. note::
> +
> +       The client may derive the supported resolution range for a
> +       combination of coded and raw format by setting width and height of
> +       ``OUTPUT`` format to 0 and calculating the intersection of
> +       resolutions returned from calls to :c:func:`VIDIOC_ENUM_FRAMESIZES`
> +       for the given coded and raw formats.

So if the output format is set to 1280x720, then ENUM_FRAMESIZES would just
return 1280x720 as the resolution. If the output format is set to 0x0, then
it returns the full range it is capable of.

Correct?

If so, then I think this needs to be a bit more explicit. I had to think about
it a bit.

Note that the v4l2_pix_format/S_FMT documentation needs to be updated as well
since we never allowed 0x0 before.

What if you set the format to 0x0 but the stream does not have meta data with
the resolution? How does userspace know if 0x0 is allowed or not? If this is
specific to the chosen coded pixel format, should be add a new flag for those
formats indicating that the coded data contains resolution information?

That way userspace knows if 0x0 can be used, and the driver can reject 0x0
for formats that do not support it.

> +
> +4. Supported profiles and levels for given format, if applicable, may be
> +   queried using their respective controls via :c:func:`VIDIOC_QUERYCTRL`.
> +
> +Initialization
> +==============
> +
> +1. *[optional]* Enumerate supported ``OUTPUT`` formats and resolutions. See
> +   capability enumeration.

capability enumeration. -> 'Querying capabilities' above.

> +
> +2. Set the coded format on ``OUTPUT`` via :c:func:`VIDIOC_S_FMT`
> +
> +   * **Required fields:**
> +
> +     ``type``
> +         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> +
> +     ``pixelformat``
> +         a coded pixel format
> +
> +     ``width``, ``height``
> +         required only if cannot be parsed from the stream for the given
> +         coded format; optional otherwise - set to zero to ignore
> +
> +     other fields
> +         follow standard semantics
> +
> +   * For coded formats including stream resolution information, if width
> +     and height are set to non-zero values, the driver will propagate the
> +     resolution to ``CAPTURE`` and signal a source change event
> +     instantly. However, after the decoder is done parsing the
> +     information embedded in the stream, it will update ``CAPTURE``
> +     format with new values and signal a source change event again, if
> +     the values do not match.
> +
> +   .. note::
> +
> +      Changing ``OUTPUT`` format may change currently set ``CAPTURE``

change -> change the

> +      format. The driver will derive a new ``CAPTURE`` format from

from -> from the

> +      ``OUTPUT`` format being set, including resolution, colorimetry
> +      parameters, etc. If the client needs a specific ``CAPTURE`` format,
> +      it must adjust it afterwards.
> +
> +3.  *[optional]* Get minimum number of buffers required for ``OUTPUT``
> +    queue via :c:func:`VIDIOC_G_CTRL`. This is useful if client intends to

client -> the client

> +    use more buffers than minimum required by hardware/format.

than -> than the

> +
> +    * **Required fields:**
> +
> +      ``id``
> +          set to ``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT``
> +
> +    * **Return fields:**
> +
> +      ``value``
> +          required number of ``OUTPUT`` buffers for the currently set
> +          format
> +
> +4.  Allocate source (bitstream) buffers via :c:func:`VIDIOC_REQBUFS` on
> +    ``OUTPUT``.
> +
> +    * **Required fields:**
> +
> +      ``count``
> +          requested number of buffers to allocate; greater than zero
> +
> +      ``type``
> +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> +
> +      ``memory``
> +          follows standard semantics
> +
> +      ``sizeimage``
> +          follows standard semantics; the client is free to choose any
> +          suitable size, however, it may be subject to change by the
> +          driver
> +
> +    * **Return fields:**
> +
> +      ``count``
> +          actual number of buffers allocated
> +
> +    * The driver must adjust count to minimum of required number of
> +      ``OUTPUT`` buffers for given format and count passed. The client must
> +      check this value after the ioctl returns to get the number of
> +      buffers allocated.
> +
> +    .. note::
> +
> +       To allocate more than minimum number of buffers (for pipeline
> +       depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT``) to
> +       get minimum number of buffers required by the driver/format,
> +       and pass the obtained value plus the number of additional
> +       buffers needed in count to :c:func:`VIDIOC_REQBUFS`.
> +
> +5.  Start streaming on ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON`.
> +
> +6.  This step only applies to coded formats that contain resolution
> +    information in the stream. Continue queuing/dequeuing bitstream
> +    buffers to/from the ``OUTPUT`` queue via :c:func:`VIDIOC_QBUF` and
> +    :c:func:`VIDIOC_DQBUF`. The driver must keep processing and returning
> +    each buffer to the client until required metadata to configure the
> +    ``CAPTURE`` queue are found. This is indicated by the driver sending
> +    a ``V4L2_EVENT_SOURCE_CHANGE`` event with
> +    ``V4L2_EVENT_SRC_CH_RESOLUTION`` source change type. There is no
> +    requirement to pass enough data for this to occur in the first buffer
> +    and the driver must be able to process any number.
> +
> +    * If data in a buffer that triggers the event is required to decode
> +      the first frame, the driver must not return it to the client,
> +      but must retain it for further decoding.
> +
> +    * If the client set width and height of ``OUTPUT`` format to 0, calling
> +      :c:func:`VIDIOC_G_FMT` on the ``CAPTURE`` queue will return -EPERM,
> +      until the driver configures ``CAPTURE`` format according to stream
> +      metadata.
> +
> +    * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events and
> +      the event is signaled, the decoding process will not continue until
> +      it is acknowledged by either (re-)starting streaming on ``CAPTURE``,
> +      or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START``
> +      command.
> +
> +    .. note::
> +
> +       No decoded frames are produced during this phase.
> +
> +7.  This step only applies to coded formats that contain resolution
> +    information in the stream.
> +    Receive and handle ``V4L2_EVENT_SOURCE_CHANGE`` from the driver
> +    via :c:func:`VIDIOC_DQEVENT`. The driver must send this event once
> +    enough data is obtained from the stream to allocate ``CAPTURE``
> +    buffers and to begin producing decoded frames.
> +
> +    * **Required fields:**
> +
> +      ``type``
> +          set to ``V4L2_EVENT_SOURCE_CHANGE``
> +
> +    * **Return fields:**
> +
> +      ``u.src_change.changes``
> +          set to ``V4L2_EVENT_SRC_CH_RESOLUTION``
> +
> +    * Any client query issued after the driver queues the event must return
> +      values applying to the just parsed stream, including queue formats,
> +      selection rectangles and controls.
> +
> +8.  Call :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` queue to get format for the
> +    destination buffers parsed/decoded from the bitstream.
> +
> +    * **Required fields:**
> +
> +      ``type``
> +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
> +
> +    * **Return fields:**
> +
> +      ``width``, ``height``
> +          frame buffer resolution for the decoded frames
> +
> +      ``pixelformat``
> +          pixel format for decoded frames
> +
> +      ``num_planes`` (for _MPLANE ``type`` only)
> +          number of planes for pixelformat
> +
> +      ``sizeimage``, ``bytesperline``
> +          as per standard semantics; matching frame buffer format
> +
> +    .. note::
> +
> +       The value of ``pixelformat`` may be any pixel format supported and
> +       must be supported for current stream, based on the information
> +       parsed from the stream and hardware capabilities. It is suggested
> +       that driver chooses the preferred/optimal format for given
> +       configuration. For example, a YUV format may be preferred over an
> +       RGB format, if additional conversion step would be required.
> +
> +9.  *[optional]* Enumerate ``CAPTURE`` formats via
> +    :c:func:`VIDIOC_ENUM_FMT` on ``CAPTURE`` queue. Once the stream
> +    information is parsed and known, the client may use this ioctl to
> +    discover which raw formats are supported for given stream and select on
> +    of them via :c:func:`VIDIOC_S_FMT`.
> +
> +    .. note::
> +
> +       The driver will return only formats supported for the current stream
> +       parsed in this initialization sequence, even if more formats may be
> +       supported by the driver in general.
> +
> +       For example, a driver/hardware may support YUV and RGB formats for
> +       resolutions 1920x1088 and lower, but only YUV for higher
> +       resolutions (due to hardware limitations). After parsing
> +       a resolution of 1920x1088 or lower, :c:func:`VIDIOC_ENUM_FMT` may
> +       return a set of YUV and RGB pixel formats, but after parsing
> +       resolution higher than 1920x1088, the driver will not return RGB,
> +       unsupported for this resolution.
> +
> +       However, subsequent resolution change event triggered after
> +       discovering a resolution change within the same stream may switch
> +       the stream into a lower resolution and :c:func:`VIDIOC_ENUM_FMT`
> +       would return RGB formats again in that case.
> +
> +10.  *[optional]* Choose a different ``CAPTURE`` format than suggested via
> +     :c:func:`VIDIOC_S_FMT` on ``CAPTURE`` queue. It is possible for the
> +     client to choose a different format than selected/suggested by the
> +     driver in :c:func:`VIDIOC_G_FMT`.
> +
> +     * **Required fields:**
> +
> +       ``type``
> +           a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
> +
> +       ``pixelformat``
> +           a raw pixel format
> +
> +     .. note::
> +
> +        Calling :c:func:`VIDIOC_ENUM_FMT` to discover currently available
> +        formats after receiving ``V4L2_EVENT_SOURCE_CHANGE`` is useful to
> +        find out a set of allowed formats for given configuration, but not
> +        required, if the client can accept the defaults.
> +
> +11. *[optional]* Acquire visible resolution via
> +    :c:func:`VIDIOC_G_SELECTION`.
> +
> +    * **Required fields:**
> +
> +      ``type``
> +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
> +
> +      ``target``
> +          set to ``V4L2_SEL_TGT_COMPOSE``
> +
> +    * **Return fields:**
> +
> +      ``r.left``, ``r.top``, ``r.width``, ``r.height``
> +          visible rectangle; this must fit within frame buffer resolution
> +          returned by :c:func:`VIDIOC_G_FMT`.
> +
> +    * The driver must expose following selection targets on ``CAPTURE``:
> +
> +      ``V4L2_SEL_TGT_CROP_BOUNDS``
> +          corresponds to coded resolution of the stream
> +
> +      ``V4L2_SEL_TGT_CROP_DEFAULT``
> +          a rectangle covering the part of the frame buffer that contains
> +          meaningful picture data (visible area); width and height will be
> +          equal to visible resolution of the stream
> +
> +      ``V4L2_SEL_TGT_CROP``
> +          rectangle within coded resolution to be output to ``CAPTURE``;
> +          defaults to ``V4L2_SEL_TGT_CROP_DEFAULT``; read-only on hardware
> +          without additional compose/scaling capabilities
> +
> +      ``V4L2_SEL_TGT_COMPOSE_BOUNDS``
> +          maximum rectangle within ``CAPTURE`` buffer, which the cropped
> +          frame can be output into; equal to ``V4L2_SEL_TGT_CROP``, if the
> +          hardware does not support compose/scaling
> +
> +      ``V4L2_SEL_TGT_COMPOSE_DEFAULT``
> +          equal to ``V4L2_SEL_TGT_CROP``
> +
> +      ``V4L2_SEL_TGT_COMPOSE``
> +          rectangle inside ``OUTPUT`` buffer into which the cropped frame
> +          is output; defaults to ``V4L2_SEL_TGT_COMPOSE_DEFAULT``;
> +          read-only on hardware without additional compose/scaling
> +          capabilities
> +
> +      ``V4L2_SEL_TGT_COMPOSE_PADDED``
> +          rectangle inside ``OUTPUT`` buffer which is overwritten by the
> +          hardware; equal to ``V4L2_SEL_TGT_COMPOSE``, if the hardware
> +          does not write padding pixels
> +
> +12. *[optional]* Get minimum number of buffers required for ``CAPTURE``
> +    queue via :c:func:`VIDIOC_G_CTRL`. This is useful if client intends to
> +    use more buffers than minimum required by hardware/format.
> +
> +    * **Required fields:**
> +
> +      ``id``
> +          set to ``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE``
> +
> +    * **Return fields:**
> +
> +      ``value``
> +          minimum number of buffers required to decode the stream parsed in
> +          this initialization sequence.
> +
> +    .. note::
> +
> +       Note that the minimum number of buffers must be at least the number
> +       required to successfully decode the current stream. This may for
> +       example be the required DPB size for an H.264 stream given the
> +       parsed stream configuration (resolution, level).
> +
> +13. Allocate destination (raw format) buffers via :c:func:`VIDIOC_REQBUFS`
> +    on the ``CAPTURE`` queue.
> +
> +    * **Required fields:**
> +
> +      ``count``
> +          requested number of buffers to allocate; greater than zero
> +
> +      ``type``
> +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
> +
> +      ``memory``
> +          follows standard semantics
> +
> +    * **Return fields:**
> +
> +      ``count``
> +          adjusted to allocated number of buffers
> +
> +    * The driver must adjust count to minimum of required number of
> +      destination buffers for given format and stream configuration and the
> +      count passed. The client must check this value after the ioctl
> +      returns to get the number of buffers allocated.
> +
> +    .. note::
> +
> +       To allocate more than minimum number of buffers (for pipeline
> +       depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE``) to
> +       get minimum number of buffers required, and pass the obtained value
> +       plus the number of additional buffers needed in count to
> +       :c:func:`VIDIOC_REQBUFS`.


I think we should mention here the option of using VIDIOC_CREATE_BUFS in order
to allocate buffers larger than the current CAPTURE format in order to accommodate
future resolution changes.

> +
> +14. Call :c:func:`VIDIOC_STREAMON` to initiate decoding frames.
> +
> +Decoding
> +========
> +
> +This state is reached after a successful initialization sequence. In this
> +state, client queues and dequeues buffers to both queues via
> +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following standard
> +semantics.
> +
> +Both queues operate independently, following standard behavior of V4L2
> +buffer queues and memory-to-memory devices. In addition, the order of
> +decoded frames dequeued from ``CAPTURE`` queue may differ from the order of
> +queuing coded frames to ``OUTPUT`` queue, due to properties of selected
> +coded format, e.g. frame reordering. The client must not assume any direct
> +relationship between ``CAPTURE`` and ``OUTPUT`` buffers, other than
> +reported by :c:type:`v4l2_buffer` ``timestamp`` field.

Is there a relationship between capture and output buffers w.r.t. the timestamp
field? I am not aware that there is one.

> +
> +The contents of source ``OUTPUT`` buffers depend on active coded pixel
> +format and might be affected by codec-specific extended controls, as stated
> +in documentation of each format individually.

in -> in the
each format individually -> each format

> +
> +The client must not assume any direct relationship between ``CAPTURE``
> +and ``OUTPUT`` buffers and any specific timing of buffers becoming
> +available to dequeue. Specifically:
> +
> +* a buffer queued to ``OUTPUT`` may result in no buffers being produced
> +  on ``CAPTURE`` (e.g. if it does not contain encoded data, or if only
> +  metadata syntax structures are present in it),
> +
> +* a buffer queued to ``OUTPUT`` may result in more than 1 buffer produced
> +  on ``CAPTURE`` (if the encoded data contained more than one frame, or if
> +  returning a decoded frame allowed the driver to return a frame that
> +  preceded it in decode, but succeeded it in display order),
> +
> +* a buffer queued to ``OUTPUT`` may result in a buffer being produced on
> +  ``CAPTURE`` later into decode process, and/or after processing further
> +  ``OUTPUT`` buffers, or be returned out of order, e.g. if display
> +  reordering is used,
> +
> +* buffers may become available on the ``CAPTURE`` queue without additional
> +  buffers queued to ``OUTPUT`` (e.g. during drain or ``EOS``), because of
> +  ``OUTPUT`` buffers being queued in the past and decoding result of which
> +  being available only at later time, due to specifics of the decoding
> +  process.
> +
> +Seek
> +====
> +
> +Seek is controlled by the ``OUTPUT`` queue, as it is the source of
> +bitstream data. ``CAPTURE`` queue remains unchanged/unaffected.
> +
> +1. Stop the ``OUTPUT`` queue to begin the seek sequence via
> +   :c:func:`VIDIOC_STREAMOFF`.
> +
> +   * **Required fields:**
> +
> +     ``type``
> +         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> +
> +   * The driver must drop all the pending ``OUTPUT`` buffers and they are
> +     treated as returned to the client (following standard semantics).
> +
> +2. Restart the ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON`
> +
> +   * **Required fields:**
> +
> +     ``type``
> +         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> +
> +   * The driver must be put in a state after seek and be ready to

"put in a state"???

> +     accept new source bitstream buffers.
> +
> +3. Start queuing buffers to ``OUTPUT`` queue containing stream data after
> +   the seek until a suitable resume point is found.
> +
> +   .. note::
> +
> +      There is no requirement to begin queuing stream starting exactly from
> +      a resume point (e.g. SPS or a keyframe). The driver must handle any
> +      data queued and must keep processing the queued buffers until it
> +      finds a suitable resume point. While looking for a resume point, the
> +      driver processes ``OUTPUT`` buffers and returns them to the client
> +      without producing any decoded frames.
> +
> +      For hardware known to be mishandling seeks to a non-resume point,
> +      e.g. by returning corrupted decoded frames, the driver must be able
> +      to handle such seeks without a crash or any fatal decode error.
> +
> +4. After a resume point is found, the driver will start returning
> +   ``CAPTURE`` buffers with decoded frames.
> +
> +   * There is no precise specification for ``CAPTURE`` queue of when it
> +     will start producing buffers containing decoded data from buffers
> +     queued after the seek, as it operates independently
> +     from ``OUTPUT`` queue.
> +
> +     * The driver is allowed to and may return a number of remaining

I'd drop 'is allowed to and'.

> +       ``CAPTURE`` buffers containing decoded frames from before the seek
> +       after the seek sequence (STREAMOFF-STREAMON) is performed.
> +
> +     * The driver is also allowed to and may not return all decoded frames

Ditto.

> +       queued but not decode before the seek sequence was initiated. For

Very confusing sentence. I think you mean this:

	  The driver may not return all decoded frames that where ready for
	  dequeueing from before the seek sequence was initiated.

Is this really true? Once decoded frames are marked as buffer_done by the
driver there is no reason for them to be removed. Or you mean something else
here, e.g. the frames are decoded, but the buffers not yet given back to vb2.

> +       example, given an ``OUTPUT`` queue sequence: QBUF(A), QBUF(B),
> +       STREAMOFF(OUT), STREAMON(OUT), QBUF(G), QBUF(H), any of the
> +       following results on the ``CAPTURE`` queue is allowed: {A’, B’, G’,
> +       H’}, {A’, G’, H’}, {G’, H’}.
> +
> +   .. note::
> +
> +      To achieve instantaneous seek, the client may restart streaming on
> +      ``CAPTURE`` queue to discard decoded, but not yet dequeued buffers.
> +
> +Pause
> +=====
> +
> +In order to pause, the client should just cease queuing buffers onto the
> +``OUTPUT`` queue. This is different from the general V4L2 API definition of
> +pause, which involves calling :c:func:`VIDIOC_STREAMOFF` on the queue.
> +Without source bitstream data, there is no data to process and the hardware
> +remains idle.
> +
> +Conversely, using :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue indicates
> +a seek, which
> +
> +1. drops all ``OUTPUT`` buffers in flight and
> +2. after a subsequent :c:func:`VIDIOC_STREAMON`, will look for and only
> +   continue from a resume point.
> +
> +This is usually undesirable for pause. The STREAMOFF-STREAMON sequence is
> +intended for seeking.
> +
> +Similarly, ``CAPTURE`` queue should remain streaming as well, as the

the ``CAPTURE`` queue

(add 'the')

> +STREAMOFF-STREAMON sequence on it is intended solely for changing buffer
> +sets.

'changing buffer sets': not clear what is meant by this. It's certainly not
'solely' since it can also be used to achieve an instantaneous seek.

> +
> +Dynamic resolution change
> +=========================
> +
> +A video decoder implementing this interface must support dynamic resolution
> +change, for streams, which include resolution metadata in the bitstream.

I think the commas can be removed from this sentence. I would also replace
'which' by 'that'.

> +When the decoder encounters a resolution change in the stream, the dynamic
> +resolution change sequence is started.
> +
> +1.  After encountering a resolution change in the stream, the driver must
> +    first process and decode all remaining buffers from before the
> +    resolution change point.
> +
> +2.  After all buffers containing decoded frames from before the resolution
> +    change point are ready to be dequeued on the ``CAPTURE`` queue, the
> +    driver sends a ``V4L2_EVENT_SOURCE_CHANGE`` event for source change
> +    type ``V4L2_EVENT_SRC_CH_RESOLUTION``.
> +
> +    * The last buffer from before the change must be marked with
> +      :c:type:`v4l2_buffer` ``flags`` flag ``V4L2_BUF_FLAG_LAST`` as in the

spurious 'as'?

> +      drain sequence. The last buffer might be empty (with
> +      :c:type:`v4l2_buffer` ``bytesused`` = 0) and must be ignored by the
> +      client, since it does not contain any decoded frame.

any -> a

> +
> +    * Any client query issued after the driver queues the event must return
> +      values applying to the stream after the resolution change, including
> +      queue formats, selection rectangles and controls.
> +
> +    * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events and
> +      the event is signaled, the decoding process will not continue until
> +      it is acknowledged by either (re-)starting streaming on ``CAPTURE``,
> +      or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START``
> +      command.

With (re-)starting streaming you mean a STREAMOFF/ON pair on the CAPTURE queue,
right?

> +
> +    .. note::
> +
> +       Any attempts to dequeue more buffers beyond the buffer marked
> +       with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE error from
> +       :c:func:`VIDIOC_DQBUF`.
> +
> +3.  The client calls :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` to get the new
> +    format information. This is identical to calling :c:func:`VIDIOC_G_FMT`
> +    after ``V4L2_EVENT_SRC_CH_RESOLUTION`` in the initialization sequence
> +    and should be handled similarly.
> +
> +    .. note::
> +
> +       It is allowed for the driver not to support the same pixel format as
> +       previously used (before the resolution change) for the new
> +       resolution. The driver must select a default supported pixel format,
> +       return it, if queried using :c:func:`VIDIOC_G_FMT`, and the client
> +       must take note of it.
> +
> +4.  The client acquires visible resolution as in initialization sequence.
> +
> +5.  *[optional]* The client is allowed to enumerate available formats and
> +    select a different one than currently chosen (returned via
> +    :c:func:`VIDIOC_G_FMT)`. This is identical to a corresponding step in
> +    the initialization sequence.
> +
> +6.  *[optional]* The client acquires minimum number of buffers as in
> +    initialization sequence.

It's an optional step, but what might happen if you ignore it or if the control
does not exist?

You also should mention that this is the min number of CAPTURE buffers.

I wonder if we should make these min buffer controls required. It might be easier
that way.

> +7.  If all the following conditions are met, the client may resume the
> +    decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with
> +    ``V4L2_DEC_CMD_START`` command, as in case of resuming after the drain
> +    sequence:
> +
> +    * ``sizeimage`` of new format is less than or equal to the size of
> +      currently allocated buffers,
> +
> +    * the number of buffers currently allocated is greater than or equal to
> +      the minimum number of buffers acquired in step 6.

You might want to mention that if there are insufficient buffers, then
VIDIOC_CREATE_BUFS can be used to add more buffers.

> +
> +    In such case, the remaining steps do not apply.
> +
> +    However, if the client intends to change the buffer set, to lower
> +    memory usage or for any other reasons, it may be achieved by following
> +    the steps below.
> +
> +8.  After dequeuing all remaining buffers from the ``CAPTURE`` queue, the
> +    client must call :c:func:`VIDIOC_STREAMOFF` on the ``CAPTURE`` queue.
> +    The ``OUTPUT`` queue must remain streaming (calling STREAMOFF on it
> +    would trigger a seek).
> +
> +9.  The client frees the buffers on the ``CAPTURE`` queue using
> +    :c:func:`VIDIOC_REQBUFS`.
> +
> +    * **Required fields:**
> +
> +      ``count``
> +          set to 0
> +
> +      ``type``
> +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
> +
> +      ``memory``
> +          follows standard semantics
> +
> +10. The client allocates a new set of buffers for the ``CAPTURE`` queue via
> +    :c:func:`VIDIOC_REQBUFS`. This is identical to a corresponding step in
> +    the initialization sequence.
> +
> +11. The client resumes decoding by issuing :c:func:`VIDIOC_STREAMON` on the
> +    ``CAPTURE`` queue.
> +
> +During the resolution change sequence, the ``OUTPUT`` queue must remain
> +streaming. Calling :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue would
> +initiate a seek.
> +
> +The ``OUTPUT`` queue operates separately from the ``CAPTURE`` queue for the
> +duration of the entire resolution change sequence. It is allowed (and
> +recommended for best performance and simplicity) for the client to keep
> +queuing/dequeuing buffers from/to ``OUTPUT`` queue even while processing
> +this sequence.
> +
> +.. note::
> +
> +   It is also possible for this sequence to be triggered without a change
> +   in coded resolution, if a different number of ``CAPTURE`` buffers is
> +   required in order to continue decoding the stream or the visible
> +   resolution changes.
> +
> +Drain
> +=====
> +
> +To ensure that all queued ``OUTPUT`` buffers have been processed and
> +related ``CAPTURE`` buffers output to the client, the following drain
> +sequence may be followed. After the drain sequence is complete, the client
> +has received all decoded frames for all ``OUTPUT`` buffers queued before
> +the sequence was started.
> +
> +1. Begin drain by issuing :c:func:`VIDIOC_DECODER_CMD`.
> +
> +   * **Required fields:**
> +
> +     ``cmd``
> +         set to ``V4L2_DEC_CMD_STOP``
> +
> +     ``flags``
> +         set to 0
> +
> +     ``pts``
> +         set to 0
> +
> +2. The driver must process and decode as normal all ``OUTPUT`` buffers
> +   queued by the client before the :c:func:`VIDIOC_DECODER_CMD` was issued.
> +   Any operations triggered as a result of processing these buffers
> +   (including the initialization and resolution change sequences) must be
> +   processed as normal by both the driver and the client before proceeding
> +   with the drain sequence.
> +
> +3. Once all ``OUTPUT`` buffers queued before ``V4L2_DEC_CMD_STOP`` are
> +   processed:
> +
> +   * If the ``CAPTURE`` queue is streaming, once all decoded frames (if
> +     any) are ready to be dequeued on the ``CAPTURE`` queue, the driver
> +     must send a ``V4L2_EVENT_EOS``. The driver must also set
> +     ``V4L2_BUF_FLAG_LAST`` in :c:type:`v4l2_buffer` ``flags`` field on the
> +     buffer on the ``CAPTURE`` queue containing the last frame (if any)
> +     produced as a result of processing the ``OUTPUT`` buffers queued
> +     before ``V4L2_DEC_CMD_STOP``. If no more frames are left to be
> +     returned at the point of handling ``V4L2_DEC_CMD_STOP``, the driver
> +     must return an empty buffer (with :c:type:`v4l2_buffer`
> +     ``bytesused`` = 0) as the last buffer with ``V4L2_BUF_FLAG_LAST`` set
> +     instead. Any attempts to dequeue more buffers beyond the buffer marked
> +     with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE error from
> +     :c:func:`VIDIOC_DQBUF`.
> +
> +   * If the ``CAPTURE`` queue is NOT streaming, no action is necessary for
> +     ``CAPTURE`` queue and the driver must send a ``V4L2_EVENT_EOS``
> +     immediately after all ``OUTPUT`` buffers in question have been
> +     processed.
> +
> +4. At this point, decoding is paused and the driver will accept, but not
> +   process any newly queued ``OUTPUT`` buffers until the client issues
> +   ``V4L2_DEC_CMD_START`` or restarts streaming on any queue.
> +
> +* Once the drain sequence is initiated, the client needs to drive it to
> +  completion, as described by the above steps, unless it aborts the process
> +  by issuing :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue.  The client
> +  is not allowed to issue ``V4L2_DEC_CMD_START`` or ``V4L2_DEC_CMD_STOP``
> +  again while the drain sequence is in progress and they will fail with
> +  -EBUSY error code if attempted.
> +
> +* Restarting streaming on ``OUTPUT`` queue will implicitly end the paused
> +  state and reinitialize the decoder (similarly to the seek sequence).
> +  Restarting ``CAPTURE`` queue will not affect an in-progress drain
> +  sequence.
> +
> +* The drivers must also implement :c:func:`VIDIOC_TRY_DECODER_CMD`, as a
> +  way to let the client query the availability of decoder commands.
> +
> +End of stream
> +=============
> +
> +If the decoder encounters an end of stream marking in the stream, the
> +driver must send a ``V4L2_EVENT_EOS`` event to the client after all frames
> +are decoded and ready to be dequeued on the ``CAPTURE`` queue, with the
> +:c:type:`v4l2_buffer` ``flags`` set to ``V4L2_BUF_FLAG_LAST``. This
> +behavior is identical to the drain sequence triggered by the client via
> +``V4L2_DEC_CMD_STOP``.
> +
> +Commit points
> +=============
> +
> +Setting formats and allocating buffers triggers changes in the behavior
> +of the driver.
> +
> +1. Setting format on ``OUTPUT`` queue may change the set of formats

Setting -> Setting the

> +   supported/advertised on the ``CAPTURE`` queue. In particular, it also
> +   means that ``CAPTURE`` format may be reset and the client must not

that -> that the

> +   rely on the previously set format being preserved.
> +
> +2. Enumerating formats on ``CAPTURE`` queue must only return formats
> +   supported for the ``OUTPUT`` format currently set.
> +
> +3. Setting/changing format on ``CAPTURE`` queue does not change formats

format -> the format

> +   available on ``OUTPUT`` queue. An attempt to set ``CAPTURE`` format that

set -> set a

> +   is not supported for the currently selected ``OUTPUT`` format must
> +   result in the driver adjusting the requested format to an acceptable
> +   one.
> +
> +4. Enumerating formats on ``OUTPUT`` queue always returns the full set of
> +   supported coded formats, irrespective of the current ``CAPTURE``
> +   format.
> +
> +5. After allocating buffers on the ``OUTPUT`` queue, it is not possible to
> +   change format on it.

format -> the format

> +
> +To summarize, setting formats and allocation must always start with the
> +``OUTPUT`` queue and the ``OUTPUT`` queue is the master that governs the
> +set of supported formats for the ``CAPTURE`` queue.
> diff --git a/Documentation/media/uapi/v4l/devices.rst b/Documentation/media/uapi/v4l/devices.rst
> index fb7f8c26cf09..12d43fe711cf 100644
> --- a/Documentation/media/uapi/v4l/devices.rst
> +++ b/Documentation/media/uapi/v4l/devices.rst
> @@ -15,6 +15,7 @@ Interfaces
>      dev-output
>      dev-osd
>      dev-codec
> +    dev-decoder
>      dev-effect
>      dev-raw-vbi
>      dev-sliced-vbi
> diff --git a/Documentation/media/uapi/v4l/v4l2.rst b/Documentation/media/uapi/v4l/v4l2.rst
> index b89e5621ae69..65dc096199ad 100644
> --- a/Documentation/media/uapi/v4l/v4l2.rst
> +++ b/Documentation/media/uapi/v4l/v4l2.rst
> @@ -53,6 +53,10 @@ Authors, in alphabetical order:
>  
>    - Original author of the V4L2 API and documentation.
>  
> +- Figa, Tomasz <tfiga@chromium.org>
> +
> +  - Documented the memory-to-memory decoder interface.
> +
>  - H Schimek, Michael <mschimek@gmx.at>
>  
>    - Original author of the V4L2 API and documentation.
> @@ -61,6 +65,10 @@ Authors, in alphabetical order:
>  
>    - Documented the Digital Video timings API.
>  
> +- Osciak, Pawel <posciak@chromium.org>
> +
> +  - Documented the memory-to-memory decoder interface.
> +
>  - Osciak, Pawel <pawel@osciak.com>
>  
>    - Designed and documented the multi-planar API.
> @@ -85,7 +93,7 @@ Authors, in alphabetical order:
>  
>    - Designed and documented the VIDIOC_LOG_STATUS ioctl, the extended control ioctls, major parts of the sliced VBI API, the MPEG encoder and decoder APIs and the DV Timings API.
>  
> -**Copyright** |copy| 1999-2016: Bill Dirks, Michael H. Schimek, Hans Verkuil, Martin Rubli, Andy Walls, Muralidharan Karicheri, Mauro Carvalho Chehab, Pawel Osciak, Sakari Ailus & Antti Palosaari.
> +**Copyright** |copy| 1999-2018: Bill Dirks, Michael H. Schimek, Hans Verkuil, Martin Rubli, Andy Walls, Muralidharan Karicheri, Mauro Carvalho Chehab, Pawel Osciak, Sakari Ailus & Antti Palosaari, Tomasz Figa
>  
>  Except when explicitly stated as GPL, programming examples within this
>  part can be used and distributed without restrictions.
> 

Regards,

	Hans
Tomasz Figa July 26, 2018, 10:20 a.m. UTC | #2
Hi Hans,

On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>
> Hi Tomasz,
>
> Many, many thanks for working on this! It's a great document and when done
> it will be very useful indeed.
>
> Review comments follow...

Thanks for review!

>
> On 24/07/18 16:06, Tomasz Figa wrote:
[snip]
> > +DPB
> > +   Decoded Picture Buffer; a H.264 term for a buffer that stores a picture
>
> a H.264 -> an H.264
>

Ack.

> > +   that is encoded or decoded and available for reference in further
> > +   decode/encode steps.
> > +
> > +EOS
> > +   end of stream
> > +
> > +IDR
> > +   a type of a keyframe in H.264-encoded stream, which clears the list of
> > +   earlier reference frames (DPBs)
>
> You do not actually say what IDR stands for. Can you add that?
>

Ack.

[snip]
> > +3. The client may use :c:func:`VIDIOC_ENUM_FRAMESIZES` to detect supported
> > +   resolutions for a given format, passing desired pixel format in
> > +   :c:type:`v4l2_frmsizeenum` ``pixel_format``.
> > +
> > +   * Values returned by :c:func:`VIDIOC_ENUM_FRAMESIZES` on ``OUTPUT``
> > +     must include all possible coded resolutions supported by the decoder
> > +     for given coded pixel format.
>
> This is confusing. Since VIDIOC_ENUM_FRAMESIZES does not have a buffer type
> argument you cannot say 'on OUTPUT'. I would remove 'on OUTPUT' entirely.
>
> > +
> > +   * Values returned by :c:func:`VIDIOC_ENUM_FRAMESIZES` on ``CAPTURE``
>
> Ditto for 'on CAPTURE'
>

You're right. I didn't notice that the "type" field in
v4l2_frmsizeenum was not buffer type, but type of the range. Thanks
for spotting this.

> > +     must include all possible frame buffer resolutions supported by the
> > +     decoder for given raw pixel format and coded format currently set on
> > +     ``OUTPUT``.
> > +
> > +    .. note::
> > +
> > +       The client may derive the supported resolution range for a
> > +       combination of coded and raw format by setting width and height of
> > +       ``OUTPUT`` format to 0 and calculating the intersection of
> > +       resolutions returned from calls to :c:func:`VIDIOC_ENUM_FRAMESIZES`
> > +       for the given coded and raw formats.
>
> So if the output format is set to 1280x720, then ENUM_FRAMESIZES would just
> return 1280x720 as the resolution. If the output format is set to 0x0, then
> it returns the full range it is capable of.
>
> Correct?
>
> If so, then I think this needs to be a bit more explicit. I had to think about
> it a bit.
>
> Note that the v4l2_pix_format/S_FMT documentation needs to be updated as well
> since we never allowed 0x0 before.

Is there any text that disallows this? I couldn't spot any. Generally
there are already drivers which return 0x0 for coded formats (s5p-mfc)
and it's not even strange, because in such case, the buffer contains
just a sequence of bytes, not a 2D picture.

> What if you set the format to 0x0 but the stream does not have meta data with
> the resolution? How does userspace know if 0x0 is allowed or not? If this is
> specific to the chosen coded pixel format, should be add a new flag for those
> formats indicating that the coded data contains resolution information?

Yes, this would definitely be on a per-format basis. Not sure what you
mean by a flag, though? E.g. if the format is set to H264, then it's
bound to include resolution information. If the format doesn't include
it, then userspace is already aware of this fact, because it needs to
get this from some other source (e.g. container).

>
> That way userspace knows if 0x0 can be used, and the driver can reject 0x0
> for formats that do not support it.

As above, but I might be misunderstanding your suggestion.

>
> > +
> > +4. Supported profiles and levels for given format, if applicable, may be
> > +   queried using their respective controls via :c:func:`VIDIOC_QUERYCTRL`.
> > +
> > +Initialization
> > +==============
> > +
> > +1. *[optional]* Enumerate supported ``OUTPUT`` formats and resolutions. See
> > +   capability enumeration.
>
> capability enumeration. -> 'Querying capabilities' above.
>

Ack.

> > +
> > +2. Set the coded format on ``OUTPUT`` via :c:func:`VIDIOC_S_FMT`
> > +
> > +   * **Required fields:**
> > +
> > +     ``type``
> > +         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> > +
> > +     ``pixelformat``
> > +         a coded pixel format
> > +
> > +     ``width``, ``height``
> > +         required only if cannot be parsed from the stream for the given
> > +         coded format; optional otherwise - set to zero to ignore
> > +
> > +     other fields
> > +         follow standard semantics
> > +
> > +   * For coded formats including stream resolution information, if width
> > +     and height are set to non-zero values, the driver will propagate the
> > +     resolution to ``CAPTURE`` and signal a source change event
> > +     instantly. However, after the decoder is done parsing the
> > +     information embedded in the stream, it will update ``CAPTURE``
> > +     format with new values and signal a source change event again, if
> > +     the values do not match.
> > +
> > +   .. note::
> > +
> > +      Changing ``OUTPUT`` format may change currently set ``CAPTURE``
>
> change -> change the

Ack.

>
> > +      format. The driver will derive a new ``CAPTURE`` format from
>
> from -> from the

Ack.

>
> > +      ``OUTPUT`` format being set, including resolution, colorimetry
> > +      parameters, etc. If the client needs a specific ``CAPTURE`` format,
> > +      it must adjust it afterwards.
> > +
> > +3.  *[optional]* Get minimum number of buffers required for ``OUTPUT``
> > +    queue via :c:func:`VIDIOC_G_CTRL`. This is useful if client intends to
>
> client -> the client

Ack.

>
> > +    use more buffers than minimum required by hardware/format.
>
> than -> than the

Ack.

[snip]
> > +13. Allocate destination (raw format) buffers via :c:func:`VIDIOC_REQBUFS`
> > +    on the ``CAPTURE`` queue.
> > +
> > +    * **Required fields:**
> > +
> > +      ``count``
> > +          requested number of buffers to allocate; greater than zero
> > +
> > +      ``type``
> > +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
> > +
> > +      ``memory``
> > +          follows standard semantics
> > +
> > +    * **Return fields:**
> > +
> > +      ``count``
> > +          adjusted to allocated number of buffers
> > +
> > +    * The driver must adjust count to minimum of required number of
> > +      destination buffers for given format and stream configuration and the
> > +      count passed. The client must check this value after the ioctl
> > +      returns to get the number of buffers allocated.
> > +
> > +    .. note::
> > +
> > +       To allocate more than minimum number of buffers (for pipeline
> > +       depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE``) to
> > +       get minimum number of buffers required, and pass the obtained value
> > +       plus the number of additional buffers needed in count to
> > +       :c:func:`VIDIOC_REQBUFS`.
>
>
> I think we should mention here the option of using VIDIOC_CREATE_BUFS in order
> to allocate buffers larger than the current CAPTURE format in order to accommodate
> future resolution changes.

Ack.

>
> > +
> > +14. Call :c:func:`VIDIOC_STREAMON` to initiate decoding frames.
> > +
> > +Decoding
> > +========
> > +
> > +This state is reached after a successful initialization sequence. In this
> > +state, client queues and dequeues buffers to both queues via
> > +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following standard
> > +semantics.
> > +
> > +Both queues operate independently, following standard behavior of V4L2
> > +buffer queues and memory-to-memory devices. In addition, the order of
> > +decoded frames dequeued from ``CAPTURE`` queue may differ from the order of
> > +queuing coded frames to ``OUTPUT`` queue, due to properties of selected
> > +coded format, e.g. frame reordering. The client must not assume any direct
> > +relationship between ``CAPTURE`` and ``OUTPUT`` buffers, other than
> > +reported by :c:type:`v4l2_buffer` ``timestamp`` field.
>
> Is there a relationship between capture and output buffers w.r.t. the timestamp
> field? I am not aware that there is one.

I believe the decoder was expected to copy the timestamp of matching
OUTPUT buffer to respective CAPTURE buffer. Both s5p-mfc and coda seem
to be implementing it this way. I guess it might be a good idea to
specify this more explicitly.

>
> > +
> > +The contents of source ``OUTPUT`` buffers depend on active coded pixel
> > +format and might be affected by codec-specific extended controls, as stated
> > +in documentation of each format individually.
>
> in -> in the
> each format individually -> each format
>

Ack.

[snip]
> > +2. Restart the ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON`
> > +
> > +   * **Required fields:**
> > +
> > +     ``type``
> > +         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> > +
> > +   * The driver must be put in a state after seek and be ready to
>
> "put in a state"???
>

I'm not sure what this was supposed to be. I guess just "The driver
must start accepting new source bitstream buffers after the call
returns." would be enough.

> > +     accept new source bitstream buffers.
> > +
> > +3. Start queuing buffers to ``OUTPUT`` queue containing stream data after
> > +   the seek until a suitable resume point is found.
> > +
> > +   .. note::
> > +
> > +      There is no requirement to begin queuing stream starting exactly from
> > +      a resume point (e.g. SPS or a keyframe). The driver must handle any
> > +      data queued and must keep processing the queued buffers until it
> > +      finds a suitable resume point. While looking for a resume point, the
> > +      driver processes ``OUTPUT`` buffers and returns them to the client
> > +      without producing any decoded frames.
> > +
> > +      For hardware known to be mishandling seeks to a non-resume point,
> > +      e.g. by returning corrupted decoded frames, the driver must be able
> > +      to handle such seeks without a crash or any fatal decode error.
> > +
> > +4. After a resume point is found, the driver will start returning
> > +   ``CAPTURE`` buffers with decoded frames.
> > +
> > +   * There is no precise specification for ``CAPTURE`` queue of when it
> > +     will start producing buffers containing decoded data from buffers
> > +     queued after the seek, as it operates independently
> > +     from ``OUTPUT`` queue.
> > +
> > +     * The driver is allowed to and may return a number of remaining
>
> I'd drop 'is allowed to and'.
>

Ack.

> > +       ``CAPTURE`` buffers containing decoded frames from before the seek
> > +       after the seek sequence (STREAMOFF-STREAMON) is performed.
> > +
> > +     * The driver is also allowed to and may not return all decoded frames
>
> Ditto.

Ack.

>
> > +       queued but not decode before the seek sequence was initiated. For
>
> Very confusing sentence. I think you mean this:
>
>           The driver may not return all decoded frames that where ready for
>           dequeueing from before the seek sequence was initiated.
>
> Is this really true? Once decoded frames are marked as buffer_done by the
> driver there is no reason for them to be removed. Or you mean something else
> here, e.g. the frames are decoded, but the buffers not yet given back to vb2.
>

Exactly "the frames are decoded, but the buffers not yet given back to
vb2", for example, if reordering takes place. However, if one stops
streaming before dequeuing all buffers, they are implicitly returned
(reset to the state after REQBUFS) and can't be dequeued anymore, so
the frames are lost, even if the driver returned them. I guess the
sentence was really unfortunate indeed.

> > +       example, given an ``OUTPUT`` queue sequence: QBUF(A), QBUF(B),
> > +       STREAMOFF(OUT), STREAMON(OUT), QBUF(G), QBUF(H), any of the
> > +       following results on the ``CAPTURE`` queue is allowed: {A’, B’, G’,
> > +       H’}, {A’, G’, H’}, {G’, H’}.
> > +
> > +   .. note::
> > +
> > +      To achieve instantaneous seek, the client may restart streaming on
> > +      ``CAPTURE`` queue to discard decoded, but not yet dequeued buffers.
> > +
> > +Pause
> > +=====
> > +
> > +In order to pause, the client should just cease queuing buffers onto the
> > +``OUTPUT`` queue. This is different from the general V4L2 API definition of
> > +pause, which involves calling :c:func:`VIDIOC_STREAMOFF` on the queue.
> > +Without source bitstream data, there is no data to process and the hardware
> > +remains idle.
> > +
> > +Conversely, using :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue indicates
> > +a seek, which
> > +
> > +1. drops all ``OUTPUT`` buffers in flight and
> > +2. after a subsequent :c:func:`VIDIOC_STREAMON`, will look for and only
> > +   continue from a resume point.
> > +
> > +This is usually undesirable for pause. The STREAMOFF-STREAMON sequence is
> > +intended for seeking.
> > +
> > +Similarly, ``CAPTURE`` queue should remain streaming as well, as the
>
> the ``CAPTURE`` queue
>
> (add 'the')
>

Ack.

> > +STREAMOFF-STREAMON sequence on it is intended solely for changing buffer
> > +sets.
>
> 'changing buffer sets': not clear what is meant by this. It's certainly not
> 'solely' since it can also be used to achieve an instantaneous seek.
>

To be honest, I'm not sure whether there is even a need to include
this whole section. It's obvious that if you stop feeding a mem2mem
device, it will pause. Moreover, other sections imply various
behaviors triggered by STREAMOFF/STREAMON/DECODER_CMD/etc., so it
should be quite clear that they are different from a simple pause.
What do you think?

> > +
> > +Dynamic resolution change
> > +=========================
> > +
> > +A video decoder implementing this interface must support dynamic resolution
> > +change, for streams, which include resolution metadata in the bitstream.
>
> I think the commas can be removed from this sentence. I would also replace
> 'which' by 'that'.
>

Ack.

> > +When the decoder encounters a resolution change in the stream, the dynamic
> > +resolution change sequence is started.
> > +
> > +1.  After encountering a resolution change in the stream, the driver must
> > +    first process and decode all remaining buffers from before the
> > +    resolution change point.
> > +
> > +2.  After all buffers containing decoded frames from before the resolution
> > +    change point are ready to be dequeued on the ``CAPTURE`` queue, the
> > +    driver sends a ``V4L2_EVENT_SOURCE_CHANGE`` event for source change
> > +    type ``V4L2_EVENT_SRC_CH_RESOLUTION``.
> > +
> > +    * The last buffer from before the change must be marked with
> > +      :c:type:`v4l2_buffer` ``flags`` flag ``V4L2_BUF_FLAG_LAST`` as in the
>
> spurious 'as'?
>

It should be:

    * The last buffer from before the change must be marked with
      the ``V4L2_BUF_FLAG_LAST`` flag in :c:type:`v4l2_buffer` ``flags`` field,
      similarly to the

> > +      drain sequence. The last buffer might be empty (with
> > +      :c:type:`v4l2_buffer` ``bytesused`` = 0) and must be ignored by the
> > +      client, since it does not contain any decoded frame.
>
> any -> a
>

Ack.

> > +
> > +    * Any client query issued after the driver queues the event must return
> > +      values applying to the stream after the resolution change, including
> > +      queue formats, selection rectangles and controls.
> > +
> > +    * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events and
> > +      the event is signaled, the decoding process will not continue until
> > +      it is acknowledged by either (re-)starting streaming on ``CAPTURE``,
> > +      or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START``
> > +      command.
>
> With (re-)starting streaming you mean a STREAMOFF/ON pair on the CAPTURE queue,
> right?
>

Right. I guess it might be better to just state that explicitly.

> > +
> > +    .. note::
> > +
> > +       Any attempts to dequeue more buffers beyond the buffer marked
> > +       with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE error from
> > +       :c:func:`VIDIOC_DQBUF`.
> > +
> > +3.  The client calls :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` to get the new
> > +    format information. This is identical to calling :c:func:`VIDIOC_G_FMT`
> > +    after ``V4L2_EVENT_SRC_CH_RESOLUTION`` in the initialization sequence
> > +    and should be handled similarly.
> > +
> > +    .. note::
> > +
> > +       It is allowed for the driver not to support the same pixel format as
> > +       previously used (before the resolution change) for the new
> > +       resolution. The driver must select a default supported pixel format,
> > +       return it, if queried using :c:func:`VIDIOC_G_FMT`, and the client
> > +       must take note of it.
> > +
> > +4.  The client acquires visible resolution as in initialization sequence.
> > +
> > +5.  *[optional]* The client is allowed to enumerate available formats and
> > +    select a different one than currently chosen (returned via
> > +    :c:func:`VIDIOC_G_FMT)`. This is identical to a corresponding step in
> > +    the initialization sequence.
> > +
> > +6.  *[optional]* The client acquires minimum number of buffers as in
> > +    initialization sequence.
>
> It's an optional step, but what might happen if you ignore it or if the control
> does not exist?

REQBUFS is supposed clamp the requested number of buffers to the [min,
max] range anyway.

>
> You also should mention that this is the min number of CAPTURE buffers.
>
> I wonder if we should make these min buffer controls required. It might be easier
> that way.

Agreed. Although userspace is still free to ignore it, because REQBUFS
would do the right thing anyway.

>
> > +7.  If all the following conditions are met, the client may resume the
> > +    decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with
> > +    ``V4L2_DEC_CMD_START`` command, as in case of resuming after the drain
> > +    sequence:
> > +
> > +    * ``sizeimage`` of new format is less than or equal to the size of
> > +      currently allocated buffers,
> > +
> > +    * the number of buffers currently allocated is greater than or equal to
> > +      the minimum number of buffers acquired in step 6.
>
> You might want to mention that if there are insufficient buffers, then
> VIDIOC_CREATE_BUFS can be used to add more buffers.
>

This might be a bit tricky, since at least s5p-mfc and coda can only
work on a fixed buffer set and one would need to fully reinitialize
the decoding to add one more buffer, which would effectively be the
full resolution change sequence, as below, just with REQBUFS(0),
REQBUFS(N) replaced with CREATE_BUFS.

We should mention CREATE_BUFS as an alternative to steps 9 and 10, though.

> > +
> > +    In such case, the remaining steps do not apply.
> > +
> > +    However, if the client intends to change the buffer set, to lower
> > +    memory usage or for any other reasons, it may be achieved by following
> > +    the steps below.
> > +
> > +8.  After dequeuing all remaining buffers from the ``CAPTURE`` queue, the
> > +    client must call :c:func:`VIDIOC_STREAMOFF` on the ``CAPTURE`` queue.
> > +    The ``OUTPUT`` queue must remain streaming (calling STREAMOFF on it
> > +    would trigger a seek).
> > +
> > +9.  The client frees the buffers on the ``CAPTURE`` queue using
> > +    :c:func:`VIDIOC_REQBUFS`.
> > +
> > +    * **Required fields:**
> > +
> > +      ``count``
> > +          set to 0
> > +
> > +      ``type``
> > +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
> > +
> > +      ``memory``
> > +          follows standard semantics
> > +
> > +10. The client allocates a new set of buffers for the ``CAPTURE`` queue via
> > +    :c:func:`VIDIOC_REQBUFS`. This is identical to a corresponding step in
> > +    the initialization sequence.
[snip]
> > +
> > +Commit points
> > +=============
> > +
> > +Setting formats and allocating buffers triggers changes in the behavior
> > +of the driver.
> > +
> > +1. Setting format on ``OUTPUT`` queue may change the set of formats
>
> Setting -> Setting the
>

Ack.

> > +   supported/advertised on the ``CAPTURE`` queue. In particular, it also
> > +   means that ``CAPTURE`` format may be reset and the client must not
>
> that -> that the
>

Ack.

> > +   rely on the previously set format being preserved.
> > +
> > +2. Enumerating formats on ``CAPTURE`` queue must only return formats
> > +   supported for the ``OUTPUT`` format currently set.
> > +
> > +3. Setting/changing format on ``CAPTURE`` queue does not change formats
>
> format -> the format
>

Ack.

> > +   available on ``OUTPUT`` queue. An attempt to set ``CAPTURE`` format that
>
> set -> set a
>

Ack.

> > +   is not supported for the currently selected ``OUTPUT`` format must
> > +   result in the driver adjusting the requested format to an acceptable
> > +   one.
> > +
> > +4. Enumerating formats on ``OUTPUT`` queue always returns the full set of
> > +   supported coded formats, irrespective of the current ``CAPTURE``
> > +   format.
> > +
> > +5. After allocating buffers on the ``OUTPUT`` queue, it is not possible to
> > +   change format on it.
>
> format -> the format
>

Ack.

Best regards,
Tomasz
Philipp Zabel July 26, 2018, 10:36 a.m. UTC | #3
On Thu, 2018-07-26 at 19:20 +0900, Tomasz Figa wrote:
[...]
> > You might want to mention that if there are insufficient buffers, then
> > VIDIOC_CREATE_BUFS can be used to add more buffers.
> > 
> 
> This might be a bit tricky, since at least s5p-mfc and coda can only
> work on a fixed buffer set and one would need to fully reinitialize
> the decoding to add one more buffer, which would effectively be the
> full resolution change sequence, as below, just with REQBUFS(0),
> REQBUFS(N) replaced with CREATE_BUFS.

The coda driver supports CREATE_BUFS on the decoder CAPTURE queue.

The firmware indeed needs a fixed frame buffer set, but these buffers
are internal only and in a coda specific tiling format. The content of
finished internal buffers is copied / detiled into the external CAPTURE
buffers, so those can be added at will.

regards
Philipp
Hans Verkuil July 26, 2018, 10:57 a.m. UTC | #4
On 26/07/18 12:20, Tomasz Figa wrote:
> Hi Hans,
> 
> On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>>
>> Hi Tomasz,
>>
>> Many, many thanks for working on this! It's a great document and when done
>> it will be very useful indeed.
>>
>> Review comments follow...
> 
> Thanks for review!
> 
>>
>> On 24/07/18 16:06, Tomasz Figa wrote:

> [snip]

>> Note that the v4l2_pix_format/S_FMT documentation needs to be updated as well
>> since we never allowed 0x0 before.
> 
> Is there any text that disallows this? I couldn't spot any. Generally
> there are already drivers which return 0x0 for coded formats (s5p-mfc)
> and it's not even strange, because in such case, the buffer contains
> just a sequence of bytes, not a 2D picture.

All non-m2m devices will always have non-zero width/height values. Only with
m2m devices do we see this.

This was probably never documented since before m2m appeared it was 'obvious'.

This definitely needs to be documented, though.

> 
>> What if you set the format to 0x0 but the stream does not have meta data with
>> the resolution? How does userspace know if 0x0 is allowed or not? If this is
>> specific to the chosen coded pixel format, should be add a new flag for those
>> formats indicating that the coded data contains resolution information?
> 
> Yes, this would definitely be on a per-format basis. Not sure what you
> mean by a flag, though? E.g. if the format is set to H264, then it's
> bound to include resolution information. If the format doesn't include
> it, then userspace is already aware of this fact, because it needs to
> get this from some other source (e.g. container).
> 
>>
>> That way userspace knows if 0x0 can be used, and the driver can reject 0x0
>> for formats that do not support it.
> 
> As above, but I might be misunderstanding your suggestion.

So my question is: is this tied to the pixel format, or should we make it
explicit with a flag like V4L2_FMT_FLAG_CAN_DECODE_WXH.

The advantage of a flag is that you don't need a switch on the format to
know whether or not 0x0 is allowed. And the flag can just be set in
v4l2-ioctls.c.

>>> +STREAMOFF-STREAMON sequence on it is intended solely for changing buffer
>>> +sets.
>>
>> 'changing buffer sets': not clear what is meant by this. It's certainly not
>> 'solely' since it can also be used to achieve an instantaneous seek.
>>
> 
> To be honest, I'm not sure whether there is even a need to include
> this whole section. It's obvious that if you stop feeding a mem2mem
> device, it will pause. Moreover, other sections imply various
> behaviors triggered by STREAMOFF/STREAMON/DECODER_CMD/etc., so it
> should be quite clear that they are different from a simple pause.
> What do you think?

Yes, I'd drop this last sentence ('Similarly...sets').

>>> +2.  After all buffers containing decoded frames from before the resolution
>>> +    change point are ready to be dequeued on the ``CAPTURE`` queue, the
>>> +    driver sends a ``V4L2_EVENT_SOURCE_CHANGE`` event for source change
>>> +    type ``V4L2_EVENT_SRC_CH_RESOLUTION``.
>>> +
>>> +    * The last buffer from before the change must be marked with
>>> +      :c:type:`v4l2_buffer` ``flags`` flag ``V4L2_BUF_FLAG_LAST`` as in the
>>
>> spurious 'as'?
>>
> 
> It should be:
> 
>     * The last buffer from before the change must be marked with
>       the ``V4L2_BUF_FLAG_LAST`` flag in :c:type:`v4l2_buffer` ``flags`` field,
>       similarly to the

Ah, OK. Now I get it.

>> I wonder if we should make these min buffer controls required. It might be easier
>> that way.
> 
> Agreed. Although userspace is still free to ignore it, because REQBUFS
> would do the right thing anyway.

It's never been entirely clear to me what the purpose of those min buffers controls
is. REQBUFS ensures that the number of buffers is at least the minimum needed to
make the HW work. So why would you need these controls? It only makes sense if they
return something different from REQBUFS.

> 
>>
>>> +7.  If all the following conditions are met, the client may resume the
>>> +    decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with
>>> +    ``V4L2_DEC_CMD_START`` command, as in case of resuming after the drain
>>> +    sequence:
>>> +
>>> +    * ``sizeimage`` of new format is less than or equal to the size of
>>> +      currently allocated buffers,
>>> +
>>> +    * the number of buffers currently allocated is greater than or equal to
>>> +      the minimum number of buffers acquired in step 6.
>>
>> You might want to mention that if there are insufficient buffers, then
>> VIDIOC_CREATE_BUFS can be used to add more buffers.
>>
> 
> This might be a bit tricky, since at least s5p-mfc and coda can only
> work on a fixed buffer set and one would need to fully reinitialize
> the decoding to add one more buffer, which would effectively be the
> full resolution change sequence, as below, just with REQBUFS(0),
> REQBUFS(N) replaced with CREATE_BUFS.

What happens today in those drivers if you try to call CREATE_BUFS?

Regards,

	Hans
Hans Verkuil July 30, 2018, 12:52 p.m. UTC | #5
On 07/24/2018 04:06 PM, Tomasz Figa wrote:
> Due to complexity of the video decoding process, the V4L2 drivers of
> stateful decoder hardware require specific sequences of V4L2 API calls
> to be followed. These include capability enumeration, initialization,
> decoding, seek, pause, dynamic resolution change, drain and end of
> stream.
> 
> Specifics of the above have been discussed during Media Workshops at
> LinuxCon Europe 2012 in Barcelona and then later Embedded Linux
> Conference Europe 2014 in Düsseldorf. The de facto Codec API that
> originated at those events was later implemented by the drivers we already
> have merged in mainline, such as s5p-mfc or coda.
> 
> The only thing missing was the real specification included as a part of
> Linux Media documentation. Fix it now and document the decoder part of
> the Codec API.
> 
> Signed-off-by: Tomasz Figa <tfiga@chromium.org>
> ---
>  Documentation/media/uapi/v4l/dev-decoder.rst | 872 +++++++++++++++++++
>  Documentation/media/uapi/v4l/devices.rst     |   1 +
>  Documentation/media/uapi/v4l/v4l2.rst        |  10 +-
>  3 files changed, 882 insertions(+), 1 deletion(-)
>  create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst
> 
> diff --git a/Documentation/media/uapi/v4l/dev-decoder.rst b/Documentation/media/uapi/v4l/dev-decoder.rst
> new file mode 100644
> index 000000000000..f55d34d2f860
> --- /dev/null
> +++ b/Documentation/media/uapi/v4l/dev-decoder.rst
> @@ -0,0 +1,872 @@

<snip>

> +6.  This step only applies to coded formats that contain resolution
> +    information in the stream. Continue queuing/dequeuing bitstream
> +    buffers to/from the ``OUTPUT`` queue via :c:func:`VIDIOC_QBUF` and
> +    :c:func:`VIDIOC_DQBUF`. The driver must keep processing and returning
> +    each buffer to the client until required metadata to configure the
> +    ``CAPTURE`` queue are found. This is indicated by the driver sending
> +    a ``V4L2_EVENT_SOURCE_CHANGE`` event with
> +    ``V4L2_EVENT_SRC_CH_RESOLUTION`` source change type. There is no
> +    requirement to pass enough data for this to occur in the first buffer
> +    and the driver must be able to process any number.
> +
> +    * If data in a buffer that triggers the event is required to decode
> +      the first frame, the driver must not return it to the client,
> +      but must retain it for further decoding.
> +
> +    * If the client set width and height of ``OUTPUT`` format to 0, calling
> +      :c:func:`VIDIOC_G_FMT` on the ``CAPTURE`` queue will return -EPERM,
> +      until the driver configures ``CAPTURE`` format according to stream
> +      metadata.

What about calling TRY/S_FMT on the capture queue: will this also return -EPERM?
I assume so.

Regards,

	Hans
Tomasz Figa Aug. 7, 2018, 6:55 a.m. UTC | #6
On Thu, Jul 26, 2018 at 7:36 PM Philipp Zabel <p.zabel@pengutronix.de> wrote:
>
> On Thu, 2018-07-26 at 19:20 +0900, Tomasz Figa wrote:
> [...]
> > > You might want to mention that if there are insufficient buffers, then
> > > VIDIOC_CREATE_BUFS can be used to add more buffers.
> > >
> >
> > This might be a bit tricky, since at least s5p-mfc and coda can only
> > work on a fixed buffer set and one would need to fully reinitialize
> > the decoding to add one more buffer, which would effectively be the
> > full resolution change sequence, as below, just with REQBUFS(0),
> > REQBUFS(N) replaced with CREATE_BUFS.
>
> The coda driver supports CREATE_BUFS on the decoder CAPTURE queue.
>
> The firmware indeed needs a fixed frame buffer set, but these buffers
> are internal only and in a coda specific tiling format. The content of
> finished internal buffers is copied / detiled into the external CAPTURE
> buffers, so those can be added at will.

Thanks for clarifying. I forgot about that internal copy indeed.

Best regards,
Tomasz
Tomasz Figa Aug. 7, 2018, 7:05 a.m. UTC | #7
On Thu, Jul 26, 2018 at 7:57 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>
> On 26/07/18 12:20, Tomasz Figa wrote:
> > Hi Hans,
> >
> > On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
> >>
> >> Hi Tomasz,
> >>
> >> Many, many thanks for working on this! It's a great document and when done
> >> it will be very useful indeed.
> >>
> >> Review comments follow...
> >
> > Thanks for review!
> >
> >>
> >> On 24/07/18 16:06, Tomasz Figa wrote:
>
> > [snip]
>
> >> Note that the v4l2_pix_format/S_FMT documentation needs to be updated as well
> >> since we never allowed 0x0 before.
> >
> > Is there any text that disallows this? I couldn't spot any. Generally
> > there are already drivers which return 0x0 for coded formats (s5p-mfc)
> > and it's not even strange, because in such case, the buffer contains
> > just a sequence of bytes, not a 2D picture.
>
> All non-m2m devices will always have non-zero width/height values. Only with
> m2m devices do we see this.
>
> This was probably never documented since before m2m appeared it was 'obvious'.
>
> This definitely needs to be documented, though.
>

Fair enough. Let me try to add a note there.

> >
> >> What if you set the format to 0x0 but the stream does not have meta data with
> >> the resolution? How does userspace know if 0x0 is allowed or not? If this is
> >> specific to the chosen coded pixel format, should be add a new flag for those
> >> formats indicating that the coded data contains resolution information?
> >
> > Yes, this would definitely be on a per-format basis. Not sure what you
> > mean by a flag, though? E.g. if the format is set to H264, then it's
> > bound to include resolution information. If the format doesn't include
> > it, then userspace is already aware of this fact, because it needs to
> > get this from some other source (e.g. container).
> >
> >>
> >> That way userspace knows if 0x0 can be used, and the driver can reject 0x0
> >> for formats that do not support it.
> >
> > As above, but I might be misunderstanding your suggestion.
>
> So my question is: is this tied to the pixel format, or should we make it
> explicit with a flag like V4L2_FMT_FLAG_CAN_DECODE_WXH.
>
> The advantage of a flag is that you don't need a switch on the format to
> know whether or not 0x0 is allowed. And the flag can just be set in
> v4l2-ioctls.c.

As far as my understanding goes, what data is included in the stream
is definitely specified by format. For example, a H264 elementary
stream will always include those data as a part of SPS.

However, having such flag internally, not exposed to userspace, could
indeed be useful to avoid all drivers have such switch. That wouldn't
belong to this documentation, though, since it would be just kernel
API.

>
> >>> +STREAMOFF-STREAMON sequence on it is intended solely for changing buffer
> >>> +sets.
> >>
> >> 'changing buffer sets': not clear what is meant by this. It's certainly not
> >> 'solely' since it can also be used to achieve an instantaneous seek.
> >>
> >
> > To be honest, I'm not sure whether there is even a need to include
> > this whole section. It's obvious that if you stop feeding a mem2mem
> > device, it will pause. Moreover, other sections imply various
> > behaviors triggered by STREAMOFF/STREAMON/DECODER_CMD/etc., so it
> > should be quite clear that they are different from a simple pause.
> > What do you think?
>
> Yes, I'd drop this last sentence ('Similarly...sets').
>

Ack.

> >>> +2.  After all buffers containing decoded frames from before the resolution
> >>> +    change point are ready to be dequeued on the ``CAPTURE`` queue, the
> >>> +    driver sends a ``V4L2_EVENT_SOURCE_CHANGE`` event for source change
> >>> +    type ``V4L2_EVENT_SRC_CH_RESOLUTION``.
> >>> +
> >>> +    * The last buffer from before the change must be marked with
> >>> +      :c:type:`v4l2_buffer` ``flags`` flag ``V4L2_BUF_FLAG_LAST`` as in the
> >>
> >> spurious 'as'?
> >>
> >
> > It should be:
> >
> >     * The last buffer from before the change must be marked with
> >       the ``V4L2_BUF_FLAG_LAST`` flag in :c:type:`v4l2_buffer` ``flags`` field,
> >       similarly to the
>
> Ah, OK. Now I get it.
>
> >> I wonder if we should make these min buffer controls required. It might be easier
> >> that way.
> >
> > Agreed. Although userspace is still free to ignore it, because REQBUFS
> > would do the right thing anyway.
>
> It's never been entirely clear to me what the purpose of those min buffers controls
> is. REQBUFS ensures that the number of buffers is at least the minimum needed to
> make the HW work. So why would you need these controls? It only makes sense if they
> return something different from REQBUFS.
>

The purpose of those controls is to let the client allocate a number
of buffers bigger than minimum, without the need to allocate the
minimum number of buffers first (to just learn the number), free them
and then allocate a bigger number again.

> >
> >>
> >>> +7.  If all the following conditions are met, the client may resume the
> >>> +    decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with
> >>> +    ``V4L2_DEC_CMD_START`` command, as in case of resuming after the drain
> >>> +    sequence:
> >>> +
> >>> +    * ``sizeimage`` of new format is less than or equal to the size of
> >>> +      currently allocated buffers,
> >>> +
> >>> +    * the number of buffers currently allocated is greater than or equal to
> >>> +      the minimum number of buffers acquired in step 6.
> >>
> >> You might want to mention that if there are insufficient buffers, then
> >> VIDIOC_CREATE_BUFS can be used to add more buffers.
> >>
> >
> > This might be a bit tricky, since at least s5p-mfc and coda can only
> > work on a fixed buffer set and one would need to fully reinitialize
> > the decoding to add one more buffer, which would effectively be the
> > full resolution change sequence, as below, just with REQBUFS(0),
> > REQBUFS(N) replaced with CREATE_BUFS.
>
> What happens today in those drivers if you try to call CREATE_BUFS?

s5p-mfc doesn't set the .vidioc_create_bufs pointer in its
v4l2_ioctl_ops, so I suppose that would be -ENOTTY?

Best regards,
Tomasz
Tomasz Figa Aug. 7, 2018, 7:08 a.m. UTC | #8
On Mon, Jul 30, 2018 at 9:52 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>
> On 07/24/2018 04:06 PM, Tomasz Figa wrote:
> > Due to complexity of the video decoding process, the V4L2 drivers of
> > stateful decoder hardware require specific sequences of V4L2 API calls
> > to be followed. These include capability enumeration, initialization,
> > decoding, seek, pause, dynamic resolution change, drain and end of
> > stream.
> >
> > Specifics of the above have been discussed during Media Workshops at
> > LinuxCon Europe 2012 in Barcelona and then later Embedded Linux
> > Conference Europe 2014 in Düsseldorf. The de facto Codec API that
> > originated at those events was later implemented by the drivers we already
> > have merged in mainline, such as s5p-mfc or coda.
> >
> > The only thing missing was the real specification included as a part of
> > Linux Media documentation. Fix it now and document the decoder part of
> > the Codec API.
> >
> > Signed-off-by: Tomasz Figa <tfiga@chromium.org>
> > ---
> >  Documentation/media/uapi/v4l/dev-decoder.rst | 872 +++++++++++++++++++
> >  Documentation/media/uapi/v4l/devices.rst     |   1 +
> >  Documentation/media/uapi/v4l/v4l2.rst        |  10 +-
> >  3 files changed, 882 insertions(+), 1 deletion(-)
> >  create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst
> >
> > diff --git a/Documentation/media/uapi/v4l/dev-decoder.rst b/Documentation/media/uapi/v4l/dev-decoder.rst
> > new file mode 100644
> > index 000000000000..f55d34d2f860
> > --- /dev/null
> > +++ b/Documentation/media/uapi/v4l/dev-decoder.rst
> > @@ -0,0 +1,872 @@
>
> <snip>
>
> > +6.  This step only applies to coded formats that contain resolution
> > +    information in the stream. Continue queuing/dequeuing bitstream
> > +    buffers to/from the ``OUTPUT`` queue via :c:func:`VIDIOC_QBUF` and
> > +    :c:func:`VIDIOC_DQBUF`. The driver must keep processing and returning
> > +    each buffer to the client until required metadata to configure the
> > +    ``CAPTURE`` queue are found. This is indicated by the driver sending
> > +    a ``V4L2_EVENT_SOURCE_CHANGE`` event with
> > +    ``V4L2_EVENT_SRC_CH_RESOLUTION`` source change type. There is no
> > +    requirement to pass enough data for this to occur in the first buffer
> > +    and the driver must be able to process any number.
> > +
> > +    * If data in a buffer that triggers the event is required to decode
> > +      the first frame, the driver must not return it to the client,
> > +      but must retain it for further decoding.
> > +
> > +    * If the client set width and height of ``OUTPUT`` format to 0, calling
> > +      :c:func:`VIDIOC_G_FMT` on the ``CAPTURE`` queue will return -EPERM,
> > +      until the driver configures ``CAPTURE`` format according to stream
> > +      metadata.
>
> What about calling TRY/S_FMT on the capture queue: will this also return -EPERM?
> I assume so.

We should make it so indeed, to make things consistent.

On another note, I don't really like this -EPERM here, as one could
just see that the format is 0x0 and know that it's not valid. This is
only needed for legacy userspace that doesn't handle the source change
event in initial stream parsing and just checks whether G_FMT returns
an error instead.

Nicolas, for more insight here.

Best regards,
Tomasz
Hans Verkuil Aug. 7, 2018, 7:13 a.m. UTC | #9
On 07/26/2018 12:20 PM, Tomasz Figa wrote:
> Hi Hans,
> 
> On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>>> +
>>> +14. Call :c:func:`VIDIOC_STREAMON` to initiate decoding frames.
>>> +
>>> +Decoding
>>> +========
>>> +
>>> +This state is reached after a successful initialization sequence. In this
>>> +state, client queues and dequeues buffers to both queues via
>>> +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following standard
>>> +semantics.
>>> +
>>> +Both queues operate independently, following standard behavior of V4L2
>>> +buffer queues and memory-to-memory devices. In addition, the order of
>>> +decoded frames dequeued from ``CAPTURE`` queue may differ from the order of
>>> +queuing coded frames to ``OUTPUT`` queue, due to properties of selected
>>> +coded format, e.g. frame reordering. The client must not assume any direct
>>> +relationship between ``CAPTURE`` and ``OUTPUT`` buffers, other than
>>> +reported by :c:type:`v4l2_buffer` ``timestamp`` field.
>>
>> Is there a relationship between capture and output buffers w.r.t. the timestamp
>> field? I am not aware that there is one.
> 
> I believe the decoder was expected to copy the timestamp of matching
> OUTPUT buffer to respective CAPTURE buffer. Both s5p-mfc and coda seem
> to be implementing it this way. I guess it might be a good idea to
> specify this more explicitly.

What about an output buffer producing multiple capture buffers? Or the case
where the encoded bitstream of a frame starts at one output buffer and ends
at another? What happens if you have B frames and the order of the capture
buffers is different from the output buffers?

In other words, for codecs there is no clear 1-to-1 relationship between an
output buffer and a capture buffer. And we never defined what the 'copy timestamp'
behavior should be in that case or if it even makes sense.

Regards,

	Hans
Hans Verkuil Aug. 7, 2018, 7:37 a.m. UTC | #10
On 08/07/2018 09:05 AM, Tomasz Figa wrote:
> On Thu, Jul 26, 2018 at 7:57 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>>>> What if you set the format to 0x0 but the stream does not have meta data with
>>>> the resolution? How does userspace know if 0x0 is allowed or not? If this is
>>>> specific to the chosen coded pixel format, should be add a new flag for those
>>>> formats indicating that the coded data contains resolution information?
>>>
>>> Yes, this would definitely be on a per-format basis. Not sure what you
>>> mean by a flag, though? E.g. if the format is set to H264, then it's
>>> bound to include resolution information. If the format doesn't include
>>> it, then userspace is already aware of this fact, because it needs to
>>> get this from some other source (e.g. container).
>>>
>>>>
>>>> That way userspace knows if 0x0 can be used, and the driver can reject 0x0
>>>> for formats that do not support it.
>>>
>>> As above, but I might be misunderstanding your suggestion.
>>
>> So my question is: is this tied to the pixel format, or should we make it
>> explicit with a flag like V4L2_FMT_FLAG_CAN_DECODE_WXH.
>>
>> The advantage of a flag is that you don't need a switch on the format to
>> know whether or not 0x0 is allowed. And the flag can just be set in
>> v4l2-ioctls.c.
> 
> As far as my understanding goes, what data is included in the stream
> is definitely specified by format. For example, a H264 elementary
> stream will always include those data as a part of SPS.
> 
> However, having such flag internally, not exposed to userspace, could
> indeed be useful to avoid all drivers have such switch. That wouldn't
> belong to this documentation, though, since it would be just kernel
> API.

Why would you keep this internally only?

>>>> I wonder if we should make these min buffer controls required. It might be easier
>>>> that way.
>>>
>>> Agreed. Although userspace is still free to ignore it, because REQBUFS
>>> would do the right thing anyway.
>>
>> It's never been entirely clear to me what the purpose of those min buffers controls
>> is. REQBUFS ensures that the number of buffers is at least the minimum needed to
>> make the HW work. So why would you need these controls? It only makes sense if they
>> return something different from REQBUFS.
>>
> 
> The purpose of those controls is to let the client allocate a number
> of buffers bigger than minimum, without the need to allocate the
> minimum number of buffers first (to just learn the number), free them
> and then allocate a bigger number again.

I don't feel this is particularly useful. One problem with the minimum number
of buffers as used in the kernel is that it is often the minimum number of
buffers required to make the hardware work, but it may not be optimal. E.g.
quite a few capture drivers set the minimum to 2, which is enough for the
hardware, but it will likely lead to dropped frames. You really need 3
(one is being DMAed, one is queued and linked into the DMA engine and one is
being processed by userspace).

I would actually prefer this to be the recommended minimum number of buffers,
which is >= the minimum REQBUFS uses.

I.e., if you use this number and you have no special requirements, then you'll
get good performance.

> 
>>>
>>>>
>>>>> +7.  If all the following conditions are met, the client may resume the
>>>>> +    decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with
>>>>> +    ``V4L2_DEC_CMD_START`` command, as in case of resuming after the drain
>>>>> +    sequence:
>>>>> +
>>>>> +    * ``sizeimage`` of new format is less than or equal to the size of
>>>>> +      currently allocated buffers,
>>>>> +
>>>>> +    * the number of buffers currently allocated is greater than or equal to
>>>>> +      the minimum number of buffers acquired in step 6.
>>>>
>>>> You might want to mention that if there are insufficient buffers, then
>>>> VIDIOC_CREATE_BUFS can be used to add more buffers.
>>>>
>>>
>>> This might be a bit tricky, since at least s5p-mfc and coda can only
>>> work on a fixed buffer set and one would need to fully reinitialize
>>> the decoding to add one more buffer, which would effectively be the
>>> full resolution change sequence, as below, just with REQBUFS(0),
>>> REQBUFS(N) replaced with CREATE_BUFS.
>>
>> What happens today in those drivers if you try to call CREATE_BUFS?
> 
> s5p-mfc doesn't set the .vidioc_create_bufs pointer in its
> v4l2_ioctl_ops, so I suppose that would be -ENOTTY?

Correct for s5p-mfc.

Regards,

	Hans
Maxime Jourdan Aug. 7, 2018, 7:11 p.m. UTC | #11
2018-08-07 9:13 GMT+02:00 Hans Verkuil <hverkuil@xs4all.nl>:
> On 07/26/2018 12:20 PM, Tomasz Figa wrote:
>> Hi Hans,
>>
>> On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>>>> +
>>>> +14. Call :c:func:`VIDIOC_STREAMON` to initiate decoding frames.
>>>> +
>>>> +Decoding
>>>> +========
>>>> +
>>>> +This state is reached after a successful initialization sequence. In this
>>>> +state, client queues and dequeues buffers to both queues via
>>>> +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following standard
>>>> +semantics.
>>>> +
>>>> +Both queues operate independently, following standard behavior of V4L2
>>>> +buffer queues and memory-to-memory devices. In addition, the order of
>>>> +decoded frames dequeued from ``CAPTURE`` queue may differ from the order of
>>>> +queuing coded frames to ``OUTPUT`` queue, due to properties of selected
>>>> +coded format, e.g. frame reordering. The client must not assume any direct
>>>> +relationship between ``CAPTURE`` and ``OUTPUT`` buffers, other than
>>>> +reported by :c:type:`v4l2_buffer` ``timestamp`` field.
>>>
>>> Is there a relationship between capture and output buffers w.r.t. the timestamp
>>> field? I am not aware that there is one.
>>
>> I believe the decoder was expected to copy the timestamp of matching
>> OUTPUT buffer to respective CAPTURE buffer. Both s5p-mfc and coda seem
>> to be implementing it this way. I guess it might be a good idea to
>> specify this more explicitly.
>
> What about an output buffer producing multiple capture buffers? Or the case
> where the encoded bitstream of a frame starts at one output buffer and ends
> at another? What happens if you have B frames and the order of the capture
> buffers is different from the output buffers?
>
> In other words, for codecs there is no clear 1-to-1 relationship between an
> output buffer and a capture buffer. And we never defined what the 'copy timestamp'
> behavior should be in that case or if it even makes sense.
>
> Regards,
>
>         Hans

As it is done right now in userspace (FFmpeg, GStreamer) and most (if
not all?) drivers, it's a 1:1 between OUTPUT and CAPTURE. The only
thing that changes is the ordering since OUTPUT buffers are in
decoding order while CAPTURE buffers are in presentation order.

This almost always implies some timestamping kung-fu to match the
OUTPUT timestamps with the corresponding CAPTURE timestamps. It's
often done indirectly by the firmware on some platforms (rpi comes to
mind iirc).

The current constructions also imply one video packet per OUTPUT
buffer. If a video packet is too big to fit in a buffer, FFmpeg will
crop that packet to the maximum buffer size and will discard the
remaining packet data. GStreamer will abort the decoding. This is
unfortunately one of the shortcomings of having fixed-size buffers.
And if they were to split the packet in multiple buffers, then some
drivers in their current state wouldn't be able to handle the
timestamping issues and/or x:1 OUTPUT:CAPTURE buffer numbers.

Maxime
Tomasz Figa Aug. 8, 2018, 2:46 a.m. UTC | #12
Hi Maxime,

On Tue, Aug 7, 2018 at 5:32 AM Maxime Jourdan <maxi.jourdan@wanadoo.fr> wrote:
>
> Hi Tomasz,
>
> Sorry for sending this email only to you, I subscribed to linux-media
> after you posted this and I'm not sure how to respond to everybody.
>

No worries. Let me reply with other recipients added back. Thanks for
your comments.

> I'm currently developing a V4L2 M2M decoder driver for Amlogic SoCs so
> my comments are somewhat biased towards it
> (https://github.com/Elyotna/linux)
>
> > +Seek
> > +====
> > +
> > +Seek is controlled by the ``OUTPUT`` queue, as it is the source of
> > +bitstream data. ``CAPTURE`` queue remains unchanged/unaffected.
> > +
> > +1. Stop the ``OUTPUT`` queue to begin the seek sequence via
> > +   :c:func:`VIDIOC_STREAMOFF`.
> > +
> > +   * **Required fields:**
> > +
> > +     ``type``
> > +         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> > +
> > +   * The driver must drop all the pending ``OUTPUT`` buffers and they are
> > +     treated as returned to the client (following standard semantics).
> > +
> > +2. Restart the ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON`
> > +
> > +   * **Required fields:**
> > +
> > +     ``type``
> > +         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> > +
> > +   * The driver must be put in a state after seek and be ready to
> > +     accept new source bitstream buffers.
> > +
> > +3. Start queuing buffers to ``OUTPUT`` queue containing stream data after
> > +   the seek until a suitable resume point is found.
> > +
> > +   .. note::
> > +
> > +      There is no requirement to begin queuing stream starting exactly from
> > +      a resume point (e.g. SPS or a keyframe). The driver must handle any
> > +      data queued and must keep processing the queued buffers until it
> > +      finds a suitable resume point. While looking for a resume point, the
> > +      driver processes ``OUTPUT`` buffers and returns them to the client
> > +      without producing any decoded frames.
> > +
> > +      For hardware known to be mishandling seeks to a non-resume point,
> > +      e.g. by returning corrupted decoded frames, the driver must be able
> > +      to handle such seeks without a crash or any fatal decode error.
>
> This is unfortunately my case, apart from parsing the bitstream
> manually - which is a no-no -, there is no way to know when I'll be
> writing in an IDR frame to the HW bitstream parser. I think it would
> be much preferable that the client starts sending in an IDR frame for
> sure.

Most of the hardware, which have upstream drivers, deal with this
correctly and there is existing user space that relies on this, so we
cannot simply add such requirement. However, when sending your driver
upstream, feel free to include a patch that adds a read-only control
that tells the user space that it needs to do seeks to resume points.
Obviously this will work only with user space aware of this
requirement, but I don't think we can do anything better here.

>
> > +4. After a resume point is found, the driver will start returning
> > +   ``CAPTURE`` buffers with decoded frames.
> > +
> > +   * There is no precise specification for ``CAPTURE`` queue of when it
> > +     will start producing buffers containing decoded data from buffers
> > +     queued after the seek, as it operates independently
> > +     from ``OUTPUT`` queue.
> > +
> > +     * The driver is allowed to and may return a number of remaining
> > +       ``CAPTURE`` buffers containing decoded frames from before the seek
> > +       after the seek sequence (STREAMOFF-STREAMON) is performed.
> > +
> > +     * The driver is also allowed to and may not return all decoded frames
> > +       queued but not decode before the seek sequence was initiated. For
> > +       example, given an ``OUTPUT`` queue sequence: QBUF(A), QBUF(B),
> > +       STREAMOFF(OUT), STREAMON(OUT), QBUF(G), QBUF(H), any of the
> > +       following results on the ``CAPTURE`` queue is allowed: {A’, B’, G’,
> > +       H’}, {A’, G’, H’}, {G’, H’}.
> > +
> > +   .. note::
> > +
> > +      To achieve instantaneous seek, the client may restart streaming on
> > +      ``CAPTURE`` queue to discard decoded, but not yet dequeued buffers.
>
> Overall, I think Drain followed by V4L2_DEC_CMD_START is a more
> applicable scenario for seeking.
> Heck, simply starting to queue buffers at the seek - starting with an
> IDR - without doing any kind of streamon/off or cmd_start(stop) will
> do the trick.

Why do you think so?

For a seek, as expected by a typical device user, the result should be
discarding anything already queued and just start decoding new frames
as soon as possible.

Actually, this section doesn't describe any specific sequence, just
possible ways to do a seek using existing primitives.

Best regards,
Tomasz
Tomasz Figa Aug. 8, 2018, 2:55 a.m. UTC | #13
On Tue, Aug 7, 2018 at 4:37 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>
> On 08/07/2018 09:05 AM, Tomasz Figa wrote:
> > On Thu, Jul 26, 2018 at 7:57 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
> >>>> What if you set the format to 0x0 but the stream does not have meta data with
> >>>> the resolution? How does userspace know if 0x0 is allowed or not? If this is
> >>>> specific to the chosen coded pixel format, should be add a new flag for those
> >>>> formats indicating that the coded data contains resolution information?
> >>>
> >>> Yes, this would definitely be on a per-format basis. Not sure what you
> >>> mean by a flag, though? E.g. if the format is set to H264, then it's
> >>> bound to include resolution information. If the format doesn't include
> >>> it, then userspace is already aware of this fact, because it needs to
> >>> get this from some other source (e.g. container).
> >>>
> >>>>
> >>>> That way userspace knows if 0x0 can be used, and the driver can reject 0x0
> >>>> for formats that do not support it.
> >>>
> >>> As above, but I might be misunderstanding your suggestion.
> >>
> >> So my question is: is this tied to the pixel format, or should we make it
> >> explicit with a flag like V4L2_FMT_FLAG_CAN_DECODE_WXH.
> >>
> >> The advantage of a flag is that you don't need a switch on the format to
> >> know whether or not 0x0 is allowed. And the flag can just be set in
> >> v4l2-ioctls.c.
> >
> > As far as my understanding goes, what data is included in the stream
> > is definitely specified by format. For example, a H264 elementary
> > stream will always include those data as a part of SPS.
> >
> > However, having such flag internally, not exposed to userspace, could
> > indeed be useful to avoid all drivers have such switch. That wouldn't
> > belong to this documentation, though, since it would be just kernel
> > API.
>
> Why would you keep this internally only?
>

Well, either keep it internal or make it read-only for the user space,
since the behavior is already defined by selected pixel format.

> >>>> I wonder if we should make these min buffer controls required. It might be easier
> >>>> that way.
> >>>
> >>> Agreed. Although userspace is still free to ignore it, because REQBUFS
> >>> would do the right thing anyway.
> >>
> >> It's never been entirely clear to me what the purpose of those min buffers controls
> >> is. REQBUFS ensures that the number of buffers is at least the minimum needed to
> >> make the HW work. So why would you need these controls? It only makes sense if they
> >> return something different from REQBUFS.
> >>
> >
> > The purpose of those controls is to let the client allocate a number
> > of buffers bigger than minimum, without the need to allocate the
> > minimum number of buffers first (to just learn the number), free them
> > and then allocate a bigger number again.
>
> I don't feel this is particularly useful. One problem with the minimum number
> of buffers as used in the kernel is that it is often the minimum number of
> buffers required to make the hardware work, but it may not be optimal. E.g.
> quite a few capture drivers set the minimum to 2, which is enough for the
> hardware, but it will likely lead to dropped frames. You really need 3
> (one is being DMAed, one is queued and linked into the DMA engine and one is
> being processed by userspace).
>
> I would actually prefer this to be the recommended minimum number of buffers,
> which is >= the minimum REQBUFS uses.
>
> I.e., if you use this number and you have no special requirements, then you'll
> get good performance.

I guess we could make it so. It would make existing user space request
more buffers than it used to with the original meaning, but I guess it
shouldn't be a big problem.

>
> >
> >>>
> >>>>
> >>>>> +7.  If all the following conditions are met, the client may resume the
> >>>>> +    decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with
> >>>>> +    ``V4L2_DEC_CMD_START`` command, as in case of resuming after the drain
> >>>>> +    sequence:
> >>>>> +
> >>>>> +    * ``sizeimage`` of new format is less than or equal to the size of
> >>>>> +      currently allocated buffers,
> >>>>> +
> >>>>> +    * the number of buffers currently allocated is greater than or equal to
> >>>>> +      the minimum number of buffers acquired in step 6.
> >>>>
> >>>> You might want to mention that if there are insufficient buffers, then
> >>>> VIDIOC_CREATE_BUFS can be used to add more buffers.
> >>>>
> >>>
> >>> This might be a bit tricky, since at least s5p-mfc and coda can only
> >>> work on a fixed buffer set and one would need to fully reinitialize
> >>> the decoding to add one more buffer, which would effectively be the
> >>> full resolution change sequence, as below, just with REQBUFS(0),
> >>> REQBUFS(N) replaced with CREATE_BUFS.
> >>
> >> What happens today in those drivers if you try to call CREATE_BUFS?
> >
> > s5p-mfc doesn't set the .vidioc_create_bufs pointer in its
> > v4l2_ioctl_ops, so I suppose that would be -ENOTTY?
>
> Correct for s5p-mfc.

As Philipp clarified, coda supports adding buffers on the fly. I
briefly looked at venus and mtk-vcodec and they seem to use m2m
implementation of CREATE_BUFS. Not sure if anyone tested that, though.
So the only hardware I know for sure cannot support this is s5p-mfc.

Best regards,
Tomasz
Tomasz Figa Aug. 8, 2018, 3:07 a.m. UTC | #14
On Wed, Aug 8, 2018 at 4:11 AM Maxime Jourdan <maxi.jourdan@wanadoo.fr> wrote:
>
> 2018-08-07 9:13 GMT+02:00 Hans Verkuil <hverkuil@xs4all.nl>:
> > On 07/26/2018 12:20 PM, Tomasz Figa wrote:
> >> Hi Hans,
> >>
> >> On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
> >>>> +
> >>>> +14. Call :c:func:`VIDIOC_STREAMON` to initiate decoding frames.
> >>>> +
> >>>> +Decoding
> >>>> +========
> >>>> +
> >>>> +This state is reached after a successful initialization sequence. In this
> >>>> +state, client queues and dequeues buffers to both queues via
> >>>> +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following standard
> >>>> +semantics.
> >>>> +
> >>>> +Both queues operate independently, following standard behavior of V4L2
> >>>> +buffer queues and memory-to-memory devices. In addition, the order of
> >>>> +decoded frames dequeued from ``CAPTURE`` queue may differ from the order of
> >>>> +queuing coded frames to ``OUTPUT`` queue, due to properties of selected
> >>>> +coded format, e.g. frame reordering. The client must not assume any direct
> >>>> +relationship between ``CAPTURE`` and ``OUTPUT`` buffers, other than
> >>>> +reported by :c:type:`v4l2_buffer` ``timestamp`` field.
> >>>
> >>> Is there a relationship between capture and output buffers w.r.t. the timestamp
> >>> field? I am not aware that there is one.
> >>
> >> I believe the decoder was expected to copy the timestamp of matching
> >> OUTPUT buffer to respective CAPTURE buffer. Both s5p-mfc and coda seem
> >> to be implementing it this way. I guess it might be a good idea to
> >> specify this more explicitly.
> >
> > What about an output buffer producing multiple capture buffers? Or the case
> > where the encoded bitstream of a frame starts at one output buffer and ends
> > at another? What happens if you have B frames and the order of the capture
> > buffers is different from the output buffers?
> >
> > In other words, for codecs there is no clear 1-to-1 relationship between an
> > output buffer and a capture buffer. And we never defined what the 'copy timestamp'
> > behavior should be in that case or if it even makes sense.
> >
> > Regards,
> >
> >         Hans
>
> As it is done right now in userspace (FFmpeg, GStreamer) and most (if
> not all?) drivers, it's a 1:1 between OUTPUT and CAPTURE. The only
> thing that changes is the ordering since OUTPUT buffers are in
> decoding order while CAPTURE buffers are in presentation order.

If I understood it correctly, there is a feature in VP9 that lets one
frame repeat several times, which would make one OUTPUT buffer produce
multiple CAPTURE buffers.

Moreover, V4L2_PIX_FMT_H264 is actually defined to be a byte stream,
without any need for framing, and yes, there are drivers that follow
this definition correctly (s5p-mfc and, AFAIR, coda). In that case,
one OUTPUT buffer can have arbitrary amount of bitstream and lead to
multiple CAPTURE frames being produced.

>
> This almost always implies some timestamping kung-fu to match the
> OUTPUT timestamps with the corresponding CAPTURE timestamps. It's
> often done indirectly by the firmware on some platforms (rpi comes to
> mind iirc).

I don't think there is an upstream driver for it, is there? (If not,
are you aware of any work towards it?)

>
> The current constructions also imply one video packet per OUTPUT
> buffer. If a video packet is too big to fit in a buffer, FFmpeg will
> crop that packet to the maximum buffer size and will discard the
> remaining packet data. GStreamer will abort the decoding. This is
> unfortunately one of the shortcomings of having fixed-size buffers.
> And if they were to split the packet in multiple buffers, then some
> drivers in their current state wouldn't be able to handle the
> timestamping issues and/or x:1 OUTPUT:CAPTURE buffer numbers.

In Chromium, we just allocate OUTPUT buffers big enough to be really
unlikely for a single frame not to fit inside [1]. Obviously it's a
waste of memory, for formats which normally have just single frames
inside buffers, but it seems to work in practice.

[1] https://cs.chromium.org/chromium/src/media/gpu/v4l2/v4l2_video_decode_accelerator.h?rcl=3468d5a59e00bcb2c2e946a30694e6057fd9ab21&l=118

Best regards,
Tomasz
Tomasz Figa Aug. 8, 2018, 3:11 a.m. UTC | #15
On Tue, Aug 7, 2018 at 4:13 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>
> On 07/26/2018 12:20 PM, Tomasz Figa wrote:
> > Hi Hans,
> >
> > On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
> >>> +
> >>> +14. Call :c:func:`VIDIOC_STREAMON` to initiate decoding frames.
> >>> +
> >>> +Decoding
> >>> +========
> >>> +
> >>> +This state is reached after a successful initialization sequence. In this
> >>> +state, client queues and dequeues buffers to both queues via
> >>> +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following standard
> >>> +semantics.
> >>> +
> >>> +Both queues operate independently, following standard behavior of V4L2
> >>> +buffer queues and memory-to-memory devices. In addition, the order of
> >>> +decoded frames dequeued from ``CAPTURE`` queue may differ from the order of
> >>> +queuing coded frames to ``OUTPUT`` queue, due to properties of selected
> >>> +coded format, e.g. frame reordering. The client must not assume any direct
> >>> +relationship between ``CAPTURE`` and ``OUTPUT`` buffers, other than
> >>> +reported by :c:type:`v4l2_buffer` ``timestamp`` field.
> >>
> >> Is there a relationship between capture and output buffers w.r.t. the timestamp
> >> field? I am not aware that there is one.
> >
> > I believe the decoder was expected to copy the timestamp of matching
> > OUTPUT buffer to respective CAPTURE buffer. Both s5p-mfc and coda seem
> > to be implementing it this way. I guess it might be a good idea to
> > specify this more explicitly.
>
> What about an output buffer producing multiple capture buffers? Or the case
> where the encoded bitstream of a frame starts at one output buffer and ends
> at another? What happens if you have B frames and the order of the capture
> buffers is different from the output buffers?
>
> In other words, for codecs there is no clear 1-to-1 relationship between an
> output buffer and a capture buffer. And we never defined what the 'copy timestamp'
> behavior should be in that case or if it even makes sense.

You're perfectly right. There is no 1:1 relationship, but it doesn't
prevent copying timestamps. It just makes it possible for multiple
CAPTURE buffers to have the same timestamp or some OUTPUT timestamps
not to be found in any CAPTURE buffer.

Best regards,
Tomasz
Hans Verkuil Aug. 8, 2018, 6:43 a.m. UTC | #16
On 08/08/2018 05:11 AM, Tomasz Figa wrote:
> On Tue, Aug 7, 2018 at 4:13 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>>
>> On 07/26/2018 12:20 PM, Tomasz Figa wrote:
>>> Hi Hans,
>>>
>>> On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>>>>> +
>>>>> +14. Call :c:func:`VIDIOC_STREAMON` to initiate decoding frames.
>>>>> +
>>>>> +Decoding
>>>>> +========
>>>>> +
>>>>> +This state is reached after a successful initialization sequence. In this
>>>>> +state, client queues and dequeues buffers to both queues via
>>>>> +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following standard
>>>>> +semantics.
>>>>> +
>>>>> +Both queues operate independently, following standard behavior of V4L2
>>>>> +buffer queues and memory-to-memory devices. In addition, the order of
>>>>> +decoded frames dequeued from ``CAPTURE`` queue may differ from the order of
>>>>> +queuing coded frames to ``OUTPUT`` queue, due to properties of selected
>>>>> +coded format, e.g. frame reordering. The client must not assume any direct
>>>>> +relationship between ``CAPTURE`` and ``OUTPUT`` buffers, other than
>>>>> +reported by :c:type:`v4l2_buffer` ``timestamp`` field.
>>>>
>>>> Is there a relationship between capture and output buffers w.r.t. the timestamp
>>>> field? I am not aware that there is one.
>>>
>>> I believe the decoder was expected to copy the timestamp of matching
>>> OUTPUT buffer to respective CAPTURE buffer. Both s5p-mfc and coda seem
>>> to be implementing it this way. I guess it might be a good idea to
>>> specify this more explicitly.
>>
>> What about an output buffer producing multiple capture buffers? Or the case
>> where the encoded bitstream of a frame starts at one output buffer and ends
>> at another? What happens if you have B frames and the order of the capture
>> buffers is different from the output buffers?
>>
>> In other words, for codecs there is no clear 1-to-1 relationship between an
>> output buffer and a capture buffer. And we never defined what the 'copy timestamp'
>> behavior should be in that case or if it even makes sense.
> 
> You're perfectly right. There is no 1:1 relationship, but it doesn't
> prevent copying timestamps. It just makes it possible for multiple
> CAPTURE buffers to have the same timestamp or some OUTPUT timestamps
> not to be found in any CAPTURE buffer.

We need to document the behavior. Basically there are three different
corner cases that need documenting:

1) one OUTPUT buffer generates multiple CAPTURE buffers
2) multiple OUTPUT buffers generate one CAPTURE buffer
3) the decoding order differs from the presentation order (i.e. the
   CAPTURE buffers are out-of-order compared to the OUTPUT buffers).

For 1) I assume that we just copy the same OUTPUT timestamp to multiple
CAPTURE buffers.

For 2) we need to specify if the CAPTURE timestamp is copied from the first
or last OUTPUT buffer used in creating the capture buffer. Using the last
OUTPUT buffer makes more sense to me.

And 3) implies that timestamps can be out-of-order. This needs to be
very carefully documented since it is very unexpected.

This should probably be a separate patch, adding text to the v4l2_buffer
documentation (esp. the V4L2_BUF_FLAG_TIMESTAMP_COPY documentation).

Regards,

	Hans
Ian Arkver Aug. 8, 2018, 6:54 a.m. UTC | #17
Hi Hans,

On 08/08/18 07:43, Hans Verkuil wrote:
> On 08/08/2018 05:11 AM, Tomasz Figa wrote:
>> On Tue, Aug 7, 2018 at 4:13 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>>>
>>> On 07/26/2018 12:20 PM, Tomasz Figa wrote:
>>>> Hi Hans,
>>>>
>>>> On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>>>>>> +
>>>>>> +14. Call :c:func:`VIDIOC_STREAMON` to initiate decoding frames.
>>>>>> +
>>>>>> +Decoding
>>>>>> +========
>>>>>> +
>>>>>> +This state is reached after a successful initialization sequence. In this
>>>>>> +state, client queues and dequeues buffers to both queues via
>>>>>> +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following standard
>>>>>> +semantics.
>>>>>> +
>>>>>> +Both queues operate independently, following standard behavior of V4L2
>>>>>> +buffer queues and memory-to-memory devices. In addition, the order of
>>>>>> +decoded frames dequeued from ``CAPTURE`` queue may differ from the order of
>>>>>> +queuing coded frames to ``OUTPUT`` queue, due to properties of selected
>>>>>> +coded format, e.g. frame reordering. The client must not assume any direct
>>>>>> +relationship between ``CAPTURE`` and ``OUTPUT`` buffers, other than
>>>>>> +reported by :c:type:`v4l2_buffer` ``timestamp`` field.
>>>>>
>>>>> Is there a relationship between capture and output buffers w.r.t. the timestamp
>>>>> field? I am not aware that there is one.
>>>>
>>>> I believe the decoder was expected to copy the timestamp of matching
>>>> OUTPUT buffer to respective CAPTURE buffer. Both s5p-mfc and coda seem
>>>> to be implementing it this way. I guess it might be a good idea to
>>>> specify this more explicitly.
>>>
>>> What about an output buffer producing multiple capture buffers? Or the case
>>> where the encoded bitstream of a frame starts at one output buffer and ends
>>> at another? What happens if you have B frames and the order of the capture
>>> buffers is different from the output buffers?
>>>
>>> In other words, for codecs there is no clear 1-to-1 relationship between an
>>> output buffer and a capture buffer. And we never defined what the 'copy timestamp'
>>> behavior should be in that case or if it even makes sense.
>>
>> You're perfectly right. There is no 1:1 relationship, but it doesn't
>> prevent copying timestamps. It just makes it possible for multiple
>> CAPTURE buffers to have the same timestamp or some OUTPUT timestamps
>> not to be found in any CAPTURE buffer.
> 
> We need to document the behavior. Basically there are three different
> corner cases that need documenting:
> 
> 1) one OUTPUT buffer generates multiple CAPTURE buffers
> 2) multiple OUTPUT buffers generate one CAPTURE buffer
> 3) the decoding order differs from the presentation order (i.e. the
>     CAPTURE buffers are out-of-order compared to the OUTPUT buffers).
> 
> For 1) I assume that we just copy the same OUTPUT timestamp to multiple
> CAPTURE buffers.

I'm not sure how this interface would handle something like a temporal
scalability layer, but conceivably this assumption might be invalid in
that case.

Regards,
Ian.

> 
> For 2) we need to specify if the CAPTURE timestamp is copied from the first
> or last OUTPUT buffer used in creating the capture buffer. Using the last
> OUTPUT buffer makes more sense to me.
> 
> And 3) implies that timestamps can be out-of-order. This needs to be
> very carefully documented since it is very unexpected.
> 
> This should probably be a separate patch, adding text to the v4l2_buffer
> documentation (esp. the V4L2_BUF_FLAG_TIMESTAMP_COPY documentation).
> 
> Regards,
> 
> 	Hans
>
Maxime Jourdan Aug. 8, 2018, 7:19 a.m. UTC | #18
2018-08-08 5:07 GMT+02:00 Tomasz Figa <tfiga@chromium.org>:
> On Wed, Aug 8, 2018 at 4:11 AM Maxime Jourdan <maxi.jourdan@wanadoo.fr> wrote:
>>
>> 2018-08-07 9:13 GMT+02:00 Hans Verkuil <hverkuil@xs4all.nl>:
>> > On 07/26/2018 12:20 PM, Tomasz Figa wrote:
>> >> Hi Hans,
>> >>
>> >> On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>> >>>> +
>> >>>> +14. Call :c:func:`VIDIOC_STREAMON` to initiate decoding frames.
>> >>>> +
>> >>>> +Decoding
>> >>>> +========
>> >>>> +
>> >>>> +This state is reached after a successful initialization sequence. In this
>> >>>> +state, client queues and dequeues buffers to both queues via
>> >>>> +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following standard
>> >>>> +semantics.
>> >>>> +
>> >>>> +Both queues operate independently, following standard behavior of V4L2
>> >>>> +buffer queues and memory-to-memory devices. In addition, the order of
>> >>>> +decoded frames dequeued from ``CAPTURE`` queue may differ from the order of
>> >>>> +queuing coded frames to ``OUTPUT`` queue, due to properties of selected
>> >>>> +coded format, e.g. frame reordering. The client must not assume any direct
>> >>>> +relationship between ``CAPTURE`` and ``OUTPUT`` buffers, other than
>> >>>> +reported by :c:type:`v4l2_buffer` ``timestamp`` field.
>> >>>
>> >>> Is there a relationship between capture and output buffers w.r.t. the timestamp
>> >>> field? I am not aware that there is one.
>> >>
>> >> I believe the decoder was expected to copy the timestamp of matching
>> >> OUTPUT buffer to respective CAPTURE buffer. Both s5p-mfc and coda seem
>> >> to be implementing it this way. I guess it might be a good idea to
>> >> specify this more explicitly.
>> >
>> > What about an output buffer producing multiple capture buffers? Or the case
>> > where the encoded bitstream of a frame starts at one output buffer and ends
>> > at another? What happens if you have B frames and the order of the capture
>> > buffers is different from the output buffers?
>> >
>> > In other words, for codecs there is no clear 1-to-1 relationship between an
>> > output buffer and a capture buffer. And we never defined what the 'copy timestamp'
>> > behavior should be in that case or if it even makes sense.
>> >
>> > Regards,
>> >
>> >         Hans
>>
>> As it is done right now in userspace (FFmpeg, GStreamer) and most (if
>> not all?) drivers, it's a 1:1 between OUTPUT and CAPTURE. The only
>> thing that changes is the ordering since OUTPUT buffers are in
>> decoding order while CAPTURE buffers are in presentation order.
>
> If I understood it correctly, there is a feature in VP9 that lets one
> frame repeat several times, which would make one OUTPUT buffer produce
> multiple CAPTURE buffers.
>
> Moreover, V4L2_PIX_FMT_H264 is actually defined to be a byte stream,
> without any need for framing, and yes, there are drivers that follow
> this definition correctly (s5p-mfc and, AFAIR, coda). In that case,
> one OUTPUT buffer can have arbitrary amount of bitstream and lead to
> multiple CAPTURE frames being produced.

I can see from the code and your answer to Hans that in such case, all
CAPTURE buffers will share the single OUTPUT timestamp.

Does this mean that at the end of the day, userspace disregards the
CAPTURE timestamps since you have the display order guarantee ?
If so, how do you reconstruct the proper PTS on such buffers ? Do you
have them saved from prior demuxing ?

>>
>> This almost always implies some timestamping kung-fu to match the
>> OUTPUT timestamps with the corresponding CAPTURE timestamps. It's
>> often done indirectly by the firmware on some platforms (rpi comes to
>> mind iirc).
>
> I don't think there is an upstream driver for it, is there? (If not,
> are you aware of any work towards it?)

You're right, it's not upstream but it is in a relatively good shape
at https://github.com/6by9/linux/commits/rpi-4.14.y-v4l2-codec

>>
>> The current constructions also imply one video packet per OUTPUT
>> buffer. If a video packet is too big to fit in a buffer, FFmpeg will
>> crop that packet to the maximum buffer size and will discard the
>> remaining packet data. GStreamer will abort the decoding. This is
>> unfortunately one of the shortcomings of having fixed-size buffers.
>> And if they were to split the packet in multiple buffers, then some
>> drivers in their current state wouldn't be able to handle the
>> timestamping issues and/or x:1 OUTPUT:CAPTURE buffer numbers.
>
> In Chromium, we just allocate OUTPUT buffers big enough to be really
> unlikely for a single frame not to fit inside [1]. Obviously it's a
> waste of memory, for formats which normally have just single frames
> inside buffers, but it seems to work in practice.
>
> [1] https://cs.chromium.org/chromium/src/media/gpu/v4l2/v4l2_video_decode_accelerator.h?rcl=3468d5a59e00bcb2c2e946a30694e6057fd9ab21&l=118

Right. As long as you don't need many OUTPUT buffers it's not that big a deal.

[snip]

>> > +      For hardware known to be mishandling seeks to a non-resume point,
>> > +      e.g. by returning corrupted decoded frames, the driver must be able
>> > +      to handle such seeks without a crash or any fatal decode error.
>>
>> This is unfortunately my case, apart from parsing the bitstream
>> manually - which is a no-no -, there is no way to know when I'll be
>> writing in an IDR frame to the HW bitstream parser. I think it would
>> be much preferable that the client starts sending in an IDR frame for
>> sure.
>
> Most of the hardware, which have upstream drivers, deal with this
> correctly and there is existing user space that relies on this, so we
> cannot simply add such requirement. However, when sending your driver
> upstream, feel free to include a patch that adds a read-only control
> that tells the user space that it needs to do seeks to resume points.
> Obviously this will work only with user space aware of this
> requirement, but I don't think we can do anything better here.
>

Makes sense

>> > +      To achieve instantaneous seek, the client may restart streaming on
>> > +      ``CAPTURE`` queue to discard decoded, but not yet dequeued buffers.
>>
>> Overall, I think Drain followed by V4L2_DEC_CMD_START is a more
>> applicable scenario for seeking.
>> Heck, simply starting to queue buffers at the seek - starting with an
>> IDR - without doing any kind of streamon/off or cmd_start(stop) will
>> do the trick.
>
> Why do you think so?
>
> For a seek, as expected by a typical device user, the result should be
> discarding anything already queued and just start decoding new frames
> as soon as possible.
>
> Actually, this section doesn't describe any specific sequence, just
> possible ways to do a seek using existing primitives.

Fair enough

Regards,
Maxime
Philipp Zabel Aug. 20, 2018, 1:04 p.m. UTC | #19
On Tue, 2018-07-24 at 23:06 +0900, Tomasz Figa wrote:
[...]
> +Seek
> +====
> +
> +Seek is controlled by the ``OUTPUT`` queue, as it is the source of
> +bitstream data. ``CAPTURE`` queue remains unchanged/unaffected.
> +
> +1. Stop the ``OUTPUT`` queue to begin the seek sequence via
> +   :c:func:`VIDIOC_STREAMOFF`.
> +
> +   * **Required fields:**
> +
> +     ``type``
> +         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> +
> +   * The driver must drop all the pending ``OUTPUT`` buffers and they are
> +     treated as returned to the client (following standard semantics).
> +
> +2. Restart the ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON`
> +
> +   * **Required fields:**
> +
> +     ``type``
> +         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> +
> +   * The driver must be put in a state after seek and be ready to
> +     accept new source bitstream buffers.
> +
> +3. Start queuing buffers to ``OUTPUT`` queue containing stream data after
> +   the seek until a suitable resume point is found.
> +
> +   .. note::
> +
> +      There is no requirement to begin queuing stream starting exactly from
> +      a resume point (e.g. SPS or a keyframe). The driver must handle any
> +      data queued and must keep processing the queued buffers until it
> +      finds a suitable resume point. While looking for a resume point, the

I think the definition of a resume point is too vague in this place.
Can the driver decide whether or not a keyframe without SPS is a
suitable resume point? Or do drivers have to parse and store SPS/PPS if
the hardware does not support resuming from a keyframe without sending
SPS/PPS again?

regards
Philipp
Tomasz Figa Aug. 20, 2018, 1:12 p.m. UTC | #20
On Mon, Aug 20, 2018 at 10:04 PM Philipp Zabel <p.zabel@pengutronix.de> wrote:
>
> On Tue, 2018-07-24 at 23:06 +0900, Tomasz Figa wrote:
> [...]
> > +Seek
> > +====
> > +
> > +Seek is controlled by the ``OUTPUT`` queue, as it is the source of
> > +bitstream data. ``CAPTURE`` queue remains unchanged/unaffected.
> > +
> > +1. Stop the ``OUTPUT`` queue to begin the seek sequence via
> > +   :c:func:`VIDIOC_STREAMOFF`.
> > +
> > +   * **Required fields:**
> > +
> > +     ``type``
> > +         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> > +
> > +   * The driver must drop all the pending ``OUTPUT`` buffers and they are
> > +     treated as returned to the client (following standard semantics).
> > +
> > +2. Restart the ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON`
> > +
> > +   * **Required fields:**
> > +
> > +     ``type``
> > +         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> > +
> > +   * The driver must be put in a state after seek and be ready to
> > +     accept new source bitstream buffers.
> > +
> > +3. Start queuing buffers to ``OUTPUT`` queue containing stream data after
> > +   the seek until a suitable resume point is found.
> > +
> > +   .. note::
> > +
> > +      There is no requirement to begin queuing stream starting exactly from
> > +      a resume point (e.g. SPS or a keyframe). The driver must handle any
> > +      data queued and must keep processing the queued buffers until it
> > +      finds a suitable resume point. While looking for a resume point, the
>
> I think the definition of a resume point is too vague in this place.
> Can the driver decide whether or not a keyframe without SPS is a
> suitable resume point? Or do drivers have to parse and store SPS/PPS if
> the hardware does not support resuming from a keyframe without sending
> SPS/PPS again?

The thing is that existing drivers implement and user space clients
rely on the behavior described above, so we cannot really change it
anymore.

Do we have hardware for which this wouldn't work to the point that the
driver couldn't even continue with a bunch of frames corrupted? If
only frame corruption is a problem, we can add a control to tell the
user space to seek to resume points and it can happen in an
incremental patch.

Best regards,
Tomasz
Philipp Zabel Aug. 20, 2018, 2:13 p.m. UTC | #21
Hi Tomasz,

On Mon, 2018-08-20 at 22:12 +0900, Tomasz Figa wrote:
> On Mon, Aug 20, 2018 at 10:04 PM Philipp Zabel <p.zabel@pengutronix.de> wrote:
> > 
> > On Tue, 2018-07-24 at 23:06 +0900, Tomasz Figa wrote:
> > [...]
> > > +Seek
> > > +====
> > > +
> > > +Seek is controlled by the ``OUTPUT`` queue, as it is the source of
> > > +bitstream data. ``CAPTURE`` queue remains unchanged/unaffected.
> > > +
> > > +1. Stop the ``OUTPUT`` queue to begin the seek sequence via
> > > +   :c:func:`VIDIOC_STREAMOFF`.
> > > +
> > > +   * **Required fields:**
> > > +
> > > +     ``type``
> > > +         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> > > +
> > > +   * The driver must drop all the pending ``OUTPUT`` buffers and they are
> > > +     treated as returned to the client (following standard semantics).
> > > +
> > > +2. Restart the ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON`
> > > +
> > > +   * **Required fields:**
> > > +
> > > +     ``type``
> > > +         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> > > +
> > > +   * The driver must be put in a state after seek and be ready to
> > > +     accept new source bitstream buffers.
> > > +
> > > +3. Start queuing buffers to ``OUTPUT`` queue containing stream data after
> > > +   the seek until a suitable resume point is found.
> > > +
> > > +   .. note::
> > > +
> > > +      There is no requirement to begin queuing stream starting exactly from
> > > +      a resume point (e.g. SPS or a keyframe). The driver must handle any
> > > +      data queued and must keep processing the queued buffers until it
> > > +      finds a suitable resume point. While looking for a resume point, the
> > 
> > I think the definition of a resume point is too vague in this place.
> > Can the driver decide whether or not a keyframe without SPS is a
> > suitable resume point? Or do drivers have to parse and store SPS/PPS if
> > the hardware does not support resuming from a keyframe without sending
> > SPS/PPS again?
> 
> The thing is that existing drivers implement and user space clients
> rely on the behavior described above, so we cannot really change it
> anymore.

My point is that I'm not exactly sure what that behaviour is, given the
description.

Must a driver be able to resume from a keyframe even if userspace never
pushes SPS/PPS again?
If so, I think it should be mentioned more explicitly than just via an
example in parentheses, to make it clear to all driver developers that
this is a requirement that userspace is going to rely on.

Or, if that is not the case, is a driver free to define "SPS only" as
its "suitable resume point" and to discard all input including keyframes
until the next SPS/PPS is pushed?

It would be better to clearly define what a "suitable resume point" has
to be per codec, and not let the drivers decide for themselves, if at
all possible. Otherwise we'd need a away to inform userspace about the
per-driver definition.

> Do we have hardware for which this wouldn't work to the point that the
> driver couldn't even continue with a bunch of frames corrupted? If
> only frame corruption is a problem, we can add a control to tell the
> user space to seek to resume points and it can happen in an
> incremental patch.

The coda driver currently can't seek at all, it always stops and
restarts the sequence. So depending on the above I might have to either
find and store SPS/PPS in software, or figure out how to make the
firmware flush the bitstream buffer and restart without actually
stopping the sequence.
I'm sure the hardware is capable of this, it's more a question of what
behaviour is actually intended, and whether I have enough information
about the firmware interface to implement it.

regards
Philipp
Tomasz Figa Aug. 20, 2018, 2:27 p.m. UTC | #22
On Mon, Aug 20, 2018 at 11:13 PM Philipp Zabel <p.zabel@pengutronix.de> wrote:
>
> Hi Tomasz,
>
> On Mon, 2018-08-20 at 22:12 +0900, Tomasz Figa wrote:
> > On Mon, Aug 20, 2018 at 10:04 PM Philipp Zabel <p.zabel@pengutronix.de> wrote:
> > >
> > > On Tue, 2018-07-24 at 23:06 +0900, Tomasz Figa wrote:
> > > [...]
> > > > +Seek
> > > > +====
> > > > +
> > > > +Seek is controlled by the ``OUTPUT`` queue, as it is the source of
> > > > +bitstream data. ``CAPTURE`` queue remains unchanged/unaffected.
> > > > +
> > > > +1. Stop the ``OUTPUT`` queue to begin the seek sequence via
> > > > +   :c:func:`VIDIOC_STREAMOFF`.
> > > > +
> > > > +   * **Required fields:**
> > > > +
> > > > +     ``type``
> > > > +         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> > > > +
> > > > +   * The driver must drop all the pending ``OUTPUT`` buffers and they are
> > > > +     treated as returned to the client (following standard semantics).
> > > > +
> > > > +2. Restart the ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON`
> > > > +
> > > > +   * **Required fields:**
> > > > +
> > > > +     ``type``
> > > > +         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> > > > +
> > > > +   * The driver must be put in a state after seek and be ready to
> > > > +     accept new source bitstream buffers.
> > > > +
> > > > +3. Start queuing buffers to ``OUTPUT`` queue containing stream data after
> > > > +   the seek until a suitable resume point is found.
> > > > +
> > > > +   .. note::
> > > > +
> > > > +      There is no requirement to begin queuing stream starting exactly from
> > > > +      a resume point (e.g. SPS or a keyframe). The driver must handle any
> > > > +      data queued and must keep processing the queued buffers until it
> > > > +      finds a suitable resume point. While looking for a resume point, the
> > >
> > > I think the definition of a resume point is too vague in this place.
> > > Can the driver decide whether or not a keyframe without SPS is a
> > > suitable resume point? Or do drivers have to parse and store SPS/PPS if
> > > the hardware does not support resuming from a keyframe without sending
> > > SPS/PPS again?
> >
> > The thing is that existing drivers implement and user space clients
> > rely on the behavior described above, so we cannot really change it
> > anymore.
>
> My point is that I'm not exactly sure what that behaviour is, given the
> description.
>
> Must a driver be able to resume from a keyframe even if userspace never
> pushes SPS/PPS again?
> If so, I think it should be mentioned more explicitly than just via an
> example in parentheses, to make it clear to all driver developers that
> this is a requirement that userspace is going to rely on.
>
> Or, if that is not the case, is a driver free to define "SPS only" as
> its "suitable resume point" and to discard all input including keyframes
> until the next SPS/PPS is pushed?
>
> It would be better to clearly define what a "suitable resume point" has
> to be per codec, and not let the drivers decide for themselves, if at
> all possible. Otherwise we'd need a away to inform userspace about the
> per-driver definition.

The intention here is that there is exactly no requirement for the
user space to seek to any kind of resume point and so there is no
point in defining such. The only requirement here is that the
hardware/driver keeps processing the source stream until it finds a
resume point suitable for it - if the hardware keeps SPS/PPS in its
state then just a keyframe; if it doesn't then SPS/PPS. Note that this
is a documentation of the user space API, not a driver implementation
guide. We may want to create the latter separately, though.

H264 is a bit special here, because one may still seek to a key frame,
but past the relevant SPS/PPS headers. In this case, there is no way
for the hardware to know that the SPS/PPS it has in its local state is
not the one that applies to the frame. It may be worth adding that
such case leads to undefined results, but must not cause crash nor a
fatal decode error.

What do you think?

>
> > Do we have hardware for which this wouldn't work to the point that the
> > driver couldn't even continue with a bunch of frames corrupted? If
> > only frame corruption is a problem, we can add a control to tell the
> > user space to seek to resume points and it can happen in an
> > incremental patch.
>
> The coda driver currently can't seek at all, it always stops and
> restarts the sequence. So depending on the above I might have to either
> find and store SPS/PPS in software, or figure out how to make the
> firmware flush the bitstream buffer and restart without actually
> stopping the sequence.
> I'm sure the hardware is capable of this, it's more a question of what
> behaviour is actually intended, and whether I have enough information
> about the firmware interface to implement it.

What happens if you just keep feeding it with next frames? If that
would result only in corrupted frames, I suppose the control (say
V4L2_CID_MPEG_VIDEO_NEEDS_SEEK_TO_RESUME_POINT) would solve the
problem?

Best regards,
Tomasz
Philipp Zabel Aug. 20, 2018, 3:33 p.m. UTC | #23
On Mon, 2018-08-20 at 23:27 +0900, Tomasz Figa wrote:
[...]
> +3. Start queuing buffers to ``OUTPUT`` queue containing stream data after
> > > > > +   the seek until a suitable resume point is found.
> > > > > +
> > > > > +   .. note::
> > > > > +
> > > > > +      There is no requirement to begin queuing stream starting exactly from
> > > > > +      a resume point (e.g. SPS or a keyframe). The driver must handle any
> > > > > +      data queued and must keep processing the queued buffers until it
> > > > > +      finds a suitable resume point. While looking for a resume point, the
> > > > 
> > > > I think the definition of a resume point is too vague in this place.
> > > > Can the driver decide whether or not a keyframe without SPS is a
> > > > suitable resume point? Or do drivers have to parse and store SPS/PPS if
> > > > the hardware does not support resuming from a keyframe without sending
> > > > SPS/PPS again?
> > > 
> > > The thing is that existing drivers implement and user space clients
> > > rely on the behavior described above, so we cannot really change it
> > > anymore.
> > 
> > My point is that I'm not exactly sure what that behaviour is, given the
> > description.
> > 
> > Must a driver be able to resume from a keyframe even if userspace never
> > pushes SPS/PPS again?
> > If so, I think it should be mentioned more explicitly than just via an
> > example in parentheses, to make it clear to all driver developers that
> > this is a requirement that userspace is going to rely on.
> > 
> > Or, if that is not the case, is a driver free to define "SPS only" as
> > its "suitable resume point" and to discard all input including keyframes
> > until the next SPS/PPS is pushed?
> > 
> > It would be better to clearly define what a "suitable resume point" has
> > to be per codec, and not let the drivers decide for themselves, if at
> > all possible. Otherwise we'd need a away to inform userspace about the
> > per-driver definition.
> 
> The intention here is that there is exactly no requirement for the
> user space to seek to any kind of resume point

No question about this.

> and so there is no point in defining such.

I don't agree. Let me give an example:

Assume userspace wants to play back a simple h.264 stream that has
SPS/PPS exactly once, in the beginning.

If drivers are allowed to resume from SPS/PPS only, and have no way to
communicate this to userspace, userspace always has to assume that
resuming from keyframes alone is not possible. So it has to store
SPS/PPS and resubmit them with every seek, even if a specific driver
wouldn't require it: Otherwise those drivers that don't store SPS/PPS
themselves (or in hardware) would be allowed to just drop everything
after the first seek.
This effectively would make resending SPS/PPS mandatory, which doesn't
fit well with the intention of letting userspace just seek anywhere and
start feeding data (or: NAL units) into the driver blindly.

> The only requirement here is that the
> hardware/driver keeps processing the source stream until it finds a
> resume point suitable for it - if the hardware keeps SPS/PPS in its
> state then just a keyframe; if it doesn't then SPS/PPS.

Yes, but the difference between those two might be very relevant to
userspace behaviour.

> Note that this is a documentation of the user space API, not a driver
> implementation guide. We may want to create the latter separately,
> though.

This is a good point, I keep switching the perspective from which I look
at this document.
Even for userspace it would make sense to be as specific as possible,
though. Otherwise, doesn't userspace always have to assume the worst?

> H264 is a bit special here, because one may still seek to a key frame,
> but past the relevant SPS/PPS headers. In this case, there is no way
> for the hardware to know that the SPS/PPS it has in its local state is
> not the one that applies to the frame. It may be worth adding that
> such case leads to undefined results, but must not cause crash nor a
> fatal decode error.
> 
> What do you think?

That sounds like a good idea. I haven't thought about seeking over a
SPS/PPS change. Of course userspace must not expect correct results in
this case without providing the new SPS/PPS.

> > > Do we have hardware for which this wouldn't work to the point that the
> > > driver couldn't even continue with a bunch of frames corrupted? If
> > > only frame corruption is a problem, we can add a control to tell the
> > > user space to seek to resume points and it can happen in an
> > > incremental patch.
> > 
> > The coda driver currently can't seek at all, it always stops and
> > restarts the sequence. So depending on the above I might have to either
> > find and store SPS/PPS in software, or figure out how to make the
> > firmware flush the bitstream buffer and restart without actually
> > stopping the sequence.
> > I'm sure the hardware is capable of this, it's more a question of what
> > behaviour is actually intended, and whether I have enough information
> > about the firmware interface to implement it.
> 
> What happens if you just keep feeding it with next frames?

As long as they are well formed, it should just decode them, possibly
with artifacts due to mismatched reference buffers. There is an I-Frame
search mode that should be usable to skip to the next resume point, as
well, so I'm sure coda will end up not needing the
NEEDS_SEEK_TO_RESUME_POINT flag below. I'm just not certain at this
point whether I'll be able to (or: whether I'll have to) keep the
SPS/PPS state across seeks. I have seen so many decoder hangs with
malformed input on i.MX53 that I couldn't recover from, that I'm wary
to make any guarantees without flushing the bitstream buffer first.

> If that would result only in corrupted frames, I suppose the control (say
> V4L2_CID_MPEG_VIDEO_NEEDS_SEEK_TO_RESUME_POINT) would solve the
> problem?

For this to be useful, userspace needs to know what a resume point is in
the first place, though.

regards
Philipp
Stanimir Varbanov Aug. 21, 2018, 11:29 a.m. UTC | #24
Hi Tomasz,

On 08/08/2018 05:55 AM, Tomasz Figa wrote:
> On Tue, Aug 7, 2018 at 4:37 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:

>>>>>>> +7.  If all the following conditions are met, the client may resume the
>>>>>>> +    decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with
>>>>>>> +    ``V4L2_DEC_CMD_START`` command, as in case of resuming after the drain
>>>>>>> +    sequence:
>>>>>>> +
>>>>>>> +    * ``sizeimage`` of new format is less than or equal to the size of
>>>>>>> +      currently allocated buffers,
>>>>>>> +
>>>>>>> +    * the number of buffers currently allocated is greater than or equal to
>>>>>>> +      the minimum number of buffers acquired in step 6.
>>>>>>
>>>>>> You might want to mention that if there are insufficient buffers, then
>>>>>> VIDIOC_CREATE_BUFS can be used to add more buffers.
>>>>>>
>>>>>
>>>>> This might be a bit tricky, since at least s5p-mfc and coda can only
>>>>> work on a fixed buffer set and one would need to fully reinitialize
>>>>> the decoding to add one more buffer, which would effectively be the
>>>>> full resolution change sequence, as below, just with REQBUFS(0),
>>>>> REQBUFS(N) replaced with CREATE_BUFS.
>>>>
>>>> What happens today in those drivers if you try to call CREATE_BUFS?
>>>
>>> s5p-mfc doesn't set the .vidioc_create_bufs pointer in its
>>> v4l2_ioctl_ops, so I suppose that would be -ENOTTY?
>>
>> Correct for s5p-mfc.
> 
> As Philipp clarified, coda supports adding buffers on the fly. I
> briefly looked at venus and mtk-vcodec and they seem to use m2m
> implementation of CREATE_BUFS. Not sure if anyone tested that, though.
> So the only hardware I know for sure cannot support this is s5p-mfc.

In Venus case CREATE_BUFS is tested with Gstreamer.
Tomasz Figa Aug. 27, 2018, 4:03 a.m. UTC | #25
Hi Philipp,

On Tue, Aug 21, 2018 at 12:34 AM Philipp Zabel <p.zabel@pengutronix.de> wrote:
>
> On Mon, 2018-08-20 at 23:27 +0900, Tomasz Figa wrote:
> [...]
> > +3. Start queuing buffers to ``OUTPUT`` queue containing stream data after
> > > > > > +   the seek until a suitable resume point is found.
> > > > > > +
> > > > > > +   .. note::
> > > > > > +
> > > > > > +      There is no requirement to begin queuing stream starting exactly from
> > > > > > +      a resume point (e.g. SPS or a keyframe). The driver must handle any
> > > > > > +      data queued and must keep processing the queued buffers until it
> > > > > > +      finds a suitable resume point. While looking for a resume point, the
> > > > >
> > > > > I think the definition of a resume point is too vague in this place.
> > > > > Can the driver decide whether or not a keyframe without SPS is a
> > > > > suitable resume point? Or do drivers have to parse and store SPS/PPS if
> > > > > the hardware does not support resuming from a keyframe without sending
> > > > > SPS/PPS again?
> > > >
> > > > The thing is that existing drivers implement and user space clients
> > > > rely on the behavior described above, so we cannot really change it
> > > > anymore.
> > >
> > > My point is that I'm not exactly sure what that behaviour is, given the
> > > description.
> > >
> > > Must a driver be able to resume from a keyframe even if userspace never
> > > pushes SPS/PPS again?
> > > If so, I think it should be mentioned more explicitly than just via an
> > > example in parentheses, to make it clear to all driver developers that
> > > this is a requirement that userspace is going to rely on.
> > >
> > > Or, if that is not the case, is a driver free to define "SPS only" as
> > > its "suitable resume point" and to discard all input including keyframes
> > > until the next SPS/PPS is pushed?
> > >
> > > It would be better to clearly define what a "suitable resume point" has
> > > to be per codec, and not let the drivers decide for themselves, if at
> > > all possible. Otherwise we'd need a away to inform userspace about the
> > > per-driver definition.
> >
> > The intention here is that there is exactly no requirement for the
> > user space to seek to any kind of resume point
>
> No question about this.
>
> > and so there is no point in defining such.
>
> I don't agree. Let me give an example:
>
> Assume userspace wants to play back a simple h.264 stream that has
> SPS/PPS exactly once, in the beginning.
>
> If drivers are allowed to resume from SPS/PPS only, and have no way to
> communicate this to userspace, userspace always has to assume that
> resuming from keyframes alone is not possible. So it has to store
> SPS/PPS and resubmit them with every seek, even if a specific driver
> wouldn't require it: Otherwise those drivers that don't store SPS/PPS
> themselves (or in hardware) would be allowed to just drop everything
> after the first seek.
> This effectively would make resending SPS/PPS mandatory, which doesn't
> fit well with the intention of letting userspace just seek anywhere and
> start feeding data (or: NAL units) into the driver blindly.
>

I'd say that such video is broken by design, because you cannot play
back any arbitrary later part of it without decoding it from the
beginning.

However, if the hardware keeps SPS/PPS across seeks (and that should
normally be the case), the case could be handled by the user space
letting the decoder initialize with the first frames and only then
seeking, which would probably be the typical case of a user opening a
video file and then moving the seek bar to desired position (or
clicking a bookmark).

If the hardware doesn't keep SPS/PPS across seeks, stateless API could
arguably be a better candidate for it, since it mandates the user
space to keep SPS/PPS around.

> > The only requirement here is that the
> > hardware/driver keeps processing the source stream until it finds a
> > resume point suitable for it - if the hardware keeps SPS/PPS in its
> > state then just a keyframe; if it doesn't then SPS/PPS.
>
> Yes, but the difference between those two might be very relevant to
> userspace behaviour.
>
> > Note that this is a documentation of the user space API, not a driver
> > implementation guide. We may want to create the latter separately,
> > though.
>
> This is a good point, I keep switching the perspective from which I look
> at this document.
> Even for userspace it would make sense to be as specific as possible,
> though. Otherwise, doesn't userspace always have to assume the worst?
>

That's right, a generic user space is expected to handle all the
possible cases possible with the interface it's using. This is
precisely why I'd like to avoid introducing the case where user space
needs to carry state around. The API is for stateful hardware, which
is expected to carry all the needed state around itself.

> > H264 is a bit special here, because one may still seek to a key frame,
> > but past the relevant SPS/PPS headers. In this case, there is no way
> > for the hardware to know that the SPS/PPS it has in its local state is
> > not the one that applies to the frame. It may be worth adding that
> > such case leads to undefined results, but must not cause crash nor a
> > fatal decode error.
> >
> > What do you think?
>
> That sounds like a good idea. I haven't thought about seeking over a
> SPS/PPS change. Of course userspace must not expect correct results in
> this case without providing the new SPS/PPS.
>

From what I talked with Pawel, our hardware (s5p-mfc, mtk-vcodec) will
just notice that the frames refer to a different SPS/PPS (based on
seq_parameter_set_id, I assume) and keep dropping frames until next
corresponding header is encountered.

> > > > Do we have hardware for which this wouldn't work to the point that the
> > > > driver couldn't even continue with a bunch of frames corrupted? If
> > > > only frame corruption is a problem, we can add a control to tell the
> > > > user space to seek to resume points and it can happen in an
> > > > incremental patch.
> > >
> > > The coda driver currently can't seek at all, it always stops and
> > > restarts the sequence. So depending on the above I might have to either
> > > find and store SPS/PPS in software, or figure out how to make the
> > > firmware flush the bitstream buffer and restart without actually
> > > stopping the sequence.
> > > I'm sure the hardware is capable of this, it's more a question of what
> > > behaviour is actually intended, and whether I have enough information
> > > about the firmware interface to implement it.
> >
> > What happens if you just keep feeding it with next frames?
>
> As long as they are well formed, it should just decode them, possibly
> with artifacts due to mismatched reference buffers. There is an I-Frame
> search mode that should be usable to skip to the next resume point, as
> well, so I'm sure coda will end up not needing the
> NEEDS_SEEK_TO_RESUME_POINT flag below. I'm just not certain at this
> point whether I'll be able to (or: whether I'll have to) keep the
> SPS/PPS state across seeks. I have seen so many decoder hangs with
> malformed input on i.MX53 that I couldn't recover from, that I'm wary
> to make any guarantees without flushing the bitstream buffer first.

Based on the above, I believe the answer is that your hardware/driver
needs to keep SPS/PPS around. Is there a good way to do it with Coda?
We definitely don't want to do any parsing inside the driver.

>
> > If that would result only in corrupted frames, I suppose the control (say
> > V4L2_CID_MPEG_VIDEO_NEEDS_SEEK_TO_RESUME_POINT) would solve the
> > problem?
>
> For this to be useful, userspace needs to know what a resume point is in
> the first place, though.

That would be defined in the context of that control and particular
pixel format, since there is no general, yet precise enough definition
that could apply to all codecs. Right now, I would like to defer
adding such constraints until there is really a hardware which needs
it and it can't be handled using stateless API.

Best regards,
Tomasz
Tomasz Figa Aug. 27, 2018, 4:09 a.m. UTC | #26
On Tue, Aug 21, 2018 at 8:29 PM Stanimir Varbanov
<stanimir.varbanov@linaro.org> wrote:
>
> Hi Tomasz,
>
> On 08/08/2018 05:55 AM, Tomasz Figa wrote:
> > On Tue, Aug 7, 2018 at 4:37 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>
> >>>>>>> +7.  If all the following conditions are met, the client may resume the
> >>>>>>> +    decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with
> >>>>>>> +    ``V4L2_DEC_CMD_START`` command, as in case of resuming after the drain
> >>>>>>> +    sequence:
> >>>>>>> +
> >>>>>>> +    * ``sizeimage`` of new format is less than or equal to the size of
> >>>>>>> +      currently allocated buffers,
> >>>>>>> +
> >>>>>>> +    * the number of buffers currently allocated is greater than or equal to
> >>>>>>> +      the minimum number of buffers acquired in step 6.
> >>>>>>
> >>>>>> You might want to mention that if there are insufficient buffers, then
> >>>>>> VIDIOC_CREATE_BUFS can be used to add more buffers.
> >>>>>>
> >>>>>
> >>>>> This might be a bit tricky, since at least s5p-mfc and coda can only
> >>>>> work on a fixed buffer set and one would need to fully reinitialize
> >>>>> the decoding to add one more buffer, which would effectively be the
> >>>>> full resolution change sequence, as below, just with REQBUFS(0),
> >>>>> REQBUFS(N) replaced with CREATE_BUFS.
> >>>>
> >>>> What happens today in those drivers if you try to call CREATE_BUFS?
> >>>
> >>> s5p-mfc doesn't set the .vidioc_create_bufs pointer in its
> >>> v4l2_ioctl_ops, so I suppose that would be -ENOTTY?
> >>
> >> Correct for s5p-mfc.
> >
> > As Philipp clarified, coda supports adding buffers on the fly. I
> > briefly looked at venus and mtk-vcodec and they seem to use m2m
> > implementation of CREATE_BUFS. Not sure if anyone tested that, though.
> > So the only hardware I know for sure cannot support this is s5p-mfc.
>
> In Venus case CREATE_BUFS is tested with Gstreamer.

Stanimir: Alright. Thanks for confirmation.

Hans: Technically, we could still implement CREATE_BUFS for s5p-mfc,
but it would need to be restricted to situations where it's possible
to reinitialize the whole hardware buffer queue, i.e.
- before initial STREAMON(CAPTURE) after header parsing,
- after a resolution change and before following STREAMON(CAPTURE) or
DECODER_CMD_START (to ack resolution change without buffer
reallocation).

Would that work for your original suggestion?

Best regards,
Tomasz
Alexandre Courbot Aug. 31, 2018, 8:26 a.m. UTC | #27
Hi Tomasz, just a few thoughts I came across while writing the
stateless codec document:

On Tue, Jul 24, 2018 at 11:06 PM Tomasz Figa <tfiga@chromium.org> wrote:
[snip]
> +****************************************
> +Memory-to-memory Video Decoder Interface
> +****************************************

Since we have a m2m stateless decoder interface, can we call this the
m2m video *stateful* decoder interface? :)

> +Conventions and notation used in this document
> +==============================================
[snip]
> +Glossary
> +========

I think these sections apply to both stateless and stateful. How about
moving then into dev-codec.rst and mentioning that they apply to the
two following sections?
Tomasz Figa Sept. 5, 2018, 5:45 a.m. UTC | #28
On Fri, Aug 31, 2018 at 5:27 PM Alexandre Courbot <acourbot@chromium.org> wrote:
>
> Hi Tomasz, just a few thoughts I came across while writing the
> stateless codec document:
>
> On Tue, Jul 24, 2018 at 11:06 PM Tomasz Figa <tfiga@chromium.org> wrote:
> [snip]
> > +****************************************
> > +Memory-to-memory Video Decoder Interface
> > +****************************************
>
> Since we have a m2m stateless decoder interface, can we call this the
> m2m video *stateful* decoder interface? :)

I guess it could make sense indeed. Let's wait for some other opinions, if any.

>
> > +Conventions and notation used in this document
> > +==============================================
> [snip]
> > +Glossary
> > +========
>
> I think these sections apply to both stateless and stateful. How about
> moving then into dev-codec.rst and mentioning that they apply to the
> two following sections?

Or maybe we could put them into separate rst files and source them at
the top of each interface documentation? Personally, I'm okay with
either. On a related note, I'd love to see some kind of glossary
lookup on mouse hoover, so that I don't have to scroll back and forth.
:)

Best regards,
Tomasz
Tomasz Figa Sept. 19, 2018, 10:17 a.m. UTC | #29
Hi Hans,

On Thu, Jul 26, 2018 at 7:20 PM Tomasz Figa <tfiga@chromium.org> wrote:
>
> Hi Hans,
>
> On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
> >
> > Hi Tomasz,
> >
> > Many, many thanks for working on this! It's a great document and when done
> > it will be very useful indeed.
> >
> > Review comments follow...
>
> Thanks for review!
>
> >
> > On 24/07/18 16:06, Tomasz Figa wrote:
[snip]
> > > +13. Allocate destination (raw format) buffers via :c:func:`VIDIOC_REQBUFS`
> > > +    on the ``CAPTURE`` queue.
> > > +
> > > +    * **Required fields:**
> > > +
> > > +      ``count``
> > > +          requested number of buffers to allocate; greater than zero
> > > +
> > > +      ``type``
> > > +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
> > > +
> > > +      ``memory``
> > > +          follows standard semantics
> > > +
> > > +    * **Return fields:**
> > > +
> > > +      ``count``
> > > +          adjusted to allocated number of buffers
> > > +
> > > +    * The driver must adjust count to minimum of required number of
> > > +      destination buffers for given format and stream configuration and the
> > > +      count passed. The client must check this value after the ioctl
> > > +      returns to get the number of buffers allocated.
> > > +
> > > +    .. note::
> > > +
> > > +       To allocate more than minimum number of buffers (for pipeline
> > > +       depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE``) to
> > > +       get minimum number of buffers required, and pass the obtained value
> > > +       plus the number of additional buffers needed in count to
> > > +       :c:func:`VIDIOC_REQBUFS`.
> >
> >
> > I think we should mention here the option of using VIDIOC_CREATE_BUFS in order
> > to allocate buffers larger than the current CAPTURE format in order to accommodate
> > future resolution changes.
>
> Ack.
>

I'm about to add a paragraph to describe this, but there is one detail
to iron out.

The VIDIOC_CREATE_BUFS ioctl accepts a v4l2_format struct. Userspace
needs to fill in this struct and the specs says that

  "Usually this will be done using the VIDIOC_TRY_FMT or VIDIOC_G_FMT
ioctls to ensure that the requested format is supported by the
driver."

However, in case of a decoder, those calls would fixup the format to
match the currently parsed stream, which would likely resolve to the
current coded resolution (~hardware alignment). How do we get a format
for the desired maximum resolution?

[snip].
> > > +
> > > +     * The driver is also allowed to and may not return all decoded frames
[snip]
> > > +       queued but not decode before the seek sequence was initiated. For
> >
> > Very confusing sentence. I think you mean this:
> >
> >           The driver may not return all decoded frames that where ready for
> >           dequeueing from before the seek sequence was initiated.
> >
> > Is this really true? Once decoded frames are marked as buffer_done by the
> > driver there is no reason for them to be removed. Or you mean something else
> > here, e.g. the frames are decoded, but the buffers not yet given back to vb2.
> >
>
> Exactly "the frames are decoded, but the buffers not yet given back to
> vb2", for example, if reordering takes place. However, if one stops
> streaming before dequeuing all buffers, they are implicitly returned
> (reset to the state after REQBUFS) and can't be dequeued anymore, so
> the frames are lost, even if the driver returned them. I guess the
> sentence was really unfortunate indeed.
>

Actually, that's not the only case.

The documentation is written from userspace point of view. Queuing an
OUTPUT buffer is not equivalent to having it decoded (and a CAPTURE
buffer given back to vb2). If the userspace queues a buffer and then
stops streaming, the buffer might have been still waiting in the
queue, for decoding of previous buffers to finish.

So basically by "queued frames" I meant "OUTPUT buffers queued by
userspace and not sent to the hardware yet" and by "decoded frames" I
meant "CAPTURE buffers containing matching frames given back to vb2".

How about rewording like this:

     * The ``VIDIOC_STREAMOFF`` operation discards any remaining queued
       ``OUTPUT`` buffers, which means that not all of the ``OUTPUT`` buffers
       queued before the seek may have matching ``CAPTURE`` buffers produced.
       For example, [...]

> > > +       example, given an ``OUTPUT`` queue sequence: QBUF(A), QBUF(B),
> > > +       STREAMOFF(OUT), STREAMON(OUT), QBUF(G), QBUF(H), any of the
> > > +       following results on the ``CAPTURE`` queue is allowed: {A’, B’, G’,
> > > +       H’}, {A’, G’, H’}, {G’, H’}.
> > > +
> > > +   .. note::
> > > +
> > > +      To achieve instantaneous seek, the client may restart streaming on
> > > +      ``CAPTURE`` queue to discard decoded, but not yet dequeued buffers.

Best regards,
Tomasz
Hans Verkuil Oct. 8, 2018, 12:22 p.m. UTC | #30
On 09/19/2018 12:17 PM, Tomasz Figa wrote:
> Hi Hans,
> 
> On Thu, Jul 26, 2018 at 7:20 PM Tomasz Figa <tfiga@chromium.org> wrote:
>>
>> Hi Hans,
>>
>> On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>>>
>>> Hi Tomasz,
>>>
>>> Many, many thanks for working on this! It's a great document and when done
>>> it will be very useful indeed.
>>>
>>> Review comments follow...
>>
>> Thanks for review!
>>
>>>
>>> On 24/07/18 16:06, Tomasz Figa wrote:
> [snip]
>>>> +13. Allocate destination (raw format) buffers via :c:func:`VIDIOC_REQBUFS`
>>>> +    on the ``CAPTURE`` queue.
>>>> +
>>>> +    * **Required fields:**
>>>> +
>>>> +      ``count``
>>>> +          requested number of buffers to allocate; greater than zero
>>>> +
>>>> +      ``type``
>>>> +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
>>>> +
>>>> +      ``memory``
>>>> +          follows standard semantics
>>>> +
>>>> +    * **Return fields:**
>>>> +
>>>> +      ``count``
>>>> +          adjusted to allocated number of buffers
>>>> +
>>>> +    * The driver must adjust count to minimum of required number of
>>>> +      destination buffers for given format and stream configuration and the
>>>> +      count passed. The client must check this value after the ioctl
>>>> +      returns to get the number of buffers allocated.
>>>> +
>>>> +    .. note::
>>>> +
>>>> +       To allocate more than minimum number of buffers (for pipeline
>>>> +       depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE``) to
>>>> +       get minimum number of buffers required, and pass the obtained value
>>>> +       plus the number of additional buffers needed in count to
>>>> +       :c:func:`VIDIOC_REQBUFS`.
>>>
>>>
>>> I think we should mention here the option of using VIDIOC_CREATE_BUFS in order
>>> to allocate buffers larger than the current CAPTURE format in order to accommodate
>>> future resolution changes.
>>
>> Ack.
>>
> 
> I'm about to add a paragraph to describe this, but there is one detail
> to iron out.
> 
> The VIDIOC_CREATE_BUFS ioctl accepts a v4l2_format struct. Userspace
> needs to fill in this struct and the specs says that
> 
>   "Usually this will be done using the VIDIOC_TRY_FMT or VIDIOC_G_FMT
> ioctls to ensure that the requested format is supported by the
> driver."
> 
> However, in case of a decoder, those calls would fixup the format to
> match the currently parsed stream, which would likely resolve to the
> current coded resolution (~hardware alignment). How do we get a format
> for the desired maximum resolution?

You would call G_FMT to get the current format/resolution, then update
width and height and call TRY_FMT.

Although to be honest you can also just set pixelformat and width/height
and zero everything else and call TRY_FMT directly, skipping the G_FMT
ioctl.

> 
> [snip].
>>>> +
>>>> +     * The driver is also allowed to and may not return all decoded frames
> [snip]
>>>> +       queued but not decode before the seek sequence was initiated. For
>>>
>>> Very confusing sentence. I think you mean this:
>>>
>>>           The driver may not return all decoded frames that where ready for
>>>           dequeueing from before the seek sequence was initiated.
>>>
>>> Is this really true? Once decoded frames are marked as buffer_done by the
>>> driver there is no reason for them to be removed. Or you mean something else
>>> here, e.g. the frames are decoded, but the buffers not yet given back to vb2.
>>>
>>
>> Exactly "the frames are decoded, but the buffers not yet given back to
>> vb2", for example, if reordering takes place. However, if one stops
>> streaming before dequeuing all buffers, they are implicitly returned
>> (reset to the state after REQBUFS) and can't be dequeued anymore, so
>> the frames are lost, even if the driver returned them. I guess the
>> sentence was really unfortunate indeed.
>>
> 
> Actually, that's not the only case.
> 
> The documentation is written from userspace point of view. Queuing an
> OUTPUT buffer is not equivalent to having it decoded (and a CAPTURE
> buffer given back to vb2). If the userspace queues a buffer and then
> stops streaming, the buffer might have been still waiting in the
> queue, for decoding of previous buffers to finish.
> 
> So basically by "queued frames" I meant "OUTPUT buffers queued by
> userspace and not sent to the hardware yet" and by "decoded frames" I
> meant "CAPTURE buffers containing matching frames given back to vb2".
> 
> How about rewording like this:
> 
>      * The ``VIDIOC_STREAMOFF`` operation discards any remaining queued
>        ``OUTPUT`` buffers, which means that not all of the ``OUTPUT`` buffers
>        queued before the seek may have matching ``CAPTURE`` buffers produced.
>        For example, [...]

That looks correct.

Regards,

	Hans

> 
>>>> +       example, given an ``OUTPUT`` queue sequence: QBUF(A), QBUF(B),
>>>> +       STREAMOFF(OUT), STREAMON(OUT), QBUF(G), QBUF(H), any of the
>>>> +       following results on the ``CAPTURE`` queue is allowed: {A’, B’, G’,
>>>> +       H’}, {A’, G’, H’}, {G’, H’}.
>>>> +
>>>> +   .. note::
>>>> +
>>>> +      To achieve instantaneous seek, the client may restart streaming on
>>>> +      ``CAPTURE`` queue to discard decoded, but not yet dequeued buffers.
> 
> Best regards,
> Tomasz
>
Tomasz Figa Oct. 9, 2018, 4:23 a.m. UTC | #31
On Mon, Oct 8, 2018 at 9:22 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>
> On 09/19/2018 12:17 PM, Tomasz Figa wrote:
> > Hi Hans,
> >
> > On Thu, Jul 26, 2018 at 7:20 PM Tomasz Figa <tfiga@chromium.org> wrote:
> >>
> >> Hi Hans,
> >>
> >> On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
> >>>
> >>> Hi Tomasz,
> >>>
> >>> Many, many thanks for working on this! It's a great document and when done
> >>> it will be very useful indeed.
> >>>
> >>> Review comments follow...
> >>
> >> Thanks for review!
> >>
> >>>
> >>> On 24/07/18 16:06, Tomasz Figa wrote:
> > [snip]
> >>>> +13. Allocate destination (raw format) buffers via :c:func:`VIDIOC_REQBUFS`
> >>>> +    on the ``CAPTURE`` queue.
> >>>> +
> >>>> +    * **Required fields:**
> >>>> +
> >>>> +      ``count``
> >>>> +          requested number of buffers to allocate; greater than zero
> >>>> +
> >>>> +      ``type``
> >>>> +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
> >>>> +
> >>>> +      ``memory``
> >>>> +          follows standard semantics
> >>>> +
> >>>> +    * **Return fields:**
> >>>> +
> >>>> +      ``count``
> >>>> +          adjusted to allocated number of buffers
> >>>> +
> >>>> +    * The driver must adjust count to minimum of required number of
> >>>> +      destination buffers for given format and stream configuration and the
> >>>> +      count passed. The client must check this value after the ioctl
> >>>> +      returns to get the number of buffers allocated.
> >>>> +
> >>>> +    .. note::
> >>>> +
> >>>> +       To allocate more than minimum number of buffers (for pipeline
> >>>> +       depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE``) to
> >>>> +       get minimum number of buffers required, and pass the obtained value
> >>>> +       plus the number of additional buffers needed in count to
> >>>> +       :c:func:`VIDIOC_REQBUFS`.
> >>>
> >>>
> >>> I think we should mention here the option of using VIDIOC_CREATE_BUFS in order
> >>> to allocate buffers larger than the current CAPTURE format in order to accommodate
> >>> future resolution changes.
> >>
> >> Ack.
> >>
> >
> > I'm about to add a paragraph to describe this, but there is one detail
> > to iron out.
> >
> > The VIDIOC_CREATE_BUFS ioctl accepts a v4l2_format struct. Userspace
> > needs to fill in this struct and the specs says that
> >
> >   "Usually this will be done using the VIDIOC_TRY_FMT or VIDIOC_G_FMT
> > ioctls to ensure that the requested format is supported by the
> > driver."
> >
> > However, in case of a decoder, those calls would fixup the format to
> > match the currently parsed stream, which would likely resolve to the
> > current coded resolution (~hardware alignment). How do we get a format
> > for the desired maximum resolution?
>
> You would call G_FMT to get the current format/resolution, then update
> width and height and call TRY_FMT.
>
> Although to be honest you can also just set pixelformat and width/height
> and zero everything else and call TRY_FMT directly, skipping the G_FMT
> ioctl.
>

Wouldn't TRY_FMT adjust the width and height back to match current stream?

Best regards,
Tomasz
Hans Verkuil Oct. 9, 2018, 6:39 a.m. UTC | #32
On 10/09/2018 06:23 AM, Tomasz Figa wrote:
> On Mon, Oct 8, 2018 at 9:22 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>>
>> On 09/19/2018 12:17 PM, Tomasz Figa wrote:
>>> Hi Hans,
>>>
>>> On Thu, Jul 26, 2018 at 7:20 PM Tomasz Figa <tfiga@chromium.org> wrote:
>>>>
>>>> Hi Hans,
>>>>
>>>> On Wed, Jul 25, 2018 at 8:59 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
>>>>>
>>>>> Hi Tomasz,
>>>>>
>>>>> Many, many thanks for working on this! It's a great document and when done
>>>>> it will be very useful indeed.
>>>>>
>>>>> Review comments follow...
>>>>
>>>> Thanks for review!
>>>>
>>>>>
>>>>> On 24/07/18 16:06, Tomasz Figa wrote:
>>> [snip]
>>>>>> +13. Allocate destination (raw format) buffers via :c:func:`VIDIOC_REQBUFS`
>>>>>> +    on the ``CAPTURE`` queue.
>>>>>> +
>>>>>> +    * **Required fields:**
>>>>>> +
>>>>>> +      ``count``
>>>>>> +          requested number of buffers to allocate; greater than zero
>>>>>> +
>>>>>> +      ``type``
>>>>>> +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
>>>>>> +
>>>>>> +      ``memory``
>>>>>> +          follows standard semantics
>>>>>> +
>>>>>> +    * **Return fields:**
>>>>>> +
>>>>>> +      ``count``
>>>>>> +          adjusted to allocated number of buffers
>>>>>> +
>>>>>> +    * The driver must adjust count to minimum of required number of
>>>>>> +      destination buffers for given format and stream configuration and the
>>>>>> +      count passed. The client must check this value after the ioctl
>>>>>> +      returns to get the number of buffers allocated.
>>>>>> +
>>>>>> +    .. note::
>>>>>> +
>>>>>> +       To allocate more than minimum number of buffers (for pipeline
>>>>>> +       depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE``) to
>>>>>> +       get minimum number of buffers required, and pass the obtained value
>>>>>> +       plus the number of additional buffers needed in count to
>>>>>> +       :c:func:`VIDIOC_REQBUFS`.
>>>>>
>>>>>
>>>>> I think we should mention here the option of using VIDIOC_CREATE_BUFS in order
>>>>> to allocate buffers larger than the current CAPTURE format in order to accommodate
>>>>> future resolution changes.
>>>>
>>>> Ack.
>>>>
>>>
>>> I'm about to add a paragraph to describe this, but there is one detail
>>> to iron out.
>>>
>>> The VIDIOC_CREATE_BUFS ioctl accepts a v4l2_format struct. Userspace
>>> needs to fill in this struct and the specs says that
>>>
>>>   "Usually this will be done using the VIDIOC_TRY_FMT or VIDIOC_G_FMT
>>> ioctls to ensure that the requested format is supported by the
>>> driver."
>>>
>>> However, in case of a decoder, those calls would fixup the format to
>>> match the currently parsed stream, which would likely resolve to the
>>> current coded resolution (~hardware alignment). How do we get a format
>>> for the desired maximum resolution?
>>
>> You would call G_FMT to get the current format/resolution, then update
>> width and height and call TRY_FMT.
>>
>> Although to be honest you can also just set pixelformat and width/height
>> and zero everything else and call TRY_FMT directly, skipping the G_FMT
>> ioctl.
>>
> 
> Wouldn't TRY_FMT adjust the width and height back to match current stream?

Huh. Hmm. Grrr.

Good point and I didn't read your original comment carefully enough.

Suggestions on a postcard...

Regards,

	Hans
Tomasz Figa Oct. 15, 2018, 10:13 a.m. UTC | #33
On Wed, Aug 8, 2018 at 11:55 AM Tomasz Figa <tfiga@chromium.org> wrote:
>
> On Tue, Aug 7, 2018 at 4:37 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
> >
> > On 08/07/2018 09:05 AM, Tomasz Figa wrote:
> > > On Thu, Jul 26, 2018 at 7:57 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
> > >>>> I wonder if we should make these min buffer controls required. It might be easier
> > >>>> that way.
> > >>>
> > >>> Agreed. Although userspace is still free to ignore it, because REQBUFS
> > >>> would do the right thing anyway.
> > >>
> > >> It's never been entirely clear to me what the purpose of those min buffers controls
> > >> is. REQBUFS ensures that the number of buffers is at least the minimum needed to
> > >> make the HW work. So why would you need these controls? It only makes sense if they
> > >> return something different from REQBUFS.
> > >>
> > >
> > > The purpose of those controls is to let the client allocate a number
> > > of buffers bigger than minimum, without the need to allocate the
> > > minimum number of buffers first (to just learn the number), free them
> > > and then allocate a bigger number again.
> >
> > I don't feel this is particularly useful. One problem with the minimum number
> > of buffers as used in the kernel is that it is often the minimum number of
> > buffers required to make the hardware work, but it may not be optimal. E.g.
> > quite a few capture drivers set the minimum to 2, which is enough for the
> > hardware, but it will likely lead to dropped frames. You really need 3
> > (one is being DMAed, one is queued and linked into the DMA engine and one is
> > being processed by userspace).
> >
> > I would actually prefer this to be the recommended minimum number of buffers,
> > which is >= the minimum REQBUFS uses.
> >
> > I.e., if you use this number and you have no special requirements, then you'll
> > get good performance.
>
> I guess we could make it so. It would make existing user space request
> more buffers than it used to with the original meaning, but I guess it
> shouldn't be a big problem.

I gave it a bit more thought and I feel like kernel is not the right
place to put any assumptions on what the userspace expects "good
performance" to be. Actually, having these controls return the minimum
number of buffers as REQBUFS would allocate makes it very well
specified - with this number you can only process frame by frame and
the number of buffers added by userspace defines exactly the queue
depth. It leaves no space for driver-specific quirks, because the
driver doesn't decide what's "good performance" anymore.

Best regards,
Tomasz
Nicolas Dufresne Oct. 16, 2018, 1:09 a.m. UTC | #34
Le lundi 15 octobre 2018 à 19:13 +0900, Tomasz Figa a écrit :
> On Wed, Aug 8, 2018 at 11:55 AM Tomasz Figa <tfiga@chromium.org> wrote:
> > 
> > On Tue, Aug 7, 2018 at 4:37 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
> > > 
> > > On 08/07/2018 09:05 AM, Tomasz Figa wrote:
> > > > On Thu, Jul 26, 2018 at 7:57 PM Hans Verkuil <hverkuil@xs4all.nl> wrote:
> > > > > > > I wonder if we should make these min buffer controls required. It might be easier
> > > > > > > that way.
> > > > > > 
> > > > > > Agreed. Although userspace is still free to ignore it, because REQBUFS
> > > > > > would do the right thing anyway.
> > > > > 
> > > > > It's never been entirely clear to me what the purpose of those min buffers controls
> > > > > is. REQBUFS ensures that the number of buffers is at least the minimum needed to
> > > > > make the HW work. So why would you need these controls? It only makes sense if they
> > > > > return something different from REQBUFS.
> > > > > 
> > > > 
> > > > The purpose of those controls is to let the client allocate a number
> > > > of buffers bigger than minimum, without the need to allocate the
> > > > minimum number of buffers first (to just learn the number), free them
> > > > and then allocate a bigger number again.
> > > 
> > > I don't feel this is particularly useful. One problem with the minimum number
> > > of buffers as used in the kernel is that it is often the minimum number of
> > > buffers required to make the hardware work, but it may not be optimal. E.g.
> > > quite a few capture drivers set the minimum to 2, which is enough for the
> > > hardware, but it will likely lead to dropped frames. You really need 3
> > > (one is being DMAed, one is queued and linked into the DMA engine and one is
> > > being processed by userspace).
> > > 
> > > I would actually prefer this to be the recommended minimum number of buffers,
> > > which is >= the minimum REQBUFS uses.
> > > 
> > > I.e., if you use this number and you have no special requirements, then you'll
> > > get good performance.
> > 
> > I guess we could make it so. It would make existing user space request
> > more buffers than it used to with the original meaning, but I guess it
> > shouldn't be a big problem.
> 
> I gave it a bit more thought and I feel like kernel is not the right
> place to put any assumptions on what the userspace expects "good
> performance" to be. Actually, having these controls return the minimum
> number of buffers as REQBUFS would allocate makes it very well
> specified - with this number you can only process frame by frame and
> the number of buffers added by userspace defines exactly the queue
> depth. It leaves no space for driver-specific quirks, because the
> driver doesn't decide what's "good performance" anymore.

I agree on that and I would add that the driver making any assumption
would lead to memory waste in context where less buffer will still work
(think of fence based operation as an example).

> 
> Best regards,
> Tomasz
Laurent Pinchart Oct. 17, 2018, 1:34 p.m. UTC | #35
Hi Tomasz,

Thank you for the patch.

On Tuesday, 24 July 2018 17:06:20 EEST Tomasz Figa wrote:
> Due to complexity of the video decoding process, the V4L2 drivers of
> stateful decoder hardware require specific sequences of V4L2 API calls
> to be followed. These include capability enumeration, initialization,
> decoding, seek, pause, dynamic resolution change, drain and end of
> stream.
> 
> Specifics of the above have been discussed during Media Workshops at
> LinuxCon Europe 2012 in Barcelona and then later Embedded Linux
> Conference Europe 2014 in Düsseldorf. The de facto Codec API that
> originated at those events was later implemented by the drivers we already
> have merged in mainline, such as s5p-mfc or coda.
> 
> The only thing missing was the real specification included as a part of
> Linux Media documentation. Fix it now and document the decoder part of
> the Codec API.
> 
> Signed-off-by: Tomasz Figa <tfiga@chromium.org>
> ---
>  Documentation/media/uapi/v4l/dev-decoder.rst | 872 +++++++++++++++++++
>  Documentation/media/uapi/v4l/devices.rst     |   1 +
>  Documentation/media/uapi/v4l/v4l2.rst        |  10 +-
>  3 files changed, 882 insertions(+), 1 deletion(-)
>  create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst
> 
> diff --git a/Documentation/media/uapi/v4l/dev-decoder.rst
> b/Documentation/media/uapi/v4l/dev-decoder.rst new file mode 100644
> index 000000000000..f55d34d2f860
> --- /dev/null
> +++ b/Documentation/media/uapi/v4l/dev-decoder.rst
> @@ -0,0 +1,872 @@
> +.. -*- coding: utf-8; mode: rst -*-
> +
> +.. _decoder:
> +
> +****************************************
> +Memory-to-memory Video Decoder Interface
> +****************************************
> +
> +Input data to a video decoder are buffers containing unprocessed video
> +stream (e.g. Annex-B H.264/HEVC stream, raw VP8/9 stream). The driver is
> +expected not to require any additional information from the client to
> +process these buffers. Output data are raw video frames returned in display
> +order.
> +
> +Performing software parsing, processing etc. of the stream in the driver
> +in order to support this interface is strongly discouraged. In case such
> +operations are needed, use of Stateless Video Decoder Interface (in
> +development) is strongly advised.
> +
> +Conventions and notation used in this document
> +==============================================
> +
> +1. The general V4L2 API rules apply if not specified in this document
> +   otherwise.
> +
> +2. The meaning of words “must”, “may”, “should”, etc. is as per RFC
> +   2119.
> +
> +3. All steps not marked “optional” are required.
> +
> +4. :c:func:`VIDIOC_G_EXT_CTRLS`, :c:func:`VIDIOC_S_EXT_CTRLS` may be used
> +   interchangeably with :c:func:`VIDIOC_G_CTRL`, :c:func:`VIDIOC_S_CTRL`,
> +   unless specified otherwise.
> +
> +5. Single-plane API (see spec) and applicable structures may be used
> +   interchangeably with Multi-plane API, unless specified otherwise,
> +   depending on driver capabilities and following the general V4L2
> +   guidelines.

How about also allowing VIDIOC_CREATE_BUFS where VIDIOC_REQBUFS is mentioned ?

> +6. i = [a..b]: sequence of integers from a to b, inclusive, i.e. i =
> +   [0..2]: i = 0, 1, 2.
> +
> +7. For ``OUTPUT`` buffer A, A’ represents a buffer on the ``CAPTURE`` queue
> +   containing data (decoded frame/stream) that resulted from processing + 
>  buffer A.
> +
> +Glossary
> +========
> +
> +CAPTURE
> +   the destination buffer queue; the queue of buffers containing decoded
> +   frames; ``V4L2_BUF_TYPE_VIDEO_CAPTURE```` or
> +   ``V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE``; data are captured from the
> +   hardware into ``CAPTURE`` buffers
> +
> +client
> +   application client communicating with the driver implementing this API
> +
> +coded format
> +   encoded/compressed video bitstream format (e.g. H.264, VP8, etc.); see
> +   also: raw format
> +
> +coded height
> +   height for given coded resolution
> +
> +coded resolution
> +   stream resolution in pixels aligned to codec and hardware requirements;
> +   typically visible resolution rounded up to full macroblocks;
> +   see also: visible resolution
> +
> +coded width
> +   width for given coded resolution
> +
> +decode order
> +   the order in which frames are decoded; may differ from display order if
> +   coded format includes a feature of frame reordering; ``OUTPUT`` buffers
> +   must be queued by the client in decode order
> +
> +destination
> +   data resulting from the decode process; ``CAPTURE``
> +
> +display order
> +   the order in which frames must be displayed; ``CAPTURE`` buffers must be
> +   returned by the driver in display order
> +
> +DPB
> +   Decoded Picture Buffer; a H.264 term for a buffer that stores a picture
> +   that is encoded or decoded and available for reference in further
> +   decode/encode steps.

By "encoded or decoded", do you mean "raw frames to be encoded (in the encoder 
use case) or decoded raw frames (in the decoder use case)" ? I think this 
should be clarified.

> +EOS
> +   end of stream
> +
> +IDR
> +   a type of a keyframe in H.264-encoded stream, which clears the list of
> +   earlier reference frames (DPBs)
> +
> +keyframe
> +   an encoded frame that does not reference frames decoded earlier, i.e.
> +   can be decoded fully on its own.
> +
> +OUTPUT
> +   the source buffer queue; the queue of buffers containing encoded
> +   bitstream; ``V4L2_BUF_TYPE_VIDEO_OUTPUT`` or
> +   ``V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE``; the hardware is fed with data
> +   from ``OUTPUT`` buffers
> +
> +PPS
> +   Picture Parameter Set; a type of metadata entity in H.264 bitstream
> +
> +raw format
> +   uncompressed format containing raw pixel data (e.g. YUV, RGB formats)
> +
> +resume point
> +   a point in the bitstream from which decoding may start/continue, without
> +   any previous state/data present, e.g.: a keyframe (VP8/VP9) or +  
> SPS/PPS/IDR sequence (H.264); a resume point is required to start decode + 
>  of a new stream, or to resume decoding after a seek
> +
> +source
> +   data fed to the decoder; ``OUTPUT``
> +
> +SPS
> +   Sequence Parameter Set; a type of metadata entity in H.264 bitstream
> +
> +visible height
> +   height for given visible resolution; display height
> +
> +visible resolution
> +   stream resolution of the visible picture, in pixels, to be used for
> +   display purposes; must be smaller or equal to coded resolution;
> +   display resolution
> +
> +visible width
> +   width for given visible resolution; display width
> +
> +Querying capabilities
> +=====================
> +
> +1. To enumerate the set of coded formats supported by the driver, the
> +   client may call :c:func:`VIDIOC_ENUM_FMT` on ``OUTPUT``.
> +
> +   * The driver must always return the full set of supported formats,
> +     irrespective of the format set on the ``CAPTURE``.
> +
> +2. To enumerate the set of supported raw formats, the client may call
> +   :c:func:`VIDIOC_ENUM_FMT` on ``CAPTURE``.
> +
> +   * The driver must return only the formats supported for the format
> +     currently active on ``OUTPUT``.
> +
> +   * In order to enumerate raw formats supported by a given coded format,
> +     the client must first set that coded format on ``OUTPUT`` and then
> +     enumerate the ``CAPTURE`` queue.

Maybe s/enumerate the/enumerate formats on the/ ?

> +3. The client may use :c:func:`VIDIOC_ENUM_FRAMESIZES` to detect supported
> +   resolutions for a given format, passing desired pixel format in
> +   :c:type:`v4l2_frmsizeenum` ``pixel_format``.
> +
> +   * Values returned by :c:func:`VIDIOC_ENUM_FRAMESIZES` on ``OUTPUT``
> +     must include all possible coded resolutions supported by the decoder
> +     for given coded pixel format.
> +
> +   * Values returned by :c:func:`VIDIOC_ENUM_FRAMESIZES` on ``CAPTURE``
> +     must include all possible frame buffer resolutions supported by the
> +     decoder for given raw pixel format and coded format currently set on
> +     ``OUTPUT``.
> +
> +    .. note::
> +
> +       The client may derive the supported resolution range for a
> +       combination of coded and raw format by setting width and height of
> +       ``OUTPUT`` format to 0 and calculating the intersection of
> +       resolutions returned from calls to :c:func:`VIDIOC_ENUM_FRAMESIZES`
> +       for the given coded and raw formats.

I'm confused by the note, I'm not sure to understand what you mean.

> +4. Supported profiles and levels for given format, if applicable, may be
> +   queried using their respective controls via :c:func:`VIDIOC_QUERYCTRL`.
> +
> +Initialization
> +==============
> +
> +1. *[optional]* Enumerate supported ``OUTPUT`` formats and resolutions. See
> +   capability enumeration.
> +
> +2. Set the coded format on ``OUTPUT`` via :c:func:`VIDIOC_S_FMT`
> +
> +   * **Required fields:**
> +
> +     ``type``
> +         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> +
> +     ``pixelformat``
> +         a coded pixel format
> +
> +     ``width``, ``height``
> +         required only if cannot be parsed from the stream for the given
> +         coded format; optional otherwise - set to zero to ignore
> +
> +     other fields
> +         follow standard semantics
> +
> +   * For coded formats including stream resolution information, if width
> +     and height are set to non-zero values, the driver will propagate the
> +     resolution to ``CAPTURE`` and signal a source change event
> +     instantly.

Maybe s/instantly/immediately before returning from :c:func:`VIDIOC_S_FMT`/ ?

> However, after the decoder is done parsing the
> +     information embedded in the stream, it will update ``CAPTURE``

s/update/update the/

> +     format with new values and signal a source change event again, if

s/, if/ if/

> +     the values do not match.
> +
> +   .. note::
> +
> +      Changing ``OUTPUT`` format may change currently set ``CAPTURE``

Do you have a particular dislike for definite articles ? :-) I would have 
written "Changing the ``OUTPUT`` format may change the currently set 
``CAPTURE`` ...". I won't repeat the comment through the whole review, but 
many places seem to be missing a definite article.

> +      format. The driver will derive a new ``CAPTURE`` format from
> +      ``OUTPUT`` format being set, including resolution, colorimetry
> +      parameters, etc. If the client needs a specific ``CAPTURE`` format,
> +      it must adjust it afterwards.
> +
> +3.  *[optional]* Get minimum number of buffers required for ``OUTPUT``
> +    queue via :c:func:`VIDIOC_G_CTRL`. This is useful if client intends to
> +    use more buffers than minimum required by hardware/format.
> +
> +    * **Required fields:**
> +
> +      ``id``
> +          set to ``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT``
> +
> +    * **Return fields:**
> +
> +      ``value``
> +          required number of ``OUTPUT`` buffers for the currently set
> +          format

s/required/required minimum/

> +
> +4.  Allocate source (bitstream) buffers via :c:func:`VIDIOC_REQBUFS` on
> +    ``OUTPUT``.
> +
> +    * **Required fields:**
> +
> +      ``count``
> +          requested number of buffers to allocate; greater than zero
> +
> +      ``type``
> +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> +
> +      ``memory``
> +          follows standard semantics
> +
> +      ``sizeimage``
> +          follows standard semantics; the client is free to choose any
> +          suitable size, however, it may be subject to change by the
> +          driver
> +
> +    * **Return fields:**
> +
> +      ``count``
> +          actual number of buffers allocated
> +
> +    * The driver must adjust count to minimum of required number of
> +      ``OUTPUT`` buffers for given format and count passed.

Isn't it the maximum, not the minimum ?

> The client must
> +      check this value after the ioctl returns to get the number of
> +      buffers allocated.
> +
> +    .. note::
> +
> +       To allocate more than minimum number of buffers (for pipeline
> +       depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT``) to
> +       get minimum number of buffers required by the driver/format,
> +       and pass the obtained value plus the number of additional
> +       buffers needed in count to :c:func:`VIDIOC_REQBUFS`.
> +
> +5.  Start streaming on ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON`.
> +
> +6.  This step only applies to coded formats that contain resolution
> +    information in the stream. Continue queuing/dequeuing bitstream
> +    buffers to/from the ``OUTPUT`` queue via :c:func:`VIDIOC_QBUF` and
> +    :c:func:`VIDIOC_DQBUF`. The driver must keep processing and returning
> +    each buffer to the client until required metadata to configure the
> +    ``CAPTURE`` queue are found. This is indicated by the driver sending
> +    a ``V4L2_EVENT_SOURCE_CHANGE`` event with
> +    ``V4L2_EVENT_SRC_CH_RESOLUTION`` source change type. There is no
> +    requirement to pass enough data for this to occur in the first buffer
> +    and the driver must be able to process any number.
> +
> +    * If data in a buffer that triggers the event is required to decode
> +      the first frame, the driver must not return it to the client,
> +      but must retain it for further decoding.
> +
> +    * If the client set width and height of ``OUTPUT`` format to 0, calling
> +      :c:func:`VIDIOC_G_FMT` on the ``CAPTURE`` queue will return -EPERM,
> +      until the driver configures ``CAPTURE`` format according to stream
> +      metadata.

That's a pretty harsh handling for this condition. What's the rationale for 
returning -EPERM instead of for instance succeeding with width and height set 
to 0 ?

> +    * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events and
> +      the event is signaled, the decoding process will not continue until
> +      it is acknowledged by either (re-)starting streaming on ``CAPTURE``,
> +      or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START``
> +      command.
> +
> +    .. note::
> +
> +       No decoded frames are produced during this phase.
> +
> +7.  This step only applies to coded formats that contain resolution
> +    information in the stream.
> +    Receive and handle ``V4L2_EVENT_SOURCE_CHANGE`` from the driver
> +    via :c:func:`VIDIOC_DQEVENT`. The driver must send this event once
> +    enough data is obtained from the stream to allocate ``CAPTURE``
> +    buffers and to begin producing decoded frames.

Doesn't the last sentence belong to step 6 (where it's already explained to 
some extent) ?

> +
> +    * **Required fields:**
> +
> +      ``type``
> +          set to ``V4L2_EVENT_SOURCE_CHANGE``

Isn't the type field set by the driver ?

> +    * **Return fields:**
> +
> +      ``u.src_change.changes``
> +          set to ``V4L2_EVENT_SRC_CH_RESOLUTION``
> +
> +    * Any client query issued after the driver queues the event must return
> +      values applying to the just parsed stream, including queue formats,
> +      selection rectangles and controls.

To align with the wording used so far, I would say that "the driver must" 
return values applying to the just parsed stream.

I think I would also move this to step 6, as it's related to queuing the 
event, not dequeuing it.

> +8.  Call :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` queue to get format for the
> +    destination buffers parsed/decoded from the bitstream.
> +
> +    * **Required fields:**
> +
> +      ``type``
> +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
> +
> +    * **Return fields:**
> +
> +      ``width``, ``height``
> +          frame buffer resolution for the decoded frames
> +
> +      ``pixelformat``
> +          pixel format for decoded frames
> +
> +      ``num_planes`` (for _MPLANE ``type`` only)
> +          number of planes for pixelformat
> +
> +      ``sizeimage``, ``bytesperline``
> +          as per standard semantics; matching frame buffer format
> +
> +    .. note::
> +
> +       The value of ``pixelformat`` may be any pixel format supported and
> +       must be supported for current stream, based on the information
> +       parsed from the stream and hardware capabilities. It is suggested
> +       that driver chooses the preferred/optimal format for given

In compliance with RFC 2119, how about using "Drivers should choose" instead 
of "It is suggested that driver chooses" ?

> +       configuration. For example, a YUV format may be preferred over an
> +       RGB format, if additional conversion step would be required.
> +
> +9.  *[optional]* Enumerate ``CAPTURE`` formats via
> +    :c:func:`VIDIOC_ENUM_FMT` on ``CAPTURE`` queue. Once the stream
> +    information is parsed and known, the client may use this ioctl to
> +    discover which raw formats are supported for given stream and select on

s/select on/select one/

> +    of them via :c:func:`VIDIOC_S_FMT`.
> +
> +    .. note::
> +
> +       The driver will return only formats supported for the current stream
> +       parsed in this initialization sequence, even if more formats may be
> +       supported by the driver in general.
> +
> +       For example, a driver/hardware may support YUV and RGB formats for
> +       resolutions 1920x1088 and lower, but only YUV for higher
> +       resolutions (due to hardware limitations). After parsing
> +       a resolution of 1920x1088 or lower, :c:func:`VIDIOC_ENUM_FMT` may
> +       return a set of YUV and RGB pixel formats, but after parsing
> +       resolution higher than 1920x1088, the driver will not return RGB,
> +       unsupported for this resolution.
> +
> +       However, subsequent resolution change event triggered after
> +       discovering a resolution change within the same stream may switch
> +       the stream into a lower resolution and :c:func:`VIDIOC_ENUM_FMT`
> +       would return RGB formats again in that case.
> +
> +10.  *[optional]* Choose a different ``CAPTURE`` format than suggested via
> +     :c:func:`VIDIOC_S_FMT` on ``CAPTURE`` queue. It is possible for the
> +     client to choose a different format than selected/suggested by the

And here, "A client may choose" ?

> +     driver in :c:func:`VIDIOC_G_FMT`.
> +
> +     * **Required fields:**
> +
> +       ``type``
> +           a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
> +
> +       ``pixelformat``
> +           a raw pixel format
> +
> +     .. note::
> +
> +        Calling :c:func:`VIDIOC_ENUM_FMT` to discover currently available
> +        formats after receiving ``V4L2_EVENT_SOURCE_CHANGE`` is useful to
> +        find out a set of allowed formats for given configuration, but not
> +        required, if the client can accept the defaults.

s/required/required,/

> +
> +11. *[optional]* Acquire visible resolution via
> +    :c:func:`VIDIOC_G_SELECTION`.
> +
> +    * **Required fields:**
> +
> +      ``type``
> +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
> +
> +      ``target``
> +          set to ``V4L2_SEL_TGT_COMPOSE``
> +
> +    * **Return fields:**
> +
> +      ``r.left``, ``r.top``, ``r.width``, ``r.height``
> +          visible rectangle; this must fit within frame buffer resolution
> +          returned by :c:func:`VIDIOC_G_FMT`.
> +
> +    * The driver must expose following selection targets on ``CAPTURE``:
> +
> +      ``V4L2_SEL_TGT_CROP_BOUNDS``
> +          corresponds to coded resolution of the stream
> +
> +      ``V4L2_SEL_TGT_CROP_DEFAULT``
> +          a rectangle covering the part of the frame buffer that contains
> +          meaningful picture data (visible area); width and height will be
> +          equal to visible resolution of the stream
> +
> +      ``V4L2_SEL_TGT_CROP``
> +          rectangle within coded resolution to be output to ``CAPTURE``;
> +          defaults to ``V4L2_SEL_TGT_CROP_DEFAULT``; read-only on hardware
> +          without additional compose/scaling capabilities
> +
> +      ``V4L2_SEL_TGT_COMPOSE_BOUNDS``
> +          maximum rectangle within ``CAPTURE`` buffer, which the cropped
> +          frame can be output into; equal to ``V4L2_SEL_TGT_CROP``, if the
> +          hardware does not support compose/scaling
> +
> +      ``V4L2_SEL_TGT_COMPOSE_DEFAULT``
> +          equal to ``V4L2_SEL_TGT_CROP``
> +
> +      ``V4L2_SEL_TGT_COMPOSE``
> +          rectangle inside ``OUTPUT`` buffer into which the cropped frame

s/OUTPUT/CAPTURE/ ?

> +          is output; defaults to ``V4L2_SEL_TGT_COMPOSE_DEFAULT``;

and "is captured" or "is written" ?

> +          read-only on hardware without additional compose/scaling
> +          capabilities
> +
> +      ``V4L2_SEL_TGT_COMPOSE_PADDED``
> +          rectangle inside ``OUTPUT`` buffer which is overwritten by the

Here too ?

> +          hardware; equal to ``V4L2_SEL_TGT_COMPOSE``, if the hardware

s/, if/ if/

> +          does not write padding pixels
> +
> +12. *[optional]* Get minimum number of buffers required for ``CAPTURE``
> +    queue via :c:func:`VIDIOC_G_CTRL`. This is useful if client intends to
> +    use more buffers than minimum required by hardware/format.
> +
> +    * **Required fields:**
> +
> +      ``id``
> +          set to ``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE``
> +
> +    * **Return fields:**
> +
> +      ``value``
> +          minimum number of buffers required to decode the stream parsed in
> +          this initialization sequence.
> +
> +    .. note::
> +
> +       Note that the minimum number of buffers must be at least the number
> +       required to successfully decode the current stream. This may for
> +       example be the required DPB size for an H.264 stream given the
> +       parsed stream configuration (resolution, level).
> +
> +13. Allocate destination (raw format) buffers via :c:func:`VIDIOC_REQBUFS`
> +    on the ``CAPTURE`` queue.
> +
> +    * **Required fields:**
> +
> +      ``count``
> +          requested number of buffers to allocate; greater than zero
> +
> +      ``type``
> +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
> +
> +      ``memory``
> +          follows standard semantics
> +
> +    * **Return fields:**
> +
> +      ``count``
> +          adjusted to allocated number of buffers
> +
> +    * The driver must adjust count to minimum of required number of

s/minimum/maximum/ ?

Should we also mentioned that if count > minimum, the driver may additionally 
limit the number of buffers based on internal limits (such as maximum memory 
consumption) ?

> +      destination buffers for given format and stream configuration and the
> +      count passed. The client must check this value after the ioctl
> +      returns to get the number of buffers allocated.
> +
> +    .. note::
> +
> +       To allocate more than minimum number of buffers (for pipeline
> +       depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE``) to
> +       get minimum number of buffers required, and pass the obtained value
> +       plus the number of additional buffers needed in count to
> +       :c:func:`VIDIOC_REQBUFS`.
> +
> +14. Call :c:func:`VIDIOC_STREAMON` to initiate decoding frames.
> +
> +Decoding
> +========
> +
> +This state is reached after a successful initialization sequence. In this
> +state, client queues and dequeues buffers to both queues via
> +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following standard
> +semantics.
> +
> +Both queues operate independently, following standard behavior of V4L2
> +buffer queues and memory-to-memory devices. In addition, the order of
> +decoded frames dequeued from ``CAPTURE`` queue may differ from the order of
> +queuing coded frames to ``OUTPUT`` queue, due to properties of selected
> +coded format, e.g. frame reordering. The client must not assume any direct
> +relationship between ``CAPTURE`` and ``OUTPUT`` buffers, other than
> +reported by :c:type:`v4l2_buffer` ``timestamp`` field.
> +
> +The contents of source ``OUTPUT`` buffers depend on active coded pixel
> +format and might be affected by codec-specific extended controls, as stated

s/might/may/

> +in documentation of each format individually.
> +
> +The client must not assume any direct relationship between ``CAPTURE``
> +and ``OUTPUT`` buffers and any specific timing of buffers becoming
> +available to dequeue. Specifically:
> +
> +* a buffer queued to ``OUTPUT`` may result in no buffers being produced
> +  on ``CAPTURE`` (e.g. if it does not contain encoded data, or if only
> +  metadata syntax structures are present in it),
> +
> +* a buffer queued to ``OUTPUT`` may result in more than 1 buffer produced
> +  on ``CAPTURE`` (if the encoded data contained more than one frame, or if
> +  returning a decoded frame allowed the driver to return a frame that
> +  preceded it in decode, but succeeded it in display order),
> +
> +* a buffer queued to ``OUTPUT`` may result in a buffer being produced on
> +  ``CAPTURE`` later into decode process, and/or after processing further
> +  ``OUTPUT`` buffers, or be returned out of order, e.g. if display
> +  reordering is used,
> +
> +* buffers may become available on the ``CAPTURE`` queue without additional

s/buffers/Buffers/

> +  buffers queued to ``OUTPUT`` (e.g. during drain or ``EOS``), because of
> +  ``OUTPUT`` buffers being queued in the past and decoding result of which
> +  being available only at later time, due to specifics of the decoding
> +  process.

I understand what you mean, but the wording is weird to my eyes. How about

* Buffers may become available on the ``CAPTURE`` queue without additional 
buffers queued to ``OUTPUT`` (e.g. during drain or ``EOS``), because of 
``OUTPUT`` buffers queued in the past whose decoding results are only 
available at later time, due to specifics of the decoding process.

> +Seek
> +====
> +
> +Seek is controlled by the ``OUTPUT`` queue, as it is the source of
> +bitstream data. ``CAPTURE`` queue remains unchanged/unaffected.

I assume that a seek may result in a source resolution change event, in which 
case the capture queue will be affected. How about stating here that 
controlling seek doesn't require any specific operation on the capture queue, 
but that the capture queue may be affected as per normal decoder operation ? 
We may also want to mention the event as an example.

> +1. Stop the ``OUTPUT`` queue to begin the seek sequence via
> +   :c:func:`VIDIOC_STREAMOFF`.
> +
> +   * **Required fields:**
> +
> +     ``type``
> +         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> +
> +   * The driver must drop all the pending ``OUTPUT`` buffers and they are
> +     treated as returned to the client (following standard semantics).
> +
> +2. Restart the ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON`
> +
> +   * **Required fields:**
> +
> +     ``type``
> +         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> +
> +   * The driver must be put in a state after seek and be ready to

What do you mean by "a state after seek" ?

> +     accept new source bitstream buffers.
> +
> +3. Start queuing buffers to ``OUTPUT`` queue containing stream data after
> +   the seek until a suitable resume point is found.
> +
> +   .. note::
> +
> +      There is no requirement to begin queuing stream starting exactly from

s/stream/buffers/ ?

> +      a resume point (e.g. SPS or a keyframe). The driver must handle any
> +      data queued and must keep processing the queued buffers until it
> +      finds a suitable resume point. While looking for a resume point, the
> +      driver processes ``OUTPUT`` buffers and returns them to the client
> +      without producing any decoded frames.
> +
> +      For hardware known to be mishandling seeks to a non-resume point,
> +      e.g. by returning corrupted decoded frames, the driver must be able
> +      to handle such seeks without a crash or any fatal decode error.

This should be true for any hardware, there should never be any crash or fatal 
decode error. I'd write it as

Some hardware is known to mishandle seeks to a non-resume point. Such an 
operation may result in an unspecified number of corrupted decoded frames 
being made available on ``CAPTURE``. Drivers must ensure that no fatal 
decoding errors or crashes occur, and implement any necessary handling and 
work-arounds for hardware issues related to seek operations.

> +4. After a resume point is found, the driver will start returning
> +   ``CAPTURE`` buffers with decoded frames.
> +
> +   * There is no precise specification for ``CAPTURE`` queue of when it
> +     will start producing buffers containing decoded data from buffers
> +     queued after the seek, as it operates independently
> +     from ``OUTPUT`` queue.
> +
> +     * The driver is allowed to and may return a number of remaining

s/is allowed to and may/may/

> +       ``CAPTURE`` buffers containing decoded frames from before the seek
> +       after the seek sequence (STREAMOFF-STREAMON) is performed.

Shouldn't all these buffers be returned when STREAMOFF is called on the OUTPUT 
side ?

> +     * The driver is also allowed to and may not return all decoded frames

s/is also allowed to and may not return/may also not return/

> +       queued but not decode before the seek sequence was initiated. For

s/not decode/not decoded/

> +       example, given an ``OUTPUT`` queue sequence: QBUF(A), QBUF(B),
> +       STREAMOFF(OUT), STREAMON(OUT), QBUF(G), QBUF(H), any of the
> +       following results on the ``CAPTURE`` queue is allowed: {A’, B’, G’,
> +       H’}, {A’, G’, H’}, {G’, H’}.

Related to the previous point, shouldn't this be moved to step 1 ?

> +   .. note::
> +
> +      To achieve instantaneous seek, the client may restart streaming on
> +      ``CAPTURE`` queue to discard decoded, but not yet dequeued buffers.
> +
> +Pause
> +=====
> +
> +In order to pause, the client should just cease queuing buffers onto the
> +``OUTPUT`` queue. This is different from the general V4L2 API definition of
> +pause, which involves calling :c:func:`VIDIOC_STREAMOFF` on the queue.
> +Without source bitstream data, there is no data to process and the
> hardware +remains idle.
> +
> +Conversely, using :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue indicates
> +a seek, which
> +
> +1. drops all ``OUTPUT`` buffers in flight and
> +2. after a subsequent :c:func:`VIDIOC_STREAMON`, will look for and only
> +   continue from a resume point.
> +
> +This is usually undesirable for pause. The STREAMOFF-STREAMON sequence is
> +intended for seeking.
> +
> +Similarly, ``CAPTURE`` queue should remain streaming as well, as the
> +STREAMOFF-STREAMON sequence on it is intended solely for changing buffer
> +sets.

And also to drop decoded buffers for instant seek ?

> +Dynamic resolution change
> +=========================
> +
> +A video decoder implementing this interface must support dynamic resolution
> +change, for streams, which include resolution metadata in the bitstream.

s/for streams, which/for streams that/

> +When the decoder encounters a resolution change in the stream, the dynamic
> +resolution change sequence is started.
> +
> +1.  After encountering a resolution change in the stream, the driver must
> +    first process and decode all remaining buffers from before the
> +    resolution change point.
> +
> +2.  After all buffers containing decoded frames from before the resolution
> +    change point are ready to be dequeued on the ``CAPTURE`` queue, the
> +    driver sends a ``V4L2_EVENT_SOURCE_CHANGE`` event for source change
> +    type ``V4L2_EVENT_SRC_CH_RESOLUTION``.
> +
> +    * The last buffer from before the change must be marked with
> +      :c:type:`v4l2_buffer` ``flags`` flag ``V4L2_BUF_FLAG_LAST`` as in the
> +      drain sequence. The last buffer might be empty (with
> +      :c:type:`v4l2_buffer` ``bytesused`` = 0) and must be ignored by the
> +      client, since it does not contain any decoded frame.
> +
> +    * Any client query issued after the driver queues the event must return
> +      values applying to the stream after the resolution change, including
> +      queue formats, selection rectangles and controls.
> +
> +    * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events and
> +      the event is signaled, the decoding process will not continue until
> +      it is acknowledged by either (re-)starting streaming on ``CAPTURE``,
> +      or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START``
> +      command.

This usage of V4L2_DEC_CMD_START isn't aligned with the documentation of the 
command. I'm not opposed to this, but I think the use cases of decoder 
commands for codecs should be explained in the VIDIOC_DECODER_CMD 
documentation. What bothers me in particular is usage of V4L2_DEC_CMD_START to 
restart the decoder, while no V4L2_DEC_CMD_STOP has been issued. Should we add 
a section that details the decoder state machine with the implicit and 
explicit ways in which it is started and stopped ?

I would also reference step 7 here.

> +    .. note::
> +
> +       Any attempts to dequeue more buffers beyond the buffer marked
> +       with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE error from
> +       :c:func:`VIDIOC_DQBUF`.
> +
> +3.  The client calls :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` to get the new
> +    format information. This is identical to calling :c:func:`VIDIOC_G_FMT`
> +    after ``V4L2_EVENT_SRC_CH_RESOLUTION`` in the initialization sequence
> +    and should be handled similarly.

As the source resolution change event is mentioned in multiple places, how 
about extracting the related ioctls sequence to a specific section, and 
referencing it where needed (at least from the initialization sequence and 
here) ?

> +    .. note::
> +
> +       It is allowed for the driver not to support the same pixel format as

"Drivers may not support ..."

> +       previously used (before the resolution change) for the new
> +       resolution. The driver must select a default supported pixel format,
> +       return it, if queried using :c:func:`VIDIOC_G_FMT`, and the client
> +       must take note of it.
> +
> +4.  The client acquires visible resolution as in initialization sequence.
> +
> +5.  *[optional]* The client is allowed to enumerate available formats and

s/is allowed to/may/

> +    select a different one than currently chosen (returned via
> +    :c:func:`VIDIOC_G_FMT)`. This is identical to a corresponding step in
> +    the initialization sequence.
> +
> +6.  *[optional]* The client acquires minimum number of buffers as in
> +    initialization sequence.
> +
> +7.  If all the following conditions are met, the client may resume the
> +    decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with
> +    ``V4L2_DEC_CMD_START`` command, as in case of resuming after the drain
> +    sequence:
> +
> +    * ``sizeimage`` of new format is less than or equal to the size of
> +      currently allocated buffers,
> +
> +    * the number of buffers currently allocated is greater than or equal to
> +      the minimum number of buffers acquired in step 6.
> +
> +    In such case, the remaining steps do not apply.
> +
> +    However, if the client intends to change the buffer set, to lower
> +    memory usage or for any other reasons, it may be achieved by following
> +    the steps below.
> +
> +8.  After dequeuing all remaining buffers from the ``CAPTURE`` queue,

This is optional, isn't it ?

> the
> +    client must call :c:func:`VIDIOC_STREAMOFF` on the ``CAPTURE`` queue.
> +    The ``OUTPUT`` queue must remain streaming (calling STREAMOFF on it

:c:func:`VIDIOC_STREAMOFF`

> +    would trigger a seek).
> +
> +9.  The client frees the buffers on the ``CAPTURE`` queue using
> +    :c:func:`VIDIOC_REQBUFS`.
> +
> +    * **Required fields:**
> +
> +      ``count``
> +          set to 0
> +
> +      ``type``
> +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
> +
> +      ``memory``
> +          follows standard semantics
> +
> +10. The client allocates a new set of buffers for the ``CAPTURE`` queue via
> +    :c:func:`VIDIOC_REQBUFS`. This is identical to a corresponding step in
> +    the initialization sequence.
> +
> +11. The client resumes decoding by issuing :c:func:`VIDIOC_STREAMON` on the
> +    ``CAPTURE`` queue.
> +
> +During the resolution change sequence, the ``OUTPUT`` queue must remain
> +streaming. Calling :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue would
> +initiate a seek.
> +
> +The ``OUTPUT`` queue operates separately from the ``CAPTURE`` queue for the
> +duration of the entire resolution change sequence. It is allowed (and
> +recommended for best performance and simplicity) for the client to keep

"The client should (for best performance and simplicity) keep ..."

> +queuing/dequeuing buffers from/to ``OUTPUT`` queue even while processing

s/from\to/to\/from/

> +this sequence.
> +
> +.. note::
> +
> +   It is also possible for this sequence to be triggered without a change

"This sequence may be triggered ..."

> +   in coded resolution, if a different number of ``CAPTURE`` buffers is
> +   required in order to continue decoding the stream or the visible
> +   resolution changes.
> +
> +Drain
> +=====
> +
> +To ensure that all queued ``OUTPUT`` buffers have been processed and
> +related ``CAPTURE`` buffers output to the client, the following drain
> +sequence may be followed. After the drain sequence is complete, the client
> +has received all decoded frames for all ``OUTPUT`` buffers queued before
> +the sequence was started.
> +
> +1. Begin drain by issuing :c:func:`VIDIOC_DECODER_CMD`.
> +
> +   * **Required fields:**
> +
> +     ``cmd``
> +         set to ``V4L2_DEC_CMD_STOP``
> +
> +     ``flags``
> +         set to 0
> +
> +     ``pts``
> +         set to 0
> +
> +2. The driver must process and decode as normal all ``OUTPUT`` buffers
> +   queued by the client before the :c:func:`VIDIOC_DECODER_CMD` was issued.
> +   Any operations triggered as a result of processing these buffers
> +   (including the initialization and resolution change sequences) must be
> +   processed as normal by both the driver and the client before proceeding
> +   with the drain sequence.
> +
> +3. Once all ``OUTPUT`` buffers queued before ``V4L2_DEC_CMD_STOP`` are
> +   processed:
> +
> +   * If the ``CAPTURE`` queue is streaming, once all decoded frames (if
> +     any) are ready to be dequeued on the ``CAPTURE`` queue, the driver
> +     must send a ``V4L2_EVENT_EOS``.

s/\./event./

Is the event sent on the OUTPUT or CAPTURE queue ? I assume the latter, should 
it be explicitly documented ?

> The driver must also set
> +     ``V4L2_BUF_FLAG_LAST`` in :c:type:`v4l2_buffer` ``flags`` field on the
> +     buffer on the ``CAPTURE`` queue containing the last frame (if any)
> +     produced as a result of processing the ``OUTPUT`` buffers queued
> +     before ``V4L2_DEC_CMD_STOP``. If no more frames are left to be
> +     returned at the point of handling ``V4L2_DEC_CMD_STOP``, the driver
> +     must return an empty buffer (with :c:type:`v4l2_buffer`
> +     ``bytesused`` = 0) as the last buffer with ``V4L2_BUF_FLAG_LAST`` set
> +     instead. Any attempts to dequeue more buffers beyond the buffer marked
> +     with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE error from
> +     :c:func:`VIDIOC_DQBUF`.
> +
> +   * If the ``CAPTURE`` queue is NOT streaming, no action is necessary for
> +     ``CAPTURE`` queue and the driver must send a ``V4L2_EVENT_EOS``
> +     immediately after all ``OUTPUT`` buffers in question have been
> +     processed.

What is the use case for this ? Can't we just return an error if decoder isn't 
streaming ?

> +4. At this point, decoding is paused and the driver will accept, but not
> +   process any newly queued ``OUTPUT`` buffers until the client issues
> +   ``V4L2_DEC_CMD_START`` or restarts streaming on any queue.
> +
> +* Once the drain sequence is initiated, the client needs to drive it to
> +  completion, as described by the above steps, unless it aborts the process
> +  by issuing :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue.  The client
> +  is not allowed to issue ``V4L2_DEC_CMD_START`` or ``V4L2_DEC_CMD_STOP``
> +  again while the drain sequence is in progress and they will fail with
> +  -EBUSY error code if attempted.

While this seems OK to me, I think drivers will need help to implement all the 
corner cases correctly without race conditions.

> +* Restarting streaming on ``OUTPUT`` queue will implicitly end the paused
> +  state and reinitialize the decoder (similarly to the seek sequence).
> +  Restarting ``CAPTURE`` queue will not affect an in-progress drain
> +  sequence.
> +
> +* The drivers must also implement :c:func:`VIDIOC_TRY_DECODER_CMD`, as a
> +  way to let the client query the availability of decoder commands.
> +
> +End of stream
> +=============
> +
> +If the decoder encounters an end of stream marking in the stream, the
> +driver must send a ``V4L2_EVENT_EOS`` event

On which queue ?

> to the client after all frames
> +are decoded and ready to be dequeued on the ``CAPTURE`` queue, with the
> +:c:type:`v4l2_buffer` ``flags`` set to ``V4L2_BUF_FLAG_LAST``. This
> +behavior is identical to the drain sequence triggered by the client via
> +``V4L2_DEC_CMD_STOP``.
> +
> +Commit points
> +=============
> +
> +Setting formats and allocating buffers triggers changes in the behavior

s/triggers/trigger/

> +of the driver.
> +
> +1. Setting format on ``OUTPUT`` queue may change the set of formats
> +   supported/advertised on the ``CAPTURE`` queue. In particular, it also
> +   means that ``CAPTURE`` format may be reset and the client must not
> +   rely on the previously set format being preserved.
> +
> +2. Enumerating formats on ``CAPTURE`` queue must only return formats
> +   supported for the ``OUTPUT`` format currently set.
> +
> +3. Setting/changing format on ``CAPTURE`` queue does not change formats

Why not just "Setting format" ?

> +   available on ``OUTPUT`` queue. An attempt to set ``CAPTURE`` format that
> +   is not supported for the currently selected ``OUTPUT`` format must
> +   result in the driver adjusting the requested format to an acceptable
> +   one.
> +
> +4. Enumerating formats on ``OUTPUT`` queue always returns the full set of
> +   supported coded formats, irrespective of the current ``CAPTURE``
> +   format.
> +
> +5. After allocating buffers on the ``OUTPUT`` queue, it is not possible to
> +   change format on it.

I'd phrase this as

"While buffers are allocated on the ``OUTPUT`` queue, clients must not change 
the format on the queue. Drivers must return <error code> for any such format 
change attempt."

> +
> +To summarize, setting formats and allocation must always start with the
> +``OUTPUT`` queue and the ``OUTPUT`` queue is the master that governs the
> +set of supported formats for the ``CAPTURE`` queue.

[snip]
Tomasz Figa Oct. 18, 2018, 10:03 a.m. UTC | #36
Hi Laurent,

On Wed, Oct 17, 2018 at 10:34 PM Laurent Pinchart
<laurent.pinchart@ideasonboard.com> wrote:
>
> Hi Tomasz,
>
> Thank you for the patch.

Thanks for your comments! Please see my replies inline.

>
> On Tuesday, 24 July 2018 17:06:20 EEST Tomasz Figa wrote:
> > Due to complexity of the video decoding process, the V4L2 drivers of
> > stateful decoder hardware require specific sequences of V4L2 API calls
> > to be followed. These include capability enumeration, initialization,
> > decoding, seek, pause, dynamic resolution change, drain and end of
> > stream.
> >
> > Specifics of the above have been discussed during Media Workshops at
> > LinuxCon Europe 2012 in Barcelona and then later Embedded Linux
> > Conference Europe 2014 in Düsseldorf. The de facto Codec API that
> > originated at those events was later implemented by the drivers we already
> > have merged in mainline, such as s5p-mfc or coda.
> >
> > The only thing missing was the real specification included as a part of
> > Linux Media documentation. Fix it now and document the decoder part of
> > the Codec API.
> >
> > Signed-off-by: Tomasz Figa <tfiga@chromium.org>
> > ---
> >  Documentation/media/uapi/v4l/dev-decoder.rst | 872 +++++++++++++++++++
> >  Documentation/media/uapi/v4l/devices.rst     |   1 +
> >  Documentation/media/uapi/v4l/v4l2.rst        |  10 +-
> >  3 files changed, 882 insertions(+), 1 deletion(-)
> >  create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst
> >
> > diff --git a/Documentation/media/uapi/v4l/dev-decoder.rst
> > b/Documentation/media/uapi/v4l/dev-decoder.rst new file mode 100644
> > index 000000000000..f55d34d2f860
> > --- /dev/null
> > +++ b/Documentation/media/uapi/v4l/dev-decoder.rst
> > @@ -0,0 +1,872 @@
> > +.. -*- coding: utf-8; mode: rst -*-
> > +
> > +.. _decoder:
> > +
> > +****************************************
> > +Memory-to-memory Video Decoder Interface
> > +****************************************
> > +
> > +Input data to a video decoder are buffers containing unprocessed video
> > +stream (e.g. Annex-B H.264/HEVC stream, raw VP8/9 stream). The driver is
> > +expected not to require any additional information from the client to
> > +process these buffers. Output data are raw video frames returned in display
> > +order.
> > +
> > +Performing software parsing, processing etc. of the stream in the driver
> > +in order to support this interface is strongly discouraged. In case such
> > +operations are needed, use of Stateless Video Decoder Interface (in
> > +development) is strongly advised.
> > +
> > +Conventions and notation used in this document
> > +==============================================
> > +
> > +1. The general V4L2 API rules apply if not specified in this document
> > +   otherwise.
> > +
> > +2. The meaning of words “must”, “may”, “should”, etc. is as per RFC
> > +   2119.
> > +
> > +3. All steps not marked “optional” are required.
> > +
> > +4. :c:func:`VIDIOC_G_EXT_CTRLS`, :c:func:`VIDIOC_S_EXT_CTRLS` may be used
> > +   interchangeably with :c:func:`VIDIOC_G_CTRL`, :c:func:`VIDIOC_S_CTRL`,
> > +   unless specified otherwise.
> > +
> > +5. Single-plane API (see spec) and applicable structures may be used
> > +   interchangeably with Multi-plane API, unless specified otherwise,
> > +   depending on driver capabilities and following the general V4L2
> > +   guidelines.
>
> How about also allowing VIDIOC_CREATE_BUFS where VIDIOC_REQBUFS is mentioned ?
>

In my draft of v2, I explicitly described VIDIOC_CREATE_BUFS in any
step mentioning VIDIOC_REQBUFS. Do you think that's fine too?

> > +6. i = [a..b]: sequence of integers from a to b, inclusive, i.e. i =
> > +   [0..2]: i = 0, 1, 2.
> > +
> > +7. For ``OUTPUT`` buffer A, A’ represents a buffer on the ``CAPTURE`` queue
> > +   containing data (decoded frame/stream) that resulted from processing +
> >  buffer A.
> > +
> > +Glossary
> > +========
> > +
> > +CAPTURE
> > +   the destination buffer queue; the queue of buffers containing decoded
> > +   frames; ``V4L2_BUF_TYPE_VIDEO_CAPTURE```` or
> > +   ``V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE``; data are captured from the
> > +   hardware into ``CAPTURE`` buffers
> > +
> > +client
> > +   application client communicating with the driver implementing this API
> > +
> > +coded format
> > +   encoded/compressed video bitstream format (e.g. H.264, VP8, etc.); see
> > +   also: raw format
> > +
> > +coded height
> > +   height for given coded resolution
> > +
> > +coded resolution
> > +   stream resolution in pixels aligned to codec and hardware requirements;
> > +   typically visible resolution rounded up to full macroblocks;
> > +   see also: visible resolution
> > +
> > +coded width
> > +   width for given coded resolution
> > +
> > +decode order
> > +   the order in which frames are decoded; may differ from display order if
> > +   coded format includes a feature of frame reordering; ``OUTPUT`` buffers
> > +   must be queued by the client in decode order
> > +
> > +destination
> > +   data resulting from the decode process; ``CAPTURE``
> > +
> > +display order
> > +   the order in which frames must be displayed; ``CAPTURE`` buffers must be
> > +   returned by the driver in display order
> > +
> > +DPB
> > +   Decoded Picture Buffer; a H.264 term for a buffer that stores a picture
> > +   that is encoded or decoded and available for reference in further
> > +   decode/encode steps.
>
> By "encoded or decoded", do you mean "raw frames to be encoded (in the encoder
> use case) or decoded raw frames (in the decoder use case)" ? I think this
> should be clarified.
>

Actually it's a decoder-specific term, so changed both decoder and
encoder documents to:

DPB
   Decoded Picture Buffer; an H.264 term for a buffer that stores a decoded
   raw frame available for reference in further decoding steps.

Does it sound better now?

> > +EOS
> > +   end of stream
> > +
> > +IDR
> > +   a type of a keyframe in H.264-encoded stream, which clears the list of
> > +   earlier reference frames (DPBs)
> > +
> > +keyframe
> > +   an encoded frame that does not reference frames decoded earlier, i.e.
> > +   can be decoded fully on its own.
> > +
> > +OUTPUT
> > +   the source buffer queue; the queue of buffers containing encoded
> > +   bitstream; ``V4L2_BUF_TYPE_VIDEO_OUTPUT`` or
> > +   ``V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE``; the hardware is fed with data
> > +   from ``OUTPUT`` buffers
> > +
> > +PPS
> > +   Picture Parameter Set; a type of metadata entity in H.264 bitstream
> > +
> > +raw format
> > +   uncompressed format containing raw pixel data (e.g. YUV, RGB formats)
> > +
> > +resume point
> > +   a point in the bitstream from which decoding may start/continue, without
> > +   any previous state/data present, e.g.: a keyframe (VP8/VP9) or +
> > SPS/PPS/IDR sequence (H.264); a resume point is required to start decode +
> >  of a new stream, or to resume decoding after a seek
> > +
> > +source
> > +   data fed to the decoder; ``OUTPUT``
> > +
> > +SPS
> > +   Sequence Parameter Set; a type of metadata entity in H.264 bitstream
> > +
> > +visible height
> > +   height for given visible resolution; display height
> > +
> > +visible resolution
> > +   stream resolution of the visible picture, in pixels, to be used for
> > +   display purposes; must be smaller or equal to coded resolution;
> > +   display resolution
> > +
> > +visible width
> > +   width for given visible resolution; display width
> > +
> > +Querying capabilities
> > +=====================
> > +
> > +1. To enumerate the set of coded formats supported by the driver, the
> > +   client may call :c:func:`VIDIOC_ENUM_FMT` on ``OUTPUT``.
> > +
> > +   * The driver must always return the full set of supported formats,
> > +     irrespective of the format set on the ``CAPTURE``.
> > +
> > +2. To enumerate the set of supported raw formats, the client may call
> > +   :c:func:`VIDIOC_ENUM_FMT` on ``CAPTURE``.
> > +
> > +   * The driver must return only the formats supported for the format
> > +     currently active on ``OUTPUT``.
> > +
> > +   * In order to enumerate raw formats supported by a given coded format,
> > +     the client must first set that coded format on ``OUTPUT`` and then
> > +     enumerate the ``CAPTURE`` queue.
>
> Maybe s/enumerate the/enumerate formats on the/ ?
>
> > +3. The client may use :c:func:`VIDIOC_ENUM_FRAMESIZES` to detect supported
> > +   resolutions for a given format, passing desired pixel format in
> > +   :c:type:`v4l2_frmsizeenum` ``pixel_format``.
> > +
> > +   * Values returned by :c:func:`VIDIOC_ENUM_FRAMESIZES` on ``OUTPUT``
> > +     must include all possible coded resolutions supported by the decoder
> > +     for given coded pixel format.
> > +
> > +   * Values returned by :c:func:`VIDIOC_ENUM_FRAMESIZES` on ``CAPTURE``
> > +     must include all possible frame buffer resolutions supported by the
> > +     decoder for given raw pixel format and coded format currently set on
> > +     ``OUTPUT``.
> > +
> > +    .. note::
> > +
> > +       The client may derive the supported resolution range for a
> > +       combination of coded and raw format by setting width and height of
> > +       ``OUTPUT`` format to 0 and calculating the intersection of
> > +       resolutions returned from calls to :c:func:`VIDIOC_ENUM_FRAMESIZES`
> > +       for the given coded and raw formats.
>
> I'm confused by the note, I'm not sure to understand what you mean.
>

I'm actually going to remove this. This special case of 0 width and
height is not only ugly, but also wouldn't work with decoders that
actually can do scaling, because the scaling ratio range is often
constant, so the supported scaled frame sizes depend on the exact
coded format.

> > +4. Supported profiles and levels for given format, if applicable, may be
> > +   queried using their respective controls via :c:func:`VIDIOC_QUERYCTRL`.
> > +
> > +Initialization
> > +==============
> > +
> > +1. *[optional]* Enumerate supported ``OUTPUT`` formats and resolutions. See
> > +   capability enumeration.
> > +
> > +2. Set the coded format on ``OUTPUT`` via :c:func:`VIDIOC_S_FMT`
> > +
> > +   * **Required fields:**
> > +
> > +     ``type``
> > +         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> > +
> > +     ``pixelformat``
> > +         a coded pixel format
> > +
> > +     ``width``, ``height``
> > +         required only if cannot be parsed from the stream for the given
> > +         coded format; optional otherwise - set to zero to ignore
> > +
> > +     other fields
> > +         follow standard semantics
> > +
> > +   * For coded formats including stream resolution information, if width
> > +     and height are set to non-zero values, the driver will propagate the
> > +     resolution to ``CAPTURE`` and signal a source change event
> > +     instantly.
>
> Maybe s/instantly/immediately before returning from :c:func:`VIDIOC_S_FMT`/ ?
>
> > However, after the decoder is done parsing the
> > +     information embedded in the stream, it will update ``CAPTURE``
>
> s/update/update the/
>
> > +     format with new values and signal a source change event again, if
>
> s/, if/ if/
>
> > +     the values do not match.
> > +
> > +   .. note::
> > +
> > +      Changing ``OUTPUT`` format may change currently set ``CAPTURE``
>
> Do you have a particular dislike for definite articles ? :-) I would have
> written "Changing the ``OUTPUT`` format may change the currently set
> ``CAPTURE`` ...". I won't repeat the comment through the whole review, but
> many places seem to be missing a definite article.

Saving the^Wworld bandwidth one "the " at a time. ;)

Hans also pointed some of those and I should have most of the missing
ones added in my draft of v2. Thanks.

>
> > +      format. The driver will derive a new ``CAPTURE`` format from
> > +      ``OUTPUT`` format being set, including resolution, colorimetry
> > +      parameters, etc. If the client needs a specific ``CAPTURE`` format,
> > +      it must adjust it afterwards.
> > +
> > +3.  *[optional]* Get minimum number of buffers required for ``OUTPUT``
> > +    queue via :c:func:`VIDIOC_G_CTRL`. This is useful if client intends to
> > +    use more buffers than minimum required by hardware/format.
> > +
> > +    * **Required fields:**
> > +
> > +      ``id``
> > +          set to ``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT``
> > +
> > +    * **Return fields:**
> > +
> > +      ``value``
> > +          required number of ``OUTPUT`` buffers for the currently set
> > +          format
>
> s/required/required minimum/

I made it "the minimum number of [...] buffers required".

>
> > +
> > +4.  Allocate source (bitstream) buffers via :c:func:`VIDIOC_REQBUFS` on
> > +    ``OUTPUT``.
> > +
> > +    * **Required fields:**
> > +
> > +      ``count``
> > +          requested number of buffers to allocate; greater than zero
> > +
> > +      ``type``
> > +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> > +
> > +      ``memory``
> > +          follows standard semantics
> > +
> > +      ``sizeimage``
> > +          follows standard semantics; the client is free to choose any
> > +          suitable size, however, it may be subject to change by the
> > +          driver
> > +
> > +    * **Return fields:**
> > +
> > +      ``count``
> > +          actual number of buffers allocated
> > +
> > +    * The driver must adjust count to minimum of required number of
> > +      ``OUTPUT`` buffers for given format and count passed.
>
> Isn't it the maximum, not the minimum ?
>

It's actually neither. All we can generally say here is that the
number will be adjusted and the client must note it.

> > The client must
> > +      check this value after the ioctl returns to get the number of
> > +      buffers allocated.
> > +
> > +    .. note::
> > +
> > +       To allocate more than minimum number of buffers (for pipeline
> > +       depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT``) to
> > +       get minimum number of buffers required by the driver/format,
> > +       and pass the obtained value plus the number of additional
> > +       buffers needed in count to :c:func:`VIDIOC_REQBUFS`.
> > +
> > +5.  Start streaming on ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON`.
> > +
> > +6.  This step only applies to coded formats that contain resolution
> > +    information in the stream. Continue queuing/dequeuing bitstream
> > +    buffers to/from the ``OUTPUT`` queue via :c:func:`VIDIOC_QBUF` and
> > +    :c:func:`VIDIOC_DQBUF`. The driver must keep processing and returning
> > +    each buffer to the client until required metadata to configure the
> > +    ``CAPTURE`` queue are found. This is indicated by the driver sending
> > +    a ``V4L2_EVENT_SOURCE_CHANGE`` event with
> > +    ``V4L2_EVENT_SRC_CH_RESOLUTION`` source change type. There is no
> > +    requirement to pass enough data for this to occur in the first buffer
> > +    and the driver must be able to process any number.
> > +
> > +    * If data in a buffer that triggers the event is required to decode
> > +      the first frame, the driver must not return it to the client,
> > +      but must retain it for further decoding.
> > +
> > +    * If the client set width and height of ``OUTPUT`` format to 0, calling
> > +      :c:func:`VIDIOC_G_FMT` on the ``CAPTURE`` queue will return -EPERM,
> > +      until the driver configures ``CAPTURE`` format according to stream
> > +      metadata.
>
> That's a pretty harsh handling for this condition. What's the rationale for
> returning -EPERM instead of for instance succeeding with width and height set
> to 0 ?

I don't like it, but the error condition must stay for compatibility
reasons as that's what current drivers implement and applications
expect. (Technically current drivers would return -EINVAL, but we
concluded that existing applications don't care about the exact value,
so we can change it to make more sense.)

>
> > +    * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events and
> > +      the event is signaled, the decoding process will not continue until
> > +      it is acknowledged by either (re-)starting streaming on ``CAPTURE``,
> > +      or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START``
> > +      command.
> > +
> > +    .. note::
> > +
> > +       No decoded frames are produced during this phase.
> > +
> > +7.  This step only applies to coded formats that contain resolution
> > +    information in the stream.
> > +    Receive and handle ``V4L2_EVENT_SOURCE_CHANGE`` from the driver
> > +    via :c:func:`VIDIOC_DQEVENT`. The driver must send this event once
> > +    enough data is obtained from the stream to allocate ``CAPTURE``
> > +    buffers and to begin producing decoded frames.
>
> Doesn't the last sentence belong to step 6 (where it's already explained to
> some extent) ?
>
> > +
> > +    * **Required fields:**
> > +
> > +      ``type``
> > +          set to ``V4L2_EVENT_SOURCE_CHANGE``
>
> Isn't the type field set by the driver ?
>
> > +    * **Return fields:**
> > +
> > +      ``u.src_change.changes``
> > +          set to ``V4L2_EVENT_SRC_CH_RESOLUTION``
> > +
> > +    * Any client query issued after the driver queues the event must return
> > +      values applying to the just parsed stream, including queue formats,
> > +      selection rectangles and controls.
>
> To align with the wording used so far, I would say that "the driver must"
> return values applying to the just parsed stream.
>
> I think I would also move this to step 6, as it's related to queuing the
> event, not dequeuing it.

As I've rephrased the whole document to be more userspace-oriented,
this step is actually going away. Step 6 will have a note about driver
behavior.

>
> > +8.  Call :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` queue to get format for the
> > +    destination buffers parsed/decoded from the bitstream.
> > +
> > +    * **Required fields:**
> > +
> > +      ``type``
> > +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
> > +
> > +    * **Return fields:**
> > +
> > +      ``width``, ``height``
> > +          frame buffer resolution for the decoded frames
> > +
> > +      ``pixelformat``
> > +          pixel format for decoded frames
> > +
> > +      ``num_planes`` (for _MPLANE ``type`` only)
> > +          number of planes for pixelformat
> > +
> > +      ``sizeimage``, ``bytesperline``
> > +          as per standard semantics; matching frame buffer format
> > +
> > +    .. note::
> > +
> > +       The value of ``pixelformat`` may be any pixel format supported and
> > +       must be supported for current stream, based on the information
> > +       parsed from the stream and hardware capabilities. It is suggested
> > +       that driver chooses the preferred/optimal format for given
>
> In compliance with RFC 2119, how about using "Drivers should choose" instead
> of "It is suggested that driver chooses" ?

The whole paragraph became:

       The value of ``pixelformat`` may be any pixel format supported by the
       decoder for the current stream. It is expected that the decoder chooses
       a preferred/optimal format for the default configuration. For example, a
       YUV format may be preferred over an RGB format, if additional conversion
       step would be required.

>
> > +       configuration. For example, a YUV format may be preferred over an
> > +       RGB format, if additional conversion step would be required.
> > +
> > +9.  *[optional]* Enumerate ``CAPTURE`` formats via
> > +    :c:func:`VIDIOC_ENUM_FMT` on ``CAPTURE`` queue. Once the stream
> > +    information is parsed and known, the client may use this ioctl to
> > +    discover which raw formats are supported for given stream and select on
>
> s/select on/select one/

Done.

>
> > +    of them via :c:func:`VIDIOC_S_FMT`.
> > +
> > +    .. note::
> > +
> > +       The driver will return only formats supported for the current stream
> > +       parsed in this initialization sequence, even if more formats may be
> > +       supported by the driver in general.
> > +
> > +       For example, a driver/hardware may support YUV and RGB formats for
> > +       resolutions 1920x1088 and lower, but only YUV for higher
> > +       resolutions (due to hardware limitations). After parsing
> > +       a resolution of 1920x1088 or lower, :c:func:`VIDIOC_ENUM_FMT` may
> > +       return a set of YUV and RGB pixel formats, but after parsing
> > +       resolution higher than 1920x1088, the driver will not return RGB,
> > +       unsupported for this resolution.
> > +
> > +       However, subsequent resolution change event triggered after
> > +       discovering a resolution change within the same stream may switch
> > +       the stream into a lower resolution and :c:func:`VIDIOC_ENUM_FMT`
> > +       would return RGB formats again in that case.
> > +
> > +10.  *[optional]* Choose a different ``CAPTURE`` format than suggested via
> > +     :c:func:`VIDIOC_S_FMT` on ``CAPTURE`` queue. It is possible for the
> > +     client to choose a different format than selected/suggested by the
>
> And here, "A client may choose" ?
>
> > +     driver in :c:func:`VIDIOC_G_FMT`.
> > +
> > +     * **Required fields:**
> > +
> > +       ``type``
> > +           a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
> > +
> > +       ``pixelformat``
> > +           a raw pixel format
> > +
> > +     .. note::
> > +
> > +        Calling :c:func:`VIDIOC_ENUM_FMT` to discover currently available
> > +        formats after receiving ``V4L2_EVENT_SOURCE_CHANGE`` is useful to
> > +        find out a set of allowed formats for given configuration, but not
> > +        required, if the client can accept the defaults.
>
> s/required/required,/

That would become "[...]but not required,, if the client[...]". Is
that your suggestion? ;)

>
> > +
> > +11. *[optional]* Acquire visible resolution via
> > +    :c:func:`VIDIOC_G_SELECTION`.
> > +
> > +    * **Required fields:**
> > +
> > +      ``type``
> > +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
> > +
> > +      ``target``
> > +          set to ``V4L2_SEL_TGT_COMPOSE``
> > +
> > +    * **Return fields:**
> > +
> > +      ``r.left``, ``r.top``, ``r.width``, ``r.height``
> > +          visible rectangle; this must fit within frame buffer resolution
> > +          returned by :c:func:`VIDIOC_G_FMT`.
> > +
> > +    * The driver must expose following selection targets on ``CAPTURE``:
> > +
> > +      ``V4L2_SEL_TGT_CROP_BOUNDS``
> > +          corresponds to coded resolution of the stream
> > +
> > +      ``V4L2_SEL_TGT_CROP_DEFAULT``
> > +          a rectangle covering the part of the frame buffer that contains
> > +          meaningful picture data (visible area); width and height will be
> > +          equal to visible resolution of the stream
> > +
> > +      ``V4L2_SEL_TGT_CROP``
> > +          rectangle within coded resolution to be output to ``CAPTURE``;
> > +          defaults to ``V4L2_SEL_TGT_CROP_DEFAULT``; read-only on hardware
> > +          without additional compose/scaling capabilities
> > +
> > +      ``V4L2_SEL_TGT_COMPOSE_BOUNDS``
> > +          maximum rectangle within ``CAPTURE`` buffer, which the cropped
> > +          frame can be output into; equal to ``V4L2_SEL_TGT_CROP``, if the
> > +          hardware does not support compose/scaling
> > +
> > +      ``V4L2_SEL_TGT_COMPOSE_DEFAULT``
> > +          equal to ``V4L2_SEL_TGT_CROP``
> > +
> > +      ``V4L2_SEL_TGT_COMPOSE``
> > +          rectangle inside ``OUTPUT`` buffer into which the cropped frame
>
> s/OUTPUT/CAPTURE/ ?
>
> > +          is output; defaults to ``V4L2_SEL_TGT_COMPOSE_DEFAULT``;
>
> and "is captured" or "is written" ?
>
> > +          read-only on hardware without additional compose/scaling
> > +          capabilities
> > +
> > +      ``V4L2_SEL_TGT_COMPOSE_PADDED``
> > +          rectangle inside ``OUTPUT`` buffer which is overwritten by the
>
> Here too ?
>
> > +          hardware; equal to ``V4L2_SEL_TGT_COMPOSE``, if the hardware
>
> s/, if/ if/

Ack +3

>
> > +          does not write padding pixels
> > +
> > +12. *[optional]* Get minimum number of buffers required for ``CAPTURE``
> > +    queue via :c:func:`VIDIOC_G_CTRL`. This is useful if client intends to
> > +    use more buffers than minimum required by hardware/format.
> > +
> > +    * **Required fields:**
> > +
> > +      ``id``
> > +          set to ``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE``
> > +
> > +    * **Return fields:**
> > +
> > +      ``value``
> > +          minimum number of buffers required to decode the stream parsed in
> > +          this initialization sequence.
> > +
> > +    .. note::
> > +
> > +       Note that the minimum number of buffers must be at least the number
> > +       required to successfully decode the current stream. This may for
> > +       example be the required DPB size for an H.264 stream given the
> > +       parsed stream configuration (resolution, level).
> > +
> > +13. Allocate destination (raw format) buffers via :c:func:`VIDIOC_REQBUFS`
> > +    on the ``CAPTURE`` queue.
> > +
> > +    * **Required fields:**
> > +
> > +      ``count``
> > +          requested number of buffers to allocate; greater than zero
> > +
> > +      ``type``
> > +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
> > +
> > +      ``memory``
> > +          follows standard semantics
> > +
> > +    * **Return fields:**
> > +
> > +      ``count``
> > +          adjusted to allocated number of buffers
> > +
> > +    * The driver must adjust count to minimum of required number of
>
> s/minimum/maximum/ ?
>
> Should we also mentioned that if count > minimum, the driver may additionally
> limit the number of buffers based on internal limits (such as maximum memory
> consumption) ?

I made it less specific:

    * The count will be adjusted by the decoder to match the stream and hardware
      requirements. The client must check the final value after the ioctl
      returns to get the number of buffers allocated.

>
> > +      destination buffers for given format and stream configuration and the
> > +      count passed. The client must check this value after the ioctl
> > +      returns to get the number of buffers allocated.
> > +
> > +    .. note::
> > +
> > +       To allocate more than minimum number of buffers (for pipeline
> > +       depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE``) to
> > +       get minimum number of buffers required, and pass the obtained value
> > +       plus the number of additional buffers needed in count to
> > +       :c:func:`VIDIOC_REQBUFS`.
> > +
> > +14. Call :c:func:`VIDIOC_STREAMON` to initiate decoding frames.
> > +
> > +Decoding
> > +========
> > +
> > +This state is reached after a successful initialization sequence. In this
> > +state, client queues and dequeues buffers to both queues via
> > +:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following standard
> > +semantics.
> > +
> > +Both queues operate independently, following standard behavior of V4L2
> > +buffer queues and memory-to-memory devices. In addition, the order of
> > +decoded frames dequeued from ``CAPTURE`` queue may differ from the order of
> > +queuing coded frames to ``OUTPUT`` queue, due to properties of selected
> > +coded format, e.g. frame reordering. The client must not assume any direct
> > +relationship between ``CAPTURE`` and ``OUTPUT`` buffers, other than
> > +reported by :c:type:`v4l2_buffer` ``timestamp`` field.
> > +
> > +The contents of source ``OUTPUT`` buffers depend on active coded pixel
> > +format and might be affected by codec-specific extended controls, as stated
>
> s/might/may/
>
> > +in documentation of each format individually.
> > +
> > +The client must not assume any direct relationship between ``CAPTURE``
> > +and ``OUTPUT`` buffers and any specific timing of buffers becoming
> > +available to dequeue. Specifically:
> > +
> > +* a buffer queued to ``OUTPUT`` may result in no buffers being produced
> > +  on ``CAPTURE`` (e.g. if it does not contain encoded data, or if only
> > +  metadata syntax structures are present in it),
> > +
> > +* a buffer queued to ``OUTPUT`` may result in more than 1 buffer produced
> > +  on ``CAPTURE`` (if the encoded data contained more than one frame, or if
> > +  returning a decoded frame allowed the driver to return a frame that
> > +  preceded it in decode, but succeeded it in display order),
> > +
> > +* a buffer queued to ``OUTPUT`` may result in a buffer being produced on
> > +  ``CAPTURE`` later into decode process, and/or after processing further
> > +  ``OUTPUT`` buffers, or be returned out of order, e.g. if display
> > +  reordering is used,
> > +
> > +* buffers may become available on the ``CAPTURE`` queue without additional
>
> s/buffers/Buffers/
>

I don't think the items should be capitalized here.

> > +  buffers queued to ``OUTPUT`` (e.g. during drain or ``EOS``), because of
> > +  ``OUTPUT`` buffers being queued in the past and decoding result of which
> > +  being available only at later time, due to specifics of the decoding
> > +  process.
>
> I understand what you mean, but the wording is weird to my eyes. How about
>
> * Buffers may become available on the ``CAPTURE`` queue without additional
> buffers queued to ``OUTPUT`` (e.g. during drain or ``EOS``), because of
> ``OUTPUT`` buffers queued in the past whose decoding results are only
> available at later time, due to specifics of the decoding process.

Done, thanks.

>
> > +Seek
> > +====
> > +
> > +Seek is controlled by the ``OUTPUT`` queue, as it is the source of
> > +bitstream data. ``CAPTURE`` queue remains unchanged/unaffected.
>
> I assume that a seek may result in a source resolution change event, in which
> case the capture queue will be affected. How about stating here that
> controlling seek doesn't require any specific operation on the capture queue,
> but that the capture queue may be affected as per normal decoder operation ?
> We may also want to mention the event as an example.

Done. I've also added a general section about decoder-initialized
sequences in the Decoding section.

>
> > +1. Stop the ``OUTPUT`` queue to begin the seek sequence via
> > +   :c:func:`VIDIOC_STREAMOFF`.
> > +
> > +   * **Required fields:**
> > +
> > +     ``type``
> > +         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> > +
> > +   * The driver must drop all the pending ``OUTPUT`` buffers and they are
> > +     treated as returned to the client (following standard semantics).
> > +
> > +2. Restart the ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON`
> > +
> > +   * **Required fields:**
> > +
> > +     ``type``
> > +         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> > +
> > +   * The driver must be put in a state after seek and be ready to
>
> What do you mean by "a state after seek" ?
>

   * The decoder will start accepting new source bitstream buffers after the
     call returns.

> > +     accept new source bitstream buffers.
> > +
> > +3. Start queuing buffers to ``OUTPUT`` queue containing stream data after
> > +   the seek until a suitable resume point is found.
> > +
> > +   .. note::
> > +
> > +      There is no requirement to begin queuing stream starting exactly from
>
> s/stream/buffers/ ?

Perhaps "stream data"? The buffers don't have a resume point, the stream does.

>
> > +      a resume point (e.g. SPS or a keyframe). The driver must handle any
> > +      data queued and must keep processing the queued buffers until it
> > +      finds a suitable resume point. While looking for a resume point, the
> > +      driver processes ``OUTPUT`` buffers and returns them to the client
> > +      without producing any decoded frames.
> > +
> > +      For hardware known to be mishandling seeks to a non-resume point,
> > +      e.g. by returning corrupted decoded frames, the driver must be able
> > +      to handle such seeks without a crash or any fatal decode error.
>
> This should be true for any hardware, there should never be any crash or fatal
> decode error. I'd write it as
>
> Some hardware is known to mishandle seeks to a non-resume point. Such an
> operation may result in an unspecified number of corrupted decoded frames
> being made available on ``CAPTURE``. Drivers must ensure that no fatal
> decoding errors or crashes occur, and implement any necessary handling and
> work-arounds for hardware issues related to seek operations.
>

Done.

> > +4. After a resume point is found, the driver will start returning
> > +   ``CAPTURE`` buffers with decoded frames.
> > +
> > +   * There is no precise specification for ``CAPTURE`` queue of when it
> > +     will start producing buffers containing decoded data from buffers
> > +     queued after the seek, as it operates independently
> > +     from ``OUTPUT`` queue.
> > +
> > +     * The driver is allowed to and may return a number of remaining
>
> s/is allowed to and may/may/
>
> > +       ``CAPTURE`` buffers containing decoded frames from before the seek
> > +       after the seek sequence (STREAMOFF-STREAMON) is performed.
>
> Shouldn't all these buffers be returned when STREAMOFF is called on the OUTPUT
> side ?

The queues are independent, so STREAMOFF on OUTPUT would only return
the OUTPUT buffers.

That's why there is the note suggesting that the application may also
stop streaming on CAPTURE to avoid stale frames being returned.

>
> > +     * The driver is also allowed to and may not return all decoded frames
>
> s/is also allowed to and may not return/may also not return/
>
> > +       queued but not decode before the seek sequence was initiated. For
>
> s/not decode/not decoded/
>
> > +       example, given an ``OUTPUT`` queue sequence: QBUF(A), QBUF(B),
> > +       STREAMOFF(OUT), STREAMON(OUT), QBUF(G), QBUF(H), any of the
> > +       following results on the ``CAPTURE`` queue is allowed: {A’, B’, G’,
> > +       H’}, {A’, G’, H’}, {G’, H’}.
>
> Related to the previous point, shouldn't this be moved to step 1 ?

I've made it a general warning after the whole sequence.

>
> > +   .. note::
> > +
> > +      To achieve instantaneous seek, the client may restart streaming on
> > +      ``CAPTURE`` queue to discard decoded, but not yet dequeued buffers.
> > +
> > +Pause
> > +=====
> > +
> > +In order to pause, the client should just cease queuing buffers onto the
> > +``OUTPUT`` queue. This is different from the general V4L2 API definition of
> > +pause, which involves calling :c:func:`VIDIOC_STREAMOFF` on the queue.
> > +Without source bitstream data, there is no data to process and the
> > hardware +remains idle.
> > +
> > +Conversely, using :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue indicates
> > +a seek, which
> > +
> > +1. drops all ``OUTPUT`` buffers in flight and
> > +2. after a subsequent :c:func:`VIDIOC_STREAMON`, will look for and only
> > +   continue from a resume point.
> > +
> > +This is usually undesirable for pause. The STREAMOFF-STREAMON sequence is
> > +intended for seeking.
> > +
> > +Similarly, ``CAPTURE`` queue should remain streaming as well, as the
> > +STREAMOFF-STREAMON sequence on it is intended solely for changing buffer
> > +sets.
>
> And also to drop decoded buffers for instant seek ?
>

I've dropped the Pause section completely. It doesn't provide any
useful information IMHO and only doubles with the general semantics of
mem2mem devices.

> > +Dynamic resolution change
> > +=========================
> > +
> > +A video decoder implementing this interface must support dynamic resolution
> > +change, for streams, which include resolution metadata in the bitstream.
>
> s/for streams, which/for streams that/
>
> > +When the decoder encounters a resolution change in the stream, the dynamic
> > +resolution change sequence is started.
> > +
> > +1.  After encountering a resolution change in the stream, the driver must
> > +    first process and decode all remaining buffers from before the
> > +    resolution change point.
> > +
> > +2.  After all buffers containing decoded frames from before the resolution
> > +    change point are ready to be dequeued on the ``CAPTURE`` queue, the
> > +    driver sends a ``V4L2_EVENT_SOURCE_CHANGE`` event for source change
> > +    type ``V4L2_EVENT_SRC_CH_RESOLUTION``.
> > +
> > +    * The last buffer from before the change must be marked with
> > +      :c:type:`v4l2_buffer` ``flags`` flag ``V4L2_BUF_FLAG_LAST`` as in the
> > +      drain sequence. The last buffer might be empty (with
> > +      :c:type:`v4l2_buffer` ``bytesused`` = 0) and must be ignored by the
> > +      client, since it does not contain any decoded frame.
> > +
> > +    * Any client query issued after the driver queues the event must return
> > +      values applying to the stream after the resolution change, including
> > +      queue formats, selection rectangles and controls.
> > +
> > +    * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events and
> > +      the event is signaled, the decoding process will not continue until
> > +      it is acknowledged by either (re-)starting streaming on ``CAPTURE``,
> > +      or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START``
> > +      command.
>
> This usage of V4L2_DEC_CMD_START isn't aligned with the documentation of the
> command. I'm not opposed to this, but I think the use cases of decoder
> commands for codecs should be explained in the VIDIOC_DECODER_CMD
> documentation. What bothers me in particular is usage of V4L2_DEC_CMD_START to
> restart the decoder, while no V4L2_DEC_CMD_STOP has been issued. Should we add
> a section that details the decoder state machine with the implicit and
> explicit ways in which it is started and stopped ?

Yes, we should probably extend the VIDIOC_DECODER_CMD documentation.

As for diagrams, they would indeed be nice to have, but maybe we could
add them in a follow up patch?

>
> I would also reference step 7 here.
>
> > +    .. note::
> > +
> > +       Any attempts to dequeue more buffers beyond the buffer marked
> > +       with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE error from
> > +       :c:func:`VIDIOC_DQBUF`.
> > +
> > +3.  The client calls :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` to get the new
> > +    format information. This is identical to calling :c:func:`VIDIOC_G_FMT`
> > +    after ``V4L2_EVENT_SRC_CH_RESOLUTION`` in the initialization sequence
> > +    and should be handled similarly.
>
> As the source resolution change event is mentioned in multiple places, how
> about extracting the related ioctls sequence to a specific section, and
> referencing it where needed (at least from the initialization sequence and
> here) ?

I made the text here refer to the Initialization sequence.

>
> > +    .. note::
> > +
> > +       It is allowed for the driver not to support the same pixel format as
>
> "Drivers may not support ..."
>
> > +       previously used (before the resolution change) for the new
> > +       resolution. The driver must select a default supported pixel format,
> > +       return it, if queried using :c:func:`VIDIOC_G_FMT`, and the client
> > +       must take note of it.
> > +
> > +4.  The client acquires visible resolution as in initialization sequence.
> > +
> > +5.  *[optional]* The client is allowed to enumerate available formats and
>
> s/is allowed to/may/
>
> > +    select a different one than currently chosen (returned via
> > +    :c:func:`VIDIOC_G_FMT)`. This is identical to a corresponding step in
> > +    the initialization sequence.
> > +
> > +6.  *[optional]* The client acquires minimum number of buffers as in
> > +    initialization sequence.
> > +
> > +7.  If all the following conditions are met, the client may resume the
> > +    decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with
> > +    ``V4L2_DEC_CMD_START`` command, as in case of resuming after the drain
> > +    sequence:
> > +
> > +    * ``sizeimage`` of new format is less than or equal to the size of
> > +      currently allocated buffers,
> > +
> > +    * the number of buffers currently allocated is greater than or equal to
> > +      the minimum number of buffers acquired in step 6.
> > +
> > +    In such case, the remaining steps do not apply.
> > +
> > +    However, if the client intends to change the buffer set, to lower
> > +    memory usage or for any other reasons, it may be achieved by following
> > +    the steps below.
> > +
> > +8.  After dequeuing all remaining buffers from the ``CAPTURE`` queue,
>
> This is optional, isn't it ?
>

I wouldn't call it optional, since it depends on what the client does
and what the decoder supports. That's why the point above just states
that the remaining steps do not apply.

Also added a note:

       To fulfill those requirements, the client may attempt to use
       :c:func:`VIDIOC_CREATE_BUFS` to add more buffers. However, due to
       hardware limitations, the decoder may not support adding buffers at this
       point and the client must be able to handle a failure using the steps
       below.

> > the
> > +    client must call :c:func:`VIDIOC_STREAMOFF` on the ``CAPTURE`` queue.
> > +    The ``OUTPUT`` queue must remain streaming (calling STREAMOFF on it
>
> :c:func:`VIDIOC_STREAMOFF`
>
> > +    would trigger a seek).
> > +
> > +9.  The client frees the buffers on the ``CAPTURE`` queue using
> > +    :c:func:`VIDIOC_REQBUFS`.
> > +
> > +    * **Required fields:**
> > +
> > +      ``count``
> > +          set to 0
> > +
> > +      ``type``
> > +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
> > +
> > +      ``memory``
> > +          follows standard semantics
> > +
> > +10. The client allocates a new set of buffers for the ``CAPTURE`` queue via
> > +    :c:func:`VIDIOC_REQBUFS`. This is identical to a corresponding step in
> > +    the initialization sequence.
> > +
> > +11. The client resumes decoding by issuing :c:func:`VIDIOC_STREAMON` on the
> > +    ``CAPTURE`` queue.
> > +
> > +During the resolution change sequence, the ``OUTPUT`` queue must remain
> > +streaming. Calling :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue would
> > +initiate a seek.
> > +
> > +The ``OUTPUT`` queue operates separately from the ``CAPTURE`` queue for the
> > +duration of the entire resolution change sequence. It is allowed (and
> > +recommended for best performance and simplicity) for the client to keep
>
> "The client should (for best performance and simplicity) keep ..."
>
> > +queuing/dequeuing buffers from/to ``OUTPUT`` queue even while processing
>
> s/from\to/to\/from/
>
> > +this sequence.
> > +
> > +.. note::
> > +
> > +   It is also possible for this sequence to be triggered without a change
>
> "This sequence may be triggered ..."
>
> > +   in coded resolution, if a different number of ``CAPTURE`` buffers is
> > +   required in order to continue decoding the stream or the visible
> > +   resolution changes.
> > +
> > +Drain
> > +=====
> > +
> > +To ensure that all queued ``OUTPUT`` buffers have been processed and
> > +related ``CAPTURE`` buffers output to the client, the following drain
> > +sequence may be followed. After the drain sequence is complete, the client
> > +has received all decoded frames for all ``OUTPUT`` buffers queued before
> > +the sequence was started.
> > +
> > +1. Begin drain by issuing :c:func:`VIDIOC_DECODER_CMD`.
> > +
> > +   * **Required fields:**
> > +
> > +     ``cmd``
> > +         set to ``V4L2_DEC_CMD_STOP``
> > +
> > +     ``flags``
> > +         set to 0
> > +
> > +     ``pts``
> > +         set to 0
> > +
> > +2. The driver must process and decode as normal all ``OUTPUT`` buffers
> > +   queued by the client before the :c:func:`VIDIOC_DECODER_CMD` was issued.
> > +   Any operations triggered as a result of processing these buffers
> > +   (including the initialization and resolution change sequences) must be
> > +   processed as normal by both the driver and the client before proceeding
> > +   with the drain sequence.
> > +
> > +3. Once all ``OUTPUT`` buffers queued before ``V4L2_DEC_CMD_STOP`` are
> > +   processed:
> > +
> > +   * If the ``CAPTURE`` queue is streaming, once all decoded frames (if
> > +     any) are ready to be dequeued on the ``CAPTURE`` queue, the driver
> > +     must send a ``V4L2_EVENT_EOS``.
>
> s/\./event./
>
> Is the event sent on the OUTPUT or CAPTURE queue ? I assume the latter, should
> it be explicitly documented ?
>

AFAICS, there is no queue type indication in the v4l2_event struct.

In any case, I've removed this event, because existing drivers don't
implement it for the drain sequence and it also makes it more
consistent, since events would be only signaled for decoder-initiated
sequences. It would also allow distinguishing between an EOS mark in
the stream (event signaled) or end of a drain sequence (no event).

> > The driver must also set
> > +     ``V4L2_BUF_FLAG_LAST`` in :c:type:`v4l2_buffer` ``flags`` field on the
> > +     buffer on the ``CAPTURE`` queue containing the last frame (if any)
> > +     produced as a result of processing the ``OUTPUT`` buffers queued
> > +     before ``V4L2_DEC_CMD_STOP``. If no more frames are left to be
> > +     returned at the point of handling ``V4L2_DEC_CMD_STOP``, the driver
> > +     must return an empty buffer (with :c:type:`v4l2_buffer`
> > +     ``bytesused`` = 0) as the last buffer with ``V4L2_BUF_FLAG_LAST`` set
> > +     instead. Any attempts to dequeue more buffers beyond the buffer marked
> > +     with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE error from
> > +     :c:func:`VIDIOC_DQBUF`.
> > +
> > +   * If the ``CAPTURE`` queue is NOT streaming, no action is necessary for
> > +     ``CAPTURE`` queue and the driver must send a ``V4L2_EVENT_EOS``
> > +     immediately after all ``OUTPUT`` buffers in question have been
> > +     processed.
>
> What is the use case for this ? Can't we just return an error if decoder isn't
> streaming ?
>

Actually this is wrong. We want the queued OUTPUT buffers to be
processed and decoded, so if the CAPTURE queue is not yet set up
(initialization sequence not completed yet), handling the
initialization sequence first will be needed as a part of the drain
sequence. I've updated the document with that.

> > +4. At this point, decoding is paused and the driver will accept, but not
> > +   process any newly queued ``OUTPUT`` buffers until the client issues
> > +   ``V4L2_DEC_CMD_START`` or restarts streaming on any queue.
> > +
> > +* Once the drain sequence is initiated, the client needs to drive it to
> > +  completion, as described by the above steps, unless it aborts the process
> > +  by issuing :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue.  The client
> > +  is not allowed to issue ``V4L2_DEC_CMD_START`` or ``V4L2_DEC_CMD_STOP``
> > +  again while the drain sequence is in progress and they will fail with
> > +  -EBUSY error code if attempted.
>
> While this seems OK to me, I think drivers will need help to implement all the
> corner cases correctly without race conditions.

We went through the possible list of corner cases and concluded that
there is no use in handling them, especially considering how much they
would complicate both the userspace and the drivers. Not even
mentioning some hardware, like s5p-mfc, which actually has a dedicated
flush operation, that needs to complete before the decoder can switch
back to normal mode.

>
> > +* Restarting streaming on ``OUTPUT`` queue will implicitly end the paused
> > +  state and reinitialize the decoder (similarly to the seek sequence).
> > +  Restarting ``CAPTURE`` queue will not affect an in-progress drain
> > +  sequence.
> > +
> > +* The drivers must also implement :c:func:`VIDIOC_TRY_DECODER_CMD`, as a
> > +  way to let the client query the availability of decoder commands.
> > +
> > +End of stream
> > +=============
> > +
> > +If the decoder encounters an end of stream marking in the stream, the
> > +driver must send a ``V4L2_EVENT_EOS`` event
>
> On which queue ?
>

Hmm?

> > to the client after all frames
> > +are decoded and ready to be dequeued on the ``CAPTURE`` queue, with the
> > +:c:type:`v4l2_buffer` ``flags`` set to ``V4L2_BUF_FLAG_LAST``. This
> > +behavior is identical to the drain sequence triggered by the client via
> > +``V4L2_DEC_CMD_STOP``.
> > +
> > +Commit points
> > +=============
> > +
> > +Setting formats and allocating buffers triggers changes in the behavior
>
> s/triggers/trigger/
>
> > +of the driver.
> > +
> > +1. Setting format on ``OUTPUT`` queue may change the set of formats
> > +   supported/advertised on the ``CAPTURE`` queue. In particular, it also
> > +   means that ``CAPTURE`` format may be reset and the client must not
> > +   rely on the previously set format being preserved.
> > +
> > +2. Enumerating formats on ``CAPTURE`` queue must only return formats
> > +   supported for the ``OUTPUT`` format currently set.
> > +
> > +3. Setting/changing format on ``CAPTURE`` queue does not change formats
>
> Why not just "Setting format" ?
>
> > +   available on ``OUTPUT`` queue. An attempt to set ``CAPTURE`` format that
> > +   is not supported for the currently selected ``OUTPUT`` format must
> > +   result in the driver adjusting the requested format to an acceptable
> > +   one.
> > +
> > +4. Enumerating formats on ``OUTPUT`` queue always returns the full set of
> > +   supported coded formats, irrespective of the current ``CAPTURE``
> > +   format.
> > +
> > +5. After allocating buffers on the ``OUTPUT`` queue, it is not possible to
> > +   change format on it.
>
> I'd phrase this as
>
> "While buffers are allocated on the ``OUTPUT`` queue, clients must not change
> the format on the queue. Drivers must return <error code> for any such format
> change attempt."

Done, thanks.

Best regards,
Tomasz
Laurent Pinchart Oct. 18, 2018, 11:22 a.m. UTC | #37
Hi Tomasz,

I've stripped out all the parts on which I have no specific comment or just 
agree with your proposal. Please see below for a few additional remarks.

On Thursday, 18 October 2018 13:03:33 EEST Tomasz Figa wrote:
> On Wed, Oct 17, 2018 at 10:34 PM Laurent Pinchart wrote:
> > On Tuesday, 24 July 2018 17:06:20 EEST Tomasz Figa wrote:
> >> Due to complexity of the video decoding process, the V4L2 drivers of
> >> stateful decoder hardware require specific sequences of V4L2 API calls
> >> to be followed. These include capability enumeration, initialization,
> >> decoding, seek, pause, dynamic resolution change, drain and end of
> >> stream.
> >> 
> >> Specifics of the above have been discussed during Media Workshops at
> >> LinuxCon Europe 2012 in Barcelona and then later Embedded Linux
> >> Conference Europe 2014 in Düsseldorf. The de facto Codec API that
> >> originated at those events was later implemented by the drivers we
> >> already have merged in mainline, such as s5p-mfc or coda.
> >> 
> >> The only thing missing was the real specification included as a part of
> >> Linux Media documentation. Fix it now and document the decoder part of
> >> the Codec API.
> >> 
> >> Signed-off-by: Tomasz Figa <tfiga@chromium.org>
> >> ---
> >> 
> >>  Documentation/media/uapi/v4l/dev-decoder.rst | 872 +++++++++++++++++++
> >>  Documentation/media/uapi/v4l/devices.rst     |   1 +
> >>  Documentation/media/uapi/v4l/v4l2.rst        |  10 +-
> >>  3 files changed, 882 insertions(+), 1 deletion(-)
> >>  create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst
> >> 
> >> diff --git a/Documentation/media/uapi/v4l/dev-decoder.rst
> >> b/Documentation/media/uapi/v4l/dev-decoder.rst new file mode 100644
> >> index 000000000000..f55d34d2f860
> >> --- /dev/null
> >> +++ b/Documentation/media/uapi/v4l/dev-decoder.rst
> >> @@ -0,0 +1,872 @@

[snip]

> >> +4.  Allocate source (bitstream) buffers via :c:func:`VIDIOC_REQBUFS` on
> >> +    ``OUTPUT``.
> >> +
> >> +    * **Required fields:**
> >> +
> >> +      ``count``
> >> +          requested number of buffers to allocate; greater than zero
> >> +
> >> +      ``type``
> >> +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> >> +
> >> +      ``memory``
> >> +          follows standard semantics
> >> +
> >> +      ``sizeimage``
> >> +          follows standard semantics; the client is free to choose any
> >> +          suitable size, however, it may be subject to change by the
> >> +          driver
> >> +
> >> +    * **Return fields:**
> >> +
> >> +      ``count``
> >> +          actual number of buffers allocated
> >> +
> >> +    * The driver must adjust count to minimum of required number of
> >> +      ``OUTPUT`` buffers for given format and count passed.
> > 
> > Isn't it the maximum, not the minimum ?
> 
> It's actually neither. All we can generally say here is that the
> number will be adjusted and the client must note it.

I expect it to be clamp(requested count, driver minimum, driver maximum). I'm 
not sure it's worth capturing this in the document though, but we could say

"The driver must clam count to the minimum and maximum number of required 
``OUTPUT`` buffers for the given format ."

> >> The client must
> >> +      check this value after the ioctl returns to get the number of
> >> +      buffers allocated.
> >> +
> >> +    .. note::
> >> +
> >> +       To allocate more than minimum number of buffers (for pipeline
> >> +       depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT``) to
> >> +       get minimum number of buffers required by the driver/format,
> >> +       and pass the obtained value plus the number of additional
> >> +       buffers needed in count to :c:func:`VIDIOC_REQBUFS`.
> >> +
> >> +5.  Start streaming on ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON`.
> >> +
> >> +6.  This step only applies to coded formats that contain resolution
> >> +    information in the stream. Continue queuing/dequeuing bitstream
> >> +    buffers to/from the ``OUTPUT`` queue via :c:func:`VIDIOC_QBUF` and
> >> +    :c:func:`VIDIOC_DQBUF`. The driver must keep processing and
> >> returning
> >> +    each buffer to the client until required metadata to configure the
> >> +    ``CAPTURE`` queue are found. This is indicated by the driver
> >> sending
> >> +    a ``V4L2_EVENT_SOURCE_CHANGE`` event with
> >> +    ``V4L2_EVENT_SRC_CH_RESOLUTION`` source change type. There is no
> >> +    requirement to pass enough data for this to occur in the first
> >> buffer
> >> +    and the driver must be able to process any number.
> >> +
> >> +    * If data in a buffer that triggers the event is required to decode
> >> +      the first frame, the driver must not return it to the client,
> >> +      but must retain it for further decoding.
> >> +
> >> +    * If the client set width and height of ``OUTPUT`` format to 0,
> >> calling
> >> +      :c:func:`VIDIOC_G_FMT` on the ``CAPTURE`` queue will return
> >> -EPERM,
> >> +      until the driver configures ``CAPTURE`` format according to stream
> >> +      metadata.
> > 
> > That's a pretty harsh handling for this condition. What's the rationale
> > for returning -EPERM instead of for instance succeeding with width and
> > height set to 0 ?
> 
> I don't like it, but the error condition must stay for compatibility
> reasons as that's what current drivers implement and applications
> expect. (Technically current drivers would return -EINVAL, but we
> concluded that existing applications don't care about the exact value,
> so we can change it to make more sense.)

Fair enough :-/ A bit of a shame though. Should we try to use an error code 
that would have less chance of being confused with an actual permission 
problem ? -EILSEQ could be an option for "illegal sequence" of operations, but 
better options could exist.

> >> +    * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events
> >> and
> >> +      the event is signaled, the decoding process will not continue
> >> until
> >> +      it is acknowledged by either (re-)starting streaming on
> >> ``CAPTURE``,
> >> +      or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START``
> >> +      command.
> >> +
> >> +    .. note::
> >> +
> >> +       No decoded frames are produced during this phase.
> >> +

[snip]

> >> +8.  Call :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` queue to get format for
> >> the +    destination buffers parsed/decoded from the bitstream.
> >> +
> >> +    * **Required fields:**
> >> +
> >> +      ``type``
> >> +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
> >> +
> >> +    * **Return fields:**
> >> +
> >> +      ``width``, ``height``
> >> +          frame buffer resolution for the decoded frames
> >> +
> >> +      ``pixelformat``
> >> +          pixel format for decoded frames
> >> +
> >> +      ``num_planes`` (for _MPLANE ``type`` only)
> >> +          number of planes for pixelformat
> >> +
> >> +      ``sizeimage``, ``bytesperline``
> >> +          as per standard semantics; matching frame buffer format
> >> +
> >> +    .. note::
> >> +
> >> +       The value of ``pixelformat`` may be any pixel format supported
> >> and
> >> +       must be supported for current stream, based on the information
> >> +       parsed from the stream and hardware capabilities. It is
> >> suggested
> >> +       that driver chooses the preferred/optimal format for given
> > 
> > In compliance with RFC 2119, how about using "Drivers should choose"
> > instead of "It is suggested that driver chooses" ?
> 
> The whole paragraph became:
> 
>        The value of ``pixelformat`` may be any pixel format supported by the
> decoder for the current stream. It is expected that the decoder chooses a
> preferred/optimal format for the default configuration. For example, a YUV
> format may be preferred over an RGB format, if additional conversion step
> would be required.

How about using "should" instead of "it is expected that" ?

[snip]

> >> +10.  *[optional]* Choose a different ``CAPTURE`` format than suggested
> >> via
> >> +     :c:func:`VIDIOC_S_FMT` on ``CAPTURE`` queue. It is possible for
> >> the
> >> +     client to choose a different format than selected/suggested by the
> > 
> > And here, "A client may choose" ?
> > 
> >> +     driver in :c:func:`VIDIOC_G_FMT`.
> >> +
> >> +     * **Required fields:**
> >> +
> >> +       ``type``
> >> +           a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
> >> +
> >> +       ``pixelformat``
> >> +           a raw pixel format
> >> +
> >> +     .. note::
> >> +
> >> +        Calling :c:func:`VIDIOC_ENUM_FMT` to discover currently
> >> available
> >> +        formats after receiving ``V4L2_EVENT_SOURCE_CHANGE`` is useful
> >> to
> >> +        find out a set of allowed formats for given configuration, but
> >> not
> >> +        required, if the client can accept the defaults.
> > 
> > s/required/required,/
> 
> That would become "[...]but not required,, if the client[...]". Is
> that your suggestion? ;)

Oops, the other way around of course :-)

[snip]

> >> +3. Start queuing buffers to ``OUTPUT`` queue containing stream data
> >> after
> >> +   the seek until a suitable resume point is found.
> >> +
> >> +   .. note::
> >> +
> >> +      There is no requirement to begin queuing stream starting exactly
> >> from
> > 
> > s/stream/buffers/ ?
> 
> Perhaps "stream data"? The buffers don't have a resume point, the stream
> does.

Maybe "coded data" ?

> >> +      a resume point (e.g. SPS or a keyframe). The driver must handle
> >> any
> >> +      data queued and must keep processing the queued buffers until it
> >> +      finds a suitable resume point. While looking for a resume point,
> >> the
> >> +      driver processes ``OUTPUT`` buffers and returns them to the
> >> client
> >> +      without producing any decoded frames.
> >> +
> >> +      For hardware known to be mishandling seeks to a non-resume point,
> >> +      e.g. by returning corrupted decoded frames, the driver must be
> >> able
> >> +      to handle such seeks without a crash or any fatal decode error.
> > 
> > This should be true for any hardware, there should never be any crash or
> > fatal decode error. I'd write it as
> > 
> > Some hardware is known to mishandle seeks to a non-resume point. Such an
> > operation may result in an unspecified number of corrupted decoded frames
> > being made available on ``CAPTURE``. Drivers must ensure that no fatal
> > decoding errors or crashes occur, and implement any necessary handling and
> > work-arounds for hardware issues related to seek operations.
> 
> Done.

[snip]

> >> +2.  After all buffers containing decoded frames from before the
> >> resolution
> >> +    change point are ready to be dequeued on the ``CAPTURE`` queue, the
> >> +    driver sends a ``V4L2_EVENT_SOURCE_CHANGE`` event for source change
> >> +    type ``V4L2_EVENT_SRC_CH_RESOLUTION``.
> >> +
> >> +    * The last buffer from before the change must be marked with
> >> +      :c:type:`v4l2_buffer` ``flags`` flag ``V4L2_BUF_FLAG_LAST`` as in
> >> the +      drain sequence. The last buffer might be empty (with
> >> +      :c:type:`v4l2_buffer` ``bytesused`` = 0) and must be ignored by
> >> the
> >> +      client, since it does not contain any decoded frame.
> >> +
> >> +    * Any client query issued after the driver queues the event must
> >> return
> >> +      values applying to the stream after the resolution change,
> >> including
> >> +      queue formats, selection rectangles and controls.
> >> +
> >> +    * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events
> >> and
> >> +      the event is signaled, the decoding process will not continue
> >> until
> >> +      it is acknowledged by either (re-)starting streaming on
> >> ``CAPTURE``,
> >> +      or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START``
> >> +      command.
> > 
> > This usage of V4L2_DEC_CMD_START isn't aligned with the documentation of
> > the command. I'm not opposed to this, but I think the use cases of
> > decoder commands for codecs should be explained in the VIDIOC_DECODER_CMD
> > documentation. What bothers me in particular is usage of
> > V4L2_DEC_CMD_START to restart the decoder, while no V4L2_DEC_CMD_STOP has
> > been issued. Should we add a section that details the decoder state
> > machine with the implicit and explicit ways in which it is started and
> > stopped ?
> 
> Yes, we should probably extend the VIDIOC_DECODER_CMD documentation.
> 
> As for diagrams, they would indeed be nice to have, but maybe we could
> add them in a follow up patch?

That's another way to say it won't happen, right ? ;-) I'm OK with that, but I 
think we should still clarify that the source change generates an implicit 
V4L2_DEC_CMD_STOP.

> > I would also reference step 7 here.
> > 
> >> +    .. note::
> >> +
> >> +       Any attempts to dequeue more buffers beyond the buffer marked
> >> +       with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE error from
> >> +       :c:func:`VIDIOC_DQBUF`.
> >> +
> >> +3.  The client calls :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` to get the
> >> new
> >> +    format information. This is identical to calling
> >> :c:func:`VIDIOC_G_FMT` +    after ``V4L2_EVENT_SRC_CH_RESOLUTION`` in
> >> the initialization sequence +    and should be handled similarly.
> > 
> > As the source resolution change event is mentioned in multiple places, how
> > about extracting the related ioctls sequence to a specific section, and
> > referencing it where needed (at least from the initialization sequence and
> > here) ?
> 
> I made the text here refer to the Initialization sequence.

Wouldn't it be clearer if those steps were extracted to a standalone sequence 
referenced from both locations ?

> >> +    .. note::
> >> +
> >> +       It is allowed for the driver not to support the same pixel
> >> format as
> > 
> > "Drivers may not support ..."
> > 
> >> +       previously used (before the resolution change) for the new
> >> +       resolution. The driver must select a default supported pixel
> >> format,
> >> +       return it, if queried using :c:func:`VIDIOC_G_FMT`, and the
> >> client
> >> +       must take note of it.
> >> +
> >> +4.  The client acquires visible resolution as in initialization
> >> sequence.
> >> +
> >> +5.  *[optional]* The client is allowed to enumerate available formats
> >> and
> > 
> > s/is allowed to/may/
> > 
> >> +    select a different one than currently chosen (returned via
> >> +    :c:func:`VIDIOC_G_FMT)`. This is identical to a corresponding step
> >> in
> >> +    the initialization sequence.
> >> +
> >> +6.  *[optional]* The client acquires minimum number of buffers as in
> >> +    initialization sequence.
> >> +
> >> +7.  If all the following conditions are met, the client may resume the
> >> +    decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with
> >> +    ``V4L2_DEC_CMD_START`` command, as in case of resuming after the
> >> drain
> >> +    sequence:
> >> +
> >> +    * ``sizeimage`` of new format is less than or equal to the size of
> >> +      currently allocated buffers,
> >> +
> >> +    * the number of buffers currently allocated is greater than or
> >> equal to
> >> +      the minimum number of buffers acquired in step 6.
> >> +
> >> +    In such case, the remaining steps do not apply.
> >> +
> >> +    However, if the client intends to change the buffer set, to lower
> >> +    memory usage or for any other reasons, it may be achieved by
> >> following
> >> +    the steps below.
> >> +
> >> +8.  After dequeuing all remaining buffers from the ``CAPTURE`` queue,
> > 
> > This is optional, isn't it ?
> 
> I wouldn't call it optional, since it depends on what the client does
> and what the decoder supports. That's why the point above just states
> that the remaining steps do not apply.

I meant isn't the "After dequeuing all remaining buffers from the CAPTURE 
queue" part optional ? As far as I understand, the client may decide not to 
dequeue them.

> Also added a note:
> 
>        To fulfill those requirements, the client may attempt to use
>        :c:func:`VIDIOC_CREATE_BUFS` to add more buffers. However, due to
>        hardware limitations, the decoder may not support adding buffers at
>        this point and the client must be able to handle a failure using the
>        steps below.

I wonder if there could be a way to work around those limitations on the 
driver side. At the beginning of step 7, the decoder is effectively stopped. 
If the hardware doesn't support adding new buffers on the fly, can't the 
driver support the VIDIOC_CREATE_BUFS + V4L2_DEC_CMD_START sequence the same 
way it would support the VIDIOC_STREAMOFF + VIDIOC_REQBUFS(0) + 
VIDIOC_REQBUFS(n) + VIDIOC_STREAMON ?

> >> the
> >> +    client must call :c:func:`VIDIOC_STREAMOFF` on the ``CAPTURE``
> >> queue.
> >> +    The ``OUTPUT`` queue must remain streaming (calling STREAMOFF on it
> >
> > :c:func:`VIDIOC_STREAMOFF`
> >
> >> +    would trigger a seek).

[snip]
Tomasz Figa Oct. 20, 2018, 8:52 a.m. UTC | #38
On Thu, Oct 18, 2018 at 8:22 PM Laurent Pinchart
<laurent.pinchart@ideasonboard.com> wrote:
>
> Hi Tomasz,
>
> I've stripped out all the parts on which I have no specific comment or just
> agree with your proposal. Please see below for a few additional remarks.
>
> On Thursday, 18 October 2018 13:03:33 EEST Tomasz Figa wrote:
> > On Wed, Oct 17, 2018 at 10:34 PM Laurent Pinchart wrote:
> > > On Tuesday, 24 July 2018 17:06:20 EEST Tomasz Figa wrote:
> > >> Due to complexity of the video decoding process, the V4L2 drivers of
> > >> stateful decoder hardware require specific sequences of V4L2 API calls
> > >> to be followed. These include capability enumeration, initialization,
> > >> decoding, seek, pause, dynamic resolution change, drain and end of
> > >> stream.
> > >>
> > >> Specifics of the above have been discussed during Media Workshops at
> > >> LinuxCon Europe 2012 in Barcelona and then later Embedded Linux
> > >> Conference Europe 2014 in Düsseldorf. The de facto Codec API that
> > >> originated at those events was later implemented by the drivers we
> > >> already have merged in mainline, such as s5p-mfc or coda.
> > >>
> > >> The only thing missing was the real specification included as a part of
> > >> Linux Media documentation. Fix it now and document the decoder part of
> > >> the Codec API.
> > >>
> > >> Signed-off-by: Tomasz Figa <tfiga@chromium.org>
> > >> ---
> > >>
> > >>  Documentation/media/uapi/v4l/dev-decoder.rst | 872 +++++++++++++++++++
> > >>  Documentation/media/uapi/v4l/devices.rst     |   1 +
> > >>  Documentation/media/uapi/v4l/v4l2.rst        |  10 +-
> > >>  3 files changed, 882 insertions(+), 1 deletion(-)
> > >>  create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst
> > >>
> > >> diff --git a/Documentation/media/uapi/v4l/dev-decoder.rst
> > >> b/Documentation/media/uapi/v4l/dev-decoder.rst new file mode 100644
> > >> index 000000000000..f55d34d2f860
> > >> --- /dev/null
> > >> +++ b/Documentation/media/uapi/v4l/dev-decoder.rst
> > >> @@ -0,0 +1,872 @@
>
> [snip]
>
> > >> +4.  Allocate source (bitstream) buffers via :c:func:`VIDIOC_REQBUFS` on
> > >> +    ``OUTPUT``.
> > >> +
> > >> +    * **Required fields:**
> > >> +
> > >> +      ``count``
> > >> +          requested number of buffers to allocate; greater than zero
> > >> +
> > >> +      ``type``
> > >> +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> > >> +
> > >> +      ``memory``
> > >> +          follows standard semantics
> > >> +
> > >> +      ``sizeimage``
> > >> +          follows standard semantics; the client is free to choose any
> > >> +          suitable size, however, it may be subject to change by the
> > >> +          driver
> > >> +
> > >> +    * **Return fields:**
> > >> +
> > >> +      ``count``
> > >> +          actual number of buffers allocated
> > >> +
> > >> +    * The driver must adjust count to minimum of required number of
> > >> +      ``OUTPUT`` buffers for given format and count passed.
> > >
> > > Isn't it the maximum, not the minimum ?
> >
> > It's actually neither. All we can generally say here is that the
> > number will be adjusted and the client must note it.
>
> I expect it to be clamp(requested count, driver minimum, driver maximum). I'm
> not sure it's worth capturing this in the document though, but we could say
>
> "The driver must clam count to the minimum and maximum number of required
> ``OUTPUT`` buffers for the given format ."
>

I'd leave the details to the documentation of VIDIOC_REQBUFS, if
needed. This document focuses on the decoder UAPI and with this note I
want to ensure that the applications don't assume that exactly the
requested number of buffers is always allocated.

How about making it even simpler:

The actual number of allocated buffers may differ from the ``count``
given. The client must check the updated value of ``count`` after the
call returns.

> > >> The client must
> > >> +      check this value after the ioctl returns to get the number of
> > >> +      buffers allocated.
> > >> +
> > >> +    .. note::
> > >> +
> > >> +       To allocate more than minimum number of buffers (for pipeline
> > >> +       depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT``) to
> > >> +       get minimum number of buffers required by the driver/format,
> > >> +       and pass the obtained value plus the number of additional
> > >> +       buffers needed in count to :c:func:`VIDIOC_REQBUFS`.
> > >> +
> > >> +5.  Start streaming on ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON`.
> > >> +
> > >> +6.  This step only applies to coded formats that contain resolution
> > >> +    information in the stream. Continue queuing/dequeuing bitstream
> > >> +    buffers to/from the ``OUTPUT`` queue via :c:func:`VIDIOC_QBUF` and
> > >> +    :c:func:`VIDIOC_DQBUF`. The driver must keep processing and
> > >> returning
> > >> +    each buffer to the client until required metadata to configure the
> > >> +    ``CAPTURE`` queue are found. This is indicated by the driver
> > >> sending
> > >> +    a ``V4L2_EVENT_SOURCE_CHANGE`` event with
> > >> +    ``V4L2_EVENT_SRC_CH_RESOLUTION`` source change type. There is no
> > >> +    requirement to pass enough data for this to occur in the first
> > >> buffer
> > >> +    and the driver must be able to process any number.
> > >> +
> > >> +    * If data in a buffer that triggers the event is required to decode
> > >> +      the first frame, the driver must not return it to the client,
> > >> +      but must retain it for further decoding.
> > >> +
> > >> +    * If the client set width and height of ``OUTPUT`` format to 0,
> > >> calling
> > >> +      :c:func:`VIDIOC_G_FMT` on the ``CAPTURE`` queue will return
> > >> -EPERM,
> > >> +      until the driver configures ``CAPTURE`` format according to stream
> > >> +      metadata.
> > >
> > > That's a pretty harsh handling for this condition. What's the rationale
> > > for returning -EPERM instead of for instance succeeding with width and
> > > height set to 0 ?
> >
> > I don't like it, but the error condition must stay for compatibility
> > reasons as that's what current drivers implement and applications
> > expect. (Technically current drivers would return -EINVAL, but we
> > concluded that existing applications don't care about the exact value,
> > so we can change it to make more sense.)
>
> Fair enough :-/ A bit of a shame though. Should we try to use an error code
> that would have less chance of being confused with an actual permission
> problem ? -EILSEQ could be an option for "illegal sequence" of operations, but
> better options could exist.
>

In Request API we concluded that -EACCES is the right code to return
for G_EXT_CTRLS on a request that has not finished yet. The case here
is similar - the capture queue is not yet set up. What do you think?

> > >> +    * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events
> > >> and
> > >> +      the event is signaled, the decoding process will not continue
> > >> until
> > >> +      it is acknowledged by either (re-)starting streaming on
> > >> ``CAPTURE``,
> > >> +      or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START``
> > >> +      command.
> > >> +
> > >> +    .. note::
> > >> +
> > >> +       No decoded frames are produced during this phase.
> > >> +
>
> [snip]
>
> > >> +8.  Call :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` queue to get format for
> > >> the +    destination buffers parsed/decoded from the bitstream.
> > >> +
> > >> +    * **Required fields:**
> > >> +
> > >> +      ``type``
> > >> +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
> > >> +
> > >> +    * **Return fields:**
> > >> +
> > >> +      ``width``, ``height``
> > >> +          frame buffer resolution for the decoded frames
> > >> +
> > >> +      ``pixelformat``
> > >> +          pixel format for decoded frames
> > >> +
> > >> +      ``num_planes`` (for _MPLANE ``type`` only)
> > >> +          number of planes for pixelformat
> > >> +
> > >> +      ``sizeimage``, ``bytesperline``
> > >> +          as per standard semantics; matching frame buffer format
> > >> +
> > >> +    .. note::
> > >> +
> > >> +       The value of ``pixelformat`` may be any pixel format supported
> > >> and
> > >> +       must be supported for current stream, based on the information
> > >> +       parsed from the stream and hardware capabilities. It is
> > >> suggested
> > >> +       that driver chooses the preferred/optimal format for given
> > >
> > > In compliance with RFC 2119, how about using "Drivers should choose"
> > > instead of "It is suggested that driver chooses" ?
> >
> > The whole paragraph became:
> >
> >        The value of ``pixelformat`` may be any pixel format supported by the
> > decoder for the current stream. It is expected that the decoder chooses a
> > preferred/optimal format for the default configuration. For example, a YUV
> > format may be preferred over an RGB format, if additional conversion step
> > would be required.
>
> How about using "should" instead of "it is expected that" ?
>

Done.

> [snip]
>
> > >> +10.  *[optional]* Choose a different ``CAPTURE`` format than suggested
> > >> via
> > >> +     :c:func:`VIDIOC_S_FMT` on ``CAPTURE`` queue. It is possible for
> > >> the
> > >> +     client to choose a different format than selected/suggested by the
> > >
> > > And here, "A client may choose" ?
> > >
> > >> +     driver in :c:func:`VIDIOC_G_FMT`.
> > >> +
> > >> +     * **Required fields:**
> > >> +
> > >> +       ``type``
> > >> +           a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
> > >> +
> > >> +       ``pixelformat``
> > >> +           a raw pixel format
> > >> +
> > >> +     .. note::
> > >> +
> > >> +        Calling :c:func:`VIDIOC_ENUM_FMT` to discover currently
> > >> available
> > >> +        formats after receiving ``V4L2_EVENT_SOURCE_CHANGE`` is useful
> > >> to
> > >> +        find out a set of allowed formats for given configuration, but
> > >> not
> > >> +        required, if the client can accept the defaults.
> > >
> > > s/required/required,/
> >
> > That would become "[...]but not required,, if the client[...]". Is
> > that your suggestion? ;)
>
> Oops, the other way around of course :-)

Done.

>
> [snip]
>
> > >> +3. Start queuing buffers to ``OUTPUT`` queue containing stream data
> > >> after
> > >> +   the seek until a suitable resume point is found.
> > >> +
> > >> +   .. note::
> > >> +
> > >> +      There is no requirement to begin queuing stream starting exactly
> > >> from
> > >
> > > s/stream/buffers/ ?
> >
> > Perhaps "stream data"? The buffers don't have a resume point, the stream
> > does.
>
> Maybe "coded data" ?
>

Done.

> > >> +      a resume point (e.g. SPS or a keyframe). The driver must handle
> > >> any
> > >> +      data queued and must keep processing the queued buffers until it
> > >> +      finds a suitable resume point. While looking for a resume point,
> > >> the
> > >> +      driver processes ``OUTPUT`` buffers and returns them to the
> > >> client
> > >> +      without producing any decoded frames.
> > >> +
> > >> +      For hardware known to be mishandling seeks to a non-resume point,
> > >> +      e.g. by returning corrupted decoded frames, the driver must be
> > >> able
> > >> +      to handle such seeks without a crash or any fatal decode error.
> > >
> > > This should be true for any hardware, there should never be any crash or
> > > fatal decode error. I'd write it as
> > >
> > > Some hardware is known to mishandle seeks to a non-resume point. Such an
> > > operation may result in an unspecified number of corrupted decoded frames
> > > being made available on ``CAPTURE``. Drivers must ensure that no fatal
> > > decoding errors or crashes occur, and implement any necessary handling and
> > > work-arounds for hardware issues related to seek operations.
> >
> > Done.
>
> [snip]
>
> > >> +2.  After all buffers containing decoded frames from before the
> > >> resolution
> > >> +    change point are ready to be dequeued on the ``CAPTURE`` queue, the
> > >> +    driver sends a ``V4L2_EVENT_SOURCE_CHANGE`` event for source change
> > >> +    type ``V4L2_EVENT_SRC_CH_RESOLUTION``.
> > >> +
> > >> +    * The last buffer from before the change must be marked with
> > >> +      :c:type:`v4l2_buffer` ``flags`` flag ``V4L2_BUF_FLAG_LAST`` as in
> > >> the +      drain sequence. The last buffer might be empty (with
> > >> +      :c:type:`v4l2_buffer` ``bytesused`` = 0) and must be ignored by
> > >> the
> > >> +      client, since it does not contain any decoded frame.
> > >> +
> > >> +    * Any client query issued after the driver queues the event must
> > >> return
> > >> +      values applying to the stream after the resolution change,
> > >> including
> > >> +      queue formats, selection rectangles and controls.
> > >> +
> > >> +    * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events
> > >> and
> > >> +      the event is signaled, the decoding process will not continue
> > >> until
> > >> +      it is acknowledged by either (re-)starting streaming on
> > >> ``CAPTURE``,
> > >> +      or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START``
> > >> +      command.
> > >
> > > This usage of V4L2_DEC_CMD_START isn't aligned with the documentation of
> > > the command. I'm not opposed to this, but I think the use cases of
> > > decoder commands for codecs should be explained in the VIDIOC_DECODER_CMD
> > > documentation. What bothers me in particular is usage of
> > > V4L2_DEC_CMD_START to restart the decoder, while no V4L2_DEC_CMD_STOP has
> > > been issued. Should we add a section that details the decoder state
> > > machine with the implicit and explicit ways in which it is started and
> > > stopped ?
> >
> > Yes, we should probably extend the VIDIOC_DECODER_CMD documentation.
> >
> > As for diagrams, they would indeed be nice to have, but maybe we could
> > add them in a follow up patch?
>
> That's another way to say it won't happen, right ? ;-)

I'd prefer to focus on the basic description first, since for the last
6 years we haven't had any documentation at all. I hope we can later
have more contributors follow up with patches to make it easier to
read, e.g. add nice diagrams.

Anyway, I'll try to add a simple state machine diagram in dot, but
would appreciate if we could postpone any not critical improvements.

> I'm OK with that, but I
> think we should still clarify that the source change generates an implicit
> V4L2_DEC_CMD_STOP.
>

Good idea, thanks.

> > > I would also reference step 7 here.
> > >
> > >> +    .. note::
> > >> +
> > >> +       Any attempts to dequeue more buffers beyond the buffer marked
> > >> +       with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE error from
> > >> +       :c:func:`VIDIOC_DQBUF`.
> > >> +
> > >> +3.  The client calls :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` to get the
> > >> new
> > >> +    format information. This is identical to calling
> > >> :c:func:`VIDIOC_G_FMT` +    after ``V4L2_EVENT_SRC_CH_RESOLUTION`` in
> > >> the initialization sequence +    and should be handled similarly.
> > >
> > > As the source resolution change event is mentioned in multiple places, how
> > > about extracting the related ioctls sequence to a specific section, and
> > > referencing it where needed (at least from the initialization sequence and
> > > here) ?
> >
> > I made the text here refer to the Initialization sequence.
>
> Wouldn't it be clearer if those steps were extracted to a standalone sequence
> referenced from both locations ?
>

It might be possible to extract the operations on the CAPTURE queue
into a "Capture setup" sequence. Let me check that.

> > >> +    .. note::
> > >> +
> > >> +       It is allowed for the driver not to support the same pixel
> > >> format as
> > >
> > > "Drivers may not support ..."
> > >
> > >> +       previously used (before the resolution change) for the new
> > >> +       resolution. The driver must select a default supported pixel
> > >> format,
> > >> +       return it, if queried using :c:func:`VIDIOC_G_FMT`, and the
> > >> client
> > >> +       must take note of it.
> > >> +
> > >> +4.  The client acquires visible resolution as in initialization
> > >> sequence.
> > >> +
> > >> +5.  *[optional]* The client is allowed to enumerate available formats
> > >> and
> > >
> > > s/is allowed to/may/
> > >
> > >> +    select a different one than currently chosen (returned via
> > >> +    :c:func:`VIDIOC_G_FMT)`. This is identical to a corresponding step
> > >> in
> > >> +    the initialization sequence.
> > >> +
> > >> +6.  *[optional]* The client acquires minimum number of buffers as in
> > >> +    initialization sequence.
> > >> +
> > >> +7.  If all the following conditions are met, the client may resume the
> > >> +    decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with
> > >> +    ``V4L2_DEC_CMD_START`` command, as in case of resuming after the
> > >> drain
> > >> +    sequence:
> > >> +
> > >> +    * ``sizeimage`` of new format is less than or equal to the size of
> > >> +      currently allocated buffers,
> > >> +
> > >> +    * the number of buffers currently allocated is greater than or
> > >> equal to
> > >> +      the minimum number of buffers acquired in step 6.
> > >> +
> > >> +    In such case, the remaining steps do not apply.
> > >> +
> > >> +    However, if the client intends to change the buffer set, to lower
> > >> +    memory usage or for any other reasons, it may be achieved by
> > >> following
> > >> +    the steps below.
> > >> +
> > >> +8.  After dequeuing all remaining buffers from the ``CAPTURE`` queue,
> > >
> > > This is optional, isn't it ?
> >
> > I wouldn't call it optional, since it depends on what the client does
> > and what the decoder supports. That's why the point above just states
> > that the remaining steps do not apply.
>
> I meant isn't the "After dequeuing all remaining buffers from the CAPTURE
> queue" part optional ? As far as I understand, the client may decide not to
> dequeue them.
>

A STREAMOFF would discard the already decoded but not yet dequeued
frames. While it's technically fine, it doesn't make sense, because it
would lead to a frame drop. Therefore, I'd rather keep it required,
for simplicity.

> > Also added a note:
> >
> >        To fulfill those requirements, the client may attempt to use
> >        :c:func:`VIDIOC_CREATE_BUFS` to add more buffers. However, due to
> >        hardware limitations, the decoder may not support adding buffers at
> >        this point and the client must be able to handle a failure using the
> >        steps below.
>
> I wonder if there could be a way to work around those limitations on the
> driver side. At the beginning of step 7, the decoder is effectively stopped.
> If the hardware doesn't support adding new buffers on the fly, can't the
> driver support the VIDIOC_CREATE_BUFS + V4L2_DEC_CMD_START sequence the same
> way it would support the VIDIOC_STREAMOFF + VIDIOC_REQBUFS(0) +
> VIDIOC_REQBUFS(n) + VIDIOC_STREAMON ?
>

I guess that would work. I would only allow it for the case where
existing buffers are already big enough and just more buffers are
needed. Otherwise it would lead to some weird cases, such as some old
buffers already in the CAPTURE queue, blocking the decode of further
frames. (While it could be handled by the driver returning them with
an error state, it would only complicate the interface.)

Best regards,
Tomasz
Tomasz Figa Oct. 20, 2018, 10:24 a.m. UTC | #39
On Thu, Oct 18, 2018 at 7:03 PM Tomasz Figa <tfiga@chromium.org> wrote:
>
> Hi Laurent,
>
> On Wed, Oct 17, 2018 at 10:34 PM Laurent Pinchart
> <laurent.pinchart@ideasonboard.com> wrote:
> >
> > Hi Tomasz,
> >
> > Thank you for the patch.
>
> Thanks for your comments! Please see my replies inline.
>
> >
> > On Tuesday, 24 July 2018 17:06:20 EEST Tomasz Figa wrote:
[snip]
> > > +4. At this point, decoding is paused and the driver will accept, but not
> > > +   process any newly queued ``OUTPUT`` buffers until the client issues
> > > +   ``V4L2_DEC_CMD_START`` or restarts streaming on any queue.
> > > +
> > > +* Once the drain sequence is initiated, the client needs to drive it to
> > > +  completion, as described by the above steps, unless it aborts the process
> > > +  by issuing :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue.  The client
> > > +  is not allowed to issue ``V4L2_DEC_CMD_START`` or ``V4L2_DEC_CMD_STOP``
> > > +  again while the drain sequence is in progress and they will fail with
> > > +  -EBUSY error code if attempted.
> >
> > While this seems OK to me, I think drivers will need help to implement all the
> > corner cases correctly without race conditions.
>
> We went through the possible list of corner cases and concluded that
> there is no use in handling them, especially considering how much they
> would complicate both the userspace and the drivers. Not even
> mentioning some hardware, like s5p-mfc, which actually has a dedicated
> flush operation, that needs to complete before the decoder can switch
> back to normal mode.

Actually I misread your comment.

Agreed that the decoder commands are a bit tricky to implement
properly. That's one of the reasons I decided to make the return
-EBUSY while an existing drain is in progress.

Do you have any particular simplification in mind that could avoid
some corner cases?

Best regards,
Tomasz
Tomasz Figa Oct. 20, 2018, 3:39 p.m. UTC | #40
On Thu, Oct 18, 2018 at 7:03 PM Tomasz Figa <tfiga@chromium.org> wrote:
>
> Hi Laurent,
>
> On Wed, Oct 17, 2018 at 10:34 PM Laurent Pinchart
> <laurent.pinchart@ideasonboard.com> wrote:
> >
> > Hi Tomasz,
> >
> > Thank you for the patch.
>
> Thanks for your comments! Please see my replies inline.
>
> >
> > On Tuesday, 24 July 2018 17:06:20 EEST Tomasz Figa wrote:
[snip]
> > > The driver must also set
> > > +     ``V4L2_BUF_FLAG_LAST`` in :c:type:`v4l2_buffer` ``flags`` field on the
> > > +     buffer on the ``CAPTURE`` queue containing the last frame (if any)
> > > +     produced as a result of processing the ``OUTPUT`` buffers queued
> > > +     before ``V4L2_DEC_CMD_STOP``. If no more frames are left to be
> > > +     returned at the point of handling ``V4L2_DEC_CMD_STOP``, the driver
> > > +     must return an empty buffer (with :c:type:`v4l2_buffer`
> > > +     ``bytesused`` = 0) as the last buffer with ``V4L2_BUF_FLAG_LAST`` set
> > > +     instead. Any attempts to dequeue more buffers beyond the buffer marked
> > > +     with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE error from
> > > +     :c:func:`VIDIOC_DQBUF`.
> > > +
> > > +   * If the ``CAPTURE`` queue is NOT streaming, no action is necessary for
> > > +     ``CAPTURE`` queue and the driver must send a ``V4L2_EVENT_EOS``
> > > +     immediately after all ``OUTPUT`` buffers in question have been
> > > +     processed.
> >
> > What is the use case for this ? Can't we just return an error if decoder isn't
> > streaming ?
> >
>
> Actually this is wrong. We want the queued OUTPUT buffers to be
> processed and decoded, so if the CAPTURE queue is not yet set up
> (initialization sequence not completed yet), handling the
> initialization sequence first will be needed as a part of the drain
> sequence. I've updated the document with that.

I might want to take this back. The client could just drive the
initialization to completion on its own and start the drain sequence
after that. Let me think if it makes anything easier. For reference, I
don't see any compatibility constraint here, since the existing user
space already works like that.

Best regards,
Tomasz
Laurent Pinchart Oct. 21, 2018, 9:23 a.m. UTC | #41
Hi Tomasz,

On Saturday, 20 October 2018 11:52:57 EEST Tomasz Figa wrote:
> On Thu, Oct 18, 2018 at 8:22 PM Laurent Pinchart wrote:
> > On Thursday, 18 October 2018 13:03:33 EEST Tomasz Figa wrote:
> >> On Wed, Oct 17, 2018 at 10:34 PM Laurent Pinchart wrote:
> >>> On Tuesday, 24 July 2018 17:06:20 EEST Tomasz Figa wrote:
> >>>> Due to complexity of the video decoding process, the V4L2 drivers of
> >>>> stateful decoder hardware require specific sequences of V4L2 API
> >>>> calls to be followed. These include capability enumeration,
> >>>> initialization, decoding, seek, pause, dynamic resolution change, drain
> >>>> and end of stream.
> >>>> 
> >>>> Specifics of the above have been discussed during Media Workshops at
> >>>> LinuxCon Europe 2012 in Barcelona and then later Embedded Linux
> >>>> Conference Europe 2014 in Düsseldorf. The de facto Codec API that
> >>>> originated at those events was later implemented by the drivers we
> >>>> already have merged in mainline, such as s5p-mfc or coda.
> >>>> 
> >>>> The only thing missing was the real specification included as a part
> >>>> of Linux Media documentation. Fix it now and document the decoder part
> >>>> of the Codec API.
> >>>> 
> >>>> Signed-off-by: Tomasz Figa <tfiga@chromium.org>
> >>>> ---
> >>>> 
> >>>>  Documentation/media/uapi/v4l/dev-decoder.rst | 872 +++++++++++++++++++
> >>>>  Documentation/media/uapi/v4l/devices.rst     |   1 +
> >>>>  Documentation/media/uapi/v4l/v4l2.rst        |  10 +-
> >>>>  3 files changed, 882 insertions(+), 1 deletion(-)
> >>>>  create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst
> >>>> 
> >>>> diff --git a/Documentation/media/uapi/v4l/dev-decoder.rst
> >>>> b/Documentation/media/uapi/v4l/dev-decoder.rst new file mode 100644
> >>>> index 000000000000..f55d34d2f860
> >>>> --- /dev/null
> >>>> +++ b/Documentation/media/uapi/v4l/dev-decoder.rst
> >>>> @@ -0,0 +1,872 @@
> > 
> > [snip]
> > 
> >>>> +4.  Allocate source (bitstream) buffers via :c:func:`VIDIOC_REQBUFS`
> >>>> on
> >>>> +    ``OUTPUT``.
> >>>> +
> >>>> +    * **Required fields:**
> >>>> +
> >>>> +      ``count``
> >>>> +          requested number of buffers to allocate; greater than zero
> >>>> +
> >>>> +      ``type``
> >>>> +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> >>>> +
> >>>> +      ``memory``
> >>>> +          follows standard semantics
> >>>> +
> >>>> +      ``sizeimage``
> >>>> +          follows standard semantics; the client is free to choose
> >>>> any
> >>>> +          suitable size, however, it may be subject to change by the
> >>>> +          driver
> >>>> +
> >>>> +    * **Return fields:**
> >>>> +
> >>>> +      ``count``
> >>>> +          actual number of buffers allocated
> >>>> +
> >>>> +    * The driver must adjust count to minimum of required number of
> >>>> +      ``OUTPUT`` buffers for given format and count passed.
> >>> 
> >>> Isn't it the maximum, not the minimum ?
> >> 
> >> It's actually neither. All we can generally say here is that the
> >> number will be adjusted and the client must note it.
> > 
> > I expect it to be clamp(requested count, driver minimum, driver maximum).
> > I'm not sure it's worth capturing this in the document though, but we
> > could say
> > 
> > "The driver must clam count to the minimum and maximum number of required
> > ``OUTPUT`` buffers for the given format ."
> 
> I'd leave the details to the documentation of VIDIOC_REQBUFS, if
> needed. This document focuses on the decoder UAPI and with this note I
> want to ensure that the applications don't assume that exactly the
> requested number of buffers is always allocated.
> 
> How about making it even simpler:
> 
> The actual number of allocated buffers may differ from the ``count``
> given. The client must check the updated value of ``count`` after the
> call returns.

That works for me. You may want to see "... given, as specified in the 
VIDIOC_REQBUFS documentation.".

> >>>> The client must
> >>>> +      check this value after the ioctl returns to get the number of
> >>>> +      buffers allocated.
> >>>> +
> >>>> +    .. note::
> >>>> +
> >>>> +       To allocate more than minimum number of buffers (for pipeline
> >>>> +       depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT``) to
> >>>> +       get minimum number of buffers required by the driver/format,
> >>>> +       and pass the obtained value plus the number of additional
> >>>> +       buffers needed in count to :c:func:`VIDIOC_REQBUFS`.
> >>>> +
> >>>> +5.  Start streaming on ``OUTPUT`` queue via
> >>>> :c:func:`VIDIOC_STREAMON`.
> >>>> +
> >>>> +6.  This step only applies to coded formats that contain resolution
> >>>> +    information in the stream. Continue queuing/dequeuing bitstream
> >>>> +    buffers to/from the ``OUTPUT`` queue via :c:func:`VIDIOC_QBUF`
> >>>> and
> >>>> +    :c:func:`VIDIOC_DQBUF`. The driver must keep processing and
> >>>> returning
> >>>> +    each buffer to the client until required metadata to configure
> >>>> the
> >>>> +    ``CAPTURE`` queue are found. This is indicated by the driver
> >>>> sending
> >>>> +    a ``V4L2_EVENT_SOURCE_CHANGE`` event with
> >>>> +    ``V4L2_EVENT_SRC_CH_RESOLUTION`` source change type. There is no
> >>>> +    requirement to pass enough data for this to occur in the first
> >>>> buffer
> >>>> +    and the driver must be able to process any number.
> >>>> +
> >>>> +    * If data in a buffer that triggers the event is required to
> >>>> decode
> >>>> +      the first frame, the driver must not return it to the client,
> >>>> +      but must retain it for further decoding.
> >>>> +
> >>>> +    * If the client set width and height of ``OUTPUT`` format to 0,
> >>>> calling
> >>>> +      :c:func:`VIDIOC_G_FMT` on the ``CAPTURE`` queue will return
> >>>> -EPERM,
> >>>> +      until the driver configures ``CAPTURE`` format according to
> >>>> stream
> >>>> +      metadata.
> >>> 
> >>> That's a pretty harsh handling for this condition. What's the
> >>> rationale for returning -EPERM instead of for instance succeeding with
> >>> width and height set to 0 ?
> >> 
> >> I don't like it, but the error condition must stay for compatibility
> >> reasons as that's what current drivers implement and applications
> >> expect. (Technically current drivers would return -EINVAL, but we
> >> concluded that existing applications don't care about the exact value,
> >> so we can change it to make more sense.)
> > 
> > Fair enough :-/ A bit of a shame though. Should we try to use an error
> > code that would have less chance of being confused with an actual
> > permission problem ? -EILSEQ could be an option for "illegal sequence" of
> > operations, but better options could exist.
> 
> In Request API we concluded that -EACCES is the right code to return
> for G_EXT_CTRLS on a request that has not finished yet. The case here
> is similar - the capture queue is not yet set up. What do you think?

Good question. -EPERM is documented as "Operation not permitted", while -
EACCES is documented as "Permission denied". The former appears to be 
understood as "This isn't a good idea, I can't let you do that", and the 
latter as "You don't have sufficient privileges, if you retry with the correct 
privileges this will succeed". Neither are a perfect match, but -EACCES might 
be better if you replace getting privileges by performing the required setup.

> >>>> +    * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE``
> >>>> events and
> >>>> +      the event is signaled, the decoding process will not continue
> >>>> until
> >>>> +      it is acknowledged by either (re-)starting streaming on
> >>>> ``CAPTURE``,
> >>>> +      or via :c:func:`VIDIOC_DECODER_CMD` with
> >>>> ``V4L2_DEC_CMD_START``
> >>>> +      command.
> >>>> +
> >>>> +    .. note::
> >>>> +
> >>>> +       No decoded frames are produced during this phase.
> >>>> +

[snip]

> >> Also added a note:
> >>        To fulfill those requirements, the client may attempt to use
> >>        :c:func:`VIDIOC_CREATE_BUFS` to add more buffers. However, due to
> >>        hardware limitations, the decoder may not support adding buffers
> >>        at this point and the client must be able to handle a failure
> >>        using the steps below.
> > 
> > I wonder if there could be a way to work around those limitations on the
> > driver side. At the beginning of step 7, the decoder is effectively
> > stopped. If the hardware doesn't support adding new buffers on the fly,
> > can't the driver support the VIDIOC_CREATE_BUFS + V4L2_DEC_CMD_START
> > sequence the same way it would support the VIDIOC_STREAMOFF +
> > VIDIOC_REQBUFS(0) +
> > VIDIOC_REQBUFS(n) + VIDIOC_STREAMON ?
> 
> I guess that would work. I would only allow it for the case where
> existing buffers are already big enough and just more buffers are
> needed. Otherwise it would lead to some weird cases, such as some old
> buffers already in the CAPTURE queue, blocking the decode of further
> frames. (While it could be handled by the driver returning them with
> an error state, it would only complicate the interface.)

Good point. I wonder if this could be handled in the framework. If it can't, 
or with non trivial support code on the driver side, then I would agree with 
you. Otherwise, handling the workaround in the framework would ensure 
consistent behaviour across drivers with minimal cost, and simplify the 
userspace API, so I think it would be a good thing.
Laurent Pinchart Oct. 21, 2018, 9:26 a.m. UTC | #42
Hi Tomasz,

On Saturday, 20 October 2018 13:24:20 EEST Tomasz Figa wrote:
> On Thu, Oct 18, 2018 at 7:03 PM Tomasz Figa wrote:
> > On Wed, Oct 17, 2018 at 10:34 PM Laurent Pinchart wrote:
> >> Hi Tomasz,
> >> 
> >> Thank you for the patch.
> > 
> > Thanks for your comments! Please see my replies inline.
> > 
> >> On Tuesday, 24 July 2018 17:06:20 EEST Tomasz Figa wrote:
> 
> [snip]
> 
> >>> +4. At this point, decoding is paused and the driver will accept, but
> >>> not
> >>> +   process any newly queued ``OUTPUT`` buffers until the client
> >>> issues
> >>> +   ``V4L2_DEC_CMD_START`` or restarts streaming on any queue.
> >>> +
> >>> +* Once the drain sequence is initiated, the client needs to drive it
> >>> to
> >>> +  completion, as described by the above steps, unless it aborts the
> >>> process
> >>> +  by issuing :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue. The client
> >>> +  is not allowed to issue ``V4L2_DEC_CMD_START`` or
> >>> ``V4L2_DEC_CMD_STOP``
> >>> +  again while the drain sequence is in progress and they will fail with
> >>> +  -EBUSY error code if attempted.
> >> 
> >> While this seems OK to me, I think drivers will need help to implement
> >> all the corner cases correctly without race conditions.
> > 
> > We went through the possible list of corner cases and concluded that
> > there is no use in handling them, especially considering how much they
> > would complicate both the userspace and the drivers. Not even
> > mentioning some hardware, like s5p-mfc, which actually has a dedicated
> > flush operation, that needs to complete before the decoder can switch
> > back to normal mode.
> 
> Actually I misread your comment.
> 
> Agreed that the decoder commands are a bit tricky to implement
> properly. That's one of the reasons I decided to make the return
> -EBUSY while an existing drain is in progress.
> 
> Do you have any particular simplification in mind that could avoid
> some corner cases?

Not really on the spec side. I think we'll have to implement helper functions 
for drivers to use if we want to ensure a consistent and bug-free behaviour.
Tomasz Figa Oct. 22, 2018, 6:19 a.m. UTC | #43
On Sun, Oct 21, 2018 at 6:23 PM Laurent Pinchart
<laurent.pinchart@ideasonboard.com> wrote:
>
> Hi Tomasz,
>
> On Saturday, 20 October 2018 11:52:57 EEST Tomasz Figa wrote:
> > On Thu, Oct 18, 2018 at 8:22 PM Laurent Pinchart wrote:
> > > On Thursday, 18 October 2018 13:03:33 EEST Tomasz Figa wrote:
> > >> On Wed, Oct 17, 2018 at 10:34 PM Laurent Pinchart wrote:
> > >>> On Tuesday, 24 July 2018 17:06:20 EEST Tomasz Figa wrote:
> > >>>> Due to complexity of the video decoding process, the V4L2 drivers of
> > >>>> stateful decoder hardware require specific sequences of V4L2 API
> > >>>> calls to be followed. These include capability enumeration,
> > >>>> initialization, decoding, seek, pause, dynamic resolution change, drain
> > >>>> and end of stream.
> > >>>>
> > >>>> Specifics of the above have been discussed during Media Workshops at
> > >>>> LinuxCon Europe 2012 in Barcelona and then later Embedded Linux
> > >>>> Conference Europe 2014 in Düsseldorf. The de facto Codec API that
> > >>>> originated at those events was later implemented by the drivers we
> > >>>> already have merged in mainline, such as s5p-mfc or coda.
> > >>>>
> > >>>> The only thing missing was the real specification included as a part
> > >>>> of Linux Media documentation. Fix it now and document the decoder part
> > >>>> of the Codec API.
> > >>>>
> > >>>> Signed-off-by: Tomasz Figa <tfiga@chromium.org>
> > >>>> ---
> > >>>>
> > >>>>  Documentation/media/uapi/v4l/dev-decoder.rst | 872 +++++++++++++++++++
> > >>>>  Documentation/media/uapi/v4l/devices.rst     |   1 +
> > >>>>  Documentation/media/uapi/v4l/v4l2.rst        |  10 +-
> > >>>>  3 files changed, 882 insertions(+), 1 deletion(-)
> > >>>>  create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst
> > >>>>
> > >>>> diff --git a/Documentation/media/uapi/v4l/dev-decoder.rst
> > >>>> b/Documentation/media/uapi/v4l/dev-decoder.rst new file mode 100644
> > >>>> index 000000000000..f55d34d2f860
> > >>>> --- /dev/null
> > >>>> +++ b/Documentation/media/uapi/v4l/dev-decoder.rst
> > >>>> @@ -0,0 +1,872 @@
> > >
> > > [snip]
> > >
> > >>>> +4.  Allocate source (bitstream) buffers via :c:func:`VIDIOC_REQBUFS`
> > >>>> on
> > >>>> +    ``OUTPUT``.
> > >>>> +
> > >>>> +    * **Required fields:**
> > >>>> +
> > >>>> +      ``count``
> > >>>> +          requested number of buffers to allocate; greater than zero
> > >>>> +
> > >>>> +      ``type``
> > >>>> +          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
> > >>>> +
> > >>>> +      ``memory``
> > >>>> +          follows standard semantics
> > >>>> +
> > >>>> +      ``sizeimage``
> > >>>> +          follows standard semantics; the client is free to choose
> > >>>> any
> > >>>> +          suitable size, however, it may be subject to change by the
> > >>>> +          driver
> > >>>> +
> > >>>> +    * **Return fields:**
> > >>>> +
> > >>>> +      ``count``
> > >>>> +          actual number of buffers allocated
> > >>>> +
> > >>>> +    * The driver must adjust count to minimum of required number of
> > >>>> +      ``OUTPUT`` buffers for given format and count passed.
> > >>>
> > >>> Isn't it the maximum, not the minimum ?
> > >>
> > >> It's actually neither. All we can generally say here is that the
> > >> number will be adjusted and the client must note it.
> > >
> > > I expect it to be clamp(requested count, driver minimum, driver maximum).
> > > I'm not sure it's worth capturing this in the document though, but we
> > > could say
> > >
> > > "The driver must clam count to the minimum and maximum number of required
> > > ``OUTPUT`` buffers for the given format ."
> >
> > I'd leave the details to the documentation of VIDIOC_REQBUFS, if
> > needed. This document focuses on the decoder UAPI and with this note I
> > want to ensure that the applications don't assume that exactly the
> > requested number of buffers is always allocated.
> >
> > How about making it even simpler:
> >
> > The actual number of allocated buffers may differ from the ``count``
> > given. The client must check the updated value of ``count`` after the
> > call returns.
>
> That works for me. You may want to see "... given, as specified in the
> VIDIOC_REQBUFS documentation.".
>

The "Conventions[...]" section mentions that

1. The general V4L2 API rules apply if not specified in this document
   otherwise.

so I think I'll skip this additional explanation.

> > >>>> The client must
> > >>>> +      check this value after the ioctl returns to get the number of
> > >>>> +      buffers allocated.
> > >>>> +
> > >>>> +    .. note::
> > >>>> +
> > >>>> +       To allocate more than minimum number of buffers (for pipeline
> > >>>> +       depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT``) to
> > >>>> +       get minimum number of buffers required by the driver/format,
> > >>>> +       and pass the obtained value plus the number of additional
> > >>>> +       buffers needed in count to :c:func:`VIDIOC_REQBUFS`.
> > >>>> +
> > >>>> +5.  Start streaming on ``OUTPUT`` queue via
> > >>>> :c:func:`VIDIOC_STREAMON`.
> > >>>> +
> > >>>> +6.  This step only applies to coded formats that contain resolution
> > >>>> +    information in the stream. Continue queuing/dequeuing bitstream
> > >>>> +    buffers to/from the ``OUTPUT`` queue via :c:func:`VIDIOC_QBUF`
> > >>>> and
> > >>>> +    :c:func:`VIDIOC_DQBUF`. The driver must keep processing and
> > >>>> returning
> > >>>> +    each buffer to the client until required metadata to configure
> > >>>> the
> > >>>> +    ``CAPTURE`` queue are found. This is indicated by the driver
> > >>>> sending
> > >>>> +    a ``V4L2_EVENT_SOURCE_CHANGE`` event with
> > >>>> +    ``V4L2_EVENT_SRC_CH_RESOLUTION`` source change type. There is no
> > >>>> +    requirement to pass enough data for this to occur in the first
> > >>>> buffer
> > >>>> +    and the driver must be able to process any number.
> > >>>> +
> > >>>> +    * If data in a buffer that triggers the event is required to
> > >>>> decode
> > >>>> +      the first frame, the driver must not return it to the client,
> > >>>> +      but must retain it for further decoding.
> > >>>> +
> > >>>> +    * If the client set width and height of ``OUTPUT`` format to 0,
> > >>>> calling
> > >>>> +      :c:func:`VIDIOC_G_FMT` on the ``CAPTURE`` queue will return
> > >>>> -EPERM,
> > >>>> +      until the driver configures ``CAPTURE`` format according to
> > >>>> stream
> > >>>> +      metadata.
> > >>>
> > >>> That's a pretty harsh handling for this condition. What's the
> > >>> rationale for returning -EPERM instead of for instance succeeding with
> > >>> width and height set to 0 ?
> > >>
> > >> I don't like it, but the error condition must stay for compatibility
> > >> reasons as that's what current drivers implement and applications
> > >> expect. (Technically current drivers would return -EINVAL, but we
> > >> concluded that existing applications don't care about the exact value,
> > >> so we can change it to make more sense.)
> > >
> > > Fair enough :-/ A bit of a shame though. Should we try to use an error
> > > code that would have less chance of being confused with an actual
> > > permission problem ? -EILSEQ could be an option for "illegal sequence" of
> > > operations, but better options could exist.
> >
> > In Request API we concluded that -EACCES is the right code to return
> > for G_EXT_CTRLS on a request that has not finished yet. The case here
> > is similar - the capture queue is not yet set up. What do you think?
>
> Good question. -EPERM is documented as "Operation not permitted", while -
> EACCES is documented as "Permission denied". The former appears to be
> understood as "This isn't a good idea, I can't let you do that", and the
> latter as "You don't have sufficient privileges, if you retry with the correct
> privileges this will succeed". Neither are a perfect match, but -EACCES might
> be better if you replace getting privileges by performing the required setup.
>

AFAIR that was also the rationale behind it for the Request API.

> > >>>> +    * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE``
> > >>>> events and
> > >>>> +      the event is signaled, the decoding process will not continue
> > >>>> until
> > >>>> +      it is acknowledged by either (re-)starting streaming on
> > >>>> ``CAPTURE``,
> > >>>> +      or via :c:func:`VIDIOC_DECODER_CMD` with
> > >>>> ``V4L2_DEC_CMD_START``
> > >>>> +      command.
> > >>>> +
> > >>>> +    .. note::
> > >>>> +
> > >>>> +       No decoded frames are produced during this phase.
> > >>>> +
>
> [snip]
>
> > >> Also added a note:
> > >>        To fulfill those requirements, the client may attempt to use
> > >>        :c:func:`VIDIOC_CREATE_BUFS` to add more buffers. However, due to
> > >>        hardware limitations, the decoder may not support adding buffers
> > >>        at this point and the client must be able to handle a failure
> > >>        using the steps below.
> > >
> > > I wonder if there could be a way to work around those limitations on the
> > > driver side. At the beginning of step 7, the decoder is effectively
> > > stopped. If the hardware doesn't support adding new buffers on the fly,
> > > can't the driver support the VIDIOC_CREATE_BUFS + V4L2_DEC_CMD_START
> > > sequence the same way it would support the VIDIOC_STREAMOFF +
> > > VIDIOC_REQBUFS(0) +
> > > VIDIOC_REQBUFS(n) + VIDIOC_STREAMON ?
> >
> > I guess that would work. I would only allow it for the case where
> > existing buffers are already big enough and just more buffers are
> > needed. Otherwise it would lead to some weird cases, such as some old
> > buffers already in the CAPTURE queue, blocking the decode of further
> > frames. (While it could be handled by the driver returning them with
> > an error state, it would only complicate the interface.)
>
> Good point. I wonder if this could be handled in the framework. If it can't,
> or with non trivial support code on the driver side, then I would agree with
> you. Otherwise, handling the workaround in the framework would ensure
> consistent behaviour across drivers with minimal cost, and simplify the
> userspace API, so I think it would be a good thing.

I think it should be possible to handle in the framework, but right
now we don't have a framework for codecs and it would definitely be a
non-trivial piece of code.

I'd stick to the restricted behavior for now, since it's easy to lift
the restrictions in the future, but if we make it mandatory, the
userspace could start relying on it.

Best regards,
Tomasz
diff mbox series

Patch

diff --git a/Documentation/media/uapi/v4l/dev-decoder.rst b/Documentation/media/uapi/v4l/dev-decoder.rst
new file mode 100644
index 000000000000..f55d34d2f860
--- /dev/null
+++ b/Documentation/media/uapi/v4l/dev-decoder.rst
@@ -0,0 +1,872 @@ 
+.. -*- coding: utf-8; mode: rst -*-
+
+.. _decoder:
+
+****************************************
+Memory-to-memory Video Decoder Interface
+****************************************
+
+Input data to a video decoder are buffers containing unprocessed video
+stream (e.g. Annex-B H.264/HEVC stream, raw VP8/9 stream). The driver is
+expected not to require any additional information from the client to
+process these buffers. Output data are raw video frames returned in display
+order.
+
+Performing software parsing, processing etc. of the stream in the driver
+in order to support this interface is strongly discouraged. In case such
+operations are needed, use of Stateless Video Decoder Interface (in
+development) is strongly advised.
+
+Conventions and notation used in this document
+==============================================
+
+1. The general V4L2 API rules apply if not specified in this document
+   otherwise.
+
+2. The meaning of words “must”, “may”, “should”, etc. is as per RFC
+   2119.
+
+3. All steps not marked “optional” are required.
+
+4. :c:func:`VIDIOC_G_EXT_CTRLS`, :c:func:`VIDIOC_S_EXT_CTRLS` may be used
+   interchangeably with :c:func:`VIDIOC_G_CTRL`, :c:func:`VIDIOC_S_CTRL`,
+   unless specified otherwise.
+
+5. Single-plane API (see spec) and applicable structures may be used
+   interchangeably with Multi-plane API, unless specified otherwise,
+   depending on driver capabilities and following the general V4L2
+   guidelines.
+
+6. i = [a..b]: sequence of integers from a to b, inclusive, i.e. i =
+   [0..2]: i = 0, 1, 2.
+
+7. For ``OUTPUT`` buffer A, A’ represents a buffer on the ``CAPTURE`` queue
+   containing data (decoded frame/stream) that resulted from processing
+   buffer A.
+
+Glossary
+========
+
+CAPTURE
+   the destination buffer queue; the queue of buffers containing decoded
+   frames; ``V4L2_BUF_TYPE_VIDEO_CAPTURE```` or
+   ``V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE``; data are captured from the
+   hardware into ``CAPTURE`` buffers
+
+client
+   application client communicating with the driver implementing this API
+
+coded format
+   encoded/compressed video bitstream format (e.g. H.264, VP8, etc.); see
+   also: raw format
+
+coded height
+   height for given coded resolution
+
+coded resolution
+   stream resolution in pixels aligned to codec and hardware requirements;
+   typically visible resolution rounded up to full macroblocks;
+   see also: visible resolution
+
+coded width
+   width for given coded resolution
+
+decode order
+   the order in which frames are decoded; may differ from display order if
+   coded format includes a feature of frame reordering; ``OUTPUT`` buffers
+   must be queued by the client in decode order
+
+destination
+   data resulting from the decode process; ``CAPTURE``
+
+display order
+   the order in which frames must be displayed; ``CAPTURE`` buffers must be
+   returned by the driver in display order
+
+DPB
+   Decoded Picture Buffer; a H.264 term for a buffer that stores a picture
+   that is encoded or decoded and available for reference in further
+   decode/encode steps.
+
+EOS
+   end of stream
+
+IDR
+   a type of a keyframe in H.264-encoded stream, which clears the list of
+   earlier reference frames (DPBs)
+
+keyframe
+   an encoded frame that does not reference frames decoded earlier, i.e.
+   can be decoded fully on its own.
+
+OUTPUT
+   the source buffer queue; the queue of buffers containing encoded
+   bitstream; ``V4L2_BUF_TYPE_VIDEO_OUTPUT`` or
+   ``V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE``; the hardware is fed with data
+   from ``OUTPUT`` buffers
+
+PPS
+   Picture Parameter Set; a type of metadata entity in H.264 bitstream
+
+raw format
+   uncompressed format containing raw pixel data (e.g. YUV, RGB formats)
+
+resume point
+   a point in the bitstream from which decoding may start/continue, without
+   any previous state/data present, e.g.: a keyframe (VP8/VP9) or
+   SPS/PPS/IDR sequence (H.264); a resume point is required to start decode
+   of a new stream, or to resume decoding after a seek
+
+source
+   data fed to the decoder; ``OUTPUT``
+
+SPS
+   Sequence Parameter Set; a type of metadata entity in H.264 bitstream
+
+visible height
+   height for given visible resolution; display height
+
+visible resolution
+   stream resolution of the visible picture, in pixels, to be used for
+   display purposes; must be smaller or equal to coded resolution;
+   display resolution
+
+visible width
+   width for given visible resolution; display width
+
+Querying capabilities
+=====================
+
+1. To enumerate the set of coded formats supported by the driver, the
+   client may call :c:func:`VIDIOC_ENUM_FMT` on ``OUTPUT``.
+
+   * The driver must always return the full set of supported formats,
+     irrespective of the format set on the ``CAPTURE``.
+
+2. To enumerate the set of supported raw formats, the client may call
+   :c:func:`VIDIOC_ENUM_FMT` on ``CAPTURE``.
+
+   * The driver must return only the formats supported for the format
+     currently active on ``OUTPUT``.
+
+   * In order to enumerate raw formats supported by a given coded format,
+     the client must first set that coded format on ``OUTPUT`` and then
+     enumerate the ``CAPTURE`` queue.
+
+3. The client may use :c:func:`VIDIOC_ENUM_FRAMESIZES` to detect supported
+   resolutions for a given format, passing desired pixel format in
+   :c:type:`v4l2_frmsizeenum` ``pixel_format``.
+
+   * Values returned by :c:func:`VIDIOC_ENUM_FRAMESIZES` on ``OUTPUT``
+     must include all possible coded resolutions supported by the decoder
+     for given coded pixel format.
+
+   * Values returned by :c:func:`VIDIOC_ENUM_FRAMESIZES` on ``CAPTURE``
+     must include all possible frame buffer resolutions supported by the
+     decoder for given raw pixel format and coded format currently set on
+     ``OUTPUT``.
+
+    .. note::
+
+       The client may derive the supported resolution range for a
+       combination of coded and raw format by setting width and height of
+       ``OUTPUT`` format to 0 and calculating the intersection of
+       resolutions returned from calls to :c:func:`VIDIOC_ENUM_FRAMESIZES`
+       for the given coded and raw formats.
+
+4. Supported profiles and levels for given format, if applicable, may be
+   queried using their respective controls via :c:func:`VIDIOC_QUERYCTRL`.
+
+Initialization
+==============
+
+1. *[optional]* Enumerate supported ``OUTPUT`` formats and resolutions. See
+   capability enumeration.
+
+2. Set the coded format on ``OUTPUT`` via :c:func:`VIDIOC_S_FMT`
+
+   * **Required fields:**
+
+     ``type``
+         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
+
+     ``pixelformat``
+         a coded pixel format
+
+     ``width``, ``height``
+         required only if cannot be parsed from the stream for the given
+         coded format; optional otherwise - set to zero to ignore
+
+     other fields
+         follow standard semantics
+
+   * For coded formats including stream resolution information, if width
+     and height are set to non-zero values, the driver will propagate the
+     resolution to ``CAPTURE`` and signal a source change event
+     instantly. However, after the decoder is done parsing the
+     information embedded in the stream, it will update ``CAPTURE``
+     format with new values and signal a source change event again, if
+     the values do not match.
+
+   .. note::
+
+      Changing ``OUTPUT`` format may change currently set ``CAPTURE``
+      format. The driver will derive a new ``CAPTURE`` format from
+      ``OUTPUT`` format being set, including resolution, colorimetry
+      parameters, etc. If the client needs a specific ``CAPTURE`` format,
+      it must adjust it afterwards.
+
+3.  *[optional]* Get minimum number of buffers required for ``OUTPUT``
+    queue via :c:func:`VIDIOC_G_CTRL`. This is useful if client intends to
+    use more buffers than minimum required by hardware/format.
+
+    * **Required fields:**
+
+      ``id``
+          set to ``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT``
+
+    * **Return fields:**
+
+      ``value``
+          required number of ``OUTPUT`` buffers for the currently set
+          format
+
+4.  Allocate source (bitstream) buffers via :c:func:`VIDIOC_REQBUFS` on
+    ``OUTPUT``.
+
+    * **Required fields:**
+
+      ``count``
+          requested number of buffers to allocate; greater than zero
+
+      ``type``
+          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
+
+      ``memory``
+          follows standard semantics
+
+      ``sizeimage``
+          follows standard semantics; the client is free to choose any
+          suitable size, however, it may be subject to change by the
+          driver
+
+    * **Return fields:**
+
+      ``count``
+          actual number of buffers allocated
+
+    * The driver must adjust count to minimum of required number of
+      ``OUTPUT`` buffers for given format and count passed. The client must
+      check this value after the ioctl returns to get the number of
+      buffers allocated.
+
+    .. note::
+
+       To allocate more than minimum number of buffers (for pipeline
+       depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT``) to
+       get minimum number of buffers required by the driver/format,
+       and pass the obtained value plus the number of additional
+       buffers needed in count to :c:func:`VIDIOC_REQBUFS`.
+
+5.  Start streaming on ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON`.
+
+6.  This step only applies to coded formats that contain resolution
+    information in the stream. Continue queuing/dequeuing bitstream
+    buffers to/from the ``OUTPUT`` queue via :c:func:`VIDIOC_QBUF` and
+    :c:func:`VIDIOC_DQBUF`. The driver must keep processing and returning
+    each buffer to the client until required metadata to configure the
+    ``CAPTURE`` queue are found. This is indicated by the driver sending
+    a ``V4L2_EVENT_SOURCE_CHANGE`` event with
+    ``V4L2_EVENT_SRC_CH_RESOLUTION`` source change type. There is no
+    requirement to pass enough data for this to occur in the first buffer
+    and the driver must be able to process any number.
+
+    * If data in a buffer that triggers the event is required to decode
+      the first frame, the driver must not return it to the client,
+      but must retain it for further decoding.
+
+    * If the client set width and height of ``OUTPUT`` format to 0, calling
+      :c:func:`VIDIOC_G_FMT` on the ``CAPTURE`` queue will return -EPERM,
+      until the driver configures ``CAPTURE`` format according to stream
+      metadata.
+
+    * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events and
+      the event is signaled, the decoding process will not continue until
+      it is acknowledged by either (re-)starting streaming on ``CAPTURE``,
+      or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START``
+      command.
+
+    .. note::
+
+       No decoded frames are produced during this phase.
+
+7.  This step only applies to coded formats that contain resolution
+    information in the stream.
+    Receive and handle ``V4L2_EVENT_SOURCE_CHANGE`` from the driver
+    via :c:func:`VIDIOC_DQEVENT`. The driver must send this event once
+    enough data is obtained from the stream to allocate ``CAPTURE``
+    buffers and to begin producing decoded frames.
+
+    * **Required fields:**
+
+      ``type``
+          set to ``V4L2_EVENT_SOURCE_CHANGE``
+
+    * **Return fields:**
+
+      ``u.src_change.changes``
+          set to ``V4L2_EVENT_SRC_CH_RESOLUTION``
+
+    * Any client query issued after the driver queues the event must return
+      values applying to the just parsed stream, including queue formats,
+      selection rectangles and controls.
+
+8.  Call :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` queue to get format for the
+    destination buffers parsed/decoded from the bitstream.
+
+    * **Required fields:**
+
+      ``type``
+          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
+
+    * **Return fields:**
+
+      ``width``, ``height``
+          frame buffer resolution for the decoded frames
+
+      ``pixelformat``
+          pixel format for decoded frames
+
+      ``num_planes`` (for _MPLANE ``type`` only)
+          number of planes for pixelformat
+
+      ``sizeimage``, ``bytesperline``
+          as per standard semantics; matching frame buffer format
+
+    .. note::
+
+       The value of ``pixelformat`` may be any pixel format supported and
+       must be supported for current stream, based on the information
+       parsed from the stream and hardware capabilities. It is suggested
+       that driver chooses the preferred/optimal format for given
+       configuration. For example, a YUV format may be preferred over an
+       RGB format, if additional conversion step would be required.
+
+9.  *[optional]* Enumerate ``CAPTURE`` formats via
+    :c:func:`VIDIOC_ENUM_FMT` on ``CAPTURE`` queue. Once the stream
+    information is parsed and known, the client may use this ioctl to
+    discover which raw formats are supported for given stream and select on
+    of them via :c:func:`VIDIOC_S_FMT`.
+
+    .. note::
+
+       The driver will return only formats supported for the current stream
+       parsed in this initialization sequence, even if more formats may be
+       supported by the driver in general.
+
+       For example, a driver/hardware may support YUV and RGB formats for
+       resolutions 1920x1088 and lower, but only YUV for higher
+       resolutions (due to hardware limitations). After parsing
+       a resolution of 1920x1088 or lower, :c:func:`VIDIOC_ENUM_FMT` may
+       return a set of YUV and RGB pixel formats, but after parsing
+       resolution higher than 1920x1088, the driver will not return RGB,
+       unsupported for this resolution.
+
+       However, subsequent resolution change event triggered after
+       discovering a resolution change within the same stream may switch
+       the stream into a lower resolution and :c:func:`VIDIOC_ENUM_FMT`
+       would return RGB formats again in that case.
+
+10.  *[optional]* Choose a different ``CAPTURE`` format than suggested via
+     :c:func:`VIDIOC_S_FMT` on ``CAPTURE`` queue. It is possible for the
+     client to choose a different format than selected/suggested by the
+     driver in :c:func:`VIDIOC_G_FMT`.
+
+     * **Required fields:**
+
+       ``type``
+           a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
+
+       ``pixelformat``
+           a raw pixel format
+
+     .. note::
+
+        Calling :c:func:`VIDIOC_ENUM_FMT` to discover currently available
+        formats after receiving ``V4L2_EVENT_SOURCE_CHANGE`` is useful to
+        find out a set of allowed formats for given configuration, but not
+        required, if the client can accept the defaults.
+
+11. *[optional]* Acquire visible resolution via
+    :c:func:`VIDIOC_G_SELECTION`.
+
+    * **Required fields:**
+
+      ``type``
+          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
+
+      ``target``
+          set to ``V4L2_SEL_TGT_COMPOSE``
+
+    * **Return fields:**
+
+      ``r.left``, ``r.top``, ``r.width``, ``r.height``
+          visible rectangle; this must fit within frame buffer resolution
+          returned by :c:func:`VIDIOC_G_FMT`.
+
+    * The driver must expose following selection targets on ``CAPTURE``:
+
+      ``V4L2_SEL_TGT_CROP_BOUNDS``
+          corresponds to coded resolution of the stream
+
+      ``V4L2_SEL_TGT_CROP_DEFAULT``
+          a rectangle covering the part of the frame buffer that contains
+          meaningful picture data (visible area); width and height will be
+          equal to visible resolution of the stream
+
+      ``V4L2_SEL_TGT_CROP``
+          rectangle within coded resolution to be output to ``CAPTURE``;
+          defaults to ``V4L2_SEL_TGT_CROP_DEFAULT``; read-only on hardware
+          without additional compose/scaling capabilities
+
+      ``V4L2_SEL_TGT_COMPOSE_BOUNDS``
+          maximum rectangle within ``CAPTURE`` buffer, which the cropped
+          frame can be output into; equal to ``V4L2_SEL_TGT_CROP``, if the
+          hardware does not support compose/scaling
+
+      ``V4L2_SEL_TGT_COMPOSE_DEFAULT``
+          equal to ``V4L2_SEL_TGT_CROP``
+
+      ``V4L2_SEL_TGT_COMPOSE``
+          rectangle inside ``OUTPUT`` buffer into which the cropped frame
+          is output; defaults to ``V4L2_SEL_TGT_COMPOSE_DEFAULT``;
+          read-only on hardware without additional compose/scaling
+          capabilities
+
+      ``V4L2_SEL_TGT_COMPOSE_PADDED``
+          rectangle inside ``OUTPUT`` buffer which is overwritten by the
+          hardware; equal to ``V4L2_SEL_TGT_COMPOSE``, if the hardware
+          does not write padding pixels
+
+12. *[optional]* Get minimum number of buffers required for ``CAPTURE``
+    queue via :c:func:`VIDIOC_G_CTRL`. This is useful if client intends to
+    use more buffers than minimum required by hardware/format.
+
+    * **Required fields:**
+
+      ``id``
+          set to ``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE``
+
+    * **Return fields:**
+
+      ``value``
+          minimum number of buffers required to decode the stream parsed in
+          this initialization sequence.
+
+    .. note::
+
+       Note that the minimum number of buffers must be at least the number
+       required to successfully decode the current stream. This may for
+       example be the required DPB size for an H.264 stream given the
+       parsed stream configuration (resolution, level).
+
+13. Allocate destination (raw format) buffers via :c:func:`VIDIOC_REQBUFS`
+    on the ``CAPTURE`` queue.
+
+    * **Required fields:**
+
+      ``count``
+          requested number of buffers to allocate; greater than zero
+
+      ``type``
+          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
+
+      ``memory``
+          follows standard semantics
+
+    * **Return fields:**
+
+      ``count``
+          adjusted to allocated number of buffers
+
+    * The driver must adjust count to minimum of required number of
+      destination buffers for given format and stream configuration and the
+      count passed. The client must check this value after the ioctl
+      returns to get the number of buffers allocated.
+
+    .. note::
+
+       To allocate more than minimum number of buffers (for pipeline
+       depth), use G_CTRL(``V4L2_CID_MIN_BUFFERS_FOR_CAPTURE``) to
+       get minimum number of buffers required, and pass the obtained value
+       plus the number of additional buffers needed in count to
+       :c:func:`VIDIOC_REQBUFS`.
+
+14. Call :c:func:`VIDIOC_STREAMON` to initiate decoding frames.
+
+Decoding
+========
+
+This state is reached after a successful initialization sequence. In this
+state, client queues and dequeues buffers to both queues via
+:c:func:`VIDIOC_QBUF` and :c:func:`VIDIOC_DQBUF`, following standard
+semantics.
+
+Both queues operate independently, following standard behavior of V4L2
+buffer queues and memory-to-memory devices. In addition, the order of
+decoded frames dequeued from ``CAPTURE`` queue may differ from the order of
+queuing coded frames to ``OUTPUT`` queue, due to properties of selected
+coded format, e.g. frame reordering. The client must not assume any direct
+relationship between ``CAPTURE`` and ``OUTPUT`` buffers, other than
+reported by :c:type:`v4l2_buffer` ``timestamp`` field.
+
+The contents of source ``OUTPUT`` buffers depend on active coded pixel
+format and might be affected by codec-specific extended controls, as stated
+in documentation of each format individually.
+
+The client must not assume any direct relationship between ``CAPTURE``
+and ``OUTPUT`` buffers and any specific timing of buffers becoming
+available to dequeue. Specifically:
+
+* a buffer queued to ``OUTPUT`` may result in no buffers being produced
+  on ``CAPTURE`` (e.g. if it does not contain encoded data, or if only
+  metadata syntax structures are present in it),
+
+* a buffer queued to ``OUTPUT`` may result in more than 1 buffer produced
+  on ``CAPTURE`` (if the encoded data contained more than one frame, or if
+  returning a decoded frame allowed the driver to return a frame that
+  preceded it in decode, but succeeded it in display order),
+
+* a buffer queued to ``OUTPUT`` may result in a buffer being produced on
+  ``CAPTURE`` later into decode process, and/or after processing further
+  ``OUTPUT`` buffers, or be returned out of order, e.g. if display
+  reordering is used,
+
+* buffers may become available on the ``CAPTURE`` queue without additional
+  buffers queued to ``OUTPUT`` (e.g. during drain or ``EOS``), because of
+  ``OUTPUT`` buffers being queued in the past and decoding result of which
+  being available only at later time, due to specifics of the decoding
+  process.
+
+Seek
+====
+
+Seek is controlled by the ``OUTPUT`` queue, as it is the source of
+bitstream data. ``CAPTURE`` queue remains unchanged/unaffected.
+
+1. Stop the ``OUTPUT`` queue to begin the seek sequence via
+   :c:func:`VIDIOC_STREAMOFF`.
+
+   * **Required fields:**
+
+     ``type``
+         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
+
+   * The driver must drop all the pending ``OUTPUT`` buffers and they are
+     treated as returned to the client (following standard semantics).
+
+2. Restart the ``OUTPUT`` queue via :c:func:`VIDIOC_STREAMON`
+
+   * **Required fields:**
+
+     ``type``
+         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``
+
+   * The driver must be put in a state after seek and be ready to
+     accept new source bitstream buffers.
+
+3. Start queuing buffers to ``OUTPUT`` queue containing stream data after
+   the seek until a suitable resume point is found.
+
+   .. note::
+
+      There is no requirement to begin queuing stream starting exactly from
+      a resume point (e.g. SPS or a keyframe). The driver must handle any
+      data queued and must keep processing the queued buffers until it
+      finds a suitable resume point. While looking for a resume point, the
+      driver processes ``OUTPUT`` buffers and returns them to the client
+      without producing any decoded frames.
+
+      For hardware known to be mishandling seeks to a non-resume point,
+      e.g. by returning corrupted decoded frames, the driver must be able
+      to handle such seeks without a crash or any fatal decode error.
+
+4. After a resume point is found, the driver will start returning
+   ``CAPTURE`` buffers with decoded frames.
+
+   * There is no precise specification for ``CAPTURE`` queue of when it
+     will start producing buffers containing decoded data from buffers
+     queued after the seek, as it operates independently
+     from ``OUTPUT`` queue.
+
+     * The driver is allowed to and may return a number of remaining
+       ``CAPTURE`` buffers containing decoded frames from before the seek
+       after the seek sequence (STREAMOFF-STREAMON) is performed.
+
+     * The driver is also allowed to and may not return all decoded frames
+       queued but not decode before the seek sequence was initiated. For
+       example, given an ``OUTPUT`` queue sequence: QBUF(A), QBUF(B),
+       STREAMOFF(OUT), STREAMON(OUT), QBUF(G), QBUF(H), any of the
+       following results on the ``CAPTURE`` queue is allowed: {A’, B’, G’,
+       H’}, {A’, G’, H’}, {G’, H’}.
+
+   .. note::
+
+      To achieve instantaneous seek, the client may restart streaming on
+      ``CAPTURE`` queue to discard decoded, but not yet dequeued buffers.
+
+Pause
+=====
+
+In order to pause, the client should just cease queuing buffers onto the
+``OUTPUT`` queue. This is different from the general V4L2 API definition of
+pause, which involves calling :c:func:`VIDIOC_STREAMOFF` on the queue.
+Without source bitstream data, there is no data to process and the hardware
+remains idle.
+
+Conversely, using :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue indicates
+a seek, which
+
+1. drops all ``OUTPUT`` buffers in flight and
+2. after a subsequent :c:func:`VIDIOC_STREAMON`, will look for and only
+   continue from a resume point.
+
+This is usually undesirable for pause. The STREAMOFF-STREAMON sequence is
+intended for seeking.
+
+Similarly, ``CAPTURE`` queue should remain streaming as well, as the
+STREAMOFF-STREAMON sequence on it is intended solely for changing buffer
+sets.
+
+Dynamic resolution change
+=========================
+
+A video decoder implementing this interface must support dynamic resolution
+change, for streams, which include resolution metadata in the bitstream.
+When the decoder encounters a resolution change in the stream, the dynamic
+resolution change sequence is started.
+
+1.  After encountering a resolution change in the stream, the driver must
+    first process and decode all remaining buffers from before the
+    resolution change point.
+
+2.  After all buffers containing decoded frames from before the resolution
+    change point are ready to be dequeued on the ``CAPTURE`` queue, the
+    driver sends a ``V4L2_EVENT_SOURCE_CHANGE`` event for source change
+    type ``V4L2_EVENT_SRC_CH_RESOLUTION``.
+
+    * The last buffer from before the change must be marked with
+      :c:type:`v4l2_buffer` ``flags`` flag ``V4L2_BUF_FLAG_LAST`` as in the
+      drain sequence. The last buffer might be empty (with
+      :c:type:`v4l2_buffer` ``bytesused`` = 0) and must be ignored by the
+      client, since it does not contain any decoded frame.
+
+    * Any client query issued after the driver queues the event must return
+      values applying to the stream after the resolution change, including
+      queue formats, selection rectangles and controls.
+
+    * If the client subscribes to ``V4L2_EVENT_SOURCE_CHANGE`` events and
+      the event is signaled, the decoding process will not continue until
+      it is acknowledged by either (re-)starting streaming on ``CAPTURE``,
+      or via :c:func:`VIDIOC_DECODER_CMD` with ``V4L2_DEC_CMD_START``
+      command.
+
+    .. note::
+
+       Any attempts to dequeue more buffers beyond the buffer marked
+       with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE error from
+       :c:func:`VIDIOC_DQBUF`.
+
+3.  The client calls :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` to get the new
+    format information. This is identical to calling :c:func:`VIDIOC_G_FMT`
+    after ``V4L2_EVENT_SRC_CH_RESOLUTION`` in the initialization sequence
+    and should be handled similarly.
+
+    .. note::
+
+       It is allowed for the driver not to support the same pixel format as
+       previously used (before the resolution change) for the new
+       resolution. The driver must select a default supported pixel format,
+       return it, if queried using :c:func:`VIDIOC_G_FMT`, and the client
+       must take note of it.
+
+4.  The client acquires visible resolution as in initialization sequence.
+
+5.  *[optional]* The client is allowed to enumerate available formats and
+    select a different one than currently chosen (returned via
+    :c:func:`VIDIOC_G_FMT)`. This is identical to a corresponding step in
+    the initialization sequence.
+
+6.  *[optional]* The client acquires minimum number of buffers as in
+    initialization sequence.
+
+7.  If all the following conditions are met, the client may resume the
+    decoding instantly, by using :c:func:`VIDIOC_DECODER_CMD` with
+    ``V4L2_DEC_CMD_START`` command, as in case of resuming after the drain
+    sequence:
+
+    * ``sizeimage`` of new format is less than or equal to the size of
+      currently allocated buffers,
+
+    * the number of buffers currently allocated is greater than or equal to
+      the minimum number of buffers acquired in step 6.
+
+    In such case, the remaining steps do not apply.
+
+    However, if the client intends to change the buffer set, to lower
+    memory usage or for any other reasons, it may be achieved by following
+    the steps below.
+
+8.  After dequeuing all remaining buffers from the ``CAPTURE`` queue, the
+    client must call :c:func:`VIDIOC_STREAMOFF` on the ``CAPTURE`` queue.
+    The ``OUTPUT`` queue must remain streaming (calling STREAMOFF on it
+    would trigger a seek).
+
+9.  The client frees the buffers on the ``CAPTURE`` queue using
+    :c:func:`VIDIOC_REQBUFS`.
+
+    * **Required fields:**
+
+      ``count``
+          set to 0
+
+      ``type``
+          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``
+
+      ``memory``
+          follows standard semantics
+
+10. The client allocates a new set of buffers for the ``CAPTURE`` queue via
+    :c:func:`VIDIOC_REQBUFS`. This is identical to a corresponding step in
+    the initialization sequence.
+
+11. The client resumes decoding by issuing :c:func:`VIDIOC_STREAMON` on the
+    ``CAPTURE`` queue.
+
+During the resolution change sequence, the ``OUTPUT`` queue must remain
+streaming. Calling :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue would
+initiate a seek.
+
+The ``OUTPUT`` queue operates separately from the ``CAPTURE`` queue for the
+duration of the entire resolution change sequence. It is allowed (and
+recommended for best performance and simplicity) for the client to keep
+queuing/dequeuing buffers from/to ``OUTPUT`` queue even while processing
+this sequence.
+
+.. note::
+
+   It is also possible for this sequence to be triggered without a change
+   in coded resolution, if a different number of ``CAPTURE`` buffers is
+   required in order to continue decoding the stream or the visible
+   resolution changes.
+
+Drain
+=====
+
+To ensure that all queued ``OUTPUT`` buffers have been processed and
+related ``CAPTURE`` buffers output to the client, the following drain
+sequence may be followed. After the drain sequence is complete, the client
+has received all decoded frames for all ``OUTPUT`` buffers queued before
+the sequence was started.
+
+1. Begin drain by issuing :c:func:`VIDIOC_DECODER_CMD`.
+
+   * **Required fields:**
+
+     ``cmd``
+         set to ``V4L2_DEC_CMD_STOP``
+
+     ``flags``
+         set to 0
+
+     ``pts``
+         set to 0
+
+2. The driver must process and decode as normal all ``OUTPUT`` buffers
+   queued by the client before the :c:func:`VIDIOC_DECODER_CMD` was issued.
+   Any operations triggered as a result of processing these buffers
+   (including the initialization and resolution change sequences) must be
+   processed as normal by both the driver and the client before proceeding
+   with the drain sequence.
+
+3. Once all ``OUTPUT`` buffers queued before ``V4L2_DEC_CMD_STOP`` are
+   processed:
+
+   * If the ``CAPTURE`` queue is streaming, once all decoded frames (if
+     any) are ready to be dequeued on the ``CAPTURE`` queue, the driver
+     must send a ``V4L2_EVENT_EOS``. The driver must also set
+     ``V4L2_BUF_FLAG_LAST`` in :c:type:`v4l2_buffer` ``flags`` field on the
+     buffer on the ``CAPTURE`` queue containing the last frame (if any)
+     produced as a result of processing the ``OUTPUT`` buffers queued
+     before ``V4L2_DEC_CMD_STOP``. If no more frames are left to be
+     returned at the point of handling ``V4L2_DEC_CMD_STOP``, the driver
+     must return an empty buffer (with :c:type:`v4l2_buffer`
+     ``bytesused`` = 0) as the last buffer with ``V4L2_BUF_FLAG_LAST`` set
+     instead. Any attempts to dequeue more buffers beyond the buffer marked
+     with ``V4L2_BUF_FLAG_LAST`` will result in a -EPIPE error from
+     :c:func:`VIDIOC_DQBUF`.
+
+   * If the ``CAPTURE`` queue is NOT streaming, no action is necessary for
+     ``CAPTURE`` queue and the driver must send a ``V4L2_EVENT_EOS``
+     immediately after all ``OUTPUT`` buffers in question have been
+     processed.
+
+4. At this point, decoding is paused and the driver will accept, but not
+   process any newly queued ``OUTPUT`` buffers until the client issues
+   ``V4L2_DEC_CMD_START`` or restarts streaming on any queue.
+
+* Once the drain sequence is initiated, the client needs to drive it to
+  completion, as described by the above steps, unless it aborts the process
+  by issuing :c:func:`VIDIOC_STREAMOFF` on ``OUTPUT`` queue.  The client
+  is not allowed to issue ``V4L2_DEC_CMD_START`` or ``V4L2_DEC_CMD_STOP``
+  again while the drain sequence is in progress and they will fail with
+  -EBUSY error code if attempted.
+
+* Restarting streaming on ``OUTPUT`` queue will implicitly end the paused
+  state and reinitialize the decoder (similarly to the seek sequence).
+  Restarting ``CAPTURE`` queue will not affect an in-progress drain
+  sequence.
+
+* The drivers must also implement :c:func:`VIDIOC_TRY_DECODER_CMD`, as a
+  way to let the client query the availability of decoder commands.
+
+End of stream
+=============
+
+If the decoder encounters an end of stream marking in the stream, the
+driver must send a ``V4L2_EVENT_EOS`` event to the client after all frames
+are decoded and ready to be dequeued on the ``CAPTURE`` queue, with the
+:c:type:`v4l2_buffer` ``flags`` set to ``V4L2_BUF_FLAG_LAST``. This
+behavior is identical to the drain sequence triggered by the client via
+``V4L2_DEC_CMD_STOP``.
+
+Commit points
+=============
+
+Setting formats and allocating buffers triggers changes in the behavior
+of the driver.
+
+1. Setting format on ``OUTPUT`` queue may change the set of formats
+   supported/advertised on the ``CAPTURE`` queue. In particular, it also
+   means that ``CAPTURE`` format may be reset and the client must not
+   rely on the previously set format being preserved.
+
+2. Enumerating formats on ``CAPTURE`` queue must only return formats
+   supported for the ``OUTPUT`` format currently set.
+
+3. Setting/changing format on ``CAPTURE`` queue does not change formats
+   available on ``OUTPUT`` queue. An attempt to set ``CAPTURE`` format that
+   is not supported for the currently selected ``OUTPUT`` format must
+   result in the driver adjusting the requested format to an acceptable
+   one.
+
+4. Enumerating formats on ``OUTPUT`` queue always returns the full set of
+   supported coded formats, irrespective of the current ``CAPTURE``
+   format.
+
+5. After allocating buffers on the ``OUTPUT`` queue, it is not possible to
+   change format on it.
+
+To summarize, setting formats and allocation must always start with the
+``OUTPUT`` queue and the ``OUTPUT`` queue is the master that governs the
+set of supported formats for the ``CAPTURE`` queue.
diff --git a/Documentation/media/uapi/v4l/devices.rst b/Documentation/media/uapi/v4l/devices.rst
index fb7f8c26cf09..12d43fe711cf 100644
--- a/Documentation/media/uapi/v4l/devices.rst
+++ b/Documentation/media/uapi/v4l/devices.rst
@@ -15,6 +15,7 @@  Interfaces
     dev-output
     dev-osd
     dev-codec
+    dev-decoder
     dev-effect
     dev-raw-vbi
     dev-sliced-vbi
diff --git a/Documentation/media/uapi/v4l/v4l2.rst b/Documentation/media/uapi/v4l/v4l2.rst
index b89e5621ae69..65dc096199ad 100644
--- a/Documentation/media/uapi/v4l/v4l2.rst
+++ b/Documentation/media/uapi/v4l/v4l2.rst
@@ -53,6 +53,10 @@  Authors, in alphabetical order:
 
   - Original author of the V4L2 API and documentation.
 
+- Figa, Tomasz <tfiga@chromium.org>
+
+  - Documented the memory-to-memory decoder interface.
+
 - H Schimek, Michael <mschimek@gmx.at>
 
   - Original author of the V4L2 API and documentation.
@@ -61,6 +65,10 @@  Authors, in alphabetical order:
 
   - Documented the Digital Video timings API.
 
+- Osciak, Pawel <posciak@chromium.org>
+
+  - Documented the memory-to-memory decoder interface.
+
 - Osciak, Pawel <pawel@osciak.com>
 
   - Designed and documented the multi-planar API.
@@ -85,7 +93,7 @@  Authors, in alphabetical order:
 
   - Designed and documented the VIDIOC_LOG_STATUS ioctl, the extended control ioctls, major parts of the sliced VBI API, the MPEG encoder and decoder APIs and the DV Timings API.
 
-**Copyright** |copy| 1999-2016: Bill Dirks, Michael H. Schimek, Hans Verkuil, Martin Rubli, Andy Walls, Muralidharan Karicheri, Mauro Carvalho Chehab, Pawel Osciak, Sakari Ailus & Antti Palosaari.
+**Copyright** |copy| 1999-2018: Bill Dirks, Michael H. Schimek, Hans Verkuil, Martin Rubli, Andy Walls, Muralidharan Karicheri, Mauro Carvalho Chehab, Pawel Osciak, Sakari Ailus & Antti Palosaari, Tomasz Figa
 
 Except when explicitly stated as GPL, programming examples within this
 part can be used and distributed without restrictions.