mbox series

[PATCHv4,0/2] Document memory-to-memory video codec interfaces

Message ID 20190603112835.19661-1-hverkuil-cisco@xs4all.nl (mailing list archive)
Headers show
Series Document memory-to-memory video codec interfaces | expand

Message

Hans Verkuil June 3, 2019, 11:28 a.m. UTC
Since Thomasz was very busy with other things, I've taken over this
patch series. This v4 includes his draft changes and additional changes
from me.

This series attempts to add the documentation of what was discussed
during Media Workshops at LinuxCon Europe 2012 in Barcelona and then
later Embedded Linux Conference Europe 2014 in Düsseldorf and then
eventually written down by Pawel Osciak and tweaked a bit by Chrome OS
video team (but mostly in a cosmetic way or making the document more
precise), during the several years of Chrome OS using the APIs in
production.

Note that most, if not all, of the API is already implemented in
existing mainline drivers, such as s5p-mfc or mtk-vcodec. Intention of
this series is just to formalize what we already have.

Thanks everyone for the huge amount of useful comments to previous
versions of this series. Much of the credits should go to Pawel Osciak
too, for writing most of the original text of the initial RFC.

This v4 incorporates all known comments (let me know if I missed
something!) and should be complete for the decoder.

For the encoder there are two remaining TODOs for the API:

1) Setting the frame rate so bitrate control can make sense, since
   they need to know this information.

   Suggested solution: require support for ENUM_FRAMEINTERVALS for the
   coded pixelformats and S_PARM(OUTPUT). Open question: some drivers
   (mediatek, hva, coda) require S_PARM(OUTPUT), some (venus) allow both
   S_PARM(CAPTURE) and S_PARM(OUTPUT). I am inclined to allow both since
   this is not a CAPTURE vs OUTPUT thing, it is global to both queues.

2) Interactions between OUTPUT and CAPTURE formats.

   The main problem is what to do if the capture sizeimage is too small
   for the OUTPUT resolution when streaming starts.

   Proposal: width and height of S_FMT(OUTPUT) are used to
   calculate a minimum sizeimage (app may request more). This is
   driver-specific.

   V4L2_FMT_FLAG_FIXED_RESOLUTION is always set for codec formats
   for the encoder (i.e. we don't support mid-stream resolution
   changes for now) and V4L2_EVENT_SOURCE_CHANGE is not
   supported. See https://patchwork.linuxtv.org/patch/56478/ for
   the patch adding this flag.

   Of course, if we start to support mid-stream resolution
   changes (or other changes that require a source change event),
   then this flag should be dropped by the encoder driver and
   documentation on how to handle the source change event should
   be documented in the encoder spec. I prefer to postpone this
   until we have an encoder than can actually do mid-stream
   resolution changes.

   If sizeimage of the OUTPUT is too small for the CAPTURE
   resolution and V4L2_EVENT_SOURCE_CHANGE is not supported,
   then the second STREAMON (either CAPTURE or OUTPUT) will
   return -ENOMEM since there is not enough memory to do the
   encode.

   If V4L2_FMT_FLAG_FIXED_RESOLUTION is set (i.e. that should
   be the case for all current encoders), then any bitrate controls
   will be limited in range to what the current state (CAPTURE and
   OUTPUT formats and frame rate) supports.

Comments regarding these two encoder proposals are welcome!

Regards,

	Hans

Changes since v3:

- Lots of stylistic fixes and fixing typos/grammar/etc.

Decoder:

- width/height for S_FMT(OUTPUT):

  Expects that the output width and height is always a valid
  resolution (i.e. never 0x0), and G/S/TRY_FMT and REQBUFS will use that
  instead of returning an error. Note that this resolution is a placeholder
  until the actual resolution is parsed from the stream.

- Dropped step 3 (Query the minimum number of buffers required for the CAPTURE
  queue via VIDIOC_G_CTRL().) in the Capture Setup section. It seems to be
  a left-over from earlier versions. The same information is also in Step 10,
  so no need to have this in two places.

- Added step 5 in the Capture Setup section: set COMPOSE rectangle if needed.

- VIDIO_DECODER_CMD: document EBUSY return while draining the queue.

Encoder:

- width/height for S_FMT(CAPTURE): The width/height for the CAPTURE format
  are marked as read-only and are based on the encoders current state such
  as the OUTPUT format.

- Drop TGT_COMPOSE support in the encoder: there are currently
  no encoders that can do composing/scaling.

- Document corner cases in the Drain sequence

- Document error handling.

- VIDIO_ENCODER_CMD: document EBUSY return while draining the queue.

Changes since v2:
(https://lore.kernel.org/patchwork/cover/1002474/)
Decoder:
 - Specified that the initial source change event is signaled
   regardless of whether the client-set format matches the
   stream format.
 - Dropped V4L2_CID_MIN_BUFFERS_FOR_OUTPUT since it's meaningless
   for the bitstream input buffers of decoders.
 - Explicitly stated that VIDIOC_REQBUFS is not allowed on CAPTURE
   if the stream information is not available.
 - Described decode error handling.
 - Mentioned that timestamps can be observed after a seek to
   determine whether the CAPTURE buffers originated from before
   or after the seek.
 - Explicitly stated that after a pair of V4L2_DEC_CMD_STOP and
   V4L2_DEC_CMD_START, the decoder is not reset and preserves
   all the state.

Encoder:
 - Specified that width and height of CAPTURE format are ignored
   and always zero.
 - Explicitly noted the common use case for the CROP target with
   macroblock-unaligned video resolutions.
 - Added a reference to Request API.
 - Dropped V4L2_CID_MIN_BUFFERS_FOR_CAPTURE since it's meaningless
   for the bitstream output buffers of encoders.
 - Explicitly stated that after a pair of V4L2_ENC_CMD_STOP and
   V4L2_ENC_CMD_START, the encoder is not reset and preserves
   all the state.

General:
 - Dropped format enumeration from "Initialization", since it's already
   a part of "Querying capabilities".
 - Many spelling, grammar, stylistic, etc. changes.
 - Changed the style of note blocks.
 - Rebased onto Hans' documentation cleanup series.
   (https://patchwork.kernel.org/cover/10775407/
    https://patchwork.kernel.org/patch/10776737/)
 - Moved the interfaces under the "Video Memory-To-Memory Interface"
   section.

For changes since v1 see the v2:
https://lore.kernel.org/patchwork/cover/1002474/

For changes since RFC see the v1:
https://patchwork.kernel.org/cover/10542207/

Tomasz Figa (2):
  media: docs-rst: Document memory-to-memory video decoder interface
  media: docs-rst: Document memory-to-memory video encoder interface

 Documentation/media/uapi/v4l/dev-decoder.rst  | 1084 +++++++++++++++++
 Documentation/media/uapi/v4l/dev-encoder.rst  |  608 +++++++++
 Documentation/media/uapi/v4l/dev-mem2mem.rst  |    9 +-
 Documentation/media/uapi/v4l/pixfmt-v4l2.rst  |   10 +
 Documentation/media/uapi/v4l/v4l2.rst         |   12 +-
 .../media/uapi/v4l/vidioc-decoder-cmd.rst     |   41 +-
 .../media/uapi/v4l/vidioc-encoder-cmd.rst     |   51 +-
 7 files changed, 1779 insertions(+), 36 deletions(-)
 create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst
 create mode 100644 Documentation/media/uapi/v4l/dev-encoder.rst

Comments

Hans Verkuil June 3, 2019, 2:02 p.m. UTC | #1
These changes + the stateless decoder v4 patch are now available here:

https://hverkuil.home.xs4all.nl/codec-api/uapi/v4l/dev-mem2mem.html

Easier to read than a patch :-)

Regards,

	Hans

On 6/3/19 1:28 PM, Hans Verkuil wrote:
> Since Thomasz was very busy with other things, I've taken over this
> patch series. This v4 includes his draft changes and additional changes
> from me.
> 
> This series attempts to add the documentation of what was discussed
> during Media Workshops at LinuxCon Europe 2012 in Barcelona and then
> later Embedded Linux Conference Europe 2014 in Düsseldorf and then
> eventually written down by Pawel Osciak and tweaked a bit by Chrome OS
> video team (but mostly in a cosmetic way or making the document more
> precise), during the several years of Chrome OS using the APIs in
> production.
> 
> Note that most, if not all, of the API is already implemented in
> existing mainline drivers, such as s5p-mfc or mtk-vcodec. Intention of
> this series is just to formalize what we already have.
> 
> Thanks everyone for the huge amount of useful comments to previous
> versions of this series. Much of the credits should go to Pawel Osciak
> too, for writing most of the original text of the initial RFC.
> 
> This v4 incorporates all known comments (let me know if I missed
> something!) and should be complete for the decoder.
> 
> For the encoder there are two remaining TODOs for the API:
> 
> 1) Setting the frame rate so bitrate control can make sense, since
>    they need to know this information.
> 
>    Suggested solution: require support for ENUM_FRAMEINTERVALS for the
>    coded pixelformats and S_PARM(OUTPUT). Open question: some drivers
>    (mediatek, hva, coda) require S_PARM(OUTPUT), some (venus) allow both
>    S_PARM(CAPTURE) and S_PARM(OUTPUT). I am inclined to allow both since
>    this is not a CAPTURE vs OUTPUT thing, it is global to both queues.
> 
> 2) Interactions between OUTPUT and CAPTURE formats.
> 
>    The main problem is what to do if the capture sizeimage is too small
>    for the OUTPUT resolution when streaming starts.
> 
>    Proposal: width and height of S_FMT(OUTPUT) are used to
>    calculate a minimum sizeimage (app may request more). This is
>    driver-specific.
> 
>    V4L2_FMT_FLAG_FIXED_RESOLUTION is always set for codec formats
>    for the encoder (i.e. we don't support mid-stream resolution
>    changes for now) and V4L2_EVENT_SOURCE_CHANGE is not
>    supported. See https://patchwork.linuxtv.org/patch/56478/ for
>    the patch adding this flag.
> 
>    Of course, if we start to support mid-stream resolution
>    changes (or other changes that require a source change event),
>    then this flag should be dropped by the encoder driver and
>    documentation on how to handle the source change event should
>    be documented in the encoder spec. I prefer to postpone this
>    until we have an encoder than can actually do mid-stream
>    resolution changes.
> 
>    If sizeimage of the OUTPUT is too small for the CAPTURE
>    resolution and V4L2_EVENT_SOURCE_CHANGE is not supported,
>    then the second STREAMON (either CAPTURE or OUTPUT) will
>    return -ENOMEM since there is not enough memory to do the
>    encode.
> 
>    If V4L2_FMT_FLAG_FIXED_RESOLUTION is set (i.e. that should
>    be the case for all current encoders), then any bitrate controls
>    will be limited in range to what the current state (CAPTURE and
>    OUTPUT formats and frame rate) supports.
> 
> Comments regarding these two encoder proposals are welcome!
> 
> Regards,
> 
> 	Hans
> 
> Changes since v3:
> 
> - Lots of stylistic fixes and fixing typos/grammar/etc.
> 
> Decoder:
> 
> - width/height for S_FMT(OUTPUT):
> 
>   Expects that the output width and height is always a valid
>   resolution (i.e. never 0x0), and G/S/TRY_FMT and REQBUFS will use that
>   instead of returning an error. Note that this resolution is a placeholder
>   until the actual resolution is parsed from the stream.
> 
> - Dropped step 3 (Query the minimum number of buffers required for the CAPTURE
>   queue via VIDIOC_G_CTRL().) in the Capture Setup section. It seems to be
>   a left-over from earlier versions. The same information is also in Step 10,
>   so no need to have this in two places.
> 
> - Added step 5 in the Capture Setup section: set COMPOSE rectangle if needed.
> 
> - VIDIO_DECODER_CMD: document EBUSY return while draining the queue.
> 
> Encoder:
> 
> - width/height for S_FMT(CAPTURE): The width/height for the CAPTURE format
>   are marked as read-only and are based on the encoders current state such
>   as the OUTPUT format.
> 
> - Drop TGT_COMPOSE support in the encoder: there are currently
>   no encoders that can do composing/scaling.
> 
> - Document corner cases in the Drain sequence
> 
> - Document error handling.
> 
> - VIDIO_ENCODER_CMD: document EBUSY return while draining the queue.
> 
> Changes since v2:
> (https://lore.kernel.org/patchwork/cover/1002474/)
> Decoder:
>  - Specified that the initial source change event is signaled
>    regardless of whether the client-set format matches the
>    stream format.
>  - Dropped V4L2_CID_MIN_BUFFERS_FOR_OUTPUT since it's meaningless
>    for the bitstream input buffers of decoders.
>  - Explicitly stated that VIDIOC_REQBUFS is not allowed on CAPTURE
>    if the stream information is not available.
>  - Described decode error handling.
>  - Mentioned that timestamps can be observed after a seek to
>    determine whether the CAPTURE buffers originated from before
>    or after the seek.
>  - Explicitly stated that after a pair of V4L2_DEC_CMD_STOP and
>    V4L2_DEC_CMD_START, the decoder is not reset and preserves
>    all the state.
> 
> Encoder:
>  - Specified that width and height of CAPTURE format are ignored
>    and always zero.
>  - Explicitly noted the common use case for the CROP target with
>    macroblock-unaligned video resolutions.
>  - Added a reference to Request API.
>  - Dropped V4L2_CID_MIN_BUFFERS_FOR_CAPTURE since it's meaningless
>    for the bitstream output buffers of encoders.
>  - Explicitly stated that after a pair of V4L2_ENC_CMD_STOP and
>    V4L2_ENC_CMD_START, the encoder is not reset and preserves
>    all the state.
> 
> General:
>  - Dropped format enumeration from "Initialization", since it's already
>    a part of "Querying capabilities".
>  - Many spelling, grammar, stylistic, etc. changes.
>  - Changed the style of note blocks.
>  - Rebased onto Hans' documentation cleanup series.
>    (https://patchwork.kernel.org/cover/10775407/
>     https://patchwork.kernel.org/patch/10776737/)
>  - Moved the interfaces under the "Video Memory-To-Memory Interface"
>    section.
> 
> For changes since v1 see the v2:
> https://lore.kernel.org/patchwork/cover/1002474/
> 
> For changes since RFC see the v1:
> https://patchwork.kernel.org/cover/10542207/
> 
> Tomasz Figa (2):
>   media: docs-rst: Document memory-to-memory video decoder interface
>   media: docs-rst: Document memory-to-memory video encoder interface
> 
>  Documentation/media/uapi/v4l/dev-decoder.rst  | 1084 +++++++++++++++++
>  Documentation/media/uapi/v4l/dev-encoder.rst  |  608 +++++++++
>  Documentation/media/uapi/v4l/dev-mem2mem.rst  |    9 +-
>  Documentation/media/uapi/v4l/pixfmt-v4l2.rst  |   10 +
>  Documentation/media/uapi/v4l/v4l2.rst         |   12 +-
>  .../media/uapi/v4l/vidioc-decoder-cmd.rst     |   41 +-
>  .../media/uapi/v4l/vidioc-encoder-cmd.rst     |   51 +-
>  7 files changed, 1779 insertions(+), 36 deletions(-)
>  create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst
>  create mode 100644 Documentation/media/uapi/v4l/dev-encoder.rst
>
Nicolas Dufresne June 4, 2019, 3:19 p.m. UTC | #2
Le lundi 03 juin 2019 à 13:28 +0200, Hans Verkuil a écrit :
> Since Thomasz was very busy with other things, I've taken over this
> patch series. This v4 includes his draft changes and additional changes
> from me.
> 
> This series attempts to add the documentation of what was discussed
> during Media Workshops at LinuxCon Europe 2012 in Barcelona and then
> later Embedded Linux Conference Europe 2014 in Düsseldorf and then
> eventually written down by Pawel Osciak and tweaked a bit by Chrome OS
> video team (but mostly in a cosmetic way or making the document more
> precise), during the several years of Chrome OS using the APIs in
> production.
> 
> Note that most, if not all, of the API is already implemented in
> existing mainline drivers, such as s5p-mfc or mtk-vcodec. Intention of
> this series is just to formalize what we already have.
> 
> Thanks everyone for the huge amount of useful comments to previous
> versions of this series. Much of the credits should go to Pawel Osciak
> too, for writing most of the original text of the initial RFC.
> 
> This v4 incorporates all known comments (let me know if I missed
> something!) and should be complete for the decoder.
> 
> For the encoder there are two remaining TODOs for the API:
> 
> 1) Setting the frame rate so bitrate control can make sense, since
>    they need to know this information.
> 
>    Suggested solution: require support for ENUM_FRAMEINTERVALS for the
>    coded pixelformats and S_PARM(OUTPUT). Open question: some drivers
>    (mediatek, hva, coda) require S_PARM(OUTPUT), some (venus) allow both
>    S_PARM(CAPTURE) and S_PARM(OUTPUT). I am inclined to allow both since
>    this is not a CAPTURE vs OUTPUT thing, it is global to both queues.

I agree, as long as it's documented. I can imagine how this could be
confusing for new users.

> 
> 2) Interactions between OUTPUT and CAPTURE formats.
> 
>    The main problem is what to do if the capture sizeimage is too small
>    for the OUTPUT resolution when streaming starts.
> 
>    Proposal: width and height of S_FMT(OUTPUT) are used to
>    calculate a minimum sizeimage (app may request more). This is
>    driver-specific.
> 
>    V4L2_FMT_FLAG_FIXED_RESOLUTION is always set for codec formats
>    for the encoder (i.e. we don't support mid-stream resolution
>    changes for now) and V4L2_EVENT_SOURCE_CHANGE is not
>    supported. See https://patchwork.linuxtv.org/patch/56478/ for
>    the patch adding this flag.
> 
>    Of course, if we start to support mid-stream resolution
>    changes (or other changes that require a source change event),
>    then this flag should be dropped by the encoder driver and
>    documentation on how to handle the source change event should
>    be documented in the encoder spec. I prefer to postpone this
>    until we have an encoder than can actually do mid-stream
>    resolution changes.
> 
>    If sizeimage of the OUTPUT is too small for the CAPTURE
>    resolution and V4L2_EVENT_SOURCE_CHANGE is not supported,
>    then the second STREAMON (either CAPTURE or OUTPUT) will
>    return -ENOMEM since there is not enough memory to do the
>    encode.

You seem confident that we will know immediately if it's too small. But
what I remember is that HW has an interrupt for this, allowing
userspace to allocate a larger buffer and resume.

Should we make the capture queue independent of the streaming state, so
that we can streamoff/reqbufs/.../streamon to resume from an ENOMEM
error ? And shouldn't ENOMEM be returned by the following capture DQBUF
when such an interrupt is raised ?

> 
>    If V4L2_FMT_FLAG_FIXED_RESOLUTION is set (i.e. that should
>    be the case for all current encoders), then any bitrate controls
>    will be limited in range to what the current state (CAPTURE and
>    OUTPUT formats and frame rate) supports.
> 
> Comments regarding these two encoder proposals are welcome!
> 
> Regards,
> 
> 	Hans
> 
> Changes since v3:
> 
> - Lots of stylistic fixes and fixing typos/grammar/etc.
> 
> Decoder:
> 
> - width/height for S_FMT(OUTPUT):
> 
>   Expects that the output width and height is always a valid
>   resolution (i.e. never 0x0), and G/S/TRY_FMT and REQBUFS will use that
>   instead of returning an error. Note that this resolution is a placeholder
>   until the actual resolution is parsed from the stream.
> 
> - Dropped step 3 (Query the minimum number of buffers required for the CAPTURE
>   queue via VIDIOC_G_CTRL().) in the Capture Setup section. It seems to be
>   a left-over from earlier versions. The same information is also in Step 10,
>   so no need to have this in two places.
> 
> - Added step 5 in the Capture Setup section: set COMPOSE rectangle if needed.
> 
> - VIDIO_DECODER_CMD: document EBUSY return while draining the queue.
> 
> Encoder:
> 
> - width/height for S_FMT(CAPTURE): The width/height for the CAPTURE format
>   are marked as read-only and are based on the encoders current state such
>   as the OUTPUT format.
> 
> - Drop TGT_COMPOSE support in the encoder: there are currently
>   no encoders that can do composing/scaling.
> 
> - Document corner cases in the Drain sequence
> 
> - Document error handling.
> 
> - VIDIO_ENCODER_CMD: document EBUSY return while draining the queue.
> 
> Changes since v2:
> (https://lore.kernel.org/patchwork/cover/1002474/)
> Decoder:
>  - Specified that the initial source change event is signaled
>    regardless of whether the client-set format matches the
>    stream format.
>  - Dropped V4L2_CID_MIN_BUFFERS_FOR_OUTPUT since it's meaningless
>    for the bitstream input buffers of decoders.
>  - Explicitly stated that VIDIOC_REQBUFS is not allowed on CAPTURE
>    if the stream information is not available.
>  - Described decode error handling.
>  - Mentioned that timestamps can be observed after a seek to
>    determine whether the CAPTURE buffers originated from before
>    or after the seek.
>  - Explicitly stated that after a pair of V4L2_DEC_CMD_STOP and
>    V4L2_DEC_CMD_START, the decoder is not reset and preserves
>    all the state.
> 
> Encoder:
>  - Specified that width and height of CAPTURE format are ignored
>    and always zero.
>  - Explicitly noted the common use case for the CROP target with
>    macroblock-unaligned video resolutions.
>  - Added a reference to Request API.
>  - Dropped V4L2_CID_MIN_BUFFERS_FOR_CAPTURE since it's meaningless
>    for the bitstream output buffers of encoders.
>  - Explicitly stated that after a pair of V4L2_ENC_CMD_STOP and
>    V4L2_ENC_CMD_START, the encoder is not reset and preserves
>    all the state.
> 
> General:
>  - Dropped format enumeration from "Initialization", since it's already
>    a part of "Querying capabilities".
>  - Many spelling, grammar, stylistic, etc. changes.
>  - Changed the style of note blocks.
>  - Rebased onto Hans' documentation cleanup series.
>    (https://patchwork.kernel.org/cover/10775407/
>     https://patchwork.kernel.org/patch/10776737/)
>  - Moved the interfaces under the "Video Memory-To-Memory Interface"
>    section.
> 
> For changes since v1 see the v2:
> https://lore.kernel.org/patchwork/cover/1002474/
> 
> For changes since RFC see the v1:
> https://patchwork.kernel.org/cover/10542207/
> 
> Tomasz Figa (2):
>   media: docs-rst: Document memory-to-memory video decoder interface
>   media: docs-rst: Document memory-to-memory video encoder interface
> 
>  Documentation/media/uapi/v4l/dev-decoder.rst  | 1084 +++++++++++++++++
>  Documentation/media/uapi/v4l/dev-encoder.rst  |  608 +++++++++
>  Documentation/media/uapi/v4l/dev-mem2mem.rst  |    9 +-
>  Documentation/media/uapi/v4l/pixfmt-v4l2.rst  |   10 +
>  Documentation/media/uapi/v4l/v4l2.rst         |   12 +-
>  .../media/uapi/v4l/vidioc-decoder-cmd.rst     |   41 +-
>  .../media/uapi/v4l/vidioc-encoder-cmd.rst     |   51 +-
>  7 files changed, 1779 insertions(+), 36 deletions(-)
>  create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst
>  create mode 100644 Documentation/media/uapi/v4l/dev-encoder.rst
>
Nicolas Dufresne June 10, 2019, 3:57 p.m. UTC | #3
Le lundi 03 juin 2019 à 13:28 +0200, Hans Verkuil a écrit :
> Since Thomasz was very busy with other things, I've taken over this
> patch series. This v4 includes his draft changes and additional changes
> from me.
> 
> This series attempts to add the documentation of what was discussed
> during Media Workshops at LinuxCon Europe 2012 in Barcelona and then
> later Embedded Linux Conference Europe 2014 in Düsseldorf and then
> eventually written down by Pawel Osciak and tweaked a bit by Chrome OS
> video team (but mostly in a cosmetic way or making the document more
> precise), during the several years of Chrome OS using the APIs in
> production.
> 
> Note that most, if not all, of the API is already implemented in
> existing mainline drivers, such as s5p-mfc or mtk-vcodec. Intention of
> this series is just to formalize what we already have.
> 
> Thanks everyone for the huge amount of useful comments to previous
> versions of this series. Much of the credits should go to Pawel Osciak
> too, for writing most of the original text of the initial RFC.
> 
> This v4 incorporates all known comments (let me know if I missed
> something!) and should be complete for the decoder.
> 
> For the encoder there are two remaining TODOs for the API:
> 
> 1) Setting the frame rate so bitrate control can make sense, since
>    they need to know this information.
> 
>    Suggested solution: require support for ENUM_FRAMEINTERVALS for the
>    coded pixelformats and S_PARM(OUTPUT). Open question: some drivers
>    (mediatek, hva, coda) require S_PARM(OUTPUT), some (venus) allow both
>    S_PARM(CAPTURE) and S_PARM(OUTPUT). I am inclined to allow both since
>    this is not a CAPTURE vs OUTPUT thing, it is global to both queues.

Is ENUM_FRAMEINTERVALS really required ? This will be a hint to the
encoder, so that the encoder round to it's internal precision does not
seem very important.

> 
> 2) Interactions between OUTPUT and CAPTURE formats.
> 
>    The main problem is what to do if the capture sizeimage is too small
>    for the OUTPUT resolution when streaming starts.
> 
>    Proposal: width and height of S_FMT(OUTPUT) are used to
>    calculate a minimum sizeimage (app may request more). This is
>    driver-specific.
> 
>    V4L2_FMT_FLAG_FIXED_RESOLUTION is always set for codec formats
>    for the encoder (i.e. we don't support mid-stream resolution
>    changes for now) and V4L2_EVENT_SOURCE_CHANGE is not
>    supported. See https://patchwork.linuxtv.org/patch/56478/ for
>    the patch adding this flag.
> 
>    Of course, if we start to support mid-stream resolution
>    changes (or other changes that require a source change event),
>    then this flag should be dropped by the encoder driver and
>    documentation on how to handle the source change event should
>    be documented in the encoder spec. I prefer to postpone this
>    until we have an encoder than can actually do mid-stream
>    resolution changes.

For H264, mid-stream changes would make sense for the case we'd like
the statefull encoder to emit multiple PPS at start, so then the switch
would simply change the PPS index. The problem seems to be how do we
expose "multiple" resolution in our interface ? As there is currently
no solution to this, I would not see much use for having this supported
at the moment.

> 
>    If sizeimage of the OUTPUT is too small for the CAPTURE
>    resolution and V4L2_EVENT_SOURCE_CHANGE is not supported,
>    then the second STREAMON (either CAPTURE or OUTPUT) will
>    return -ENOMEM since there is not enough memory to do the
>    encode.
> 
>    If V4L2_FMT_FLAG_FIXED_RESOLUTION is set (i.e. that should
>    be the case for all current encoders), then any bitrate controls
>    will be limited in range to what the current state (CAPTURE and
>    OUTPUT formats and frame rate) supports.
> 
> Comments regarding these two encoder proposals are welcome!
> 
> Regards,
> 
> 	Hans
> 
> Changes since v3:
> 
> - Lots of stylistic fixes and fixing typos/grammar/etc.
> 
> Decoder:
> 
> - width/height for S_FMT(OUTPUT):
> 
>   Expects that the output width and height is always a valid
>   resolution (i.e. never 0x0), and G/S/TRY_FMT and REQBUFS will use that
>   instead of returning an error. Note that this resolution is a placeholder
>   until the actual resolution is parsed from the stream.
> 
> - Dropped step 3 (Query the minimum number of buffers required for the CAPTURE
>   queue via VIDIOC_G_CTRL().) in the Capture Setup section. It seems to be
>   a left-over from earlier versions. The same information is also in Step 10,
>   so no need to have this in two places.
> 
> - Added step 5 in the Capture Setup section: set COMPOSE rectangle if needed.
> 
> - VIDIO_DECODER_CMD: document EBUSY return while draining the queue.
> 
> Encoder:
> 
> - width/height for S_FMT(CAPTURE): The width/height for the CAPTURE format
>   are marked as read-only and are based on the encoders current state such
>   as the OUTPUT format.
> 
> - Drop TGT_COMPOSE support in the encoder: there are currently
>   no encoders that can do composing/scaling.
> 
> - Document corner cases in the Drain sequence
> 
> - Document error handling.
> 
> - VIDIO_ENCODER_CMD: document EBUSY return while draining the queue.
> 
> Changes since v2:
> (https://lore.kernel.org/patchwork/cover/1002474/)
> Decoder:
>  - Specified that the initial source change event is signaled
>    regardless of whether the client-set format matches the
>    stream format.
>  - Dropped V4L2_CID_MIN_BUFFERS_FOR_OUTPUT since it's meaningless
>    for the bitstream input buffers of decoders.
>  - Explicitly stated that VIDIOC_REQBUFS is not allowed on CAPTURE
>    if the stream information is not available.
>  - Described decode error handling.
>  - Mentioned that timestamps can be observed after a seek to
>    determine whether the CAPTURE buffers originated from before
>    or after the seek.
>  - Explicitly stated that after a pair of V4L2_DEC_CMD_STOP and
>    V4L2_DEC_CMD_START, the decoder is not reset and preserves
>    all the state.
> 
> Encoder:
>  - Specified that width and height of CAPTURE format are ignored
>    and always zero.
>  - Explicitly noted the common use case for the CROP target with
>    macroblock-unaligned video resolutions.
>  - Added a reference to Request API.
>  - Dropped V4L2_CID_MIN_BUFFERS_FOR_CAPTURE since it's meaningless
>    for the bitstream output buffers of encoders.
>  - Explicitly stated that after a pair of V4L2_ENC_CMD_STOP and
>    V4L2_ENC_CMD_START, the encoder is not reset and preserves
>    all the state.
> 
> General:
>  - Dropped format enumeration from "Initialization", since it's already
>    a part of "Querying capabilities".
>  - Many spelling, grammar, stylistic, etc. changes.
>  - Changed the style of note blocks.
>  - Rebased onto Hans' documentation cleanup series.
>    (https://patchwork.kernel.org/cover/10775407/
>     https://patchwork.kernel.org/patch/10776737/)
>  - Moved the interfaces under the "Video Memory-To-Memory Interface"
>    section.
> 
> For changes since v1 see the v2:
> https://lore.kernel.org/patchwork/cover/1002474/
> 
> For changes since RFC see the v1:
> https://patchwork.kernel.org/cover/10542207/
> 
> Tomasz Figa (2):
>   media: docs-rst: Document memory-to-memory video decoder interface
>   media: docs-rst: Document memory-to-memory video encoder interface
> 
>  Documentation/media/uapi/v4l/dev-decoder.rst  | 1084 +++++++++++++++++
>  Documentation/media/uapi/v4l/dev-encoder.rst  |  608 +++++++++
>  Documentation/media/uapi/v4l/dev-mem2mem.rst  |    9 +-
>  Documentation/media/uapi/v4l/pixfmt-v4l2.rst  |   10 +
>  Documentation/media/uapi/v4l/v4l2.rst         |   12 +-
>  .../media/uapi/v4l/vidioc-decoder-cmd.rst     |   41 +-
>  .../media/uapi/v4l/vidioc-encoder-cmd.rst     |   51 +-
>  7 files changed, 1779 insertions(+), 36 deletions(-)
>  create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst
>  create mode 100644 Documentation/media/uapi/v4l/dev-encoder.rst
>
Hans Verkuil June 11, 2019, 8:35 a.m. UTC | #4
On 6/10/19 5:57 PM, Nicolas Dufresne wrote:
> Le lundi 03 juin 2019 à 13:28 +0200, Hans Verkuil a écrit :
>> Since Thomasz was very busy with other things, I've taken over this
>> patch series. This v4 includes his draft changes and additional changes
>> from me.
>>
>> This series attempts to add the documentation of what was discussed
>> during Media Workshops at LinuxCon Europe 2012 in Barcelona and then
>> later Embedded Linux Conference Europe 2014 in Düsseldorf and then
>> eventually written down by Pawel Osciak and tweaked a bit by Chrome OS
>> video team (but mostly in a cosmetic way or making the document more
>> precise), during the several years of Chrome OS using the APIs in
>> production.
>>
>> Note that most, if not all, of the API is already implemented in
>> existing mainline drivers, such as s5p-mfc or mtk-vcodec. Intention of
>> this series is just to formalize what we already have.
>>
>> Thanks everyone for the huge amount of useful comments to previous
>> versions of this series. Much of the credits should go to Pawel Osciak
>> too, for writing most of the original text of the initial RFC.
>>
>> This v4 incorporates all known comments (let me know if I missed
>> something!) and should be complete for the decoder.
>>
>> For the encoder there are two remaining TODOs for the API:
>>
>> 1) Setting the frame rate so bitrate control can make sense, since
>>    they need to know this information.
>>
>>    Suggested solution: require support for ENUM_FRAMEINTERVALS for the
>>    coded pixelformats and S_PARM(OUTPUT). Open question: some drivers
>>    (mediatek, hva, coda) require S_PARM(OUTPUT), some (venus) allow both
>>    S_PARM(CAPTURE) and S_PARM(OUTPUT). I am inclined to allow both since
>>    this is not a CAPTURE vs OUTPUT thing, it is global to both queues.
> 
> Is ENUM_FRAMEINTERVALS really required ? This will be a hint to the
> encoder, so that the encoder round to it's internal precision does not
> seem very important.

I don't like this proposal. Especially the use of S_PARM (I *hate* that ioctl).
I think the frame period should be a control with a min/max range, like
any other control.

FRAMEINTERVALS really refers to the rate at which frames arrive when
capturing, and that's not quite what is happening in an encoder.

For now I want to concentrate on the decoder spec, and I'll come back to
this later.

> 
>>
>> 2) Interactions between OUTPUT and CAPTURE formats.
>>
>>    The main problem is what to do if the capture sizeimage is too small
>>    for the OUTPUT resolution when streaming starts.
>>
>>    Proposal: width and height of S_FMT(OUTPUT) are used to
>>    calculate a minimum sizeimage (app may request more). This is
>>    driver-specific.
>>
>>    V4L2_FMT_FLAG_FIXED_RESOLUTION is always set for codec formats
>>    for the encoder (i.e. we don't support mid-stream resolution
>>    changes for now) and V4L2_EVENT_SOURCE_CHANGE is not
>>    supported. See https://patchwork.linuxtv.org/patch/56478/ for
>>    the patch adding this flag.
>>
>>    Of course, if we start to support mid-stream resolution
>>    changes (or other changes that require a source change event),
>>    then this flag should be dropped by the encoder driver and
>>    documentation on how to handle the source change event should
>>    be documented in the encoder spec. I prefer to postpone this
>>    until we have an encoder than can actually do mid-stream
>>    resolution changes.
> 
> For H264, mid-stream changes would make sense for the case we'd like
> the statefull encoder to emit multiple PPS at start, so then the switch
> would simply change the PPS index. The problem seems to be how do we
> expose "multiple" resolution in our interface ? As there is currently
> no solution to this, I would not see much use for having this supported
> at the moment.

I agree. That's why I want to postpone that part.

> 
>>
>>    If sizeimage of the OUTPUT is too small for the CAPTURE
>>    resolution and V4L2_EVENT_SOURCE_CHANGE is not supported,
>>    then the second STREAMON (either CAPTURE or OUTPUT) will
>>    return -ENOMEM since there is not enough memory to do the
>>    encode.
>>
>>    If V4L2_FMT_FLAG_FIXED_RESOLUTION is set (i.e. that should
>>    be the case for all current encoders), then any bitrate controls
>>    will be limited in range to what the current state (CAPTURE and
>>    OUTPUT formats and frame rate) supports.

Note that this flag will be inverted: V4L2_FMT_FLAG_DYN_RESOLUTION.
So this text is out of date in that regard.

Regards,

	Hans

>>
>> Comments regarding these two encoder proposals are welcome!
>>
>> Regards,
>>
>> 	Hans
>>
>> Changes since v3:
>>
>> - Lots of stylistic fixes and fixing typos/grammar/etc.
>>
>> Decoder:
>>
>> - width/height for S_FMT(OUTPUT):
>>
>>   Expects that the output width and height is always a valid
>>   resolution (i.e. never 0x0), and G/S/TRY_FMT and REQBUFS will use that
>>   instead of returning an error. Note that this resolution is a placeholder
>>   until the actual resolution is parsed from the stream.
>>
>> - Dropped step 3 (Query the minimum number of buffers required for the CAPTURE
>>   queue via VIDIOC_G_CTRL().) in the Capture Setup section. It seems to be
>>   a left-over from earlier versions. The same information is also in Step 10,
>>   so no need to have this in two places.
>>
>> - Added step 5 in the Capture Setup section: set COMPOSE rectangle if needed.
>>
>> - VIDIO_DECODER_CMD: document EBUSY return while draining the queue.
>>
>> Encoder:
>>
>> - width/height for S_FMT(CAPTURE): The width/height for the CAPTURE format
>>   are marked as read-only and are based on the encoders current state such
>>   as the OUTPUT format.
>>
>> - Drop TGT_COMPOSE support in the encoder: there are currently
>>   no encoders that can do composing/scaling.
>>
>> - Document corner cases in the Drain sequence
>>
>> - Document error handling.
>>
>> - VIDIO_ENCODER_CMD: document EBUSY return while draining the queue.
>>
>> Changes since v2:
>> (https://lore.kernel.org/patchwork/cover/1002474/)
>> Decoder:
>>  - Specified that the initial source change event is signaled
>>    regardless of whether the client-set format matches the
>>    stream format.
>>  - Dropped V4L2_CID_MIN_BUFFERS_FOR_OUTPUT since it's meaningless
>>    for the bitstream input buffers of decoders.
>>  - Explicitly stated that VIDIOC_REQBUFS is not allowed on CAPTURE
>>    if the stream information is not available.
>>  - Described decode error handling.
>>  - Mentioned that timestamps can be observed after a seek to
>>    determine whether the CAPTURE buffers originated from before
>>    or after the seek.
>>  - Explicitly stated that after a pair of V4L2_DEC_CMD_STOP and
>>    V4L2_DEC_CMD_START, the decoder is not reset and preserves
>>    all the state.
>>
>> Encoder:
>>  - Specified that width and height of CAPTURE format are ignored
>>    and always zero.
>>  - Explicitly noted the common use case for the CROP target with
>>    macroblock-unaligned video resolutions.
>>  - Added a reference to Request API.
>>  - Dropped V4L2_CID_MIN_BUFFERS_FOR_CAPTURE since it's meaningless
>>    for the bitstream output buffers of encoders.
>>  - Explicitly stated that after a pair of V4L2_ENC_CMD_STOP and
>>    V4L2_ENC_CMD_START, the encoder is not reset and preserves
>>    all the state.
>>
>> General:
>>  - Dropped format enumeration from "Initialization", since it's already
>>    a part of "Querying capabilities".
>>  - Many spelling, grammar, stylistic, etc. changes.
>>  - Changed the style of note blocks.
>>  - Rebased onto Hans' documentation cleanup series.
>>    (https://patchwork.kernel.org/cover/10775407/
>>     https://patchwork.kernel.org/patch/10776737/)
>>  - Moved the interfaces under the "Video Memory-To-Memory Interface"
>>    section.
>>
>> For changes since v1 see the v2:
>> https://lore.kernel.org/patchwork/cover/1002474/
>>
>> For changes since RFC see the v1:
>> https://patchwork.kernel.org/cover/10542207/
>>
>> Tomasz Figa (2):
>>   media: docs-rst: Document memory-to-memory video decoder interface
>>   media: docs-rst: Document memory-to-memory video encoder interface
>>
>>  Documentation/media/uapi/v4l/dev-decoder.rst  | 1084 +++++++++++++++++
>>  Documentation/media/uapi/v4l/dev-encoder.rst  |  608 +++++++++
>>  Documentation/media/uapi/v4l/dev-mem2mem.rst  |    9 +-
>>  Documentation/media/uapi/v4l/pixfmt-v4l2.rst  |   10 +
>>  Documentation/media/uapi/v4l/v4l2.rst         |   12 +-
>>  .../media/uapi/v4l/vidioc-decoder-cmd.rst     |   41 +-
>>  .../media/uapi/v4l/vidioc-encoder-cmd.rst     |   51 +-
>>  7 files changed, 1779 insertions(+), 36 deletions(-)
>>  create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst
>>  create mode 100644 Documentation/media/uapi/v4l/dev-encoder.rst
>>
Nicolas Dufresne June 12, 2019, 12:33 a.m. UTC | #5
Le mardi 11 juin 2019 à 10:35 +0200, Hans Verkuil a écrit :
> On 6/10/19 5:57 PM, Nicolas Dufresne wrote:
> > Le lundi 03 juin 2019 à 13:28 +0200, Hans Verkuil a écrit :
> > > Since Thomasz was very busy with other things, I've taken over this
> > > patch series. This v4 includes his draft changes and additional changes
> > > from me.
> > > 
> > > This series attempts to add the documentation of what was discussed
> > > during Media Workshops at LinuxCon Europe 2012 in Barcelona and then
> > > later Embedded Linux Conference Europe 2014 in Düsseldorf and then
> > > eventually written down by Pawel Osciak and tweaked a bit by Chrome OS
> > > video team (but mostly in a cosmetic way or making the document more
> > > precise), during the several years of Chrome OS using the APIs in
> > > production.
> > > 
> > > Note that most, if not all, of the API is already implemented in
> > > existing mainline drivers, such as s5p-mfc or mtk-vcodec. Intention of
> > > this series is just to formalize what we already have.
> > > 
> > > Thanks everyone for the huge amount of useful comments to previous
> > > versions of this series. Much of the credits should go to Pawel Osciak
> > > too, for writing most of the original text of the initial RFC.
> > > 
> > > This v4 incorporates all known comments (let me know if I missed
> > > something!) and should be complete for the decoder.
> > > 
> > > For the encoder there are two remaining TODOs for the API:
> > > 
> > > 1) Setting the frame rate so bitrate control can make sense, since
> > >    they need to know this information.
> > > 
> > >    Suggested solution: require support for ENUM_FRAMEINTERVALS for the
> > >    coded pixelformats and S_PARM(OUTPUT). Open question: some drivers
> > >    (mediatek, hva, coda) require S_PARM(OUTPUT), some (venus) allow both
> > >    S_PARM(CAPTURE) and S_PARM(OUTPUT). I am inclined to allow both since
> > >    this is not a CAPTURE vs OUTPUT thing, it is global to both queues.
> > 
> > Is ENUM_FRAMEINTERVALS really required ? This will be a hint to the
> > encoder, so that the encoder round to it's internal precision does not
> > seem very important.
> 
> I don't like this proposal. Especially the use of S_PARM (I *hate* that ioctl).
> I think the frame period should be a control with a min/max range, like
> any other control.
> 
> FRAMEINTERVALS really refers to the rate at which frames arrive when
> capturing, and that's not quite what is happening in an encoder.

Only side is that this is exactly what happens when doing live encoding
(to site GStreamer):

   v4l2src ! v4l2encoder ! ...

The frameinterval is exactly what the V4L2 capture device reports.
Though, a lot of encoder used for RTP transcoding takes a per frame
duration as input, as RTP does not always expose the framerate (aka
frame duration). Or the other way around is that the frameinterval is
calculates and updated periodically (S_PARM do have this advantage that
you are suppose to be able to change that while streaming, no idea if
that works in practice). I do not like the ioctl() though. This could
have been a control and could be called framerate to match what the
rest of the world uses.

> 
> For now I want to concentrate on the decoder spec, and I'll come back to
> this later.

Ok.

> 
> > > 2) Interactions between OUTPUT and CAPTURE formats.
> > > 
> > >    The main problem is what to do if the capture sizeimage is too small
> > >    for the OUTPUT resolution when streaming starts.
> > > 
> > >    Proposal: width and height of S_FMT(OUTPUT) are used to
> > >    calculate a minimum sizeimage (app may request more). This is
> > >    driver-specific.
> > > 
> > >    V4L2_FMT_FLAG_FIXED_RESOLUTION is always set for codec formats
> > >    for the encoder (i.e. we don't support mid-stream resolution
> > >    changes for now) and V4L2_EVENT_SOURCE_CHANGE is not
> > >    supported. See https://patchwork.linuxtv.org/patch/56478/ for
> > >    the patch adding this flag.
> > > 
> > >    Of course, if we start to support mid-stream resolution
> > >    changes (or other changes that require a source change event),
> > >    then this flag should be dropped by the encoder driver and
> > >    documentation on how to handle the source change event should
> > >    be documented in the encoder spec. I prefer to postpone this
> > >    until we have an encoder than can actually do mid-stream
> > >    resolution changes.
> > 
> > For H264, mid-stream changes would make sense for the case we'd like
> > the statefull encoder to emit multiple PPS at start, so then the switch
> > would simply change the PPS index. The problem seems to be how do we
> > expose "multiple" resolution in our interface ? As there is currently
> > no solution to this, I would not see much use for having this supported
> > at the moment.
> 
> I agree. That's why I want to postpone that part.
> 
> > >    If sizeimage of the OUTPUT is too small for the CAPTURE
> > >    resolution and V4L2_EVENT_SOURCE_CHANGE is not supported,
> > >    then the second STREAMON (either CAPTURE or OUTPUT) will
> > >    return -ENOMEM since there is not enough memory to do the
> > >    encode.
> > > 
> > >    If V4L2_FMT_FLAG_FIXED_RESOLUTION is set (i.e. that should
> > >    be the case for all current encoders), then any bitrate controls
> > >    will be limited in range to what the current state (CAPTURE and
> > >    OUTPUT formats and frame rate) supports.
> 
> Note that this flag will be inverted: V4L2_FMT_FLAG_DYN_RESOLUTION.
> So this text is out of date in that regard.
> 
> Regards,
> 
> 	Hans
> 
> > > Comments regarding these two encoder proposals are welcome!
> > > 
> > > Regards,
> > > 
> > > 	Hans
> > > 
> > > Changes since v3:
> > > 
> > > - Lots of stylistic fixes and fixing typos/grammar/etc.
> > > 
> > > Decoder:
> > > 
> > > - width/height for S_FMT(OUTPUT):
> > > 
> > >   Expects that the output width and height is always a valid
> > >   resolution (i.e. never 0x0), and G/S/TRY_FMT and REQBUFS will use that
> > >   instead of returning an error. Note that this resolution is a placeholder
> > >   until the actual resolution is parsed from the stream.
> > > 
> > > - Dropped step 3 (Query the minimum number of buffers required for the CAPTURE
> > >   queue via VIDIOC_G_CTRL().) in the Capture Setup section. It seems to be
> > >   a left-over from earlier versions. The same information is also in Step 10,
> > >   so no need to have this in two places.
> > > 
> > > - Added step 5 in the Capture Setup section: set COMPOSE rectangle if needed.
> > > 
> > > - VIDIO_DECODER_CMD: document EBUSY return while draining the queue.
> > > 
> > > Encoder:
> > > 
> > > - width/height for S_FMT(CAPTURE): The width/height for the CAPTURE format
> > >   are marked as read-only and are based on the encoders current state such
> > >   as the OUTPUT format.
> > > 
> > > - Drop TGT_COMPOSE support in the encoder: there are currently
> > >   no encoders that can do composing/scaling.
> > > 
> > > - Document corner cases in the Drain sequence
> > > 
> > > - Document error handling.
> > > 
> > > - VIDIO_ENCODER_CMD: document EBUSY return while draining the queue.
> > > 
> > > Changes since v2:
> > > (https://lore.kernel.org/patchwork/cover/1002474/)
> > > Decoder:
> > >  - Specified that the initial source change event is signaled
> > >    regardless of whether the client-set format matches the
> > >    stream format.
> > >  - Dropped V4L2_CID_MIN_BUFFERS_FOR_OUTPUT since it's meaningless
> > >    for the bitstream input buffers of decoders.
> > >  - Explicitly stated that VIDIOC_REQBUFS is not allowed on CAPTURE
> > >    if the stream information is not available.
> > >  - Described decode error handling.
> > >  - Mentioned that timestamps can be observed after a seek to
> > >    determine whether the CAPTURE buffers originated from before
> > >    or after the seek.
> > >  - Explicitly stated that after a pair of V4L2_DEC_CMD_STOP and
> > >    V4L2_DEC_CMD_START, the decoder is not reset and preserves
> > >    all the state.
> > > 
> > > Encoder:
> > >  - Specified that width and height of CAPTURE format are ignored
> > >    and always zero.
> > >  - Explicitly noted the common use case for the CROP target with
> > >    macroblock-unaligned video resolutions.
> > >  - Added a reference to Request API.
> > >  - Dropped V4L2_CID_MIN_BUFFERS_FOR_CAPTURE since it's meaningless
> > >    for the bitstream output buffers of encoders.
> > >  - Explicitly stated that after a pair of V4L2_ENC_CMD_STOP and
> > >    V4L2_ENC_CMD_START, the encoder is not reset and preserves
> > >    all the state.
> > > 
> > > General:
> > >  - Dropped format enumeration from "Initialization", since it's already
> > >    a part of "Querying capabilities".
> > >  - Many spelling, grammar, stylistic, etc. changes.
> > >  - Changed the style of note blocks.
> > >  - Rebased onto Hans' documentation cleanup series.
> > >    (https://patchwork.kernel.org/cover/10775407/
> > >     https://patchwork.kernel.org/patch/10776737/)
> > >  - Moved the interfaces under the "Video Memory-To-Memory Interface"
> > >    section.
> > > 
> > > For changes since v1 see the v2:
> > > https://lore.kernel.org/patchwork/cover/1002474/
> > > 
> > > For changes since RFC see the v1:
> > > https://patchwork.kernel.org/cover/10542207/
> > > 
> > > Tomasz Figa (2):
> > >   media: docs-rst: Document memory-to-memory video decoder interface
> > >   media: docs-rst: Document memory-to-memory video encoder interface
> > > 
> > >  Documentation/media/uapi/v4l/dev-decoder.rst  | 1084 +++++++++++++++++
> > >  Documentation/media/uapi/v4l/dev-encoder.rst  |  608 +++++++++
> > >  Documentation/media/uapi/v4l/dev-mem2mem.rst  |    9 +-
> > >  Documentation/media/uapi/v4l/pixfmt-v4l2.rst  |   10 +
> > >  Documentation/media/uapi/v4l/v4l2.rst         |   12 +-
> > >  .../media/uapi/v4l/vidioc-decoder-cmd.rst     |   41 +-
> > >  .../media/uapi/v4l/vidioc-encoder-cmd.rst     |   51 +-
> > >  7 files changed, 1779 insertions(+), 36 deletions(-)
> > >  create mode 100644 Documentation/media/uapi/v4l/dev-decoder.rst
> > >  create mode 100644 Documentation/media/uapi/v4l/dev-encoder.rst
> > >
Hans Verkuil June 13, 2019, 6:48 a.m. UTC | #6
On 6/3/19 1:28 PM, Hans Verkuil wrote:
> Since Tomasz was very busy with other things, I've taken over this
> patch series. This v4 includes his draft changes and additional changes
> from me.
> 
> This series attempts to add the documentation of what was discussed
> during Media Workshops at LinuxCon Europe 2012 in Barcelona and then
> later Embedded Linux Conference Europe 2014 in Düsseldorf and then
> eventually written down by Pawel Osciak and tweaked a bit by Chrome OS
> video team (but mostly in a cosmetic way or making the document more
> precise), during the several years of Chrome OS using the APIs in
> production.
> 
> Note that most, if not all, of the API is already implemented in
> existing mainline drivers, such as s5p-mfc or mtk-vcodec. Intention of
> this series is just to formalize what we already have.
> 
> Thanks everyone for the huge amount of useful comments to previous
> versions of this series. Much of the credits should go to Pawel Osciak
> too, for writing most of the original text of the initial RFC.
> 
> This v4 incorporates all known comments (let me know if I missed
> something!) and should be complete for the decoder.
> 
> For the encoder there are two remaining TODOs for the API:
> 
> 1) Setting the frame rate so bitrate control can make sense, since
>    they need to know this information.
> 
>    Suggested solution: require support for ENUM_FRAMEINTERVALS for the
>    coded pixelformats and S_PARM(OUTPUT). Open question: some drivers
>    (mediatek, hva, coda) require S_PARM(OUTPUT), some (venus) allow both
>    S_PARM(CAPTURE) and S_PARM(OUTPUT). I am inclined to allow both since
>    this is not a CAPTURE vs OUTPUT thing, it is global to both queues.

Alternative proposal:

1) Add support for fractions (struct v4l2_fract) as a control type:
   V4L2_CTRL_TYPE_FRACT.

2) Add a new V4L2_CID_MPEG_FRAME_INTERVAL control.

Encoders shall support this control.

3) For backwards compatibility reasons encoder drivers still have to
support G/S_PARM, but this can now be implemented by standard helpers
that query this control. Drivers also have to implement ENUM_FRAMEINTERVALS.
If the range of intervals is always the same regardless of the frame size,
then a helper can be used that queries the min/max/step of the control, but
if it is dependent on the frame size, then it has to be implemented in the
driver itself.

I'm sticking to frame intervals instead of frame rates for the simple reason
that that's what V4L2 has used since the beginning. I think it is too confusing
to change this to a frame rate. This is just my opinion, though.

I also chose to make this a codec control, not a generic user control: this
value together with the bit rate control(s) determine the compression size,
it does not determine the actual time it takes for the encoder to compress
the raw frames. Hence it is really not the same thing as the frame interval
of a video capture device. If we want to use a control for that as well in
the future as a replacement for G/S_PARM, then that should be a new control.
And we would like need per-pad controls as well in order to implement that.
Which is a lot more work.

Regards,

	Hans
Nicolas Dufresne June 14, 2019, 1:09 a.m. UTC | #7
Le jeudi 13 juin 2019 à 08:48 +0200, Hans Verkuil a écrit :
> On 6/3/19 1:28 PM, Hans Verkuil wrote:
> > Since Tomasz was very busy with other things, I've taken over this
> > patch series. This v4 includes his draft changes and additional changes
> > from me.
> > 
> > This series attempts to add the documentation of what was discussed
> > during Media Workshops at LinuxCon Europe 2012 in Barcelona and then
> > later Embedded Linux Conference Europe 2014 in Düsseldorf and then
> > eventually written down by Pawel Osciak and tweaked a bit by Chrome OS
> > video team (but mostly in a cosmetic way or making the document more
> > precise), during the several years of Chrome OS using the APIs in
> > production.
> > 
> > Note that most, if not all, of the API is already implemented in
> > existing mainline drivers, such as s5p-mfc or mtk-vcodec. Intention of
> > this series is just to formalize what we already have.
> > 
> > Thanks everyone for the huge amount of useful comments to previous
> > versions of this series. Much of the credits should go to Pawel Osciak
> > too, for writing most of the original text of the initial RFC.
> > 
> > This v4 incorporates all known comments (let me know if I missed
> > something!) and should be complete for the decoder.
> > 
> > For the encoder there are two remaining TODOs for the API:
> > 
> > 1) Setting the frame rate so bitrate control can make sense, since
> >    they need to know this information.
> > 
> >    Suggested solution: require support for ENUM_FRAMEINTERVALS for the
> >    coded pixelformats and S_PARM(OUTPUT). Open question: some drivers
> >    (mediatek, hva, coda) require S_PARM(OUTPUT), some (venus) allow both
> >    S_PARM(CAPTURE) and S_PARM(OUTPUT). I am inclined to allow both since
> >    this is not a CAPTURE vs OUTPUT thing, it is global to both queues.
> 
> Alternative proposal:
> 
> 1) Add support for fractions (struct v4l2_fract) as a control type:
>    V4L2_CTRL_TYPE_FRACT.
> 
> 2) Add a new V4L2_CID_MPEG_FRAME_INTERVAL control.

Is the MPEG namespace historical ? That might be confusing for users.

> 
> Encoders shall support this control.
> 
> 3) For backwards compatibility reasons encoder drivers still have to
> support G/S_PARM, but this can now be implemented by standard helpers
> that query this control. Drivers also have to implement ENUM_FRAMEINTERVALS.

That's won't be very friendly for UI generator like qv4l2. Support for
v4l2_fract as control should include a way to describe the supported
values of that control the usual way I think.

Also, long term, it would be nice to have two sets of frame rates. The
one that the HW can handle "real-time" and the one that can be used for
bitrate calculation. So staying away from ENUM_FRAMEINTERVALS for
bitrate configuration would be nicer.

> If the range of intervals is always the same regardless of the frame size,
> then a helper can be used that queries the min/max/step of the control, but
> if it is dependent on the frame size, then it has to be implemented in the
> driver itself.
> 
> I'm sticking to frame intervals instead of frame rates for the simple reason
> that that's what V4L2 has used since the beginning. I think it is too confusing
> to change this to a frame rate. This is just my opinion, though.

I suggested frame rate since this is what I saw implemented by HW
registers (if you think it's worth it, I can try and make a list).
Also, frame-interval steps are not compatible with frame-rate steps
(something that was raised through a venus driver bug) last year. Even
v4l2-ctl was displaying that in a very confusing way. Something as
simple as 1 to 30 fps cannot be exposed through ENU_FRAMEINTERVALS. You
are forced to expose the full fractional range of interval, from 1/30
to 1/1. For Venus it was not that much of a trouble, since its stores a
framerate as Q16..

> 
> I also chose to make this a codec control, not a generic user control: this
> value together with the bit rate control(s) determine the compression size,
> it does not determine the actual time it takes for the encoder to compress
> the raw frames. Hence it is really not the same thing as the frame interval

That's a good point.

> of a video capture device. If we want to use a control for that as well in
> the future as a replacement for G/S_PARM, then that should be a new control.
> And we would like need per-pad controls as well in order to implement that.
> Which is a lot more work.
> 
> Regards,
> 
> 	Hans
Hans Verkuil June 15, 2019, 8:08 a.m. UTC | #8
On 6/14/19 3:09 AM, Nicolas Dufresne wrote:
> Le jeudi 13 juin 2019 à 08:48 +0200, Hans Verkuil a écrit :
>> On 6/3/19 1:28 PM, Hans Verkuil wrote:
>>> Since Tomasz was very busy with other things, I've taken over this
>>> patch series. This v4 includes his draft changes and additional changes
>>> from me.
>>>
>>> This series attempts to add the documentation of what was discussed
>>> during Media Workshops at LinuxCon Europe 2012 in Barcelona and then
>>> later Embedded Linux Conference Europe 2014 in Düsseldorf and then
>>> eventually written down by Pawel Osciak and tweaked a bit by Chrome OS
>>> video team (but mostly in a cosmetic way or making the document more
>>> precise), during the several years of Chrome OS using the APIs in
>>> production.
>>>
>>> Note that most, if not all, of the API is already implemented in
>>> existing mainline drivers, such as s5p-mfc or mtk-vcodec. Intention of
>>> this series is just to formalize what we already have.
>>>
>>> Thanks everyone for the huge amount of useful comments to previous
>>> versions of this series. Much of the credits should go to Pawel Osciak
>>> too, for writing most of the original text of the initial RFC.
>>>
>>> This v4 incorporates all known comments (let me know if I missed
>>> something!) and should be complete for the decoder.
>>>
>>> For the encoder there are two remaining TODOs for the API:
>>>
>>> 1) Setting the frame rate so bitrate control can make sense, since
>>>    they need to know this information.
>>>
>>>    Suggested solution: require support for ENUM_FRAMEINTERVALS for the
>>>    coded pixelformats and S_PARM(OUTPUT). Open question: some drivers
>>>    (mediatek, hva, coda) require S_PARM(OUTPUT), some (venus) allow both
>>>    S_PARM(CAPTURE) and S_PARM(OUTPUT). I am inclined to allow both since
>>>    this is not a CAPTURE vs OUTPUT thing, it is global to both queues.
>>
>> Alternative proposal:
>>
>> 1) Add support for fractions (struct v4l2_fract) as a control type:
>>    V4L2_CTRL_TYPE_FRACT.
>>
>> 2) Add a new V4L2_CID_MPEG_FRAME_INTERVAL control.
> 
> Is the MPEG namespace historical ? That might be confusing for users.

Yes, it's historical. I have toyed with the idea of renaming all the
defines to something like V4L2_CID_CODEC_... (keeping the old defines, of
course), but I'm not sure it is worth it.

> 
>>
>> Encoders shall support this control.
>>
>> 3) For backwards compatibility reasons encoder drivers still have to
>> support G/S_PARM, but this can now be implemented by standard helpers
>> that query this control. Drivers also have to implement ENUM_FRAMEINTERVALS.
> 
> That's won't be very friendly for UI generator like qv4l2. Support for
> v4l2_fract as control should include a way to describe the supported
> values of that control the usual way I think.

Such a control will definitely have the usual min/max/step/default control
values.

> Also, long term, it would be nice to have two sets of frame rates. The
> one that the HW can handle "real-time" and the one that can be used for
> bitrate calculation. So staying away from ENUM_FRAMEINTERVALS for
> bitrate configuration would be nicer.

I'm not sure if that's feasible in practice, although the idea is nice.
The 'real-time' framerate will likely depend to a huge extent on the
frequency of various internal clocks and the content of the bitstream.

I suspect it will be very hard if not impossible to report realistic
ENUM_FRAMEINTERVAL values for codecs.

> 
>> If the range of intervals is always the same regardless of the frame size,
>> then a helper can be used that queries the min/max/step of the control, but
>> if it is dependent on the frame size, then it has to be implemented in the
>> driver itself.
>>
>> I'm sticking to frame intervals instead of frame rates for the simple reason
>> that that's what V4L2 has used since the beginning. I think it is too confusing
>> to change this to a frame rate. This is just my opinion, though.
> 
> I suggested frame rate since this is what I saw implemented by HW
> registers (if you think it's worth it, I can try and make a list).
> Also, frame-interval steps are not compatible with frame-rate steps
> (something that was raised through a venus driver bug) last year. Even
> v4l2-ctl was displaying that in a very confusing way. Something as
> simple as 1 to 30 fps cannot be exposed through ENU_FRAMEINTERVALS. You
> are forced to expose the full fractional range of interval, from 1/30
> to 1/1. For Venus it was not that much of a trouble, since its stores a
> framerate as Q16..

Since this is used for bitrate calculations, and not for determining the
exact framerate, I'm not sure it matters all that much in this particular
case. Since you still need to implement G/S_PARM in drivers for backwards
compatibility reasons I think it is easiest to keep using frame interval
instead of frame rate.

Actually, that raises a new question: are there codecs that store the frame rate
in the bitstream? I haven't heard of that, so I suspect not, but I'm not certain.

Regards,

	Hans

> 
>>
>> I also chose to make this a codec control, not a generic user control: this
>> value together with the bit rate control(s) determine the compression size,
>> it does not determine the actual time it takes for the encoder to compress
>> the raw frames. Hence it is really not the same thing as the frame interval
> 
> That's a good point.
> 
>> of a video capture device. If we want to use a control for that as well in
>> the future as a replacement for G/S_PARM, then that should be a new control.
>> And we would like need per-pad controls as well in order to implement that.
>> Which is a lot more work.
>>
>> Regards,
>>
>> 	Hans
> 
>
Nicolas Dufresne June 16, 2019, 12:17 a.m. UTC | #9
Le samedi 15 juin 2019 à 10:08 +0200, Hans Verkuil a écrit :
> On 6/14/19 3:09 AM, Nicolas Dufresne wrote:
> > Le jeudi 13 juin 2019 à 08:48 +0200, Hans Verkuil a écrit :
> > > On 6/3/19 1:28 PM, Hans Verkuil wrote:
> > > > Since Tomasz was very busy with other things, I've taken over this
> > > > patch series. This v4 includes his draft changes and additional changes
> > > > from me.
> > > > 
> > > > This series attempts to add the documentation of what was discussed
> > > > during Media Workshops at LinuxCon Europe 2012 in Barcelona and then
> > > > later Embedded Linux Conference Europe 2014 in Düsseldorf and then
> > > > eventually written down by Pawel Osciak and tweaked a bit by Chrome OS
> > > > video team (but mostly in a cosmetic way or making the document more
> > > > precise), during the several years of Chrome OS using the APIs in
> > > > production.
> > > > 
> > > > Note that most, if not all, of the API is already implemented in
> > > > existing mainline drivers, such as s5p-mfc or mtk-vcodec. Intention of
> > > > this series is just to formalize what we already have.
> > > > 
> > > > Thanks everyone for the huge amount of useful comments to previous
> > > > versions of this series. Much of the credits should go to Pawel Osciak
> > > > too, for writing most of the original text of the initial RFC.
> > > > 
> > > > This v4 incorporates all known comments (let me know if I missed
> > > > something!) and should be complete for the decoder.
> > > > 
> > > > For the encoder there are two remaining TODOs for the API:
> > > > 
> > > > 1) Setting the frame rate so bitrate control can make sense, since
> > > >    they need to know this information.
> > > > 
> > > >    Suggested solution: require support for ENUM_FRAMEINTERVALS for the
> > > >    coded pixelformats and S_PARM(OUTPUT). Open question: some drivers
> > > >    (mediatek, hva, coda) require S_PARM(OUTPUT), some (venus) allow both
> > > >    S_PARM(CAPTURE) and S_PARM(OUTPUT). I am inclined to allow both since
> > > >    this is not a CAPTURE vs OUTPUT thing, it is global to both queues.
> > > 
> > > Alternative proposal:
> > > 
> > > 1) Add support for fractions (struct v4l2_fract) as a control type:
> > >    V4L2_CTRL_TYPE_FRACT.
> > > 
> > > 2) Add a new V4L2_CID_MPEG_FRAME_INTERVAL control.
> > 
> > Is the MPEG namespace historical ? That might be confusing for users.
> 
> Yes, it's historical. I have toyed with the idea of renaming all the
> defines to something like V4L2_CID_CODEC_... (keeping the old defines, of
> course), but I'm not sure it is worth it.
> 
> > > Encoders shall support this control.
> > > 
> > > 3) For backwards compatibility reasons encoder drivers still have to
> > > support G/S_PARM, but this can now be implemented by standard helpers
> > > that query this control. Drivers also have to implement ENUM_FRAMEINTERVALS.
> > 
> > That's won't be very friendly for UI generator like qv4l2. Support for
> > v4l2_fract as control should include a way to describe the supported
> > values of that control the usual way I think.
> 
> Such a control will definitely have the usual min/max/step/default control
> values.
> 
> > Also, long term, it would be nice to have two sets of frame rates. The
> > one that the HW can handle "real-time" and the one that can be used for
> > bitrate calculation. So staying away from ENUM_FRAMEINTERVALS for
> > bitrate configuration would be nicer.
> 
> I'm not sure if that's feasible in practice, although the idea is nice.
> The 'real-time' framerate will likely depend to a huge extent on the
> frequency of various internal clocks and the content of the bitstream.
> 
> I suspect it will be very hard if not impossible to report realistic
> ENUM_FRAMEINTERVAL values for codecs.
> 
> > > If the range of intervals is always the same regardless of the frame size,
> > > then a helper can be used that queries the min/max/step of the control, but
> > > if it is dependent on the frame size, then it has to be implemented in the
> > > driver itself.
> > > 
> > > I'm sticking to frame intervals instead of frame rates for the simple reason
> > > that that's what V4L2 has used since the beginning. I think it is too confusing
> > > to change this to a frame rate. This is just my opinion, though.
> > 
> > I suggested frame rate since this is what I saw implemented by HW
> > registers (if you think it's worth it, I can try and make a list).
> > Also, frame-interval steps are not compatible with frame-rate steps
> > (something that was raised through a venus driver bug) last year. Even
> > v4l2-ctl was displaying that in a very confusing way. Something as
> > simple as 1 to 30 fps cannot be exposed through ENU_FRAMEINTERVALS. You
> > are forced to expose the full fractional range of interval, from 1/30
> > to 1/1. For Venus it was not that much of a trouble, since its stores a
> > framerate as Q16..
> 
> Since this is used for bitrate calculations, and not for determining the
> exact framerate, I'm not sure it matters all that much in this particular
> case. Since you still need to implement G/S_PARM in drivers for backwards
> compatibility reasons I think it is easiest to keep using frame interval
> instead of frame rate.
> 
> Actually, that raises a new question: are there codecs that store the frame rate
> in the bitstream? I haven't heard of that, so I suspect not, but I'm not certain.

Yes, H264 and HEVC have the VUI SEI. In HEVC you can also pass this in
VPS. Both are fractions. I'm not certain, but I think you can embed a
timecode, and these also have the fps.

For stateful encoder, it's always nice if the encoder can produce the
SEI/VPS bit. The worst that could happen is an IP that always produce
that information, and userspace that didn't set anything.

For decoders, it's not something I really care about, since I we do
parse this in GStreamer. That being said, I know that the firmware on
the ZynqMP uses the framerate to calculate the number of cores it uses.
So if you don't pass it, it will chose base on the maximum for the
profile/level, which may greatly reduce the number of concurrent
streams you can run.

> 
> Regards,
> 
> 	Hans
> 
> > > I also chose to make this a codec control, not a generic user control: this
> > > value together with the bit rate control(s) determine the compression size,
> > > it does not determine the actual time it takes for the encoder to compress
> > > the raw frames. Hence it is really not the same thing as the frame interval
> > 
> > That's a good point.
> > 
> > > of a video capture device. If we want to use a control for that as well in
> > > the future as a replacement for G/S_PARM, then that should be a new control.
> > > And we would like need per-pad controls as well in order to implement that.
> > > Which is a lot more work.
> > > 
> > > Regards,
> > > 
> > > 	Hans
Tomasz Figa July 3, 2019, 9:04 a.m. UTC | #10
On Wed, Jun 5, 2019 at 12:19 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
>
> Le lundi 03 juin 2019 à 13:28 +0200, Hans Verkuil a écrit :
> > Since Thomasz was very busy with other things, I've taken over this
> > patch series. This v4 includes his draft changes and additional changes
> > from me.
> >
> > This series attempts to add the documentation of what was discussed
> > during Media Workshops at LinuxCon Europe 2012 in Barcelona and then
> > later Embedded Linux Conference Europe 2014 in Düsseldorf and then
> > eventually written down by Pawel Osciak and tweaked a bit by Chrome OS
> > video team (but mostly in a cosmetic way or making the document more
> > precise), during the several years of Chrome OS using the APIs in
> > production.
> >
> > Note that most, if not all, of the API is already implemented in
> > existing mainline drivers, such as s5p-mfc or mtk-vcodec. Intention of
> > this series is just to formalize what we already have.
> >
> > Thanks everyone for the huge amount of useful comments to previous
> > versions of this series. Much of the credits should go to Pawel Osciak
> > too, for writing most of the original text of the initial RFC.
> >
> > This v4 incorporates all known comments (let me know if I missed
> > something!) and should be complete for the decoder.
> >
> > For the encoder there are two remaining TODOs for the API:
> >
> > 1) Setting the frame rate so bitrate control can make sense, since
> >    they need to know this information.
> >
> >    Suggested solution: require support for ENUM_FRAMEINTERVALS for the
> >    coded pixelformats and S_PARM(OUTPUT). Open question: some drivers
> >    (mediatek, hva, coda) require S_PARM(OUTPUT), some (venus) allow both
> >    S_PARM(CAPTURE) and S_PARM(OUTPUT). I am inclined to allow both since
> >    this is not a CAPTURE vs OUTPUT thing, it is global to both queues.
>
> I agree, as long as it's documented. I can imagine how this could be
> confusing for new users.
>
> >
> > 2) Interactions between OUTPUT and CAPTURE formats.
> >
> >    The main problem is what to do if the capture sizeimage is too small
> >    for the OUTPUT resolution when streaming starts.
> >
> >    Proposal: width and height of S_FMT(OUTPUT) are used to
> >    calculate a minimum sizeimage (app may request more). This is
> >    driver-specific.
> >
> >    V4L2_FMT_FLAG_FIXED_RESOLUTION is always set for codec formats
> >    for the encoder (i.e. we don't support mid-stream resolution
> >    changes for now) and V4L2_EVENT_SOURCE_CHANGE is not
> >    supported. See https://patchwork.linuxtv.org/patch/56478/ for
> >    the patch adding this flag.
> >
> >    Of course, if we start to support mid-stream resolution
> >    changes (or other changes that require a source change event),
> >    then this flag should be dropped by the encoder driver and
> >    documentation on how to handle the source change event should
> >    be documented in the encoder spec. I prefer to postpone this
> >    until we have an encoder than can actually do mid-stream
> >    resolution changes.
> >
> >    If sizeimage of the OUTPUT is too small for the CAPTURE
> >    resolution and V4L2_EVENT_SOURCE_CHANGE is not supported,
> >    then the second STREAMON (either CAPTURE or OUTPUT) will
> >    return -ENOMEM since there is not enough memory to do the
> >    encode.
>
> You seem confident that we will know immediately if it's too small. But
> what I remember is that HW has an interrupt for this, allowing
> userspace to allocate a larger buffer and resume.
>
> Should we make the capture queue independent of the streaming state, so
> that we can streamoff/reqbufs/.../streamon to resume from an ENOMEM
> error ? And shouldn't ENOMEM be returned by the following capture DQBUF
> when such an interrupt is raised ?
>

The idea was that stopping the CAPTURE queue would reset the encoder,
i.e. start encoding a new, independent stream after the streaming
starts again. Still, given that one would normally need to reallocate
the buffers on some significant stream parameter change, that would
normally require emitting all the relevant headers anyway, so it
probably doesn't break anything?

Best regards,
Tomasz
Nicolas Dufresne July 3, 2019, 5:07 p.m. UTC | #11
Le mercredi 03 juillet 2019 à 18:04 +0900, Tomasz Figa a écrit :
> On Wed, Jun 5, 2019 at 12:19 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
> > Le lundi 03 juin 2019 à 13:28 +0200, Hans Verkuil a écrit :
> > > Since Thomasz was very busy with other things, I've taken over this
> > > patch series. This v4 includes his draft changes and additional changes
> > > from me.
> > > 
> > > This series attempts to add the documentation of what was discussed
> > > during Media Workshops at LinuxCon Europe 2012 in Barcelona and then
> > > later Embedded Linux Conference Europe 2014 in Düsseldorf and then
> > > eventually written down by Pawel Osciak and tweaked a bit by Chrome OS
> > > video team (but mostly in a cosmetic way or making the document more
> > > precise), during the several years of Chrome OS using the APIs in
> > > production.
> > > 
> > > Note that most, if not all, of the API is already implemented in
> > > existing mainline drivers, such as s5p-mfc or mtk-vcodec. Intention of
> > > this series is just to formalize what we already have.
> > > 
> > > Thanks everyone for the huge amount of useful comments to previous
> > > versions of this series. Much of the credits should go to Pawel Osciak
> > > too, for writing most of the original text of the initial RFC.
> > > 
> > > This v4 incorporates all known comments (let me know if I missed
> > > something!) and should be complete for the decoder.
> > > 
> > > For the encoder there are two remaining TODOs for the API:
> > > 
> > > 1) Setting the frame rate so bitrate control can make sense, since
> > >    they need to know this information.
> > > 
> > >    Suggested solution: require support for ENUM_FRAMEINTERVALS for the
> > >    coded pixelformats and S_PARM(OUTPUT). Open question: some drivers
> > >    (mediatek, hva, coda) require S_PARM(OUTPUT), some (venus) allow both
> > >    S_PARM(CAPTURE) and S_PARM(OUTPUT). I am inclined to allow both since
> > >    this is not a CAPTURE vs OUTPUT thing, it is global to both queues.
> > 
> > I agree, as long as it's documented. I can imagine how this could be
> > confusing for new users.
> > 
> > > 2) Interactions between OUTPUT and CAPTURE formats.
> > > 
> > >    The main problem is what to do if the capture sizeimage is too small
> > >    for the OUTPUT resolution when streaming starts.
> > > 
> > >    Proposal: width and height of S_FMT(OUTPUT) are used to
> > >    calculate a minimum sizeimage (app may request more). This is
> > >    driver-specific.
> > > 
> > >    V4L2_FMT_FLAG_FIXED_RESOLUTION is always set for codec formats
> > >    for the encoder (i.e. we don't support mid-stream resolution
> > >    changes for now) and V4L2_EVENT_SOURCE_CHANGE is not
> > >    supported. See https://patchwork.linuxtv.org/patch/56478/ for
> > >    the patch adding this flag.
> > > 
> > >    Of course, if we start to support mid-stream resolution
> > >    changes (or other changes that require a source change event),
> > >    then this flag should be dropped by the encoder driver and
> > >    documentation on how to handle the source change event should
> > >    be documented in the encoder spec. I prefer to postpone this
> > >    until we have an encoder than can actually do mid-stream
> > >    resolution changes.
> > > 
> > >    If sizeimage of the OUTPUT is too small for the CAPTURE
> > >    resolution and V4L2_EVENT_SOURCE_CHANGE is not supported,
> > >    then the second STREAMON (either CAPTURE or OUTPUT) will
> > >    return -ENOMEM since there is not enough memory to do the
> > >    encode.
> > 
> > You seem confident that we will know immediately if it's too small. But
> > what I remember is that HW has an interrupt for this, allowing
> > userspace to allocate a larger buffer and resume.
> > 
> > Should we make the capture queue independent of the streaming state, so
> > that we can streamoff/reqbufs/.../streamon to resume from an ENOMEM
> > error ? And shouldn't ENOMEM be returned by the following capture DQBUF
> > when such an interrupt is raised ?
> > 
> 
> The idea was that stopping the CAPTURE queue would reset the encoder,
> i.e. start encoding a new, independent stream after the streaming
> starts again. Still, given that one would normally need to reallocate
> the buffers on some significant stream parameter change, that would
> normally require emitting all the relevant headers anyway, so it
> probably doesn't break anything?

The capture buffer size is a prediction, so even without any parameters
changes, the size could become insufficient. On the other hand, we have
managed to predict quite well so far in many applications.

Note, I didn't remember that encoder CAPTURE queue streamoff was the
one triggering the reset. In GStreamer, I always streamoff both, so I
never had to think about this. One thing is clear though, it will be
really hard to extend later with this hard relationship between
allocation, streaming state and encoder state. I'm sure we can survive
this, there is probably plenty of workaround, including spreading
encoded data across multiple buffer as Hans suggested.

> 
> Best regards,
> Tomasz