mbox series

[PATCHv3,0/3] media: mc: add manual request completion support

Message ID cover.1724928939.git.hverkuil-cisco@xs4all.nl (mailing list archive)
Headers show
Series media: mc: add manual request completion support | expand

Message

Hans Verkuil Aug. 29, 2024, 10:55 a.m. UTC
Normally a request contains one or more request objects, and once all
objects are marked as 'completed' the request itself is completed and
userspace gets a signal that the request is complete.

Calling vb2_buffer_done will complete a buffer object, and
v4l2_ctrl_request_complete will complete a control handler object.

In some cases (e.g. VP9 codecs) there is only a buffer object, so
as soon as the buffer is marked done, the request is marked as
completed. But in the case of mediatek, while the buffer is done
(i.e. the data is consumed by the hardware), the request isn't
completed yet as the data is still being processed. Once the
data is fully processed, the driver wants to call
v4l2_ctrl_request_complete() which will either update an existing
control handler object, or add a new control handler object to the
request containing the latest control values. But since the
request is already completed, calling v4l2_ctrl_request_complete()
will fail.

One option is to simply postpone calling vb2_buffer_done() and do
it after the call to v4l2_ctrl_request_complete(). However, in some
use-cases (e.g. secure memory) the number of available buffers is
very limited and you really want to return a buffer as soon as
possible.

In that case you want to postpone request completion until you
know the request is really ready.

Originally I thought the best way would be to make a dummy request
object, but that turned out to be overly complicated. So instead
I just add a bool manual_completion, which you set to true in
req_queue, and you call media_request_manual_complete() when you
know the request is complete. That was a lot less complicated.

The first patch adds this new manual completion code, the second
patch adds this to vicodec so it is included in regression testing,
and the last patch is an updated old patch of mine that adds debugfs
code to check if all requests and request objects are properly freed.
Without it it is really hard to verify that there are no dangling
requests or objects.

I prefer to merge this third patch as well, but if there are
objections, then I can live without it.

Regards,

	Hans

Changes since v2:
- fixed use-after-free bug in the third patch in media_request_object_release().

Changes since the RFC:

- Added WARN_ONs
- vicodec was calling media_request_manual_complete() without
  checking that it was the stateless output queue first.
- Some minor cleanups in patch 3.

Hans Verkuil (3):
  media: mc: add manual request completion
  media: vicodec: add support for manual completion
  media: mc: add debugfs node to keep track of requests

 drivers/media/mc/mc-device.c                  | 30 +++++++++++++
 drivers/media/mc/mc-devnode.c                 |  5 +++
 drivers/media/mc/mc-request.c                 | 44 ++++++++++++++++++-
 .../media/test-drivers/vicodec/vicodec-core.c | 21 +++++++--
 include/media/media-device.h                  |  9 ++++
 include/media/media-devnode.h                 |  4 ++
 include/media/media-request.h                 | 38 +++++++++++++++-
 7 files changed, 144 insertions(+), 7 deletions(-)

Comments

Steve Cho Oct. 3, 2024, 10:03 p.m. UTC | #1
Hi Hans,

Few questions here.
After this patch, is there any need to reflect in the Request API doc?

On Thu, Aug 29, 2024 at 3:58 AM Hans Verkuil <hverkuil-cisco@xs4all.nl> wrote:
>
> Normally a request contains one or more request objects, and once all
> objects are marked as 'completed' the request itself is completed and
> userspace gets a signal that the request is complete.
>
> Calling vb2_buffer_done will complete a buffer object, and
> v4l2_ctrl_request_complete will complete a control handler object.
>
> In some cases (e.g. VP9 codecs) there is only a buffer object, so
What codecs have the same behavior here?
How are things different with other codecs?

> as soon as the buffer is marked done, the request is marked as
> completed. But in the case of mediatek, while the buffer is done
> (i.e. the data is consumed by the hardware), the request isn't
> completed yet as the data is still being processed. Once the
This sounds like standard HW behavior to me...
How is it different on rockchip, which I heard was not reproducing this issue?

> data is fully processed, the driver wants to call
> v4l2_ctrl_request_complete() which will either update an existing
> control handler object, or add a new control handler object to the
> request containing the latest control values. But since the
> request is already completed, calling v4l2_ctrl_request_complete()
> will fail.
>
> One option is to simply postpone calling vb2_buffer_done() and do
> it after the call to v4l2_ctrl_request_complete(). However, in some
> use-cases (e.g. secure memory) the number of available buffers is
> very limited and you really want to return a buffer as soon as
> possible.
>
> In that case you want to postpone request completion until you
> know the request is really ready.
>
> Originally I thought the best way would be to make a dummy request
> object, but that turned out to be overly complicated. So instead
> I just add a bool manual_completion, which you set to true in
> req_queue, and you call media_request_manual_complete() when you
> know the request is complete. That was a lot less complicated.
>
> The first patch adds this new manual completion code, the second
> patch adds this to vicodec so it is included in regression testing,
> and the last patch is an updated old patch of mine that adds debugfs
> code to check if all requests and request objects are properly freed.
> Without it it is really hard to verify that there are no dangling
> requests or objects.
>
> I prefer to merge this third patch as well, but if there are
> objections, then I can live without it.
>
> Regards,
>
>         Hans
>
> Changes since v2:
> - fixed use-after-free bug in the third patch in media_request_object_release().
>
> Changes since the RFC:
>
> - Added WARN_ONs
> - vicodec was calling media_request_manual_complete() without
>   checking that it was the stateless output queue first.
> - Some minor cleanups in patch 3.
>
> Hans Verkuil (3):
>   media: mc: add manual request completion
>   media: vicodec: add support for manual completion
>   media: mc: add debugfs node to keep track of requests
>
>  drivers/media/mc/mc-device.c                  | 30 +++++++++++++
>  drivers/media/mc/mc-devnode.c                 |  5 +++
>  drivers/media/mc/mc-request.c                 | 44 ++++++++++++++++++-
>  .../media/test-drivers/vicodec/vicodec-core.c | 21 +++++++--
>  include/media/media-device.h                  |  9 ++++
>  include/media/media-devnode.h                 |  4 ++
>  include/media/media-request.h                 | 38 +++++++++++++++-
>  7 files changed, 144 insertions(+), 7 deletions(-)
>
> --
> 2.43.0
>
>
Nicolas Dufresne Oct. 5, 2024, 2:54 a.m. UTC | #2
Hi Steve,

Le jeudi 03 octobre 2024 à 15:03 -0700, Steve Cho a écrit :
> Hi Hans,
> 
> Few questions here.
> After this patch, is there any need to reflect in the Request API doc?
> 
> On Thu, Aug 29, 2024 at 3:58 AM Hans Verkuil <hverkuil-cisco@xs4all.nl> wrote:
> > 
> > Normally a request contains one or more request objects, and once all
> > objects are marked as 'completed' the request itself is completed and
> > userspace gets a signal that the request is complete.
> > 
> > Calling vb2_buffer_done will complete a buffer object, and
> > v4l2_ctrl_request_complete will complete a control handler object.
> > 
> > In some cases (e.g. VP9 codecs) there is only a buffer object, so
> What codecs have the same behavior here?
> How are things different with other codecs?

CODECs with single header are typically affected. VP9 have no sequence header of
any sort. Userspace will have to set an initial VP9 Frame header out of request
in order to negotiate the format with the driver. VP8 and AV1 are also possible
candidates.

> 
> > as soon as the buffer is marked done, the request is marked as
> > completed. But in the case of mediatek, while the buffer is done
> > (i.e. the data is consumed by the hardware), the request isn't
> > completed yet as the data is still being processed. Once the
> This sounds like standard HW behavior to me...
> How is it different on rockchip, which I heard was not reproducing this issue?

This only affects MTK VCODEC, and only very recent version of Chromium. Event if
Hantro and RKVDEC are not affected, at the time RK3399 support was dropped by
ChromeOS team, Chromium was still sending all the controls into all the request.
Incidently, we never tested the effect of having request without controls.

The intention here is to give drivers the control over when the following steps
occurs. In practice, the most logical event signalling order would be:

- Apply the new controls to the global state
- Mark the bitstream buffer done (bitstream buffers can immediately be refilled)
- Mark the picture buffer done
- Signal the request completion

But all drivers, except MTK, uses the helper v4l2_m2m_buf_done_and_job_finish(),
which imply

- Apply the new controls to the global state
- Mark the picture buffer done
- Mark the bitstream buffer done
- After which the last object in the request is dropped and the implicit
signalling triggers

In order to render the bistream buffers available earlier to user space, the MTK
driver is currently delaying the application of controls. But if there is no
control object in the request, this mechanism is not working and lead to the
splat we are trying to fix in a clean way.

Nicolas

> 
> > data is fully processed, the driver wants to call
> > v4l2_ctrl_request_complete() which will either update an existing
> > control handler object, or add a new control handler object to the
> > request containing the latest control values. But since the
> > request is already completed, calling v4l2_ctrl_request_complete()
> > will fail.
> > 
> > One option is to simply postpone calling vb2_buffer_done() and do
> > it after the call to v4l2_ctrl_request_complete(). However, in some
> > use-cases (e.g. secure memory) the number of available buffers is
> > very limited and you really want to return a buffer as soon as
> > possible.
> > 
> > In that case you want to postpone request completion until you
> > know the request is really ready.
> > 
> > Originally I thought the best way would be to make a dummy request
> > object, but that turned out to be overly complicated. So instead
> > I just add a bool manual_completion, which you set to true in
> > req_queue, and you call media_request_manual_complete() when you
> > know the request is complete. That was a lot less complicated.
> > 
> > The first patch adds this new manual completion code, the second
> > patch adds this to vicodec so it is included in regression testing,
> > and the last patch is an updated old patch of mine that adds debugfs
> > code to check if all requests and request objects are properly freed.
> > Without it it is really hard to verify that there are no dangling
> > requests or objects.
> > 
> > I prefer to merge this third patch as well, but if there are
> > objections, then I can live without it.
> > 
> > Regards,
> > 
> >         Hans
> > 
> > Changes since v2:
> > - fixed use-after-free bug in the third patch in media_request_object_release().
> > 
> > Changes since the RFC:
> > 
> > - Added WARN_ONs
> > - vicodec was calling media_request_manual_complete() without
> >   checking that it was the stateless output queue first.
> > - Some minor cleanups in patch 3.
> > 
> > Hans Verkuil (3):
> >   media: mc: add manual request completion
> >   media: vicodec: add support for manual completion
> >   media: mc: add debugfs node to keep track of requests
> > 
> >  drivers/media/mc/mc-device.c                  | 30 +++++++++++++
> >  drivers/media/mc/mc-devnode.c                 |  5 +++
> >  drivers/media/mc/mc-request.c                 | 44 ++++++++++++++++++-
> >  .../media/test-drivers/vicodec/vicodec-core.c | 21 +++++++--
> >  include/media/media-device.h                  |  9 ++++
> >  include/media/media-devnode.h                 |  4 ++
> >  include/media/media-request.h                 | 38 +++++++++++++++-
> >  7 files changed, 144 insertions(+), 7 deletions(-)
> > 
> > --
> > 2.43.0
> > 
> >
Steve Cho Oct. 7, 2024, 4:36 p.m. UTC | #3
Thank you Nicolas for the explanation.

On Fri, Oct 4, 2024 at 7:54 PM Nicolas Dufresne
<nicolas.dufresne@collabora.com> wrote:
>
> Hi Steve,
>
> Le jeudi 03 octobre 2024 à 15:03 -0700, Steve Cho a écrit :
> > Hi Hans,
> >
> > Few questions here.
> > After this patch, is there any need to reflect in the Request API doc?
> >
> > On Thu, Aug 29, 2024 at 3:58 AM Hans Verkuil <hverkuil-cisco@xs4all.nl> wrote:
> > >
> > > Normally a request contains one or more request objects, and once all
> > > objects are marked as 'completed' the request itself is completed and
> > > userspace gets a signal that the request is complete.
> > >
> > > Calling vb2_buffer_done will complete a buffer object, and
> > > v4l2_ctrl_request_complete will complete a control handler object.
> > >
> > > In some cases (e.g. VP9 codecs) there is only a buffer object, so
> > What codecs have the same behavior here?
> > How are things different with other codecs?
>
> CODECs with single header are typically affected. VP9 have no sequence header of
> any sort. Userspace will have to set an initial VP9 Frame header out of request
> in order to negotiate the format with the driver. VP8 and AV1 are also possible
> candidates.
>
> >
> > > as soon as the buffer is marked done, the request is marked as
> > > completed. But in the case of mediatek, while the buffer is done
> > > (i.e. the data is consumed by the hardware), the request isn't
> > > completed yet as the data is still being processed. Once the
> > This sounds like standard HW behavior to me...
> > How is it different on rockchip, which I heard was not reproducing this issue?
>
> This only affects MTK VCODEC, and only very recent version of Chromium. Event if
I assume you are referring to what we call "v4l2 flat stateless"
implementation.
This was developed, but it was not enabled.

> Hantro and RKVDEC are not affected, at the time RK3399 support was dropped by
> ChromeOS team, Chromium was still sending all the controls into all the request.
> Incidently, we never tested the effect of having request without controls.
>
> The intention here is to give drivers the control over when the following steps
> occurs. In practice, the most logical event signalling order would be:
>
> - Apply the new controls to the global state
> - Mark the bitstream buffer done (bitstream buffers can immediately be refilled)
> - Mark the picture buffer done
> - Signal the request completion
>
> But all drivers, except MTK, uses the helper v4l2_m2m_buf_done_and_job_finish(),
> which imply
>
> - Apply the new controls to the global state
> - Mark the picture buffer done
> - Mark the bitstream buffer done
> - After which the last object in the request is dropped and the implicit
> signalling triggers
>
> In order to render the bistream buffers available earlier to user space, the MTK
> driver is currently delaying the application of controls. But if there is no
> control object in the request, this mechanism is not working and lead to the
> splat we are trying to fix in a clean way.
>
> Nicolas
>
> >
> > > data is fully processed, the driver wants to call
> > > v4l2_ctrl_request_complete() which will either update an existing
> > > control handler object, or add a new control handler object to the
> > > request containing the latest control values. But since the
> > > request is already completed, calling v4l2_ctrl_request_complete()
> > > will fail.
> > >
> > > One option is to simply postpone calling vb2_buffer_done() and do
> > > it after the call to v4l2_ctrl_request_complete(). However, in some
> > > use-cases (e.g. secure memory) the number of available buffers is
> > > very limited and you really want to return a buffer as soon as
> > > possible.
> > >
> > > In that case you want to postpone request completion until you
> > > know the request is really ready.
> > >
> > > Originally I thought the best way would be to make a dummy request
> > > object, but that turned out to be overly complicated. So instead
> > > I just add a bool manual_completion, which you set to true in
> > > req_queue, and you call media_request_manual_complete() when you
> > > know the request is complete. That was a lot less complicated.
> > >
> > > The first patch adds this new manual completion code, the second
> > > patch adds this to vicodec so it is included in regression testing,
> > > and the last patch is an updated old patch of mine that adds debugfs
> > > code to check if all requests and request objects are properly freed.
> > > Without it it is really hard to verify that there are no dangling
> > > requests or objects.
> > >
> > > I prefer to merge this third patch as well, but if there are
> > > objections, then I can live without it.
> > >
> > > Regards,
> > >
> > >         Hans
> > >
> > > Changes since v2:
> > > - fixed use-after-free bug in the third patch in media_request_object_release().
> > >
> > > Changes since the RFC:
> > >
> > > - Added WARN_ONs
> > > - vicodec was calling media_request_manual_complete() without
> > >   checking that it was the stateless output queue first.
> > > - Some minor cleanups in patch 3.
> > >
> > > Hans Verkuil (3):
> > >   media: mc: add manual request completion
> > >   media: vicodec: add support for manual completion
> > >   media: mc: add debugfs node to keep track of requests
> > >
> > >  drivers/media/mc/mc-device.c                  | 30 +++++++++++++
> > >  drivers/media/mc/mc-devnode.c                 |  5 +++
> > >  drivers/media/mc/mc-request.c                 | 44 ++++++++++++++++++-
> > >  .../media/test-drivers/vicodec/vicodec-core.c | 21 +++++++--
> > >  include/media/media-device.h                  |  9 ++++
> > >  include/media/media-devnode.h                 |  4 ++
> > >  include/media/media-request.h                 | 38 +++++++++++++++-
> > >  7 files changed, 144 insertions(+), 7 deletions(-)
> > >
> > > --
> > > 2.43.0
> > >
> > >
>