diff mbox

[v3,4/9] media: venus: vdec: add video decoder files

Message ID 15975057-dd6a-6946-07ac-93a748b6a176@linaro.org (mailing list archive)
State New, archived
Headers show

Commit Message

Stanimir Varbanov Nov. 21, 2016, 3:29 p.m. UTC
Hi Hans,

On 11/21/2016 05:04 PM, Hans Verkuil wrote:
> On 18/11/16 10:11, Stanimir Varbanov wrote:
>> Hi Hans,
>>
>>>>> +
>>>>> +static int
>>>>> +vdec_reqbufs(struct file *file, void *fh, struct
>>>>> v4l2_requestbuffers *b)
>>>>> +{
>>>>> +    struct vb2_queue *queue = to_vb2q(file, b->type);
>>>>> +
>>>>> +    if (!queue)
>>>>> +        return -EINVAL;
>>>>> +
>>>>> +    return vb2_reqbufs(queue, b);
>>>>> +}
>>>>
>>>> Is there any reason why the v4l2_m2m_ioctl_reqbufs et al helper
>>>> functions
>>>> can't be used? I strongly recommend that, unless there is a specific
>>>> reason
>>>> why that won't work.
>>>
>>> So that means I need to completely rewrite the v4l2 part and adopt it
>>> for mem2mem device APIs.
>>>
>>> If that is what you meant I can invest some time to make a estimation
>>> what would be the changes and time needed. After that we can decide what
>>> to do - take the driver as is and port it to mem2mem device APIs later
>>> on or wait for the this transition to happen before merging.
>>>
>>
>> I made an attempt to adopt v4l2 part of the venus driver to m2m API's
>> and the result was ~300 less lines of code, but with the price of few
>> extensions in m2m APIs (and I still have issues with running
>> simultaneously multiple instances).
>>
>> I have to add few functions/macros to iterate over a list (list_for_each
>> and friends). This is used to find the returned from decoder buffers by
>> address and associate them to vb2_buffer, because the decoder can change
>> the order of the output buffers.
>>
>> The main problem I have is registering of the capture buffers before
>> session_start. This is requirement (disadvantage) of the firmware
>> implementation i.e. I need to announce capture buffers (address and size
>> of the buffer) to the firmware before start buffer interaction by
>> session_start.
>>
>> So having that I think I will need one more week to stabilize the driver
>> to the state that it was before this m2m transition.
>>
>> Thoughts?
>>
> 
> It sounds like this it worth doing, since if you need these extensions,
> then
> it is likely someone else will need it as well.

Meanwhile I have found bigger obstacle - I cannot run multiple instances
simultaneously. By m2m design it can execute only one job (m2m context)
at a time per m2m device. Can you confirm that my observation is correct?

If so one solution could be on every fops::open I can create m2m_dev and
init m2m_cxt.

> 
> Can you mail me a preliminary patch with the core m2m changes? That will be
> helpful for me to look at.

Something like below diffs:

  *

Comments

Hans Verkuil Nov. 21, 2016, 3:33 p.m. UTC | #1
On 21/11/16 16:29, Stanimir Varbanov wrote:
> Hi Hans,
>
> On 11/21/2016 05:04 PM, Hans Verkuil wrote:
>> On 18/11/16 10:11, Stanimir Varbanov wrote:
>>> Hi Hans,
>>>
>>>>>> +
>>>>>> +static int
>>>>>> +vdec_reqbufs(struct file *file, void *fh, struct
>>>>>> v4l2_requestbuffers *b)
>>>>>> +{
>>>>>> +    struct vb2_queue *queue = to_vb2q(file, b->type);
>>>>>> +
>>>>>> +    if (!queue)
>>>>>> +        return -EINVAL;
>>>>>> +
>>>>>> +    return vb2_reqbufs(queue, b);
>>>>>> +}
>>>>>
>>>>> Is there any reason why the v4l2_m2m_ioctl_reqbufs et al helper
>>>>> functions
>>>>> can't be used? I strongly recommend that, unless there is a specific
>>>>> reason
>>>>> why that won't work.
>>>>
>>>> So that means I need to completely rewrite the v4l2 part and adopt it
>>>> for mem2mem device APIs.
>>>>
>>>> If that is what you meant I can invest some time to make a estimation
>>>> what would be the changes and time needed. After that we can decide what
>>>> to do - take the driver as is and port it to mem2mem device APIs later
>>>> on or wait for the this transition to happen before merging.
>>>>
>>>
>>> I made an attempt to adopt v4l2 part of the venus driver to m2m API's
>>> and the result was ~300 less lines of code, but with the price of few
>>> extensions in m2m APIs (and I still have issues with running
>>> simultaneously multiple instances).
>>>
>>> I have to add few functions/macros to iterate over a list (list_for_each
>>> and friends). This is used to find the returned from decoder buffers by
>>> address and associate them to vb2_buffer, because the decoder can change
>>> the order of the output buffers.
>>>
>>> The main problem I have is registering of the capture buffers before
>>> session_start. This is requirement (disadvantage) of the firmware
>>> implementation i.e. I need to announce capture buffers (address and size
>>> of the buffer) to the firmware before start buffer interaction by
>>> session_start.
>>>
>>> So having that I think I will need one more week to stabilize the driver
>>> to the state that it was before this m2m transition.
>>>
>>> Thoughts?
>>>
>>
>> It sounds like this it worth doing, since if you need these extensions,
>> then
>> it is likely someone else will need it as well.
>
> Meanwhile I have found bigger obstacle - I cannot run multiple instances
> simultaneously. By m2m design it can execute only one job (m2m context)
> at a time per m2m device. Can you confirm that my observation is correct?

The m2m framework assumes a single HW instance, yes. Do you have multiple
HW decoders? I might not understand what you mean...

Regards,

	Hans

>
> If so one solution could be on every fops::open I can create m2m_dev and
> init m2m_cxt.
>
>>
>> Can you mail me a preliminary patch with the core m2m changes? That will be
>> helpful for me to look at.
>
> Something like below diffs:
>
> diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c
> b/drivers/media/v4l2-core/v4l2-mem2mem.c
> index 61d56c940f80..52e22ec0f67b 100644
> --- a/drivers/media/v4l2-core/v4l2-mem2mem.c
> +++ b/drivers/media/v4l2-core/v4l2-mem2mem.c
> @@ -136,6 +136,28 @@ void *v4l2_m2m_buf_remove(struct v4l2_m2m_queue_ctx
> *q_ctx)
>  }
>  EXPORT_SYMBOL_GPL(v4l2_m2m_buf_remove);
>
> +struct vb2_v4l2_buffer *
> +v4l2_m2m_buf_remove_match(struct v4l2_m2m_queue_ctx *q_ctx, void *priv,
> +                          int (*match)(void *priv, struct
> vb2_v4l2_buffer *vb))
> +{
> +       struct v4l2_m2m_buffer *b, *tmp;
> +       struct vb2_v4l2_buffer *ret = NULL;
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&q_ctx->rdy_spinlock, flags);
> +       list_for_each_entry_safe(b, tmp, &q_ctx->rdy_queue, list) {
> +               if (match(priv, &b->vb)) {
> +                       list_del(&b->list);
> +                       ret = &b->vb;
> +                       break;
> +               }
> +       }
> +       spin_unlock_irqrestore(&q_ctx->rdy_spinlock, flags);
> +
> +       return ret;
> +}
> +EXPORT_SYMBOL_GPL(v4l2_m2m_buf_remove_match);
> +
>  /*
>   * Scheduling handlers
>   */
> diff --git a/include/media/v4l2-mem2mem.h b/include/media/v4l2-mem2mem.h
> index 64e1819ea66d..e943609209ba 100644
> --- a/include/media/v4l2-mem2mem.h
> +++ b/include/media/v4l2-mem2mem.h
> @@ -263,6 +263,24 @@ static inline void *v4l2_m2m_dst_buf_remove(struct
> v4l2_m2m_ctx *m2m_ctx)
>         return v4l2_m2m_buf_remove(&m2m_ctx->cap_q_ctx);
>  }
>
> +struct vb2_v4l2_buffer *
> +v4l2_m2m_buf_remove_match(struct v4l2_m2m_queue_ctx *q_ctx, void *priv,
> +                       int (*match)(void *priv, struct vb2_v4l2_buffer
> *vb));
> +
> +static inline struct vb2_v4l2_buffer *
> +v4l2_m2m_src_buf_remove_match(struct v4l2_m2m_ctx *m2m_ctx, void *priv,
> +                       int (*match)(void *priv, struct vb2_v4l2_buffer
> *vb))
> +{
> +       return v4l2_m2m_buf_remove_match(&m2m_ctx->out_q_ctx, priv, match);
> +}
> +
> +static inline struct vb2_v4l2_buffer *
> +v4l2_m2m_dst_buf_remove_match(struct v4l2_m2m_ctx *m2m_ctx, void *priv,
> +                       int (*match)(void *priv, struct vb2_v4l2_buffer
> *vb))
> +{
> +       return v4l2_m2m_buf_remove_match(&m2m_ctx->cap_q_ctx, priv, match);
> +}
>
> diff --git a/include/media/v4l2-mem2mem.h b/include/media/v4l2-mem2mem.h
> index 5a9597dd1ee0..64e1819ea66d 100644
> --- a/include/media/v4l2-mem2mem.h
> +++ b/include/media/v4l2-mem2mem.h
> @@ -211,6 +211,12 @@ static inline void *v4l2_m2m_next_dst_buf(struct
> v4l2_m2m_ctx *m2m_ctx)
>         return v4l2_m2m_next_buf(&m2m_ctx->cap_q_ctx);
>  }
>
> +#define v4l2_m2m_for_each_dst_buf(q_ctx, b)    \
> +       list_for_each_entry(b, &q_ctx->cap_q_ctx.rdy_queue, list)
> +
> +#define v4l2_m2m_for_each_src_buf(q_ctx, b)    \
> +       list_for_each_entry(b, &q_ctx->out_q_ctx.rdy_queue, list)
> +
>  /**
>   * v4l2_m2m_get_src_vq() - return vb2_queue for source buffers
>   *
>
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stanimir Varbanov Nov. 21, 2016, 4:09 p.m. UTC | #2
On 11/21/2016 05:33 PM, Hans Verkuil wrote:
> On 21/11/16 16:29, Stanimir Varbanov wrote:
>> Hi Hans,
>>
>> On 11/21/2016 05:04 PM, Hans Verkuil wrote:
>>> On 18/11/16 10:11, Stanimir Varbanov wrote:
>>>> Hi Hans,
>>>>
>>>>>>> +
>>>>>>> +static int
>>>>>>> +vdec_reqbufs(struct file *file, void *fh, struct
>>>>>>> v4l2_requestbuffers *b)
>>>>>>> +{
>>>>>>> +    struct vb2_queue *queue = to_vb2q(file, b->type);
>>>>>>> +
>>>>>>> +    if (!queue)
>>>>>>> +        return -EINVAL;
>>>>>>> +
>>>>>>> +    return vb2_reqbufs(queue, b);
>>>>>>> +}
>>>>>>
>>>>>> Is there any reason why the v4l2_m2m_ioctl_reqbufs et al helper
>>>>>> functions
>>>>>> can't be used? I strongly recommend that, unless there is a specific
>>>>>> reason
>>>>>> why that won't work.
>>>>>
>>>>> So that means I need to completely rewrite the v4l2 part and adopt it
>>>>> for mem2mem device APIs.
>>>>>
>>>>> If that is what you meant I can invest some time to make a estimation
>>>>> what would be the changes and time needed. After that we can decide
>>>>> what
>>>>> to do - take the driver as is and port it to mem2mem device APIs later
>>>>> on or wait for the this transition to happen before merging.
>>>>>
>>>>
>>>> I made an attempt to adopt v4l2 part of the venus driver to m2m API's
>>>> and the result was ~300 less lines of code, but with the price of few
>>>> extensions in m2m APIs (and I still have issues with running
>>>> simultaneously multiple instances).
>>>>
>>>> I have to add few functions/macros to iterate over a list
>>>> (list_for_each
>>>> and friends). This is used to find the returned from decoder buffers by
>>>> address and associate them to vb2_buffer, because the decoder can
>>>> change
>>>> the order of the output buffers.
>>>>
>>>> The main problem I have is registering of the capture buffers before
>>>> session_start. This is requirement (disadvantage) of the firmware
>>>> implementation i.e. I need to announce capture buffers (address and
>>>> size
>>>> of the buffer) to the firmware before start buffer interaction by
>>>> session_start.
>>>>
>>>> So having that I think I will need one more week to stabilize the
>>>> driver
>>>> to the state that it was before this m2m transition.
>>>>
>>>> Thoughts?
>>>>
>>>
>>> It sounds like this it worth doing, since if you need these extensions,
>>> then
>>> it is likely someone else will need it as well.
>>
>> Meanwhile I have found bigger obstacle - I cannot run multiple instances
>> simultaneously. By m2m design it can execute only one job (m2m context)
>> at a time per m2m device. Can you confirm that my observation is correct?
> 
> The m2m framework assumes a single HW instance, yes. Do you have multiple
> HW decoders? I might not understand what you mean...
> 

I mean that I can start and execute up to 16 decoder sessions
simultaneously. Its a firmware responsibility how those sessions are
scheduled and how the hardware is shared between them. Of course
depending on the resolution the firmware can refuse to start the session
because the hardware will be overloaded and will not be able to satisfy
the bitrate requirements.
Nicolas Dufresne Nov. 23, 2016, 8:24 p.m. UTC | #3
Le lundi 21 novembre 2016 à 18:09 +0200, Stanimir Varbanov a écrit :
> >> Meanwhile I have found bigger obstacle - I cannot run multiple
> instances
> >> simultaneously. By m2m design it can execute only one job (m2m
> context)
> >> at a time per m2m device. Can you confirm that my observation is
> correct?
> > 
> > The m2m framework assumes a single HW instance, yes. Do you have
> multiple
> > HW decoders? I might not understand what you mean...
> > 
> 
> I mean that I can start and execute up to 16 decoder sessions
> simultaneously. Its a firmware responsibility how those sessions are
> scheduled and how the hardware is shared between them. Of course
> depending on the resolution the firmware can refuse to start the
> session
> because the hardware will be overloaded and will not be able to
> satisfy
> the bitrate requirements.

This is similar to S5P-MFC driver, which you may have notice not use
m2m framework.

Nicolas
Stanimir Varbanov Nov. 24, 2016, 1:16 p.m. UTC | #4
Hi Nicolas,

On 11/23/2016 10:24 PM, Nicolas Dufresne wrote:
> Le lundi 21 novembre 2016 à 18:09 +0200, Stanimir Varbanov a écrit :
>>>> Meanwhile I have found bigger obstacle - I cannot run multiple
>> instances
>>>> simultaneously. By m2m design it can execute only one job (m2m
>> context)
>>>> at a time per m2m device. Can you confirm that my observation is
>> correct?
>>>  
>>> The m2m framework assumes a single HW instance, yes. Do you have
>> multiple
>>> HW decoders? I might not understand what you mean...
>>>  
>>
>> I mean that I can start and execute up to 16 decoder sessions
>> simultaneously. Its a firmware responsibility how those sessions are
>> scheduled and how the hardware is shared between them. Of course
>> depending on the resolution the firmware can refuse to start the
>> session
>> because the hardware will be overloaded and will not be able to
>> satisfy
>> the bitrate requirements.
> 
> This is similar to S5P-MFC driver, which you may have notice not use
> m2m framework.

Thanks for the note.

I have started to look into m2m because Hans asked me to reuse the ioctl
helpers that it provides.

I have no problem with usage of the m2m API if they help me to reduce
code size and doesn't impact performance.
diff mbox

Patch

diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c
b/drivers/media/v4l2-core/v4l2-mem2mem.c
index 61d56c940f80..52e22ec0f67b 100644
--- a/drivers/media/v4l2-core/v4l2-mem2mem.c
+++ b/drivers/media/v4l2-core/v4l2-mem2mem.c
@@ -136,6 +136,28 @@  void *v4l2_m2m_buf_remove(struct v4l2_m2m_queue_ctx
*q_ctx)
 }
 EXPORT_SYMBOL_GPL(v4l2_m2m_buf_remove);

+struct vb2_v4l2_buffer *
+v4l2_m2m_buf_remove_match(struct v4l2_m2m_queue_ctx *q_ctx, void *priv,
+                          int (*match)(void *priv, struct
vb2_v4l2_buffer *vb))
+{
+       struct v4l2_m2m_buffer *b, *tmp;
+       struct vb2_v4l2_buffer *ret = NULL;
+       unsigned long flags;
+
+       spin_lock_irqsave(&q_ctx->rdy_spinlock, flags);
+       list_for_each_entry_safe(b, tmp, &q_ctx->rdy_queue, list) {
+               if (match(priv, &b->vb)) {
+                       list_del(&b->list);
+                       ret = &b->vb;
+                       break;
+               }
+       }
+       spin_unlock_irqrestore(&q_ctx->rdy_spinlock, flags);
+
+       return ret;
+}
+EXPORT_SYMBOL_GPL(v4l2_m2m_buf_remove_match);
+
 /*
  * Scheduling handlers
  */
diff --git a/include/media/v4l2-mem2mem.h b/include/media/v4l2-mem2mem.h
index 64e1819ea66d..e943609209ba 100644
--- a/include/media/v4l2-mem2mem.h
+++ b/include/media/v4l2-mem2mem.h
@@ -263,6 +263,24 @@  static inline void *v4l2_m2m_dst_buf_remove(struct
v4l2_m2m_ctx *m2m_ctx)
        return v4l2_m2m_buf_remove(&m2m_ctx->cap_q_ctx);
 }

+struct vb2_v4l2_buffer *
+v4l2_m2m_buf_remove_match(struct v4l2_m2m_queue_ctx *q_ctx, void *priv,
+                       int (*match)(void *priv, struct vb2_v4l2_buffer
*vb));
+
+static inline struct vb2_v4l2_buffer *
+v4l2_m2m_src_buf_remove_match(struct v4l2_m2m_ctx *m2m_ctx, void *priv,
+                       int (*match)(void *priv, struct vb2_v4l2_buffer
*vb))
+{
+       return v4l2_m2m_buf_remove_match(&m2m_ctx->out_q_ctx, priv, match);
+}
+
+static inline struct vb2_v4l2_buffer *
+v4l2_m2m_dst_buf_remove_match(struct v4l2_m2m_ctx *m2m_ctx, void *priv,
+                       int (*match)(void *priv, struct vb2_v4l2_buffer
*vb))
+{
+       return v4l2_m2m_buf_remove_match(&m2m_ctx->cap_q_ctx, priv, match);
+}

diff --git a/include/media/v4l2-mem2mem.h b/include/media/v4l2-mem2mem.h
index 5a9597dd1ee0..64e1819ea66d 100644
--- a/include/media/v4l2-mem2mem.h
+++ b/include/media/v4l2-mem2mem.h
@@ -211,6 +211,12 @@  static inline void *v4l2_m2m_next_dst_buf(struct
v4l2_m2m_ctx *m2m_ctx)
        return v4l2_m2m_next_buf(&m2m_ctx->cap_q_ctx);
 }

+#define v4l2_m2m_for_each_dst_buf(q_ctx, b)    \
+       list_for_each_entry(b, &q_ctx->cap_q_ctx.rdy_queue, list)
+
+#define v4l2_m2m_for_each_src_buf(q_ctx, b)    \
+       list_for_each_entry(b, &q_ctx->out_q_ctx.rdy_queue, list)
+
 /**
  * v4l2_m2m_get_src_vq() - return vb2_queue for source buffers