mbox series

[RFC,0/1] Add LTR controls

Message ID 1591871121-25420-1-git-send-email-dikshita@codeaurora.org (mailing list archive)
Headers show
Series Add LTR controls | expand

Message

Dikshita Agarwal June 11, 2020, 10:25 a.m. UTC
LTR (Long Term Reference) frames are the frames that are encoded sometime in the past
and stored in the DPB buffer list to be used as reference to encode future frames.
One usage of LTR encoding is to reduce error propagation for video transmission
in packet lossy networks.  For example, encoder may want to specify some key frames as
LTR pictures and use them as reference frames for encoding. With extra protection
selectively on these LTR frames or synchronization with the receiver of reception of
the LTR frames during transmission, decoder can receive reference frames more reliably
than other non-reference frames. As a result, transmission error can be effectively
restricted within certain frames rather than propagated to future frames.

We are introducing below V4l2 Controls for this feature
1. V4L2_CID_MPEG_VIDEO_LTRCOUNT
    a. This is used to query or configure the number of LTR frames.
       This is a static control and is controlled by the client.
    b. The LTR index varies from 0 to the max LTR-1.
    c. If LTR Count is more than max supported LTR count (max LTR) by driver, it will be rejected.
    d. Auto Marking : If LTR count is non zero,
        1) first LTR count frames would be mark as LTR automatically after
      	   every IDR frame (inclusive).
        2) For multilayer encoding: first LTR count base layer reference frames starting after
           every IDR frame (inclusive) in encoding order would be marked as LTR frames by the encoder.
        3) Auto marking of LTR due to IDR should consider following conditions:
            1. The frame is not already set to be marked as LTR.
            2. The frame is part of the base layer in the hierarchical layer case.
            3. The number of frames currently marked as LTR is less than the maximum LTR frame index plus 1.
    e. Encoder needs to handle explicit Mark/Use command when encoder is still doing "auto" marking

2. V4L2_CID_MPEG_VIDEO_MARKLTRFRAME :
    a. This signals to mark the current frame as LTR frame. It is a dynamic control and also provide the LTR index to be used.
    b. the LTR index provided by this control should never exceed the max LTR-1. Else it will be rejected.

3. V4L2_CID_MPEG_VIDEO_USELTRFRAME :
    a. This specifies the LTR frame(s) to be used for encoding the current frame. This is a dynamic control.
    b. LTR Use Bitmap : this consists of bits [0, 15]. A total of N LSB bits of this field are valid,
       where N is the maximum number of LTRs supported. All the other bits are invalid and should be rejected.
       The LSB corresponds to the LTR index 0. Bit N-1 from the LSB corresponds to the LTR index max LTR-1.

Dikshita Agarwal (1):
  media: v4l2-ctrls:  add control for ltr

 drivers/media/v4l2-core/v4l2-ctrls.c | 6 ++++++
 include/uapi/linux/v4l2-controls.h   | 4 ++++
 2 files changed, 10 insertions(+)

Comments

Nicolas Dufresne June 11, 2020, 2:22 p.m. UTC | #1
Le jeudi 11 juin 2020 à 15:55 +0530, Dikshita Agarwal a écrit :
> LTR (Long Term Reference) frames are the frames that are encoded sometime in the past
> and stored in the DPB buffer list to be used as reference to encode future frames.
> One usage of LTR encoding is to reduce error propagation for video transmission
> in packet lossy networks.  For example, encoder may want to specify some key frames as
> LTR pictures and use them as reference frames for encoding. With extra protection
> selectively on these LTR frames or synchronization with the receiver of reception of
> the LTR frames during transmission, decoder can receive reference frames more reliably
> than other non-reference frames. As a result, transmission error can be effectively
> restricted within certain frames rather than propagated to future frames.
> 
> We are introducing below V4l2 Controls for this feature
> 1. V4L2_CID_MPEG_VIDEO_LTRCOUNT
>     a. This is used to query or configure the number of LTR frames.
>        This is a static control and is controlled by the client.
>     b. The LTR index varies from 0 to the max LTR-1.
>     c. If LTR Count is more than max supported LTR count (max LTR) by driver, it will be rejected.
>     d. Auto Marking : If LTR count is non zero,
>         1) first LTR count frames would be mark as LTR automatically after
>       	   every IDR frame (inclusive).
>         2) For multilayer encoding: first LTR count base layer reference frames starting after
>            every IDR frame (inclusive) in encoding order would be marked as LTR frames by the encoder.
>         3) Auto marking of LTR due to IDR should consider following conditions:
>             1. The frame is not already set to be marked as LTR.
>             2. The frame is part of the base layer in the hierarchical layer case.
>             3. The number of frames currently marked as LTR is less than the maximum LTR frame index plus 1.
>     e. Encoder needs to handle explicit Mark/Use command when encoder is still doing "auto" marking

Perhaps we are missing a LONG_TERM_REFERENCE_MODE ? I bet some encoder
can select by themself long term references and even some encoders may
not let the user decide.

(not huge han of LTR acronyme, but that could be fine too, assuming you
add more _).

> 
> 2. V4L2_CID_MPEG_VIDEO_MARKLTRFRAME :
>     a. This signals to mark the current frame as LTR frame. It is a dynamic control and also provide the LTR index to be used.
>     b. the LTR index provided by this control should never exceed the max LTR-1. Else it will be rejected.

The "current" frame seems a bit loose. Perhaps you wanted to use buffer
flags ? A bit like what we have to signal TOP/BOTTOM fields in
alternate interlacing.

> 
> 3. V4L2_CID_MPEG_VIDEO_USELTRFRAME :
>     a. This specifies the LTR frame(s) to be used for encoding the current frame. This is a dynamic control.
>     b. LTR Use Bitmap : this consists of bits [0, 15]. A total of N LSB bits of this field are valid,
>        where N is the maximum number of LTRs supported. All the other bits are invalid and should be rejected.
>        The LSB corresponds to the LTR index 0. Bit N-1 from the LSB corresponds to the LTR index max LTR-1.

Note, I haven't captured very well the userspace control flow, perhaps
this could be enhanced through writing some documentation.

As per all other generic encoder controls, we need to make sure it will
be usable and flexible enough for multiple HW blocks, as it can be
tedious to extend later otherwise. It is important that along with this
RFC you provide some comparisons with with other HW / SW APIs in order
to help justify the design decisions. I also think there should be 
link made V4L2_CID_MPEG_VIDEO_GOP_* , number of B-Frames etc.

regards,
Nicolas

> 
> Dikshita Agarwal (1):
>   media: v4l2-ctrls:  add control for ltr
> 
>  drivers/media/v4l2-core/v4l2-ctrls.c | 6 ++++++
>  include/uapi/linux/v4l2-controls.h   | 4 ++++
>  2 files changed, 10 insertions(+)
>
Hans Verkuil June 12, 2020, 9:11 a.m. UTC | #2
Hi Dikshita, Nicolas,

On 11/06/2020 16:22, Nicolas Dufresne wrote:
> Le jeudi 11 juin 2020 à 15:55 +0530, Dikshita Agarwal a écrit :
>> LTR (Long Term Reference) frames are the frames that are encoded sometime in the past
>> and stored in the DPB buffer list to be used as reference to encode future frames.
>> One usage of LTR encoding is to reduce error propagation for video transmission
>> in packet lossy networks.  For example, encoder may want to specify some key frames as
>> LTR pictures and use them as reference frames for encoding. With extra protection
>> selectively on these LTR frames or synchronization with the receiver of reception of
>> the LTR frames during transmission, decoder can receive reference frames more reliably
>> than other non-reference frames. As a result, transmission error can be effectively
>> restricted within certain frames rather than propagated to future frames.
>>
>> We are introducing below V4l2 Controls for this feature
>> 1. V4L2_CID_MPEG_VIDEO_LTRCOUNT
>>     a. This is used to query or configure the number of LTR frames.
>>        This is a static control and is controlled by the client.
>>     b. The LTR index varies from 0 to the max LTR-1.
>>     c. If LTR Count is more than max supported LTR count (max LTR) by driver, it will be rejected.
>>     d. Auto Marking : If LTR count is non zero,
>>         1) first LTR count frames would be mark as LTR automatically after
>>       	   every IDR frame (inclusive).
>>         2) For multilayer encoding: first LTR count base layer reference frames starting after
>>            every IDR frame (inclusive) in encoding order would be marked as LTR frames by the encoder.
>>         3) Auto marking of LTR due to IDR should consider following conditions:
>>             1. The frame is not already set to be marked as LTR.
>>             2. The frame is part of the base layer in the hierarchical layer case.
>>             3. The number of frames currently marked as LTR is less than the maximum LTR frame index plus 1.
>>     e. Encoder needs to handle explicit Mark/Use command when encoder is still doing "auto" marking

I don't follow this, quite possibly due to lack of experience with encoders.

I kind of would expect to see two modes: either automatic where encoders can
mark up to LTR_COUNT frames as long term reference, and userspace just sets
LTR_COUNT and doesn't have to do anything else.

Or it is manual mode where userspace explicitly marks long term reference
frames.

From the proposal above it looks like you can mix auto and manual modes.

BTW, how do you 'unmark' long term reference frames?

This feature is for stateful encoders, right?

> 
> Perhaps we are missing a LONG_TERM_REFERENCE_MODE ? I bet some encoder
> can select by themself long term references and even some encoders may
> not let the user decide.
> 
> (not huge han of LTR acronyme, but that could be fine too, assuming you
> add more _).
> 
>>
>> 2. V4L2_CID_MPEG_VIDEO_MARKLTRFRAME :
>>     a. This signals to mark the current frame as LTR frame. It is a dynamic control and also provide the LTR index to be used.
>>     b. the LTR index provided by this control should never exceed the max LTR-1. Else it will be rejected.
> 
> The "current" frame seems a bit loose. Perhaps you wanted to use buffer
> flags ? A bit like what we have to signal TOP/BOTTOM fields in
> alternate interlacing.

I was thinking the same thing. Using a control for this doesn't seem right.

> 
>>
>> 3. V4L2_CID_MPEG_VIDEO_USELTRFRAME :
>>     a. This specifies the LTR frame(s) to be used for encoding the current frame. This is a dynamic control.
>>     b. LTR Use Bitmap : this consists of bits [0, 15]. A total of N LSB bits of this field are valid,
>>        where N is the maximum number of LTRs supported. All the other bits are invalid and should be rejected.
>>        The LSB corresponds to the LTR index 0. Bit N-1 from the LSB corresponds to the LTR index max LTR-1.

How would userspace know this? Esp. with auto marking since userspace would have
to predict how auto marking works (if I understand this correctly).

For which HW encoder is this meant?

> 
> Note, I haven't captured very well the userspace control flow, perhaps
> this could be enhanced through writing some documentation.
> 
> As per all other generic encoder controls, we need to make sure it will
> be usable and flexible enough for multiple HW blocks, as it can be
> tedious to extend later otherwise. It is important that along with this
> RFC you provide some comparisons with with other HW / SW APIs in order
> to help justify the design decisions. I also think there should be 
> link made V4L2_CID_MPEG_VIDEO_GOP_* , number of B-Frames etc.

I agree with Nicolas.

Regards,

	Hans

> 
> regards,
> Nicolas
> 
>>
>> Dikshita Agarwal (1):
>>   media: v4l2-ctrls:  add control for ltr
>>
>>  drivers/media/v4l2-core/v4l2-ctrls.c | 6 ++++++
>>  include/uapi/linux/v4l2-controls.h   | 4 ++++
>>  2 files changed, 10 insertions(+)
>>
>
Dikshita Agarwal June 16, 2020, 6:41 p.m. UTC | #3
Hi Hans, Nicolas,

Thanks for your comments.

On 2020-06-12 14:41, Hans Verkuil wrote:
> Hi Dikshita, Nicolas,
> 
> On 11/06/2020 16:22, Nicolas Dufresne wrote:
>> Le jeudi 11 juin 2020 à 15:55 +0530, Dikshita Agarwal a écrit :
>>> LTR (Long Term Reference) frames are the frames that are encoded 
>>> sometime in the past
>>> and stored in the DPB buffer list to be used as reference to encode 
>>> future frames.
>>> One usage of LTR encoding is to reduce error propagation for video 
>>> transmission
>>> in packet lossy networks.  For example, encoder may want to specify 
>>> some key frames as
>>> LTR pictures and use them as reference frames for encoding. With 
>>> extra protection
>>> selectively on these LTR frames or synchronization with the receiver 
>>> of reception of
>>> the LTR frames during transmission, decoder can receive reference 
>>> frames more reliably
>>> than other non-reference frames. As a result, transmission error can 
>>> be effectively
>>> restricted within certain frames rather than propagated to future 
>>> frames.
>>> 
>>> We are introducing below V4l2 Controls for this feature
>>> 1. V4L2_CID_MPEG_VIDEO_LTRCOUNT
>>>     a. This is used to query or configure the number of LTR frames.
>>>        This is a static control and is controlled by the client.
>>>     b. The LTR index varies from 0 to the max LTR-1.
>>>     c. If LTR Count is more than max supported LTR count (max LTR) by 
>>> driver, it will be rejected.
>>>     d. Auto Marking : If LTR count is non zero,
>>>         1) first LTR count frames would be mark as LTR automatically 
>>> after
>>>       	   every IDR frame (inclusive).
>>>         2) For multilayer encoding: first LTR count base layer 
>>> reference frames starting after
>>>            every IDR frame (inclusive) in encoding order would be 
>>> marked as LTR frames by the encoder.
>>>         3) Auto marking of LTR due to IDR should consider following 
>>> conditions:
>>>             1. The frame is not already set to be marked as LTR.
>>>             2. The frame is part of the base layer in the 
>>> hierarchical layer case.
>>>             3. The number of frames currently marked as LTR is less 
>>> than the maximum LTR frame index plus 1.
>>>     e. Encoder needs to handle explicit Mark/Use command when encoder 
>>> is still doing "auto" marking
> 
> I don't follow this, quite possibly due to lack of experience with 
> encoders.
> 
> I kind of would expect to see two modes: either automatic where 
> encoders can
> mark up to LTR_COUNT frames as long term reference, and userspace just 
> sets
> LTR_COUNT and doesn't have to do anything else.
> 
> Or it is manual mode where userspace explicitly marks long term 
> reference
> frames.
> 
> From the proposal above it looks like you can mix auto and manual 
> modes.
> 
> BTW, how do you 'unmark' long term reference frames?
> 
> This feature is for stateful encoders, right?
> 
>> 
>> Perhaps we are missing a LONG_TERM_REFERENCE_MODE ? I bet some encoder
>> can select by themself long term references and even some encoders may
>> not let the user decide.
>> 
>> (not huge han of LTR acronyme, but that could be fine too, assuming 
>> you
>> add more _).
>> 

Userspace sets LTR count which signifies the number of LTR frames 
encoder needs to generate or keep.
The encoder has to build-up its internal buffer reference list (aka DBP 
list or recon buffer list).
In order to achieve that encoder will fill It's LTR (long term 
references) list and STR (short term references) list
by auto marking n frames as LTR frames(n is equal to LTR count) based on 
auto-marking dictated by the encoder spec.
The client then can replace those automatically marked frames with new 
frames using V4L2_CID_MPEG_VIDEO_MARKLTRFRAME and can ask
encoder to refer the newly marked frame for encoding the next frame 
using V4L2_CID_MPEG_VIDEO_USELTRFRAME.

>>> 
>>> 2. V4L2_CID_MPEG_VIDEO_MARKLTRFRAME :
>>>     a. This signals to mark the current frame as LTR frame. It is a 
>>> dynamic control and also provide the LTR index to be used.
>>>     b. the LTR index provided by this control should never exceed the 
>>> max LTR-1. Else it will be rejected.
>> 
>> The "current" frame seems a bit loose. Perhaps you wanted to use 
>> buffer
>> flags ? A bit like what we have to signal TOP/BOTTOM fields in
>> alternate interlacing.
> 
> I was thinking the same thing. Using a control for this doesn't seem 
> right.
> 

the client sets this to replace automatically marked frames by the 
encoder with a particular frame.
this provides an index that ranges from 0 to LTR count-1 and then the 
particular frame will be marked with that index.
this can be achieved through request by associating this control with a 
specific buffer to make it synchronized.

>> 
>>> 
>>> 3. V4L2_CID_MPEG_VIDEO_USELTRFRAME :
>>>     a. This specifies the LTR frame(s) to be used for encoding the 
>>> current frame. This is a dynamic control.
>>>     b. LTR Use Bitmap : this consists of bits [0, 15]. A total of N 
>>> LSB bits of this field are valid,
>>>        where N is the maximum number of LTRs supported. All the other 
>>> bits are invalid and should be rejected.
>>>        The LSB corresponds to the LTR index 0. Bit N-1 from the LSB 
>>> corresponds to the LTR index max LTR-1.
> 
> How would userspace know this? Esp. with auto marking since userspace 
> would have
> to predict how auto marking works (if I understand this correctly).
> 

Client sets LTR count which tells about the number of LTR frames 
automatically marked by the encoder.
so client can use LTR index (0 to LTR count -1) to ask encoder to refer 
any particular
frame (marked automatically by driver or marked by client with 
V4L2_CID_MPEG_VIDEO_MARKLTRFRAME) as a reference to encode the next 
frame.

> For which HW encoder is this meant?
> 
This is primarily meant for H264 and HEVC.

Thanks,
Dikshita

>> 
>> Note, I haven't captured very well the userspace control flow, perhaps
>> this could be enhanced through writing some documentation.
>> 
>> As per all other generic encoder controls, we need to make sure it 
>> will
>> be usable and flexible enough for multiple HW blocks, as it can be
>> tedious to extend later otherwise. It is important that along with 
>> this
>> RFC you provide some comparisons with with other HW / SW APIs in order
>> to help justify the design decisions. I also think there should be
>> link made V4L2_CID_MPEG_VIDEO_GOP_* , number of B-Frames etc.
> 
> I agree with Nicolas.
> 
> Regards,
> 
> 	Hans
> 
>> 
>> regards,
>> Nicolas
>> 
>>> 
>>> Dikshita Agarwal (1):
>>>   media: v4l2-ctrls:  add control for ltr
>>> 
>>>  drivers/media/v4l2-core/v4l2-ctrls.c | 6 ++++++
>>>  include/uapi/linux/v4l2-controls.h   | 4 ++++
>>>  2 files changed, 10 insertions(+)
>>> 
>>
Dikshita Agarwal July 14, 2020, 7:21 a.m. UTC | #4
Hi,

A gentle reminder for the review.

Thanks,
Dikshita

On 2020-06-17 00:11, dikshita@codeaurora.org wrote:
> Hi Hans, Nicolas,
> 
> Thanks for your comments.
> 
> On 2020-06-12 14:41, Hans Verkuil wrote:
>> Hi Dikshita, Nicolas,
>> 
>> On 11/06/2020 16:22, Nicolas Dufresne wrote:
>>> Le jeudi 11 juin 2020 à 15:55 +0530, Dikshita Agarwal a écrit :
>>>> LTR (Long Term Reference) frames are the frames that are encoded 
>>>> sometime in the past
>>>> and stored in the DPB buffer list to be used as reference to encode 
>>>> future frames.
>>>> One usage of LTR encoding is to reduce error propagation for video 
>>>> transmission
>>>> in packet lossy networks.  For example, encoder may want to specify 
>>>> some key frames as
>>>> LTR pictures and use them as reference frames for encoding. With 
>>>> extra protection
>>>> selectively on these LTR frames or synchronization with the receiver 
>>>> of reception of
>>>> the LTR frames during transmission, decoder can receive reference 
>>>> frames more reliably
>>>> than other non-reference frames. As a result, transmission error can 
>>>> be effectively
>>>> restricted within certain frames rather than propagated to future 
>>>> frames.
>>>> 
>>>> We are introducing below V4l2 Controls for this feature
>>>> 1. V4L2_CID_MPEG_VIDEO_LTRCOUNT
>>>>     a. This is used to query or configure the number of LTR frames.
>>>>        This is a static control and is controlled by the client.
>>>>     b. The LTR index varies from 0 to the max LTR-1.
>>>>     c. If LTR Count is more than max supported LTR count (max LTR) 
>>>> by driver, it will be rejected.
>>>>     d. Auto Marking : If LTR count is non zero,
>>>>         1) first LTR count frames would be mark as LTR automatically 
>>>> after
>>>>       	   every IDR frame (inclusive).
>>>>         2) For multilayer encoding: first LTR count base layer 
>>>> reference frames starting after
>>>>            every IDR frame (inclusive) in encoding order would be 
>>>> marked as LTR frames by the encoder.
>>>>         3) Auto marking of LTR due to IDR should consider following 
>>>> conditions:
>>>>             1. The frame is not already set to be marked as LTR.
>>>>             2. The frame is part of the base layer in the 
>>>> hierarchical layer case.
>>>>             3. The number of frames currently marked as LTR is less 
>>>> than the maximum LTR frame index plus 1.
>>>>     e. Encoder needs to handle explicit Mark/Use command when 
>>>> encoder is still doing "auto" marking
>> 
>> I don't follow this, quite possibly due to lack of experience with 
>> encoders.
>> 
>> I kind of would expect to see two modes: either automatic where 
>> encoders can
>> mark up to LTR_COUNT frames as long term reference, and userspace just 
>> sets
>> LTR_COUNT and doesn't have to do anything else.
>> 
>> Or it is manual mode where userspace explicitly marks long term 
>> reference
>> frames.
>> 
>> From the proposal above it looks like you can mix auto and manual 
>> modes.
>> 
>> BTW, how do you 'unmark' long term reference frames?
>> 
>> This feature is for stateful encoders, right?
>> 
>>> 
>>> Perhaps we are missing a LONG_TERM_REFERENCE_MODE ? I bet some 
>>> encoder
>>> can select by themself long term references and even some encoders 
>>> may
>>> not let the user decide.
>>> 
>>> (not huge han of LTR acronyme, but that could be fine too, assuming 
>>> you
>>> add more _).
>>> 
> 
> Userspace sets LTR count which signifies the number of LTR frames
> encoder needs to generate or keep.
> The encoder has to build-up its internal buffer reference list (aka
> DBP list or recon buffer list).
> In order to achieve that encoder will fill It's LTR (long term
> references) list and STR (short term references) list
> by auto marking n frames as LTR frames(n is equal to LTR count) based
> on auto-marking dictated by the encoder spec.
> The client then can replace those automatically marked frames with new
> frames using V4L2_CID_MPEG_VIDEO_MARKLTRFRAME and can ask
> encoder to refer the newly marked frame for encoding the next frame
> using V4L2_CID_MPEG_VIDEO_USELTRFRAME.
> 
>>>> 
>>>> 2. V4L2_CID_MPEG_VIDEO_MARKLTRFRAME :
>>>>     a. This signals to mark the current frame as LTR frame. It is a 
>>>> dynamic control and also provide the LTR index to be used.
>>>>     b. the LTR index provided by this control should never exceed 
>>>> the max LTR-1. Else it will be rejected.
>>> 
>>> The "current" frame seems a bit loose. Perhaps you wanted to use 
>>> buffer
>>> flags ? A bit like what we have to signal TOP/BOTTOM fields in
>>> alternate interlacing.
>> 
>> I was thinking the same thing. Using a control for this doesn't seem 
>> right.
>> 
> 
> the client sets this to replace automatically marked frames by the
> encoder with a particular frame.
> this provides an index that ranges from 0 to LTR count-1 and then the
> particular frame will be marked with that index.
> this can be achieved through request by associating this control with
> a specific buffer to make it synchronized.
> 
>>> 
>>>> 
>>>> 3. V4L2_CID_MPEG_VIDEO_USELTRFRAME :
>>>>     a. This specifies the LTR frame(s) to be used for encoding the 
>>>> current frame. This is a dynamic control.
>>>>     b. LTR Use Bitmap : this consists of bits [0, 15]. A total of N 
>>>> LSB bits of this field are valid,
>>>>        where N is the maximum number of LTRs supported. All the 
>>>> other bits are invalid and should be rejected.
>>>>        The LSB corresponds to the LTR index 0. Bit N-1 from the LSB 
>>>> corresponds to the LTR index max LTR-1.
>> 
>> How would userspace know this? Esp. with auto marking since userspace 
>> would have
>> to predict how auto marking works (if I understand this correctly).
>> 
> 
> Client sets LTR count which tells about the number of LTR frames
> automatically marked by the encoder.
> so client can use LTR index (0 to LTR count -1) to ask encoder to
> refer any particular
> frame (marked automatically by driver or marked by client with
> V4L2_CID_MPEG_VIDEO_MARKLTRFRAME) as a reference to encode the next
> frame.
> 
>> For which HW encoder is this meant?
>> 
> This is primarily meant for H264 and HEVC.
> 
> Thanks,
> Dikshita
> 
>>> 
>>> Note, I haven't captured very well the userspace control flow, 
>>> perhaps
>>> this could be enhanced through writing some documentation.
>>> 
>>> As per all other generic encoder controls, we need to make sure it 
>>> will
>>> be usable and flexible enough for multiple HW blocks, as it can be
>>> tedious to extend later otherwise. It is important that along with 
>>> this
>>> RFC you provide some comparisons with with other HW / SW APIs in 
>>> order
>>> to help justify the design decisions. I also think there should be
>>> link made V4L2_CID_MPEG_VIDEO_GOP_* , number of B-Frames etc.
>> 
>> I agree with Nicolas.
>> 
>> Regards,
>> 
>> 	Hans
>> 
>>> 
>>> regards,
>>> Nicolas
>>> 
>>>> 
>>>> Dikshita Agarwal (1):
>>>>   media: v4l2-ctrls:  add control for ltr
>>>> 
>>>>  drivers/media/v4l2-core/v4l2-ctrls.c | 6 ++++++
>>>>  include/uapi/linux/v4l2-controls.h   | 4 ++++
>>>>  2 files changed, 10 insertions(+)
>>>> 
>>>
Hans Verkuil July 16, 2020, 8:55 a.m. UTC | #5
On 16/06/2020 20:41, dikshita@codeaurora.org wrote:
> Hi Hans, Nicolas,
> 
> Thanks for your comments.
> 
> On 2020-06-12 14:41, Hans Verkuil wrote:
>> Hi Dikshita, Nicolas,
>>
>> On 11/06/2020 16:22, Nicolas Dufresne wrote:
>>> Le jeudi 11 juin 2020 à 15:55 +0530, Dikshita Agarwal a écrit :
>>>> LTR (Long Term Reference) frames are the frames that are encoded 
>>>> sometime in the past
>>>> and stored in the DPB buffer list to be used as reference to encode 
>>>> future frames.
>>>> One usage of LTR encoding is to reduce error propagation for video 
>>>> transmission
>>>> in packet lossy networks.  For example, encoder may want to specify 
>>>> some key frames as
>>>> LTR pictures and use them as reference frames for encoding. With 
>>>> extra protection
>>>> selectively on these LTR frames or synchronization with the receiver 
>>>> of reception of
>>>> the LTR frames during transmission, decoder can receive reference 
>>>> frames more reliably
>>>> than other non-reference frames. As a result, transmission error can 
>>>> be effectively
>>>> restricted within certain frames rather than propagated to future 
>>>> frames.
>>>>
>>>> We are introducing below V4l2 Controls for this feature
>>>> 1. V4L2_CID_MPEG_VIDEO_LTRCOUNT
>>>>     a. This is used to query or configure the number of LTR frames.
>>>>        This is a static control and is controlled by the client.
>>>>     b. The LTR index varies from 0 to the max LTR-1.
>>>>     c. If LTR Count is more than max supported LTR count (max LTR) by 
>>>> driver, it will be rejected.
>>>>     d. Auto Marking : If LTR count is non zero,
>>>>         1) first LTR count frames would be mark as LTR automatically 
>>>> after
>>>>       	   every IDR frame (inclusive).
>>>>         2) For multilayer encoding: first LTR count base layer 
>>>> reference frames starting after
>>>>            every IDR frame (inclusive) in encoding order would be 
>>>> marked as LTR frames by the encoder.
>>>>         3) Auto marking of LTR due to IDR should consider following 
>>>> conditions:
>>>>             1. The frame is not already set to be marked as LTR.
>>>>             2. The frame is part of the base layer in the 
>>>> hierarchical layer case.
>>>>             3. The number of frames currently marked as LTR is less 
>>>> than the maximum LTR frame index plus 1.
>>>>     e. Encoder needs to handle explicit Mark/Use command when encoder 
>>>> is still doing "auto" marking
>>
>> I don't follow this, quite possibly due to lack of experience with 
>> encoders.
>>
>> I kind of would expect to see two modes: either automatic where 
>> encoders can
>> mark up to LTR_COUNT frames as long term reference, and userspace just 
>> sets
>> LTR_COUNT and doesn't have to do anything else.
>>
>> Or it is manual mode where userspace explicitly marks long term 
>> reference
>> frames.
>>
>> From the proposal above it looks like you can mix auto and manual 
>> modes.
>>
>> BTW, how do you 'unmark' long term reference frames?
>>
>> This feature is for stateful encoders, right?
>>
>>>
>>> Perhaps we are missing a LONG_TERM_REFERENCE_MODE ? I bet some encoder
>>> can select by themself long term references and even some encoders may
>>> not let the user decide.
>>>
>>> (not huge han of LTR acronyme, but that could be fine too, assuming 
>>> you
>>> add more _).
>>>
> 
> Userspace sets LTR count which signifies the number of LTR frames 
> encoder needs to generate or keep.
> The encoder has to build-up its internal buffer reference list (aka DBP 
> list or recon buffer list).
> In order to achieve that encoder will fill It's LTR (long term 
> references) list and STR (short term references) list
> by auto marking n frames as LTR frames(n is equal to LTR count) based on 
> auto-marking dictated by the encoder spec.
> The client then can replace those automatically marked frames with new 
> frames using V4L2_CID_MPEG_VIDEO_MARKLTRFRAME and can ask
> encoder to refer the newly marked frame for encoding the next frame 
> using V4L2_CID_MPEG_VIDEO_USELTRFRAME.
> 
>>>>
>>>> 2. V4L2_CID_MPEG_VIDEO_MARKLTRFRAME :
>>>>     a. This signals to mark the current frame as LTR frame. It is a 
>>>> dynamic control and also provide the LTR index to be used.
>>>>     b. the LTR index provided by this control should never exceed the 
>>>> max LTR-1. Else it will be rejected.
>>>
>>> The "current" frame seems a bit loose. Perhaps you wanted to use 
>>> buffer
>>> flags ? A bit like what we have to signal TOP/BOTTOM fields in
>>> alternate interlacing.
>>
>> I was thinking the same thing. Using a control for this doesn't seem 
>> right.
>>
> 
> the client sets this to replace automatically marked frames by the 
> encoder with a particular frame.
> this provides an index that ranges from 0 to LTR count-1 and then the 
> particular frame will be marked with that index.
> this can be achieved through request by associating this control with a 
> specific buffer to make it synchronized.
> 
>>>
>>>>
>>>> 3. V4L2_CID_MPEG_VIDEO_USELTRFRAME :
>>>>     a. This specifies the LTR frame(s) to be used for encoding the 
>>>> current frame. This is a dynamic control.
>>>>     b. LTR Use Bitmap : this consists of bits [0, 15]. A total of N 
>>>> LSB bits of this field are valid,
>>>>        where N is the maximum number of LTRs supported. All the other 
>>>> bits are invalid and should be rejected.
>>>>        The LSB corresponds to the LTR index 0. Bit N-1 from the LSB 
>>>> corresponds to the LTR index max LTR-1.
>>
>> How would userspace know this? Esp. with auto marking since userspace 
>> would have
>> to predict how auto marking works (if I understand this correctly).
>>
> 
> Client sets LTR count which tells about the number of LTR frames 
> automatically marked by the encoder.
> so client can use LTR index (0 to LTR count -1) to ask encoder to refer 
> any particular
> frame (marked automatically by driver or marked by client with 
> V4L2_CID_MPEG_VIDEO_MARKLTRFRAME) as a reference to encode the next 
> frame.
> 
>> For which HW encoder is this meant?
>>
> This is primarily meant for H264 and HEVC.

The venus encoder?

Some more questions:

1) How many LTR frames do h.264 and hevc allow?
2) Given N LTR frames, is there a ordering of those frames? E.g.
   the LTR frame with index 0 is processed/used differently from
   LTR frame with index 1? Or are they all equal in that it is just a pool
   of LTR frames that the encoder can use as it wishes?

Regards,

	Hans

> 
> Thanks,
> Dikshita
> 
>>>
>>> Note, I haven't captured very well the userspace control flow, perhaps
>>> this could be enhanced through writing some documentation.
>>>
>>> As per all other generic encoder controls, we need to make sure it 
>>> will
>>> be usable and flexible enough for multiple HW blocks, as it can be
>>> tedious to extend later otherwise. It is important that along with 
>>> this
>>> RFC you provide some comparisons with with other HW / SW APIs in order
>>> to help justify the design decisions. I also think there should be
>>> link made V4L2_CID_MPEG_VIDEO_GOP_* , number of B-Frames etc.
>>
>> I agree with Nicolas.
>>
>> Regards,
>>
>> 	Hans
>>
>>>
>>> regards,
>>> Nicolas
>>>
>>>>
>>>> Dikshita Agarwal (1):
>>>>   media: v4l2-ctrls:  add control for ltr
>>>>
>>>>  drivers/media/v4l2-core/v4l2-ctrls.c | 6 ++++++
>>>>  include/uapi/linux/v4l2-controls.h   | 4 ++++
>>>>  2 files changed, 10 insertions(+)
>>>>
>>>
Dikshita Agarwal July 20, 2020, 2:33 p.m. UTC | #6
On 2020-07-16 14:25, Hans Verkuil wrote:
> On 16/06/2020 20:41, dikshita@codeaurora.org wrote:
>> Hi Hans, Nicolas,
>> 
>> Thanks for your comments.
>> 
>> On 2020-06-12 14:41, Hans Verkuil wrote:
>>> Hi Dikshita, Nicolas,
>>> 
>>> On 11/06/2020 16:22, Nicolas Dufresne wrote:
>>>> Le jeudi 11 juin 2020 à 15:55 +0530, Dikshita Agarwal a écrit :
>>>>> LTR (Long Term Reference) frames are the frames that are encoded
>>>>> sometime in the past
>>>>> and stored in the DPB buffer list to be used as reference to encode
>>>>> future frames.
>>>>> One usage of LTR encoding is to reduce error propagation for video
>>>>> transmission
>>>>> in packet lossy networks.  For example, encoder may want to specify
>>>>> some key frames as
>>>>> LTR pictures and use them as reference frames for encoding. With
>>>>> extra protection
>>>>> selectively on these LTR frames or synchronization with the 
>>>>> receiver
>>>>> of reception of
>>>>> the LTR frames during transmission, decoder can receive reference
>>>>> frames more reliably
>>>>> than other non-reference frames. As a result, transmission error 
>>>>> can
>>>>> be effectively
>>>>> restricted within certain frames rather than propagated to future
>>>>> frames.
>>>>> 
>>>>> We are introducing below V4l2 Controls for this feature
>>>>> 1. V4L2_CID_MPEG_VIDEO_LTRCOUNT
>>>>>     a. This is used to query or configure the number of LTR frames.
>>>>>        This is a static control and is controlled by the client.
>>>>>     b. The LTR index varies from 0 to the max LTR-1.
>>>>>     c. If LTR Count is more than max supported LTR count (max LTR) 
>>>>> by
>>>>> driver, it will be rejected.
>>>>>     d. Auto Marking : If LTR count is non zero,
>>>>>         1) first LTR count frames would be mark as LTR 
>>>>> automatically
>>>>> after
>>>>>       	   every IDR frame (inclusive).
>>>>>         2) For multilayer encoding: first LTR count base layer
>>>>> reference frames starting after
>>>>>            every IDR frame (inclusive) in encoding order would be
>>>>> marked as LTR frames by the encoder.
>>>>>         3) Auto marking of LTR due to IDR should consider following
>>>>> conditions:
>>>>>             1. The frame is not already set to be marked as LTR.
>>>>>             2. The frame is part of the base layer in the
>>>>> hierarchical layer case.
>>>>>             3. The number of frames currently marked as LTR is less
>>>>> than the maximum LTR frame index plus 1.
>>>>>     e. Encoder needs to handle explicit Mark/Use command when 
>>>>> encoder
>>>>> is still doing "auto" marking
>>> 
>>> I don't follow this, quite possibly due to lack of experience with
>>> encoders.
>>> 
>>> I kind of would expect to see two modes: either automatic where
>>> encoders can
>>> mark up to LTR_COUNT frames as long term reference, and userspace 
>>> just
>>> sets
>>> LTR_COUNT and doesn't have to do anything else.
>>> 
>>> Or it is manual mode where userspace explicitly marks long term
>>> reference
>>> frames.
>>> 
>>> From the proposal above it looks like you can mix auto and manual
>>> modes.
>>> 
>>> BTW, how do you 'unmark' long term reference frames?
>>> 
>>> This feature is for stateful encoders, right?
>>> 
>>>> 
>>>> Perhaps we are missing a LONG_TERM_REFERENCE_MODE ? I bet some 
>>>> encoder
>>>> can select by themself long term references and even some encoders 
>>>> may
>>>> not let the user decide.
>>>> 
>>>> (not huge han of LTR acronyme, but that could be fine too, assuming
>>>> you
>>>> add more _).
>>>> 
>> 
>> Userspace sets LTR count which signifies the number of LTR frames
>> encoder needs to generate or keep.
>> The encoder has to build-up its internal buffer reference list (aka 
>> DBP
>> list or recon buffer list).
>> In order to achieve that encoder will fill It's LTR (long term
>> references) list and STR (short term references) list
>> by auto marking n frames as LTR frames(n is equal to LTR count) based 
>> on
>> auto-marking dictated by the encoder spec.
>> The client then can replace those automatically marked frames with new
>> frames using V4L2_CID_MPEG_VIDEO_MARKLTRFRAME and can ask
>> encoder to refer the newly marked frame for encoding the next frame
>> using V4L2_CID_MPEG_VIDEO_USELTRFRAME.
>> 
>>>>> 
>>>>> 2. V4L2_CID_MPEG_VIDEO_MARKLTRFRAME :
>>>>>     a. This signals to mark the current frame as LTR frame. It is a
>>>>> dynamic control and also provide the LTR index to be used.
>>>>>     b. the LTR index provided by this control should never exceed 
>>>>> the
>>>>> max LTR-1. Else it will be rejected.
>>>> 
>>>> The "current" frame seems a bit loose. Perhaps you wanted to use
>>>> buffer
>>>> flags ? A bit like what we have to signal TOP/BOTTOM fields in
>>>> alternate interlacing.
>>> 
>>> I was thinking the same thing. Using a control for this doesn't seem
>>> right.
>>> 
>> 
>> the client sets this to replace automatically marked frames by the
>> encoder with a particular frame.
>> this provides an index that ranges from 0 to LTR count-1 and then the
>> particular frame will be marked with that index.
>> this can be achieved through request by associating this control with 
>> a
>> specific buffer to make it synchronized.
>> 
>>>> 
>>>>> 
>>>>> 3. V4L2_CID_MPEG_VIDEO_USELTRFRAME :
>>>>>     a. This specifies the LTR frame(s) to be used for encoding the
>>>>> current frame. This is a dynamic control.
>>>>>     b. LTR Use Bitmap : this consists of bits [0, 15]. A total of N
>>>>> LSB bits of this field are valid,
>>>>>        where N is the maximum number of LTRs supported. All the 
>>>>> other
>>>>> bits are invalid and should be rejected.
>>>>>        The LSB corresponds to the LTR index 0. Bit N-1 from the LSB
>>>>> corresponds to the LTR index max LTR-1.
>>> 
>>> How would userspace know this? Esp. with auto marking since userspace
>>> would have
>>> to predict how auto marking works (if I understand this correctly).
>>> 
>> 
>> Client sets LTR count which tells about the number of LTR frames
>> automatically marked by the encoder.
>> so client can use LTR index (0 to LTR count -1) to ask encoder to 
>> refer
>> any particular
>> frame (marked automatically by driver or marked by client with
>> V4L2_CID_MPEG_VIDEO_MARKLTRFRAME) as a reference to encode the next
>> frame.
>> 
>>> For which HW encoder is this meant?
>>> 
>> This is primarily meant for H264 and HEVC.
> 
> The venus encoder?
yes
> 
> Some more questions:
> 
> 1) How many LTR frames do h.264 and hevc allow?
According to spec, MAX LTR allowed by H.264 is 15 and HEVC allows max 32 
LTR frames.
> 2) Given N LTR frames, is there a ordering of those frames? E.g.
>    the LTR frame with index 0 is processed/used differently from
>    LTR frame with index 1? Or are they all equal in that it is just a 
> pool
>    of LTR frames that the encoder can use as it wishes?
they are different frames marked to be used as LTR and stored in 
available indices.
Userspace notifies encoder which LTR frame index to use via USE LTR 
control.

Thanks,
Dikshita
> 
> Regards,
> 
> 	Hans
> 
>> 
>> Thanks,
>> Dikshita
>> 
>>>> 
>>>> Note, I haven't captured very well the userspace control flow, 
>>>> perhaps
>>>> this could be enhanced through writing some documentation.
>>>> 
>>>> As per all other generic encoder controls, we need to make sure it
>>>> will
>>>> be usable and flexible enough for multiple HW blocks, as it can be
>>>> tedious to extend later otherwise. It is important that along with
>>>> this
>>>> RFC you provide some comparisons with with other HW / SW APIs in 
>>>> order
>>>> to help justify the design decisions. I also think there should be
>>>> link made V4L2_CID_MPEG_VIDEO_GOP_* , number of B-Frames etc.
>>> 
>>> I agree with Nicolas.
>>> 
>>> Regards,
>>> 
>>> 	Hans
>>> 
>>>> 
>>>> regards,
>>>> Nicolas
>>>> 
>>>>> 
>>>>> Dikshita Agarwal (1):
>>>>>   media: v4l2-ctrls:  add control for ltr
>>>>> 
>>>>>  drivers/media/v4l2-core/v4l2-ctrls.c | 6 ++++++
>>>>>  include/uapi/linux/v4l2-controls.h   | 4 ++++
>>>>>  2 files changed, 10 insertions(+)
>>>>> 
>>>>
Hans Verkuil July 22, 2020, 2:26 p.m. UTC | #7
On 20/07/2020 16:33, dikshita@codeaurora.org wrote:
> On 2020-07-16 14:25, Hans Verkuil wrote:
>> On 16/06/2020 20:41, dikshita@codeaurora.org wrote:
>>> Hi Hans, Nicolas,
>>>
>>> Thanks for your comments.
>>>
>>> On 2020-06-12 14:41, Hans Verkuil wrote:
>>>> Hi Dikshita, Nicolas,
>>>>
>>>> On 11/06/2020 16:22, Nicolas Dufresne wrote:
>>>>> Le jeudi 11 juin 2020 à 15:55 +0530, Dikshita Agarwal a écrit :
>>>>>> LTR (Long Term Reference) frames are the frames that are encoded
>>>>>> sometime in the past
>>>>>> and stored in the DPB buffer list to be used as reference to encode
>>>>>> future frames.
>>>>>> One usage of LTR encoding is to reduce error propagation for video
>>>>>> transmission
>>>>>> in packet lossy networks.  For example, encoder may want to specify
>>>>>> some key frames as
>>>>>> LTR pictures and use them as reference frames for encoding. With
>>>>>> extra protection
>>>>>> selectively on these LTR frames or synchronization with the receiver
>>>>>> of reception of
>>>>>> the LTR frames during transmission, decoder can receive reference
>>>>>> frames more reliably
>>>>>> than other non-reference frames. As a result, transmission error can
>>>>>> be effectively
>>>>>> restricted within certain frames rather than propagated to future
>>>>>> frames.
>>>>>>
>>>>>> We are introducing below V4l2 Controls for this feature
>>>>>> 1. V4L2_CID_MPEG_VIDEO_LTRCOUNT
>>>>>>     a. This is used to query or configure the number of LTR frames.
>>>>>>        This is a static control and is controlled by the client.
>>>>>>     b. The LTR index varies from 0 to the max LTR-1.
>>>>>>     c. If LTR Count is more than max supported LTR count (max LTR) by
>>>>>> driver, it will be rejected.
>>>>>>     d. Auto Marking : If LTR count is non zero,
>>>>>>         1) first LTR count frames would be mark as LTR automatically
>>>>>> after
>>>>>>              every IDR frame (inclusive).
>>>>>>         2) For multilayer encoding: first LTR count base layer
>>>>>> reference frames starting after
>>>>>>            every IDR frame (inclusive) in encoding order would be
>>>>>> marked as LTR frames by the encoder.
>>>>>>         3) Auto marking of LTR due to IDR should consider following
>>>>>> conditions:
>>>>>>             1. The frame is not already set to be marked as LTR.
>>>>>>             2. The frame is part of the base layer in the
>>>>>> hierarchical layer case.
>>>>>>             3. The number of frames currently marked as LTR is less
>>>>>> than the maximum LTR frame index plus 1.
>>>>>>     e. Encoder needs to handle explicit Mark/Use command when encoder
>>>>>> is still doing "auto" marking
>>>>
>>>> I don't follow this, quite possibly due to lack of experience with
>>>> encoders.
>>>>
>>>> I kind of would expect to see two modes: either automatic where
>>>> encoders can
>>>> mark up to LTR_COUNT frames as long term reference, and userspace just
>>>> sets
>>>> LTR_COUNT and doesn't have to do anything else.
>>>>
>>>> Or it is manual mode where userspace explicitly marks long term
>>>> reference
>>>> frames.
>>>>
>>>> From the proposal above it looks like you can mix auto and manual
>>>> modes.
>>>>
>>>> BTW, how do you 'unmark' long term reference frames?
>>>>
>>>> This feature is for stateful encoders, right?
>>>>
>>>>>
>>>>> Perhaps we are missing a LONG_TERM_REFERENCE_MODE ? I bet some encoder
>>>>> can select by themself long term references and even some encoders may
>>>>> not let the user decide.
>>>>>
>>>>> (not huge han of LTR acronyme, but that could be fine too, assuming
>>>>> you
>>>>> add more _).
>>>>>
>>>
>>> Userspace sets LTR count which signifies the number of LTR frames
>>> encoder needs to generate or keep.
>>> The encoder has to build-up its internal buffer reference list (aka DBP
>>> list or recon buffer list).
>>> In order to achieve that encoder will fill It's LTR (long term
>>> references) list and STR (short term references) list
>>> by auto marking n frames as LTR frames(n is equal to LTR count) based on
>>> auto-marking dictated by the encoder spec.
>>> The client then can replace those automatically marked frames with new
>>> frames using V4L2_CID_MPEG_VIDEO_MARKLTRFRAME and can ask
>>> encoder to refer the newly marked frame for encoding the next frame
>>> using V4L2_CID_MPEG_VIDEO_USELTRFRAME.
>>>
>>>>>>
>>>>>> 2. V4L2_CID_MPEG_VIDEO_MARKLTRFRAME :
>>>>>>     a. This signals to mark the current frame as LTR frame. It is a
>>>>>> dynamic control and also provide the LTR index to be used.
>>>>>>     b. the LTR index provided by this control should never exceed the
>>>>>> max LTR-1. Else it will be rejected.
>>>>>
>>>>> The "current" frame seems a bit loose. Perhaps you wanted to use
>>>>> buffer
>>>>> flags ? A bit like what we have to signal TOP/BOTTOM fields in
>>>>> alternate interlacing.
>>>>
>>>> I was thinking the same thing. Using a control for this doesn't seem
>>>> right.
>>>>
>>>
>>> the client sets this to replace automatically marked frames by the
>>> encoder with a particular frame.
>>> this provides an index that ranges from 0 to LTR count-1 and then the
>>> particular frame will be marked with that index.
>>> this can be achieved through request by associating this control with a
>>> specific buffer to make it synchronized.
>>>
>>>>>
>>>>>>
>>>>>> 3. V4L2_CID_MPEG_VIDEO_USELTRFRAME :
>>>>>>     a. This specifies the LTR frame(s) to be used for encoding the
>>>>>> current frame. This is a dynamic control.
>>>>>>     b. LTR Use Bitmap : this consists of bits [0, 15]. A total of N
>>>>>> LSB bits of this field are valid,
>>>>>>        where N is the maximum number of LTRs supported. All the other
>>>>>> bits are invalid and should be rejected.
>>>>>>        The LSB corresponds to the LTR index 0. Bit N-1 from the LSB
>>>>>> corresponds to the LTR index max LTR-1.
>>>>
>>>> How would userspace know this? Esp. with auto marking since userspace
>>>> would have
>>>> to predict how auto marking works (if I understand this correctly).
>>>>
>>>
>>> Client sets LTR count which tells about the number of LTR frames
>>> automatically marked by the encoder.
>>> so client can use LTR index (0 to LTR count -1) to ask encoder to refer
>>> any particular
>>> frame (marked automatically by driver or marked by client with
>>> V4L2_CID_MPEG_VIDEO_MARKLTRFRAME) as a reference to encode the next
>>> frame.
>>>
>>>> For which HW encoder is this meant?
>>>>
>>> This is primarily meant for H264 and HEVC.
>>
>> The venus encoder?
> yes
>>
>> Some more questions:
>>
>> 1) How many LTR frames do h.264 and hevc allow?
> According to spec, MAX LTR allowed by H.264 is 15 and HEVC allows max 32 LTR frames.
>> 2) Given N LTR frames, is there a ordering of those frames? E.g.
>>    the LTR frame with index 0 is processed/used differently from
>>    LTR frame with index 1? Or are they all equal in that it is just a pool
>>    of LTR frames that the encoder can use as it wishes?
> they are different frames marked to be used as LTR and stored in available indices.
> Userspace notifies encoder which LTR frame index to use via USE LTR control.

One more question: if a frame is marked as a LTR frame, does that mean that userspace
can't reuse the buffer containing that frame? I assume that's the case.

One thing that I don't like about this API is that it introduces the LTR indices.
Would it be possible to instead use buffer indices? (i.e. the v4l2_buffer index field).
Then you can use a buffer flag to indicate that a buffer should be an LTR buffer,
or (for automarking) the driver can mark a buffer as an LTR buffer by setting this flag
before returning the buffer to userspace.

You would still need a V4L2_CID_MPEG_VIDEO_USELTRFRAME, but that can be a bitmask
using the buffer index.

This would avoid introducing a second method of referring to buffers (LTR index),
and I think that V4L2_CID_MPEG_VIDEO_MARKLTRFRAME can be dropped.

Regards,

	Hans

> 
> Thanks,
> Dikshita
>>
>> Regards,
>>
>>     Hans
>>
>>>
>>> Thanks,
>>> Dikshita
>>>
>>>>>
>>>>> Note, I haven't captured very well the userspace control flow, perhaps
>>>>> this could be enhanced through writing some documentation.
>>>>>
>>>>> As per all other generic encoder controls, we need to make sure it
>>>>> will
>>>>> be usable and flexible enough for multiple HW blocks, as it can be
>>>>> tedious to extend later otherwise. It is important that along with
>>>>> this
>>>>> RFC you provide some comparisons with with other HW / SW APIs in order
>>>>> to help justify the design decisions. I also think there should be
>>>>> link made V4L2_CID_MPEG_VIDEO_GOP_* , number of B-Frames etc.
>>>>
>>>> I agree with Nicolas.
>>>>
>>>> Regards,
>>>>
>>>>     Hans
>>>>
>>>>>
>>>>> regards,
>>>>> Nicolas
>>>>>
>>>>>>
>>>>>> Dikshita Agarwal (1):
>>>>>>   media: v4l2-ctrls:  add control for ltr
>>>>>>
>>>>>>  drivers/media/v4l2-core/v4l2-ctrls.c | 6 ++++++
>>>>>>  include/uapi/linux/v4l2-controls.h   | 4 ++++
>>>>>>  2 files changed, 10 insertions(+)
>>>>>>
>>>>>
Dikshita Agarwal Aug. 6, 2020, 11:05 a.m. UTC | #8
On 2020-07-22 19:56, Hans Verkuil wrote:
> On 20/07/2020 16:33, dikshita@codeaurora.org wrote:
>> On 2020-07-16 14:25, Hans Verkuil wrote:
>>> On 16/06/2020 20:41, dikshita@codeaurora.org wrote:
>>>> Hi Hans, Nicolas,
>>>> 
>>>> Thanks for your comments.
>>>> 
>>>> On 2020-06-12 14:41, Hans Verkuil wrote:
>>>>> Hi Dikshita, Nicolas,
>>>>> 
>>>>> On 11/06/2020 16:22, Nicolas Dufresne wrote:
>>>>>> Le jeudi 11 juin 2020 à 15:55 +0530, Dikshita Agarwal a écrit :
>>>>>>> LTR (Long Term Reference) frames are the frames that are encoded
>>>>>>> sometime in the past
>>>>>>> and stored in the DPB buffer list to be used as reference to 
>>>>>>> encode
>>>>>>> future frames.
>>>>>>> One usage of LTR encoding is to reduce error propagation for 
>>>>>>> video
>>>>>>> transmission
>>>>>>> in packet lossy networks.  For example, encoder may want to 
>>>>>>> specify
>>>>>>> some key frames as
>>>>>>> LTR pictures and use them as reference frames for encoding. With
>>>>>>> extra protection
>>>>>>> selectively on these LTR frames or synchronization with the 
>>>>>>> receiver
>>>>>>> of reception of
>>>>>>> the LTR frames during transmission, decoder can receive reference
>>>>>>> frames more reliably
>>>>>>> than other non-reference frames. As a result, transmission error 
>>>>>>> can
>>>>>>> be effectively
>>>>>>> restricted within certain frames rather than propagated to future
>>>>>>> frames.
>>>>>>> 
>>>>>>> We are introducing below V4l2 Controls for this feature
>>>>>>> 1. V4L2_CID_MPEG_VIDEO_LTRCOUNT
>>>>>>>     a. This is used to query or configure the number of LTR 
>>>>>>> frames.
>>>>>>>        This is a static control and is controlled by the client.
>>>>>>>     b. The LTR index varies from 0 to the max LTR-1.
>>>>>>>     c. If LTR Count is more than max supported LTR count (max 
>>>>>>> LTR) by
>>>>>>> driver, it will be rejected.
>>>>>>>     d. Auto Marking : If LTR count is non zero,
>>>>>>>         1) first LTR count frames would be mark as LTR 
>>>>>>> automatically
>>>>>>> after
>>>>>>>              every IDR frame (inclusive).
>>>>>>>         2) For multilayer encoding: first LTR count base layer
>>>>>>> reference frames starting after
>>>>>>>            every IDR frame (inclusive) in encoding order would be
>>>>>>> marked as LTR frames by the encoder.
>>>>>>>         3) Auto marking of LTR due to IDR should consider 
>>>>>>> following
>>>>>>> conditions:
>>>>>>>             1. The frame is not already set to be marked as LTR.
>>>>>>>             2. The frame is part of the base layer in the
>>>>>>> hierarchical layer case.
>>>>>>>             3. The number of frames currently marked as LTR is 
>>>>>>> less
>>>>>>> than the maximum LTR frame index plus 1.
>>>>>>>     e. Encoder needs to handle explicit Mark/Use command when 
>>>>>>> encoder
>>>>>>> is still doing "auto" marking
>>>>> 
>>>>> I don't follow this, quite possibly due to lack of experience with
>>>>> encoders.
>>>>> 
>>>>> I kind of would expect to see two modes: either automatic where
>>>>> encoders can
>>>>> mark up to LTR_COUNT frames as long term reference, and userspace 
>>>>> just
>>>>> sets
>>>>> LTR_COUNT and doesn't have to do anything else.
>>>>> 
>>>>> Or it is manual mode where userspace explicitly marks long term
>>>>> reference
>>>>> frames.
>>>>> 
>>>>> From the proposal above it looks like you can mix auto and manual
>>>>> modes.
>>>>> 
>>>>> BTW, how do you 'unmark' long term reference frames?
>>>>> 
>>>>> This feature is for stateful encoders, right?
>>>>> 
>>>>>> 
>>>>>> Perhaps we are missing a LONG_TERM_REFERENCE_MODE ? I bet some 
>>>>>> encoder
>>>>>> can select by themself long term references and even some encoders 
>>>>>> may
>>>>>> not let the user decide.
>>>>>> 
>>>>>> (not huge han of LTR acronyme, but that could be fine too, 
>>>>>> assuming
>>>>>> you
>>>>>> add more _).
>>>>>> 
>>>> 
>>>> Userspace sets LTR count which signifies the number of LTR frames
>>>> encoder needs to generate or keep.
>>>> The encoder has to build-up its internal buffer reference list (aka 
>>>> DBP
>>>> list or recon buffer list).
>>>> In order to achieve that encoder will fill It's LTR (long term
>>>> references) list and STR (short term references) list
>>>> by auto marking n frames as LTR frames(n is equal to LTR count) 
>>>> based on
>>>> auto-marking dictated by the encoder spec.
>>>> The client then can replace those automatically marked frames with 
>>>> new
>>>> frames using V4L2_CID_MPEG_VIDEO_MARKLTRFRAME and can ask
>>>> encoder to refer the newly marked frame for encoding the next frame
>>>> using V4L2_CID_MPEG_VIDEO_USELTRFRAME.
>>>> 
>>>>>>> 
>>>>>>> 2. V4L2_CID_MPEG_VIDEO_MARKLTRFRAME :
>>>>>>>     a. This signals to mark the current frame as LTR frame. It is 
>>>>>>> a
>>>>>>> dynamic control and also provide the LTR index to be used.
>>>>>>>     b. the LTR index provided by this control should never exceed 
>>>>>>> the
>>>>>>> max LTR-1. Else it will be rejected.
>>>>>> 
>>>>>> The "current" frame seems a bit loose. Perhaps you wanted to use
>>>>>> buffer
>>>>>> flags ? A bit like what we have to signal TOP/BOTTOM fields in
>>>>>> alternate interlacing.
>>>>> 
>>>>> I was thinking the same thing. Using a control for this doesn't 
>>>>> seem
>>>>> right.
>>>>> 
>>>> 
>>>> the client sets this to replace automatically marked frames by the
>>>> encoder with a particular frame.
>>>> this provides an index that ranges from 0 to LTR count-1 and then 
>>>> the
>>>> particular frame will be marked with that index.
>>>> this can be achieved through request by associating this control 
>>>> with a
>>>> specific buffer to make it synchronized.
>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> 3. V4L2_CID_MPEG_VIDEO_USELTRFRAME :
>>>>>>>     a. This specifies the LTR frame(s) to be used for encoding 
>>>>>>> the
>>>>>>> current frame. This is a dynamic control.
>>>>>>>     b. LTR Use Bitmap : this consists of bits [0, 15]. A total of 
>>>>>>> N
>>>>>>> LSB bits of this field are valid,
>>>>>>>        where N is the maximum number of LTRs supported. All the 
>>>>>>> other
>>>>>>> bits are invalid and should be rejected.
>>>>>>>        The LSB corresponds to the LTR index 0. Bit N-1 from the 
>>>>>>> LSB
>>>>>>> corresponds to the LTR index max LTR-1.
>>>>> 
>>>>> How would userspace know this? Esp. with auto marking since 
>>>>> userspace
>>>>> would have
>>>>> to predict how auto marking works (if I understand this correctly).
>>>>> 
>>>> 
>>>> Client sets LTR count which tells about the number of LTR frames
>>>> automatically marked by the encoder.
>>>> so client can use LTR index (0 to LTR count -1) to ask encoder to 
>>>> refer
>>>> any particular
>>>> frame (marked automatically by driver or marked by client with
>>>> V4L2_CID_MPEG_VIDEO_MARKLTRFRAME) as a reference to encode the next
>>>> frame.
>>>> 
>>>>> For which HW encoder is this meant?
>>>>> 
>>>> This is primarily meant for H264 and HEVC.
>>> 
>>> The venus encoder?
>> yes
>>> 
>>> Some more questions:
>>> 
>>> 1) How many LTR frames do h.264 and hevc allow?
>> According to spec, MAX LTR allowed by H.264 is 15 and HEVC allows max 
>> 32 LTR frames.
>>> 2) Given N LTR frames, is there a ordering of those frames? E.g.
>>>    the LTR frame with index 0 is processed/used differently from
>>>    LTR frame with index 1? Or are they all equal in that it is just a 
>>> pool
>>>    of LTR frames that the encoder can use as it wishes?
>> they are different frames marked to be used as LTR and stored in 
>> available indices.
>> Userspace notifies encoder which LTR frame index to use via USE LTR 
>> control.
> 
> One more question: if a frame is marked as a LTR frame, does that mean
> that userspace
> can't reuse the buffer containing that frame? I assume that's the case.
> 
> One thing that I don't like about this API is that it introduces the
> LTR indices.
> Would it be possible to instead use buffer indices? (i.e. the
> v4l2_buffer index field).
> Then you can use a buffer flag to indicate that a buffer should be an
> LTR buffer,
> or (for automarking) the driver can mark a buffer as an LTR buffer by
> setting this flag
> before returning the buffer to userspace.
> 
> You would still need a V4L2_CID_MPEG_VIDEO_USELTRFRAME, but that can
> be a bitmask
> using the buffer index.
> 
> This would avoid introducing a second method of referring to buffers
> (LTR index),
> and I think that V4L2_CID_MPEG_VIDEO_MARKLTRFRAME can be dropped.
> 
> Regards,
> 
> 	Hans

Hi Hans,
LTR indices are part of the spec, we are not introducing anything new.
Also, there is no relation between v4l2 buffer index and LTR index. So 
we can't use V4l2 index as LTR index.

below are some details about LTR from spec:

Regarding LTR count:
max_num_ref_frames specifies the maximum number of short-term and 
long-term reference frames, complementary reference field pairs, and 
non-paired reference fields that may be used by the decoding process for 
inter prediction of any picture in the coded video sequence. The value 
of max_num_ref_frames shall be in the range of 0 to MaxDpbFrames (as 
specified in clause A.3.1 or A.3.2), inclusive.
max_long_term_frame_idx_plus1 minus 1 specifies the maximum value of 
long-term frame index allowed for long-term reference pictures (until 
receipt of another value of max_long_term_frame_idx_plus1). The value of 
max_long_term_frame_idx_plus1 shall be in the range of 0 to 
max_num_ref_frames, inclusive.


Regarding LTR index which will be specified by 
V4L2_CID_MPEG_VIDEO_MARKLTRFRAM :
long_term_pic_num specifies the long-term picture number of the picture 
being moved to the current index in the list. When decoding a coded 
frame, long_term_pic_num shall be equal to a LongTermPicNum assigned to 
one of the reference frames or complementary reference field pairs 
marked as "used for long-term reference".
long_term_frame_idx is used (with memory_management_control_operation 
equal to 3 or 6) to assign a long-term frame index to a picture. When 
the associated memory_management_control_operation is processed by the 
decoding process, the value of long_term_frame_idx shall be in the range 
of 0 to MaxLongTermFrameIdx, inclusive.


Also As there can be multiple LTR present. We can instruct encoder to 
use one and/or multiple of them using V4L2_CID_MPEG_VIDEO_USELTRFRAME 
with the bitmask.

Thanks,
Dikshita

> 
>> 
>> Thanks,
>> Dikshita
>>> 
>>> Regards,
>>> 
>>>     Hans
>>> 
>>>> 
>>>> Thanks,
>>>> Dikshita
>>>> 
>>>>>> 
>>>>>> Note, I haven't captured very well the userspace control flow, 
>>>>>> perhaps
>>>>>> this could be enhanced through writing some documentation.
>>>>>> 
>>>>>> As per all other generic encoder controls, we need to make sure it
>>>>>> will
>>>>>> be usable and flexible enough for multiple HW blocks, as it can be
>>>>>> tedious to extend later otherwise. It is important that along with
>>>>>> this
>>>>>> RFC you provide some comparisons with with other HW / SW APIs in 
>>>>>> order
>>>>>> to help justify the design decisions. I also think there should be
>>>>>> link made V4L2_CID_MPEG_VIDEO_GOP_* , number of B-Frames etc.
>>>>> 
>>>>> I agree with Nicolas.
>>>>> 
>>>>> Regards,
>>>>> 
>>>>>     Hans
>>>>> 
>>>>>> 
>>>>>> regards,
>>>>>> Nicolas
>>>>>> 
>>>>>>> 
>>>>>>> Dikshita Agarwal (1):
>>>>>>>   media: v4l2-ctrls:  add control for ltr
>>>>>>> 
>>>>>>>  drivers/media/v4l2-core/v4l2-ctrls.c | 6 ++++++
>>>>>>>  include/uapi/linux/v4l2-controls.h   | 4 ++++
>>>>>>>  2 files changed, 10 insertions(+)
>>>>>>> 
>>>>>>
Hans Verkuil Aug. 7, 2020, 8:39 a.m. UTC | #9
On 06/08/2020 13:05, dikshita@codeaurora.org wrote:
> On 2020-07-22 19:56, Hans Verkuil wrote:
>> On 20/07/2020 16:33, dikshita@codeaurora.org wrote:
>>> On 2020-07-16 14:25, Hans Verkuil wrote:
>>>> On 16/06/2020 20:41, dikshita@codeaurora.org wrote:
>>>>> Hi Hans, Nicolas,
>>>>>
>>>>> Thanks for your comments.
>>>>>
>>>>> On 2020-06-12 14:41, Hans Verkuil wrote:
>>>>>> Hi Dikshita, Nicolas,
>>>>>>
>>>>>> On 11/06/2020 16:22, Nicolas Dufresne wrote:
>>>>>>> Le jeudi 11 juin 2020 à 15:55 +0530, Dikshita Agarwal a écrit :
>>>>>>>> LTR (Long Term Reference) frames are the frames that are encoded
>>>>>>>> sometime in the past
>>>>>>>> and stored in the DPB buffer list to be used as reference to encode
>>>>>>>> future frames.
>>>>>>>> One usage of LTR encoding is to reduce error propagation for video
>>>>>>>> transmission
>>>>>>>> in packet lossy networks.  For example, encoder may want to specify
>>>>>>>> some key frames as
>>>>>>>> LTR pictures and use them as reference frames for encoding. With
>>>>>>>> extra protection
>>>>>>>> selectively on these LTR frames or synchronization with the receiver
>>>>>>>> of reception of
>>>>>>>> the LTR frames during transmission, decoder can receive reference
>>>>>>>> frames more reliably
>>>>>>>> than other non-reference frames. As a result, transmission error can
>>>>>>>> be effectively
>>>>>>>> restricted within certain frames rather than propagated to future
>>>>>>>> frames.
>>>>>>>>
>>>>>>>> We are introducing below V4l2 Controls for this feature
>>>>>>>> 1. V4L2_CID_MPEG_VIDEO_LTRCOUNT
>>>>>>>>     a. This is used to query or configure the number of LTR frames.
>>>>>>>>        This is a static control and is controlled by the client.
>>>>>>>>     b. The LTR index varies from 0 to the max LTR-1.
>>>>>>>>     c. If LTR Count is more than max supported LTR count (max LTR) by
>>>>>>>> driver, it will be rejected.
>>>>>>>>     d. Auto Marking : If LTR count is non zero,
>>>>>>>>         1) first LTR count frames would be mark as LTR automatically
>>>>>>>> after
>>>>>>>>              every IDR frame (inclusive).
>>>>>>>>         2) For multilayer encoding: first LTR count base layer
>>>>>>>> reference frames starting after
>>>>>>>>            every IDR frame (inclusive) in encoding order would be
>>>>>>>> marked as LTR frames by the encoder.
>>>>>>>>         3) Auto marking of LTR due to IDR should consider following
>>>>>>>> conditions:
>>>>>>>>             1. The frame is not already set to be marked as LTR.
>>>>>>>>             2. The frame is part of the base layer in the
>>>>>>>> hierarchical layer case.
>>>>>>>>             3. The number of frames currently marked as LTR is less
>>>>>>>> than the maximum LTR frame index plus 1.
>>>>>>>>     e. Encoder needs to handle explicit Mark/Use command when encoder
>>>>>>>> is still doing "auto" marking
>>>>>>
>>>>>> I don't follow this, quite possibly due to lack of experience with
>>>>>> encoders.
>>>>>>
>>>>>> I kind of would expect to see two modes: either automatic where
>>>>>> encoders can
>>>>>> mark up to LTR_COUNT frames as long term reference, and userspace just
>>>>>> sets
>>>>>> LTR_COUNT and doesn't have to do anything else.
>>>>>>
>>>>>> Or it is manual mode where userspace explicitly marks long term
>>>>>> reference
>>>>>> frames.
>>>>>>
>>>>>> From the proposal above it looks like you can mix auto and manual
>>>>>> modes.
>>>>>>
>>>>>> BTW, how do you 'unmark' long term reference frames?
>>>>>>
>>>>>> This feature is for stateful encoders, right?
>>>>>>
>>>>>>>
>>>>>>> Perhaps we are missing a LONG_TERM_REFERENCE_MODE ? I bet some encoder
>>>>>>> can select by themself long term references and even some encoders may
>>>>>>> not let the user decide.
>>>>>>>
>>>>>>> (not huge han of LTR acronyme, but that could be fine too, assuming
>>>>>>> you
>>>>>>> add more _).
>>>>>>>
>>>>>
>>>>> Userspace sets LTR count which signifies the number of LTR frames
>>>>> encoder needs to generate or keep.
>>>>> The encoder has to build-up its internal buffer reference list (aka DBP
>>>>> list or recon buffer list).
>>>>> In order to achieve that encoder will fill It's LTR (long term
>>>>> references) list and STR (short term references) list
>>>>> by auto marking n frames as LTR frames(n is equal to LTR count) based on
>>>>> auto-marking dictated by the encoder spec.
>>>>> The client then can replace those automatically marked frames with new
>>>>> frames using V4L2_CID_MPEG_VIDEO_MARKLTRFRAME and can ask
>>>>> encoder to refer the newly marked frame for encoding the next frame
>>>>> using V4L2_CID_MPEG_VIDEO_USELTRFRAME.
>>>>>
>>>>>>>>
>>>>>>>> 2. V4L2_CID_MPEG_VIDEO_MARKLTRFRAME :
>>>>>>>>     a. This signals to mark the current frame as LTR frame. It is a
>>>>>>>> dynamic control and also provide the LTR index to be used.
>>>>>>>>     b. the LTR index provided by this control should never exceed the
>>>>>>>> max LTR-1. Else it will be rejected.
>>>>>>>
>>>>>>> The "current" frame seems a bit loose. Perhaps you wanted to use
>>>>>>> buffer
>>>>>>> flags ? A bit like what we have to signal TOP/BOTTOM fields in
>>>>>>> alternate interlacing.
>>>>>>
>>>>>> I was thinking the same thing. Using a control for this doesn't seem
>>>>>> right.
>>>>>>
>>>>>
>>>>> the client sets this to replace automatically marked frames by the
>>>>> encoder with a particular frame.
>>>>> this provides an index that ranges from 0 to LTR count-1 and then the
>>>>> particular frame will be marked with that index.
>>>>> this can be achieved through request by associating this control with a
>>>>> specific buffer to make it synchronized.
>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> 3. V4L2_CID_MPEG_VIDEO_USELTRFRAME :
>>>>>>>>     a. This specifies the LTR frame(s) to be used for encoding the
>>>>>>>> current frame. This is a dynamic control.
>>>>>>>>     b. LTR Use Bitmap : this consists of bits [0, 15]. A total of N
>>>>>>>> LSB bits of this field are valid,
>>>>>>>>        where N is the maximum number of LTRs supported. All the other
>>>>>>>> bits are invalid and should be rejected.
>>>>>>>>        The LSB corresponds to the LTR index 0. Bit N-1 from the LSB
>>>>>>>> corresponds to the LTR index max LTR-1.
>>>>>>
>>>>>> How would userspace know this? Esp. with auto marking since userspace
>>>>>> would have
>>>>>> to predict how auto marking works (if I understand this correctly).
>>>>>>
>>>>>
>>>>> Client sets LTR count which tells about the number of LTR frames
>>>>> automatically marked by the encoder.
>>>>> so client can use LTR index (0 to LTR count -1) to ask encoder to refer
>>>>> any particular
>>>>> frame (marked automatically by driver or marked by client with
>>>>> V4L2_CID_MPEG_VIDEO_MARKLTRFRAME) as a reference to encode the next
>>>>> frame.
>>>>>
>>>>>> For which HW encoder is this meant?
>>>>>>
>>>>> This is primarily meant for H264 and HEVC.
>>>>
>>>> The venus encoder?
>>> yes
>>>>
>>>> Some more questions:
>>>>
>>>> 1) How many LTR frames do h.264 and hevc allow?
>>> According to spec, MAX LTR allowed by H.264 is 15 and HEVC allows max 32 LTR frames.
>>>> 2) Given N LTR frames, is there a ordering of those frames? E.g.
>>>>    the LTR frame with index 0 is processed/used differently from
>>>>    LTR frame with index 1? Or are they all equal in that it is just a pool
>>>>    of LTR frames that the encoder can use as it wishes?
>>> they are different frames marked to be used as LTR and stored in available indices.
>>> Userspace notifies encoder which LTR frame index to use via USE LTR control.
>>
>> One more question: if a frame is marked as a LTR frame, does that mean
>> that userspace
>> can't reuse the buffer containing that frame? I assume that's the case.
>>
>> One thing that I don't like about this API is that it introduces the
>> LTR indices.
>> Would it be possible to instead use buffer indices? (i.e. the
>> v4l2_buffer index field).
>> Then you can use a buffer flag to indicate that a buffer should be an
>> LTR buffer,
>> or (for automarking) the driver can mark a buffer as an LTR buffer by
>> setting this flag
>> before returning the buffer to userspace.
>>
>> You would still need a V4L2_CID_MPEG_VIDEO_USELTRFRAME, but that can
>> be a bitmask
>> using the buffer index.
>>
>> This would avoid introducing a second method of referring to buffers
>> (LTR index),
>> and I think that V4L2_CID_MPEG_VIDEO_MARKLTRFRAME can be dropped.
>>
>> Regards,
>>
>>     Hans
> 
> Hi Hans,
> LTR indices are part of the spec, we are not introducing anything new.
> Also, there is no relation between v4l2 buffer index and LTR index. So we can't use V4l2 index as LTR index.

Ah, I think I start to understand the cause of my confusion: I thought that the
buffers containing LTR frames had to be kept around by userspace, but instead the
driver/HW makes copies and userspace does not need to do anything special with
those buffers.

OK, can you post a v2?

Document the controls in Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst

Regards,

	Hans

> 
> below are some details about LTR from spec:
> 
> Regarding LTR count:
> max_num_ref_frames specifies the maximum number of short-term and long-term reference frames, complementary reference field pairs, and non-paired reference fields that may be used by the decoding
> process for inter prediction of any picture in the coded video sequence. The value of max_num_ref_frames shall be in the range of 0 to MaxDpbFrames (as specified in clause A.3.1 or A.3.2), inclusive.
> max_long_term_frame_idx_plus1 minus 1 specifies the maximum value of long-term frame index allowed for long-term reference pictures (until receipt of another value of max_long_term_frame_idx_plus1).
> The value of max_long_term_frame_idx_plus1 shall be in the range of 0 to max_num_ref_frames, inclusive.
> 
> 
> Regarding LTR index which will be specified by V4L2_CID_MPEG_VIDEO_MARKLTRFRAM :
> long_term_pic_num specifies the long-term picture number of the picture being moved to the current index in the list. When decoding a coded frame, long_term_pic_num shall be equal to a LongTermPicNum
> assigned to one of the reference frames or complementary reference field pairs marked as "used for long-term reference".
> long_term_frame_idx is used (with memory_management_control_operation equal to 3 or 6) to assign a long-term frame index to a picture. When the associated memory_management_control_operation is
> processed by the decoding process, the value of long_term_frame_idx shall be in the range of 0 to MaxLongTermFrameIdx, inclusive.
> 
> 
> Also As there can be multiple LTR present. We can instruct encoder to use one and/or multiple of them using V4L2_CID_MPEG_VIDEO_USELTRFRAME with the bitmask.
> 
> Thanks,
> Dikshita
> 
>>
>>>
>>> Thanks,
>>> Dikshita
>>>>
>>>> Regards,
>>>>
>>>>     Hans
>>>>
>>>>>
>>>>> Thanks,
>>>>> Dikshita
>>>>>
>>>>>>>
>>>>>>> Note, I haven't captured very well the userspace control flow, perhaps
>>>>>>> this could be enhanced through writing some documentation.
>>>>>>>
>>>>>>> As per all other generic encoder controls, we need to make sure it
>>>>>>> will
>>>>>>> be usable and flexible enough for multiple HW blocks, as it can be
>>>>>>> tedious to extend later otherwise. It is important that along with
>>>>>>> this
>>>>>>> RFC you provide some comparisons with with other HW / SW APIs in order
>>>>>>> to help justify the design decisions. I also think there should be
>>>>>>> link made V4L2_CID_MPEG_VIDEO_GOP_* , number of B-Frames etc.
>>>>>>
>>>>>> I agree with Nicolas.
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>>     Hans
>>>>>>
>>>>>>>
>>>>>>> regards,
>>>>>>> Nicolas
>>>>>>>
>>>>>>>>
>>>>>>>> Dikshita Agarwal (1):
>>>>>>>>   media: v4l2-ctrls:  add control for ltr
>>>>>>>>
>>>>>>>>  drivers/media/v4l2-core/v4l2-ctrls.c | 6 ++++++
>>>>>>>>  include/uapi/linux/v4l2-controls.h   | 4 ++++
>>>>>>>>  2 files changed, 10 insertions(+)
>>>>>>>>
>>>>>>>