mbox series

[00/18] Add Hantro regmap and VC8000 h264 decode support

Message ID 20201012205957.889185-1-adrian.ratiu@collabora.com (mailing list archive)
Headers show
Series Add Hantro regmap and VC8000 h264 decode support | expand

Message

Adrian Ratiu Oct. 12, 2020, 8:59 p.m. UTC
Dear all,

This series introduces a regmap infrastructure for the Hantro driver
which is used to compensate for different HW-revision register layouts.
To justify it h264 decoding capability is added for newer VC8000 chips.

This is a gradual conversion to the new infra - a complete conversion
would have been very big and I do not have all the HW yet to test (I'm
expecting a RK3399 shipment next week though ;). I think converting the
h264 decoder provides a nice blueprint for how the other codecs can be
converted and enabled for different HW revisions.

The end goal of this is to make the driver more generic and eliminate
entirely custom boilerplate like `struct hantro_reg` or headers with
core-specific bit manipulations like `hantro_g1_regs.h` and instead rely
on the well-tested albeit more verbose regmap subsytem.

To give just two examples of bugs which are easily discovered by using
more verbose regmap fields (very easy to compare with the datasheets)
instead of relying on bit-magic tricks: G1_REG_DEC_CTRL3_INIT_QP(x) was
off-by-1 and the wrong .clk_gate bit was set in hantro_postproc.c.

Anyway, this series also extends the MMIO regmap API to allow relaxed
writes for the theoretical reason that avoiding unnecessary membarriers
leads to less CPU usage and small improvements to battery life. However,
in practice I could not measure differences between relaxed/non-relaxed
IO, so I'm on the fence whether to keep or remove the relaxed calls.

What I could masure is the performance impact of adding more sub-reg
field acesses: a constant ~ 20 microsecond bump per G1 h264 frame. This
is acceptable considering the total time to decode a frame takes three
orders of magnitude longer, i.e. miliseconds ranges, depending on the
frame size and bitstream params, so it is an acceptable trade-off to
have a more generic driver.

This has been tested on next-20201009 with imx8mq for G1 and an SoC with
VC8000 which has not yet been added (hopefuly support lands soon).

Kind regards,
Adrian

Adrian Ratiu (18):
  media: hantro: document all int reg bits up to vc8000
  media: hantro: make consistent use of decimal register notation
  media: hantro: make G1_REG_SOFT_RESET Rockchip specific
  media: hantro: add reset controller support
  media: hantro: prepare clocks before variant inits are run
  media: hantro: imx8mq: simplify ctrlblk reset logic
  regmap: mmio: add config option to allow relaxed MMIO accesses
  media: hantro: add initial MMIO regmap infrastructure
  media: hantro: default regmap to relaxed MMIO
  media: hantro: convert G1 h264 decoder to regmap fields
  media: hantro: convert G1 postproc to regmap
  media: hantro: add VC8000D h264 decoding
  media: hantro: add VC8000D postproc support
  media: hantro: make PP enablement logic a bit smarter
  media: hantro: add user-selectable, platform-selectable H264 High10
  media: hantro: rename h264_dec as it's not G1 specific anymore
  media: hantro: add dump registers debug option before decode start
  media: hantro: document encoder reg fields

 drivers/base/regmap/regmap-mmio.c             |   34 +-
 drivers/staging/media/hantro/Makefile         |    3 +-
 drivers/staging/media/hantro/hantro.h         |   79 +-
 drivers/staging/media/hantro/hantro_drv.c     |   41 +-
 drivers/staging/media/hantro/hantro_g1_regs.h |   92 +-
 ...hantro_g1_h264_dec.c => hantro_h264_dec.c} |  237 +++-
 drivers/staging/media/hantro/hantro_hw.h      |   23 +-
 .../staging/media/hantro/hantro_postproc.c    |  144 ++-
 drivers/staging/media/hantro/hantro_regmap.c  | 1015 +++++++++++++++++
 drivers/staging/media/hantro/hantro_regmap.h  |  295 +++++
 drivers/staging/media/hantro/hantro_v4l2.c    |    3 +-
 drivers/staging/media/hantro/imx8m_vpu_hw.c   |   75 +-
 drivers/staging/media/hantro/rk3288_vpu_hw.c  |    5 +-
 include/linux/regmap.h                        |    5 +
 14 files changed, 1795 insertions(+), 256 deletions(-)
 rename drivers/staging/media/hantro/{hantro_g1_h264_dec.c => hantro_h264_dec.c} (58%)
 create mode 100644 drivers/staging/media/hantro/hantro_regmap.c
 create mode 100644 drivers/staging/media/hantro/hantro_regmap.h

Comments

Jonas Karlman Oct. 12, 2020, 11:39 p.m. UTC | #1
Hi,

On 2020-10-12 22:59, Adrian Ratiu wrote:
> Dear all,
> 
> This series introduces a regmap infrastructure for the Hantro driver
> which is used to compensate for different HW-revision register layouts.
> To justify it h264 decoding capability is added for newer VC8000 chips.
> 
> This is a gradual conversion to the new infra - a complete conversion
> would have been very big and I do not have all the HW yet to test (I'm
> expecting a RK3399 shipment next week though ;). I think converting the
> h264 decoder provides a nice blueprint for how the other codecs can be
> converted and enabled for different HW revisions.
> 
> The end goal of this is to make the driver more generic and eliminate
> entirely custom boilerplate like `struct hantro_reg` or headers with
> core-specific bit manipulations like `hantro_g1_regs.h` and instead rely
> on the well-tested albeit more verbose regmap subsytem.
> 
> To give just two examples of bugs which are easily discovered by using
> more verbose regmap fields (very easy to compare with the datasheets)
> instead of relying on bit-magic tricks: G1_REG_DEC_CTRL3_INIT_QP(x) was
> off-by-1 and the wrong .clk_gate bit was set in hantro_postproc.c.
> 
> Anyway, this series also extends the MMIO regmap API to allow relaxed
> writes for the theoretical reason that avoiding unnecessary membarriers
> leads to less CPU usage and small improvements to battery life. However,
> in practice I could not measure differences between relaxed/non-relaxed
> IO, so I'm on the fence whether to keep or remove the relaxed calls.
> 
> What I could masure is the performance impact of adding more sub-reg
> field acesses: a constant ~ 20 microsecond bump per G1 h264 frame. This
> is acceptable considering the total time to decode a frame takes three
> orders of magnitude longer, i.e. miliseconds ranges, depending on the
> frame size and bitstream params, so it is an acceptable trade-off to
> have a more generic driver.

In the RK3399 variant all fields use completely different positions so
in order to make the driver fully generic all around 145 sub-reg fields
used for h264 needs to be converted, see [1] for a quick generation of
field mappings used for h264 decoding.

Any indication on how the performance will be impacted with 145 fields
compared to around 20 fields used in this series?

Another issue with RK3399 variant is that some fields use different
position depending on the codec used, e.g. two dec_ref_frames in [2].
Should we use codec specific field maps? or any other suggestion on
how we can handle such case?

[1] https://github.com/Kwiboo/rockchip-vpu-regtool/commit/8b88d94d2ed966c7d88d9a735c0c97368eb6c92d
[2] https://github.com/Kwiboo/rockchip-vpu-regtool/blob/master/rk3399_dec_regs.c#L1065
[3] https://github.com/Kwiboo/rockchip-vpu-regtool/commit/9498326296445a9ce153b585cc48e0cea05d3c93

Best regards,
Jonas

> 
> This has been tested on next-20201009 with imx8mq for G1 and an SoC with
> VC8000 which has not yet been added (hopefuly support lands soon).
> 
> Kind regards,
> Adrian
> 
> Adrian Ratiu (18):
>   media: hantro: document all int reg bits up to vc8000
>   media: hantro: make consistent use of decimal register notation
>   media: hantro: make G1_REG_SOFT_RESET Rockchip specific
>   media: hantro: add reset controller support
>   media: hantro: prepare clocks before variant inits are run
>   media: hantro: imx8mq: simplify ctrlblk reset logic
>   regmap: mmio: add config option to allow relaxed MMIO accesses
>   media: hantro: add initial MMIO regmap infrastructure
>   media: hantro: default regmap to relaxed MMIO
>   media: hantro: convert G1 h264 decoder to regmap fields
>   media: hantro: convert G1 postproc to regmap
>   media: hantro: add VC8000D h264 decoding
>   media: hantro: add VC8000D postproc support
>   media: hantro: make PP enablement logic a bit smarter
>   media: hantro: add user-selectable, platform-selectable H264 High10
>   media: hantro: rename h264_dec as it's not G1 specific anymore
>   media: hantro: add dump registers debug option before decode start
>   media: hantro: document encoder reg fields
> 
>  drivers/base/regmap/regmap-mmio.c             |   34 +-
>  drivers/staging/media/hantro/Makefile         |    3 +-
>  drivers/staging/media/hantro/hantro.h         |   79 +-
>  drivers/staging/media/hantro/hantro_drv.c     |   41 +-
>  drivers/staging/media/hantro/hantro_g1_regs.h |   92 +-
>  ...hantro_g1_h264_dec.c => hantro_h264_dec.c} |  237 +++-
>  drivers/staging/media/hantro/hantro_hw.h      |   23 +-
>  .../staging/media/hantro/hantro_postproc.c    |  144 ++-
>  drivers/staging/media/hantro/hantro_regmap.c  | 1015 +++++++++++++++++
>  drivers/staging/media/hantro/hantro_regmap.h  |  295 +++++
>  drivers/staging/media/hantro/hantro_v4l2.c    |    3 +-
>  drivers/staging/media/hantro/imx8m_vpu_hw.c   |   75 +-
>  drivers/staging/media/hantro/rk3288_vpu_hw.c  |    5 +-
>  include/linux/regmap.h                        |    5 +
>  14 files changed, 1795 insertions(+), 256 deletions(-)
>  rename drivers/staging/media/hantro/{hantro_g1_h264_dec.c => hantro_h264_dec.c} (58%)
>  create mode 100644 drivers/staging/media/hantro/hantro_regmap.c
>  create mode 100644 drivers/staging/media/hantro/hantro_regmap.h
>
Adrian Ratiu Oct. 13, 2020, 6:48 a.m. UTC | #2
Hi Jonas,

On Mon, 12 Oct 2020, Jonas Karlman <jonas@kwiboo.se> wrote:
> Hi, 
> 
> On 2020-10-12 22:59, Adrian Ratiu wrote: 
>> Dear all,  This series introduces a regmap infrastructure for 
>> the Hantro driver which is used to compensate for different 
>> HW-revision register layouts.  To justify it h264 decoding 
>> capability is added for newer VC8000 chips.   This is a gradual 
>> conversion to the new infra - a complete conversion would have 
>> been very big and I do not have all the HW yet to test (I'm 
>> expecting a RK3399 shipment next week though ;). I think 
>> converting the h264 decoder provides a nice blueprint for how 
>> the other codecs can be converted and enabled for different HW 
>> revisions.   The end goal of this is to make the driver more 
>> generic and eliminate entirely custom boilerplate like `struct 
>> hantro_reg` or headers with core-specific bit manipulations 
>> like `hantro_g1_regs.h` and instead rely on the well-tested 
>> albeit more verbose regmap subsytem.   To give just two 
>> examples of bugs which are easily discovered by using more 
>> verbose regmap fields (very easy to compare with the 
>> datasheets) instead of relying on bit-magic tricks: 
>> G1_REG_DEC_CTRL3_INIT_QP(x) was off-by-1 and the wrong 
>> .clk_gate bit was set in hantro_postproc.c.   Anyway, this 
>> series also extends the MMIO regmap API to allow relaxed writes 
>> for the theoretical reason that avoiding unnecessary 
>> membarriers leads to less CPU usage and small improvements to 
>> battery life. However, in practice I could not measure 
>> differences between relaxed/non-relaxed IO, so I'm on the fence 
>> whether to keep or remove the relaxed calls.   What I could 
>> masure is the performance impact of adding more sub-reg field 
>> acesses: a constant ~ 20 microsecond bump per G1 h264 
>> frame. This is acceptable considering the total time to decode 
>> a frame takes three orders of magnitude longer, 
>> i.e. miliseconds ranges, depending on the frame size and 
>> bitstream params, so it is an acceptable trade-off to have a 
>> more generic driver. 
> 
> In the RK3399 variant all fields use completely different 
> positions so in order to make the driver fully generic all 
> around 145 sub-reg fields used for h264 needs to be converted, 
> see [1] for a quick generation of field mappings used for h264 
> decoding. 
> 
> Any indication on how the performance will be impacted with 145 
> fields compared to around 20 fields used in this series? 

I'm aware of the RK3399 bigger layout divergence and have some 
commits converting more of the reg fields, but not all that is 
required for h264 on rk3399. I haven't seen a huge perf 
degradation but more measurements are needed, basically it depends 
on how often we go from writing a reg once to multiple times due 
to splitting.

I tried some benchmarks using regmap caching (both the default 
backends provided by the regmap subsystem, and a custom one I 
wrote) but they were not helping, perhaps if we had more fields 
then that would have more of an impact.

(btw some good news is I'm having a RK3399 SoC in the mail for an 
unrelated project and expect to receive it soon :D)

IMO there will always be a trade-off between optimizing the driver 
to squeeze the most perf out of the HW, eg optimize reg writes at 
low microsec level (which I think here is unnecessary) and making 
it more generic to support more HW.

In this case a fundamental question we need to ask ourselves is if 
the RK3399 "looks like another/different-enough HW" due to its 
bigger reg shuffling to warrant a separate driver or 
driver-within-a-driver architecture instead trying to bring it 
into the fold with the others, possibly degrading perf for 
everyone. I guess we'll have to see some benchmark numbers and an 
actual h264 implementation before deciding how to proceed with 
RK3399.

> 
> Another issue with RK3399 variant is that some fields use 
> different position depending on the codec used, e.g. two 
> dec_ref_frames in [2].  Should we use codec specific field maps? 
> or any other suggestion on how we can handle such case?

Yes, codec specific fields would be one idea, but I'd try to avoid 
it if possible to avoid unnecessary field definitions.

The regmap field API and config we currently use are just a flat 
structs (see hantro_regmap.[h|c]) but it doesn't have to be like 
that. Maybe we could organize it a bit better and in the future 
have some codec-level configs going on due to the regmap subsystem 
allowing de-coupling of the API (struct regmap_field) from the reg 
defs/configs (struct reg_field).

That is just an idea of the top of my head :) Will have to think a 
bit more about how to handle that specific use case in the 
future. Thanks!

>
> [1] https://github.com/Kwiboo/rockchip-vpu-regtool/commit/8b88d94d2ed966c7d88d9a735c0c97368eb6c92d
> [2] https://github.com/Kwiboo/rockchip-vpu-regtool/blob/master/rk3399_dec_regs.c#L1065
> [3] https://github.com/Kwiboo/rockchip-vpu-regtool/commit/9498326296445a9ce153b585cc48e0cea05d3c93
>
> Best regards,
> Jonas
>
>> 
>> This has been tested on next-20201009 with imx8mq for G1 and an SoC with
>> VC8000 which has not yet been added (hopefuly support lands soon).
>> 
>> Kind regards,
>> Adrian
>> 
>> Adrian Ratiu (18):
>>   media: hantro: document all int reg bits up to vc8000
>>   media: hantro: make consistent use of decimal register notation
>>   media: hantro: make G1_REG_SOFT_RESET Rockchip specific
>>   media: hantro: add reset controller support
>>   media: hantro: prepare clocks before variant inits are run
>>   media: hantro: imx8mq: simplify ctrlblk reset logic
>>   regmap: mmio: add config option to allow relaxed MMIO accesses
>>   media: hantro: add initial MMIO regmap infrastructure
>>   media: hantro: default regmap to relaxed MMIO
>>   media: hantro: convert G1 h264 decoder to regmap fields
>>   media: hantro: convert G1 postproc to regmap
>>   media: hantro: add VC8000D h264 decoding
>>   media: hantro: add VC8000D postproc support
>>   media: hantro: make PP enablement logic a bit smarter
>>   media: hantro: add user-selectable, platform-selectable H264 High10
>>   media: hantro: rename h264_dec as it's not G1 specific anymore
>>   media: hantro: add dump registers debug option before decode start
>>   media: hantro: document encoder reg fields
>> 
>>  drivers/base/regmap/regmap-mmio.c             |   34 +-
>>  drivers/staging/media/hantro/Makefile         |    3 +-
>>  drivers/staging/media/hantro/hantro.h         |   79 +-
>>  drivers/staging/media/hantro/hantro_drv.c     |   41 +-
>>  drivers/staging/media/hantro/hantro_g1_regs.h |   92 +-
>>  ...hantro_g1_h264_dec.c => hantro_h264_dec.c} |  237 +++-
>>  drivers/staging/media/hantro/hantro_hw.h      |   23 +-
>>  .../staging/media/hantro/hantro_postproc.c    |  144 ++-
>>  drivers/staging/media/hantro/hantro_regmap.c  | 1015 +++++++++++++++++
>>  drivers/staging/media/hantro/hantro_regmap.h  |  295 +++++
>>  drivers/staging/media/hantro/hantro_v4l2.c    |    3 +-
>>  drivers/staging/media/hantro/imx8m_vpu_hw.c   |   75 +-
>>  drivers/staging/media/hantro/rk3288_vpu_hw.c  |    5 +-
>>  include/linux/regmap.h                        |    5 +
>>  14 files changed, 1795 insertions(+), 256 deletions(-)
>>  rename drivers/staging/media/hantro/{hantro_g1_h264_dec.c => hantro_h264_dec.c} (58%)
>>  create mode 100644 drivers/staging/media/hantro/hantro_regmap.c
>>  create mode 100644 drivers/staging/media/hantro/hantro_regmap.h
>>
Ezequiel Garcia Oct. 29, 2020, 12:38 p.m. UTC | #3
On Mon, 2020-10-12 at 23:39 +0000, Jonas Karlman wrote:
> Hi,
> 
> On 2020-10-12 22:59, Adrian Ratiu wrote:
> > Dear all,
> > 
> > This series introduces a regmap infrastructure for the Hantro driver
> > which is used to compensate for different HW-revision register layouts.
> > To justify it h264 decoding capability is added for newer VC8000 chips.
> > 
> > This is a gradual conversion to the new infra - a complete conversion
> > would have been very big and I do not have all the HW yet to test (I'm
> > expecting a RK3399 shipment next week though ;). I think converting the
> > h264 decoder provides a nice blueprint for how the other codecs can be
> > converted and enabled for different HW revisions.
> > 
> > The end goal of this is to make the driver more generic and eliminate
> > entirely custom boilerplate like `struct hantro_reg` or headers with
> > core-specific bit manipulations like `hantro_g1_regs.h` and instead rely
> > on the well-tested albeit more verbose regmap subsytem.
> > 
> > To give just two examples of bugs which are easily discovered by using
> > more verbose regmap fields (very easy to compare with the datasheets)
> > instead of relying on bit-magic tricks: G1_REG_DEC_CTRL3_INIT_QP(x) was
> > off-by-1 and the wrong .clk_gate bit was set in hantro_postproc.c.
> > 
> > Anyway, this series also extends the MMIO regmap API to allow relaxed
> > writes for the theoretical reason that avoiding unnecessary membarriers
> > leads to less CPU usage and small improvements to battery life. However,
> > in practice I could not measure differences between relaxed/non-relaxed
> > IO, so I'm on the fence whether to keep or remove the relaxed calls.
> > 
> > What I could masure is the performance impact of adding more sub-reg
> > field acesses: a constant ~ 20 microsecond bump per G1 h264 frame. This
> > is acceptable considering the total time to decode a frame takes three
> > orders of magnitude longer, i.e. miliseconds ranges, depending on the
> > frame size and bitstream params, so it is an acceptable trade-off to
> > have a more generic driver.
> 
> In the RK3399 variant all fields use completely different positions so
> in order to make the driver fully generic all around 145 sub-reg fields
> used for h264 needs to be converted, see [1] for a quick generation of
> field mappings used for h264 decoding.
> 

Currently, we've only decided to support H.264 decoding via he RKVDEC
core on RK3399.

What your thoughts here Jonas, have you tested H.264 on RK3399 with
the G1 core? If it works, what benefits do we get from enabling both
cores?

Thanks!
Ezequiel

> Any indication on how the performance will be impacted with 145 fields
> compared to around 20 fields used in this series?
> 
> Another issue with RK3399 variant is that some fields use different
> position depending on the codec used, e.g. two dec_ref_frames in [2].
> Should we use codec specific field maps? or any other suggestion on
> how we can handle such case?
> 
> [1] https://github.com/Kwiboo/rockchip-vpu-regtool/commit/8b88d94d2ed966c7d88d9a735c0c97368eb6c92d
> [2] https://github.com/Kwiboo/rockchip-vpu-regtool/blob/master/rk3399_dec_regs.c#L1065
> [3] https://github.com/Kwiboo/rockchip-vpu-regtool/commit/9498326296445a9ce153b585cc48e0cea05d3c93
> 
> Best regards,
> Jonas
> 
> > This has been tested on next-20201009 with imx8mq for G1 and an SoC with
> > VC8000 which has not yet been added (hopefuly support lands soon).
> > 
> > Kind regards,
> > Adrian
> > 
> > Adrian Ratiu (18):
> >   media: hantro: document all int reg bits up to vc8000
> >   media: hantro: make consistent use of decimal register notation
> >   media: hantro: make G1_REG_SOFT_RESET Rockchip specific
> >   media: hantro: add reset controller support
> >   media: hantro: prepare clocks before variant inits are run
> >   media: hantro: imx8mq: simplify ctrlblk reset logic
> >   regmap: mmio: add config option to allow relaxed MMIO accesses
> >   media: hantro: add initial MMIO regmap infrastructure
> >   media: hantro: default regmap to relaxed MMIO
> >   media: hantro: convert G1 h264 decoder to regmap fields
> >   media: hantro: convert G1 postproc to regmap
> >   media: hantro: add VC8000D h264 decoding
> >   media: hantro: add VC8000D postproc support
> >   media: hantro: make PP enablement logic a bit smarter
> >   media: hantro: add user-selectable, platform-selectable H264 High10
> >   media: hantro: rename h264_dec as it's not G1 specific anymore
> >   media: hantro: add dump registers debug option before decode start
> >   media: hantro: document encoder reg fields
> > 
> >  drivers/base/regmap/regmap-mmio.c             |   34 +-
> >  drivers/staging/media/hantro/Makefile         |    3 +-
> >  drivers/staging/media/hantro/hantro.h         |   79 +-
> >  drivers/staging/media/hantro/hantro_drv.c     |   41 +-
> >  drivers/staging/media/hantro/hantro_g1_regs.h |   92 +-
> >  ...hantro_g1_h264_dec.c => hantro_h264_dec.c} |  237 +++-
> >  drivers/staging/media/hantro/hantro_hw.h      |   23 +-
> >  .../staging/media/hantro/hantro_postproc.c    |  144 ++-
> >  drivers/staging/media/hantro/hantro_regmap.c  | 1015 +++++++++++++++++
> >  drivers/staging/media/hantro/hantro_regmap.h  |  295 +++++
> >  drivers/staging/media/hantro/hantro_v4l2.c    |    3 +-
> >  drivers/staging/media/hantro/imx8m_vpu_hw.c   |   75 +-
> >  drivers/staging/media/hantro/rk3288_vpu_hw.c  |    5 +-
> >  include/linux/regmap.h                        |    5 +
> >  14 files changed, 1795 insertions(+), 256 deletions(-)
> >  rename drivers/staging/media/hantro/{hantro_g1_h264_dec.c => hantro_h264_dec.c} (58%)
> >  create mode 100644 drivers/staging/media/hantro/hantro_regmap.c
> >  create mode 100644 drivers/staging/media/hantro/hantro_regmap.h
> >
Ezequiel Garcia Oct. 29, 2020, 1:07 p.m. UTC | #4
Hello Adrian,

On Mon, 2020-10-12 at 23:59 +0300, Adrian Ratiu wrote:
> Dear all,
> 
> This series introduces a regmap infrastructure for the Hantro driver
> which is used to compensate for different HW-revision register layouts.
> To justify it h264 decoding capability is added for newer VC8000 chips.
> 
> This is a gradual conversion to the new infra - a complete conversion
> would have been very big and I do not have all the HW yet to test (I'm
> expecting a RK3399 shipment next week though ;). I think converting the
> h264 decoder provides a nice blueprint for how the other codecs can be
> converted and enabled for different HW revisions.
> 
> The end goal of this is to make the driver more generic and eliminate
> entirely custom boilerplate like `struct hantro_reg` or headers with
> core-specific bit manipulations like `hantro_g1_regs.h` and instead rely
> on the well-tested albeit more verbose regmap subsytem.
> 
> To give just two examples of bugs which are easily discovered by using
> more verbose regmap fields (very easy to compare with the datasheets)
> instead of relying on bit-magic tricks: G1_REG_DEC_CTRL3_INIT_QP(x) was
> off-by-1 and the wrong .clk_gate bit was set in hantro_postproc.c.
> 
> Anyway, this series also extends the MMIO regmap API to allow relaxed
> writes for the theoretical reason that avoiding unnecessary membarriers
> leads to less CPU usage and small improvements to battery life. However,
> in practice I could not measure differences between relaxed/non-relaxed
> IO, so I'm on the fence whether to keep or remove the relaxed calls.
> 
> What I could masure is the performance impact of adding more sub-reg
> field acesses: a constant ~ 20 microsecond bump per G1 h264 frame. This
> is acceptable considering the total time to decode a frame takes three
> orders of magnitude longer, i.e. miliseconds ranges, depending on the
> frame size and bitstream params, so it is an acceptable trade-off to
> have a more generic driver.
> 

Before going forward with using regmap, I would like to have a sense
of the footprint it adds, and see if we can avoid that 20 us penalty.

I'd also like to try another approach, something that has less
memory footprint and less runtime penalty.

How about something like this:

#define G1_PIC_WIDTH 4, 0xff8, 23                                                
#define ...
                                 
struct hantro_swreg {                                                            
        u32 value[399 /*whatever size goes here*/];                                                          
};                                                                               
                                                                                 
void hantro_reg_write(struct hantro_swreg *r,                                   
                      unsigned int swreg, u32 mask, u32 offset, u32 new_val)     
{                                                                                
        r->value[swreg] = (r->value[swreg] & ~(mask)) |                          
                          ((new_val << offset) & mask);                          
}

Which you can then use in a very similar way as the current proposal:

hantro_reg_write(&swreg, G1_PIC_WIDTH, width);

The first advantage here is that we no longer have any
footprint for the fields.

The ugly macros for "4, 0xff8, 23" can be auto-generated from
existing vendor headers, when possible, so that shouldn't
bother us.

The register set is "flushed" using _relaxed, but it
could be still costly.

If that is indeed costly, perhaps we can avoid writing
the entire set by having a dirty bit somewhere.

In any case, it's worth exploring our options first, I think.

PS: Another option is to just fork RK3399 to its
own driver and call the day, given how different it is :)

Thanks!
Ezequiel
Robin Murphy Oct. 29, 2020, 2:15 p.m. UTC | #5
On 2020-10-29 13:07, Ezequiel Garcia wrote:
> Hello Adrian,
> 
> On Mon, 2020-10-12 at 23:59 +0300, Adrian Ratiu wrote:
>> Dear all,
>>
>> This series introduces a regmap infrastructure for the Hantro driver
>> which is used to compensate for different HW-revision register layouts.
>> To justify it h264 decoding capability is added for newer VC8000 chips.
>>
>> This is a gradual conversion to the new infra - a complete conversion
>> would have been very big and I do not have all the HW yet to test (I'm
>> expecting a RK3399 shipment next week though ;). I think converting the
>> h264 decoder provides a nice blueprint for how the other codecs can be
>> converted and enabled for different HW revisions.
>>
>> The end goal of this is to make the driver more generic and eliminate
>> entirely custom boilerplate like `struct hantro_reg` or headers with
>> core-specific bit manipulations like `hantro_g1_regs.h` and instead rely
>> on the well-tested albeit more verbose regmap subsytem.
>>
>> To give just two examples of bugs which are easily discovered by using
>> more verbose regmap fields (very easy to compare with the datasheets)
>> instead of relying on bit-magic tricks: G1_REG_DEC_CTRL3_INIT_QP(x) was
>> off-by-1 and the wrong .clk_gate bit was set in hantro_postproc.c.
>>
>> Anyway, this series also extends the MMIO regmap API to allow relaxed
>> writes for the theoretical reason that avoiding unnecessary membarriers
>> leads to less CPU usage and small improvements to battery life. However,
>> in practice I could not measure differences between relaxed/non-relaxed
>> IO, so I'm on the fence whether to keep or remove the relaxed calls.
>>
>> What I could masure is the performance impact of adding more sub-reg
>> field acesses: a constant ~ 20 microsecond bump per G1 h264 frame. This
>> is acceptable considering the total time to decode a frame takes three
>> orders of magnitude longer, i.e. miliseconds ranges, depending on the
>> frame size and bitstream params, so it is an acceptable trade-off to
>> have a more generic driver.
>>
> 
> Before going forward with using regmap, I would like to have a sense
> of the footprint it adds, and see if we can avoid that 20 us penalty.
> 
> I'd also like to try another approach, something that has less
> memory footprint and less runtime penalty.
> 
> How about something like this:
> 
> #define G1_PIC_WIDTH 4, 0xff8, 23
> #define ...
>                                   
> struct hantro_swreg {
>          u32 value[399 /*whatever size goes here*/];
> };
>                                                                                   
> void hantro_reg_write(struct hantro_swreg *r,
>                        unsigned int swreg, u32 mask, u32 offset, u32 new_val)
> {
>          r->value[swreg] = (r->value[swreg] & ~(mask)) |
>                            ((new_val << offset) & mask);
> }
> 
> Which you can then use in a very similar way as the current proposal:
> 
> hantro_reg_write(&swreg, G1_PIC_WIDTH, width);
> 
> The first advantage here is that we no longer have any
> footprint for the fields.
> 
> The ugly macros for "4, 0xff8, 23" can be auto-generated from
> existing vendor headers, when possible, so that shouldn't
> bother us.
> 
> The register set is "flushed" using _relaxed, but it
> could be still costly.
> 
> If that is indeed costly, perhaps we can avoid writing
> the entire set by having a dirty bit somewhere.
> 
> In any case, it's worth exploring our options first, I think.

Or maybe the regmap API itself deserves extending with a "deferred" 
operating mode where updates to the cached state can be separated from 
committing that state to the underlying hardware.

...which, after a brief code search out of curiosity, apparently already 
exists in the form of regcache_cache_only()/regcache_sync(), so there's 
probably no need to reinvent it :)

Robin.

> 
> PS: Another option is to just fork RK3399 to its
> own driver and call the day, given how different it is :)
> 
> Thanks!
> Ezequiel
> 
> 
> 
> _______________________________________________
> Linux-rockchip mailing list
> Linux-rockchip@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-rockchip
>
Mark Brown Oct. 29, 2020, 2:48 p.m. UTC | #6
On Thu, Oct 29, 2020 at 02:15:10PM +0000, Robin Murphy wrote:

> Or maybe the regmap API itself deserves extending with a "deferred"
> operating mode where updates to the cached state can be separated from
> committing that state to the underlying hardware.

> ...which, after a brief code search out of curiosity, apparently already
> exists in the form of regcache_cache_only()/regcache_sync(), so there's
> probably no need to reinvent it :)

Yes, exactly.  One of the big use cases for regmap on MMIO devices is
being able to access the register map without the hardware being there,
this would be another application of the cache stuff.
Jonas Karlman Oct. 29, 2020, 4:21 p.m. UTC | #7
On 2020-10-29 13:38, Ezequiel Garcia wrote:
> On Mon, 2020-10-12 at 23:39 +0000, Jonas Karlman wrote:
>> Hi,
>>
>> On 2020-10-12 22:59, Adrian Ratiu wrote:
>>> Dear all,
>>>
>>> This series introduces a regmap infrastructure for the Hantro driver
>>> which is used to compensate for different HW-revision register layouts.
>>> To justify it h264 decoding capability is added for newer VC8000 chips.
>>>
>>> This is a gradual conversion to the new infra - a complete conversion
>>> would have been very big and I do not have all the HW yet to test (I'm
>>> expecting a RK3399 shipment next week though ;). I think converting the
>>> h264 decoder provides a nice blueprint for how the other codecs can be
>>> converted and enabled for different HW revisions.
>>>
>>> The end goal of this is to make the driver more generic and eliminate
>>> entirely custom boilerplate like `struct hantro_reg` or headers with
>>> core-specific bit manipulations like `hantro_g1_regs.h` and instead rely
>>> on the well-tested albeit more verbose regmap subsytem.
>>>
>>> To give just two examples of bugs which are easily discovered by using
>>> more verbose regmap fields (very easy to compare with the datasheets)
>>> instead of relying on bit-magic tricks: G1_REG_DEC_CTRL3_INIT_QP(x) was
>>> off-by-1 and the wrong .clk_gate bit was set in hantro_postproc.c.
>>>
>>> Anyway, this series also extends the MMIO regmap API to allow relaxed
>>> writes for the theoretical reason that avoiding unnecessary membarriers
>>> leads to less CPU usage and small improvements to battery life. However,
>>> in practice I could not measure differences between relaxed/non-relaxed
>>> IO, so I'm on the fence whether to keep or remove the relaxed calls.
>>>
>>> What I could masure is the performance impact of adding more sub-reg
>>> field acesses: a constant ~ 20 microsecond bump per G1 h264 frame. This
>>> is acceptable considering the total time to decode a frame takes three
>>> orders of magnitude longer, i.e. miliseconds ranges, depending on the
>>> frame size and bitstream params, so it is an acceptable trade-off to
>>> have a more generic driver.
>>
>> In the RK3399 variant all fields use completely different positions so
>> in order to make the driver fully generic all around 145 sub-reg fields
>> used for h264 needs to be converted, see [1] for a quick generation of
>> field mappings used for h264 decoding.
>>
> 
> Currently, we've only decided to support H.264 decoding via he RKVDEC
> core on RK3399.
> 
> What your thoughts here Jonas, have you tested H.264 on RK3399 with
> the G1 core? If it works, what benefits do we get from enabling both
> cores?

The G1 core was working back in Dec/Jan/Feb and was used for H.264 decoding in
LibreELEC nightly images until the rkvdec h264 driver was submitted/merged.

For RK3399 and other SoCs that both contain RKVDEC and VDPU2 IP it may not be
much of a benefit. Possible for decoding multiple videos in parallel,
it is unclear to me if both IP can be used at the same time.

There are however SoCs that only have VDPU2 IP (px30/rk3326 and rk1808)
that could benefit from adding support for the VDPU2 IP, see [1].

Should I submit the rk3399 variant in similar style as the rk3399 mpeg2 decoder?
Or should I try and adopt it to be based on this series and use regmap?

[1] https://github.com/HermanChen/mpp/blob/develop/osal/mpp_platform.cpp#L80-L82

Best regards,
Jonas

> 
> Thanks!
> Ezequiel
> 
>> Any indication on how the performance will be impacted with 145 fields
>> compared to around 20 fields used in this series?
>>
>> Another issue with RK3399 variant is that some fields use different
>> position depending on the codec used, e.g. two dec_ref_frames in [2].
>> Should we use codec specific field maps? or any other suggestion on
>> how we can handle such case?
>>
>> [1] https://github.com/Kwiboo/rockchip-vpu-regtool/commit/8b88d94d2ed966c7d88d9a735c0c97368eb6c92d
>> [2] https://github.com/Kwiboo/rockchip-vpu-regtool/blob/master/rk3399_dec_regs.c#L1065
>> [3] https://github.com/Kwiboo/rockchip-vpu-regtool/commit/9498326296445a9ce153b585cc48e0cea05d3c93
>>
>> Best regards,
>> Jonas
>>
>>> This has been tested on next-20201009 with imx8mq for G1 and an SoC with
>>> VC8000 which has not yet been added (hopefuly support lands soon).
>>>
>>> Kind regards,
>>> Adrian
>>>
>>> Adrian Ratiu (18):
>>>   media: hantro: document all int reg bits up to vc8000
>>>   media: hantro: make consistent use of decimal register notation
>>>   media: hantro: make G1_REG_SOFT_RESET Rockchip specific
>>>   media: hantro: add reset controller support
>>>   media: hantro: prepare clocks before variant inits are run
>>>   media: hantro: imx8mq: simplify ctrlblk reset logic
>>>   regmap: mmio: add config option to allow relaxed MMIO accesses
>>>   media: hantro: add initial MMIO regmap infrastructure
>>>   media: hantro: default regmap to relaxed MMIO
>>>   media: hantro: convert G1 h264 decoder to regmap fields
>>>   media: hantro: convert G1 postproc to regmap
>>>   media: hantro: add VC8000D h264 decoding
>>>   media: hantro: add VC8000D postproc support
>>>   media: hantro: make PP enablement logic a bit smarter
>>>   media: hantro: add user-selectable, platform-selectable H264 High10
>>>   media: hantro: rename h264_dec as it's not G1 specific anymore
>>>   media: hantro: add dump registers debug option before decode start
>>>   media: hantro: document encoder reg fields
>>>
>>>  drivers/base/regmap/regmap-mmio.c             |   34 +-
>>>  drivers/staging/media/hantro/Makefile         |    3 +-
>>>  drivers/staging/media/hantro/hantro.h         |   79 +-
>>>  drivers/staging/media/hantro/hantro_drv.c     |   41 +-
>>>  drivers/staging/media/hantro/hantro_g1_regs.h |   92 +-
>>>  ...hantro_g1_h264_dec.c => hantro_h264_dec.c} |  237 +++-
>>>  drivers/staging/media/hantro/hantro_hw.h      |   23 +-
>>>  .../staging/media/hantro/hantro_postproc.c    |  144 ++-
>>>  drivers/staging/media/hantro/hantro_regmap.c  | 1015 +++++++++++++++++
>>>  drivers/staging/media/hantro/hantro_regmap.h  |  295 +++++
>>>  drivers/staging/media/hantro/hantro_v4l2.c    |    3 +-
>>>  drivers/staging/media/hantro/imx8m_vpu_hw.c   |   75 +-
>>>  drivers/staging/media/hantro/rk3288_vpu_hw.c  |    5 +-
>>>  include/linux/regmap.h                        |    5 +
>>>  14 files changed, 1795 insertions(+), 256 deletions(-)
>>>  rename drivers/staging/media/hantro/{hantro_g1_h264_dec.c => hantro_h264_dec.c} (58%)
>>>  create mode 100644 drivers/staging/media/hantro/hantro_regmap.c
>>>  create mode 100644 drivers/staging/media/hantro/hantro_regmap.h
>>>
> 
>
Ezequiel Garcia Oct. 29, 2020, 4:27 p.m. UTC | #8
Hi Robin,

On Thu, 2020-10-29 at 14:15 +0000, Robin Murphy wrote:
> On 2020-10-29 13:07, Ezequiel Garcia wrote:
> > Hello Adrian,
> > 
> > On Mon, 2020-10-12 at 23:59 +0300, Adrian Ratiu wrote:
> > > Dear all,
> > > 
> > > This series introduces a regmap infrastructure for the Hantro driver
> > > which is used to compensate for different HW-revision register layouts.
> > > To justify it h264 decoding capability is added for newer VC8000 chips.
> > > 
> > > This is a gradual conversion to the new infra - a complete conversion
> > > would have been very big and I do not have all the HW yet to test (I'm
> > > expecting a RK3399 shipment next week though ;). I think converting the
> > > h264 decoder provides a nice blueprint for how the other codecs can be
> > > converted and enabled for different HW revisions.
> > > 
> > > The end goal of this is to make the driver more generic and eliminate
> > > entirely custom boilerplate like `struct hantro_reg` or headers with
> > > core-specific bit manipulations like `hantro_g1_regs.h` and instead rely
> > > on the well-tested albeit more verbose regmap subsytem.
> > > 
> > > To give just two examples of bugs which are easily discovered by using
> > > more verbose regmap fields (very easy to compare with the datasheets)
> > > instead of relying on bit-magic tricks: G1_REG_DEC_CTRL3_INIT_QP(x) was
> > > off-by-1 and the wrong .clk_gate bit was set in hantro_postproc.c.
> > > 
> > > Anyway, this series also extends the MMIO regmap API to allow relaxed
> > > writes for the theoretical reason that avoiding unnecessary membarriers
> > > leads to less CPU usage and small improvements to battery life. However,
> > > in practice I could not measure differences between relaxed/non-relaxed
> > > IO, so I'm on the fence whether to keep or remove the relaxed calls.
> > > 
> > > What I could masure is the performance impact of adding more sub-reg
> > > field acesses: a constant ~ 20 microsecond bump per G1 h264 frame. This
> > > is acceptable considering the total time to decode a frame takes three
> > > orders of magnitude longer, i.e. miliseconds ranges, depending on the
> > > frame size and bitstream params, so it is an acceptable trade-off to
> > > have a more generic driver.
> > > 
> > 
> > Before going forward with using regmap, I would like to have a sense
> > of the footprint it adds, and see if we can avoid that 20 us penalty.
> > 
> > I'd also like to try another approach, something that has less
> > memory footprint and less runtime penalty.
> > 
> > How about something like this:
> > 
> > #define G1_PIC_WIDTH 4, 0xff8, 23
> > #define ...
> >                                   
> > struct hantro_swreg {
> >          u32 value[399 /*whatever size goes here*/];
> > };
> >                                                                                   
> > void hantro_reg_write(struct hantro_swreg *r,
> >                        unsigned int swreg, u32 mask, u32 offset, u32 new_val)
> > {
> >          r->value[swreg] = (r->value[swreg] & ~(mask)) |
> >                            ((new_val << offset) & mask);
> > }
> > 
> > Which you can then use in a very similar way as the current proposal:
> > 
> > hantro_reg_write(&swreg, G1_PIC_WIDTH, width);
> > 
> > The first advantage here is that we no longer have any
> > footprint for the fields.
> > 
> > The ugly macros for "4, 0xff8, 23" can be auto-generated from
> > existing vendor headers, when possible, so that shouldn't
> > bother us.
> > 
> > The register set is "flushed" using _relaxed, but it
> > could be still costly.
> > 
> > If that is indeed costly, perhaps we can avoid writing
> > the entire set by having a dirty bit somewhere.
> > 
> > In any case, it's worth exploring our options first, I think.
> 
> Or maybe the regmap API itself deserves extending with a "deferred" 
> operating mode where updates to the cached state can be separated from 
> committing that state to the underlying hardware.
> 
> ...which, after a brief code search out of curiosity, apparently already 
> exists in the form of regcache_cache_only()/regcache_sync(), so there's 
> probably no need to reinvent it :)
> 

To be fair, and despite it could seem an anti-pattern, this particular
wheel is so tiny and trivial, that I'm starting to seriously consider
reinventing it.

I've been thinking long about this but just can't seem to see exactly
what benefit we're getting from using MMIO regmaps here,
as opposed to just a simple macro with an index, a mask, and an offset.

Ezequiel
Mark Brown Oct. 29, 2020, 5:59 p.m. UTC | #9
On Thu, Oct 29, 2020 at 01:27:08PM -0300, Ezequiel Garcia wrote:
> On Thu, 2020-10-29 at 14:15 +0000, Robin Murphy wrote:

> > Or maybe the regmap API itself deserves extending with a "deferred" 
> > operating mode where updates to the cached state can be separated from 
> > committing that state to the underlying hardware.

> > ...which, after a brief code search out of curiosity, apparently already 
> > exists in the form of regcache_cache_only()/regcache_sync(), so there's 
> > probably no need to reinvent it :)

> To be fair, and despite it could seem an anti-pattern, this particular
> wheel is so tiny and trivial, that I'm starting to seriously consider
> reinventing it.

> I've been thinking long about this but just can't seem to see exactly
> what benefit we're getting from using MMIO regmaps here,
> as opposed to just a simple macro with an index, a mask, and an offset.

As a rule of thumb if you're not using a cache or fitting into some
other higher level framework stuff that uses regmap then I wouldn't
bother for MMIO devices.
Ezequiel Garcia Nov. 3, 2020, 3:27 p.m. UTC | #10
On Thu, 2020-10-29 at 16:21 +0000, Jonas Karlman wrote:
> On 2020-10-29 13:38, Ezequiel Garcia wrote:
> > On Mon, 2020-10-12 at 23:39 +0000, Jonas Karlman wrote:
> > > Hi,
> > > 
> > > On 2020-10-12 22:59, Adrian Ratiu wrote:
> > > > Dear all,
> > > > 
> > > > This series introduces a regmap infrastructure for the Hantro driver
> > > > which is used to compensate for different HW-revision register layouts.
> > > > To justify it h264 decoding capability is added for newer VC8000 chips.
> > > > 
> > > > This is a gradual conversion to the new infra - a complete conversion
> > > > would have been very big and I do not have all the HW yet to test (I'm
> > > > expecting a RK3399 shipment next week though ;). I think converting the
> > > > h264 decoder provides a nice blueprint for how the other codecs can be
> > > > converted and enabled for different HW revisions.
> > > > 
> > > > The end goal of this is to make the driver more generic and eliminate
> > > > entirely custom boilerplate like `struct hantro_reg` or headers with
> > > > core-specific bit manipulations like `hantro_g1_regs.h` and instead rely
> > > > on the well-tested albeit more verbose regmap subsytem.
> > > > 
> > > > To give just two examples of bugs which are easily discovered by using
> > > > more verbose regmap fields (very easy to compare with the datasheets)
> > > > instead of relying on bit-magic tricks: G1_REG_DEC_CTRL3_INIT_QP(x) was
> > > > off-by-1 and the wrong .clk_gate bit was set in hantro_postproc.c.
> > > > 
> > > > Anyway, this series also extends the MMIO regmap API to allow relaxed
> > > > writes for the theoretical reason that avoiding unnecessary membarriers
> > > > leads to less CPU usage and small improvements to battery life. However,
> > > > in practice I could not measure differences between relaxed/non-relaxed
> > > > IO, so I'm on the fence whether to keep or remove the relaxed calls.
> > > > 
> > > > What I could masure is the performance impact of adding more sub-reg
> > > > field acesses: a constant ~ 20 microsecond bump per G1 h264 frame. This
> > > > is acceptable considering the total time to decode a frame takes three
> > > > orders of magnitude longer, i.e. miliseconds ranges, depending on the
> > > > frame size and bitstream params, so it is an acceptable trade-off to
> > > > have a more generic driver.
> > > 
> > > In the RK3399 variant all fields use completely different positions so
> > > in order to make the driver fully generic all around 145 sub-reg fields
> > > used for h264 needs to be converted, see [1] for a quick generation of
> > > field mappings used for h264 decoding.
> > > 
> > 
> > Currently, we've only decided to support H.264 decoding via he RKVDEC
> > core on RK3399.
> > 
> > What your thoughts here Jonas, have you tested H.264 on RK3399 with
> > the G1 core? If it works, what benefits do we get from enabling both
> > cores?
> 
> The G1 core was working back in Dec/Jan/Feb and was used for H.264 decoding in
> LibreELEC nightly images until the rkvdec h264 driver was submitted/merged.
> 
> For RK3399 and other SoCs that both contain RKVDEC and VDPU2 IP it may not be
> much of a benefit. Possible for decoding multiple videos in parallel,
> it is unclear to me if both IP can be used at the same time.
> 
> There are however SoCs that only have VDPU2 IP (px30/rk3326 and rk1808)
> that could benefit from adding support for the VDPU2 IP, see [1].
> 
> Should I submit the rk3399 variant in similar style as the rk3399 mpeg2 decoder?
> Or should I try and adopt it to be based on this series and use regmap?
> 

I'm inclined to take a patch that is as uninvasive as possible first.
If it's easy enough to submit something that adds just the minimum for
VDPU2, then please go for it.

We can do the cleaning later.

As a stretch, if you could add px30 (which seems to be an alias for
rk3326, in terms of codec support), that would be nice as well.
(I have ordered an odroid with rk3326 but shipment is taking ages). 

Thanks,
Ezequiel

> [1] https://github.com/HermanChen/mpp/blob/develop/osal/mpp_platform.cpp#L80-L82
> 
> Best regards,
> Jonas
> 
> > Thanks!
> > Ezequiel
> > 
> > > Any indication on how the performance will be impacted with 145 fields
> > > compared to around 20 fields used in this series?
> > > 
> > > Another issue with RK3399 variant is that some fields use different
> > > position depending on the codec used, e.g. two dec_ref_frames in [2].
> > > Should we use codec specific field maps? or any other suggestion on
> > > how we can handle such case?
> > > 
> > > [1] https://github.com/Kwiboo/rockchip-vpu-regtool/commit/8b88d94d2ed966c7d88d9a735c0c97368eb6c92d
> > > [2] https://github.com/Kwiboo/rockchip-vpu-regtool/blob/master/rk3399_dec_regs.c#L1065
> > > [3] https://github.com/Kwiboo/rockchip-vpu-regtool/commit/9498326296445a9ce153b585cc48e0cea05d3c93
> > > 
> > > Best regards,
> > > Jonas
> > > 
> > > > This has been tested on next-20201009 with imx8mq for G1 and an SoC with
> > > > VC8000 which has not yet been added (hopefuly support lands soon).
> > > > 
> > > > Kind regards,
> > > > Adrian
> > > > 
> > > > Adrian Ratiu (18):
> > > >   media: hantro: document all int reg bits up to vc8000
> > > >   media: hantro: make consistent use of decimal register notation
> > > >   media: hantro: make G1_REG_SOFT_RESET Rockchip specific
> > > >   media: hantro: add reset controller support
> > > >   media: hantro: prepare clocks before variant inits are run
> > > >   media: hantro: imx8mq: simplify ctrlblk reset logic
> > > >   regmap: mmio: add config option to allow relaxed MMIO accesses
> > > >   media: hantro: add initial MMIO regmap infrastructure
> > > >   media: hantro: default regmap to relaxed MMIO
> > > >   media: hantro: convert G1 h264 decoder to regmap fields
> > > >   media: hantro: convert G1 postproc to regmap
> > > >   media: hantro: add VC8000D h264 decoding
> > > >   media: hantro: add VC8000D postproc support
> > > >   media: hantro: make PP enablement logic a bit smarter
> > > >   media: hantro: add user-selectable, platform-selectable H264 High10
> > > >   media: hantro: rename h264_dec as it's not G1 specific anymore
> > > >   media: hantro: add dump registers debug option before decode start
> > > >   media: hantro: document encoder reg fields
> > > > 
> > > >  drivers/base/regmap/regmap-mmio.c             |   34 +-
> > > >  drivers/staging/media/hantro/Makefile         |    3 +-
> > > >  drivers/staging/media/hantro/hantro.h         |   79 +-
> > > >  drivers/staging/media/hantro/hantro_drv.c     |   41 +-
> > > >  drivers/staging/media/hantro/hantro_g1_regs.h |   92 +-
> > > >  ...hantro_g1_h264_dec.c => hantro_h264_dec.c} |  237 +++-
> > > >  drivers/staging/media/hantro/hantro_hw.h      |   23 +-
> > > >  .../staging/media/hantro/hantro_postproc.c    |  144 ++-
> > > >  drivers/staging/media/hantro/hantro_regmap.c  | 1015 +++++++++++++++++
> > > >  drivers/staging/media/hantro/hantro_regmap.h  |  295 +++++
> > > >  drivers/staging/media/hantro/hantro_v4l2.c    |    3 +-
> > > >  drivers/staging/media/hantro/imx8m_vpu_hw.c   |   75 +-
> > > >  drivers/staging/media/hantro/rk3288_vpu_hw.c  |    5 +-
> > > >  include/linux/regmap.h                        |    5 +
> > > >  14 files changed, 1795 insertions(+), 256 deletions(-)
> > > >  rename drivers/staging/media/hantro/{hantro_g1_h264_dec.c => hantro_h264_dec.c} (58%)
> > > >  create mode 100644 drivers/staging/media/hantro/hantro_regmap.c
> > > >  create mode 100644 drivers/staging/media/hantro/hantro_regmap.h
> > > >