mbox series

[v5,00/14] Add BLK_CTL support for i.MX8MP

Message ID 1604402306-5348-1-git-send-email-abel.vesa@nxp.com (mailing list archive)
Headers show
Series Add BLK_CTL support for i.MX8MP | expand

Message

Abel Vesa Nov. 3, 2020, 11:18 a.m. UTC
The BLK_CTL according to HW design is basically the wrapper of the entire
function specific group of IPs and holds GPRs that usually cannot be placed
into one specific IP from that group. Some of these GPRs are used to control
some clocks, other some resets, others some very specific function that does
not fit into clocks or resets. Since the clocks are registered using the i.MX
clock subsystem API, the driver is placed into the clock subsystem, but it
also registers the resets. For the other functionalities that other GPRs might
have, the syscon is used.

Changes since v4:
 * added back the bus_blk_clk in the imx8mp blk_ctl driver (media_blk_ctl)
 * added the R-b tag from Rob to the documentation patch

Abel Vesa (14):
  dt-bindings: clocks: imx8mp: Rename audiomix ids clocks to
    audio_blk_ctl
  dt-bindings: reset: imx8mp: Add audio blk_ctl reset IDs
  dt-bindings: clock: imx8mp: Add ids for the audio shared gate
  dt-bindings: clock: imx8mp: Add media blk_ctl clock IDs
  dt-bindings: reset: imx8mp: Add media blk_ctl reset IDs
  dt-bindings: clock: imx8mp: Add hdmi blk_ctl clock IDs
  dt-bindings: reset: imx8mp: Add hdmi blk_ctl reset IDs
  clk: imx8mp: Add audio shared gate
  Documentation: bindings: clk: Add bindings for i.MX BLK_CTL
  clk: imx: Add generic blk-ctl driver
  clk: imx: Add blk-ctl driver for i.MX8MP
  arm64: dts: imx8mp: Add audio_blk_ctl node
  arm64: dts: imx8mp: Add media_blk_ctl node
  arm64: dts: imx8mp: Add hdmi_blk_ctl node

 .../devicetree/bindings/clock/fsl,imx-blk-ctl.yaml |  60 ++++
 arch/arm64/boot/dts/freescale/imx8mp.dtsi          |  37 +++
 drivers/clk/imx/Makefile                           |   2 +-
 drivers/clk/imx/clk-blk-ctl-imx8mp.c               | 317 +++++++++++++++++++++
 drivers/clk/imx/clk-blk-ctl.c                      | 302 ++++++++++++++++++++
 drivers/clk/imx/clk-blk-ctl.h                      |  80 ++++++
 drivers/clk/imx/clk-imx8mp.c                       |  12 +-
 include/dt-bindings/clock/imx8mp-clock.h           | 200 +++++++++----
 include/dt-bindings/reset/imx8mp-reset.h           |  45 +++
 9 files changed, 992 insertions(+), 63 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/clock/fsl,imx-blk-ctl.yaml
 create mode 100644 drivers/clk/imx/clk-blk-ctl-imx8mp.c
 create mode 100644 drivers/clk/imx/clk-blk-ctl.c
 create mode 100644 drivers/clk/imx/clk-blk-ctl.h

Comments

Abel Vesa March 3, 2021, 10:47 a.m. UTC | #1
On 20-11-03 13:18:12, Abel Vesa wrote:
> The BLK_CTL according to HW design is basically the wrapper of the entire
> function specific group of IPs and holds GPRs that usually cannot be placed
> into one specific IP from that group. Some of these GPRs are used to control
> some clocks, other some resets, others some very specific function that does
> not fit into clocks or resets. Since the clocks are registered using the i.MX
> clock subsystem API, the driver is placed into the clock subsystem, but it
> also registers the resets. For the other functionalities that other GPRs might
> have, the syscon is used.
> 

This approach seems to be introducing a possible ABBA deadlock due to
the core clock and genpd locking. Here is a backtrace I got from Pete
Zhang (he reported the issue on the internal mailing list):

[   11.667711][  T108] -> #1 (&genpd->mlock){+.+.}-{3:3}:
[   11.675041][  T108]        __lock_acquire+0xae4/0xef8
[   11.680093][  T108]        lock_acquire+0xfc/0x2f8
[   11.684888][  T108]        __mutex_lock+0x90/0x870
[   11.689685][  T108]        mutex_lock_nested+0x44/0x50
[   11.694826][  T108]        genpd_lock_mtx+0x18/0x24
[   11.699706][  T108]        genpd_runtime_resume+0x90/0x214 (hold genpd->mlock)
[   11.705194][  T108]        __rpm_callback+0x80/0x2c0
[   11.710160][  T108]        rpm_resume+0x468/0x650
[   11.714866][  T108]        __pm_runtime_resume+0x60/0x88
[   11.720180][  T108]        clk_pm_runtime_get+0x28/0x9c
[   11.725410][  T108]        clk_disable_unused_subtree+0x8c/0x144
[   11.731420][  T108]        clk_disable_unused_subtree+0x124/0x144
[   11.737518][  T108]        clk_disable_unused+0xa4/0x11c (hold prepare_lock)
[   11.742833][  T108]        do_one_initcall+0x98/0x178
[   11.747888][  T108]        do_initcall_level+0x9c/0xb8
[   11.753028][  T108]        do_initcalls+0x54/0x94
[   11.757736][  T108]        do_basic_setup+0x24/0x30
[   11.762614][  T108]        kernel_init_freeable+0x70/0xa4
[   11.768014][  T108]        kernel_init+0x14/0x18c
[   11.772722][  T108]        ret_from_fork+0x10/0x18

[   11.777512][  T108] -> #0 (prepare_lock){+.+.}-{3:3}:
[   11.784749][  T108]        check_noncircular+0x134/0x13c
[   11.790064][  T108]        validate_chain+0x590/0x2a04
[   11.795204][  T108]        __lock_acquire+0xae4/0xef8
[   11.800258][  T108]        lock_acquire+0xfc/0x2f8
[   11.805050][  T108]        __mutex_lock+0x90/0x870
[   11.809841][  T108]        mutex_lock_nested+0x44/0x50
[   11.814983][  T108]        clk_unprepare+0x5c/0x100 ((hold prepare_lock))
[   11.819864][  T108]        imx8m_pd_power_off+0xac/0x110
[   11.825179][  T108]        genpd_power_off+0x1b4/0x2dc
[   11.830318][  T108]        genpd_power_off_work_fn+0x38/0x58 (hold genpd->mlock)
[   11.835981][  T108]        process_one_work+0x270/0x444
[   11.841208][  T108]        worker_thread+0x280/0x4e4
[   11.846176][  T108]        kthread+0x13c/0x14
[   11.850621][  T108]        ret_from_fork+0x10/0x18

Now, this has been reproduced only on the NXP internal tree, but I think
it is pretty obvious this could happen in upstream too, with this
patchset applied.

First, my thought was to change the prepare_lock/enable_lock in clock
core, from a global approach to a per clock basis. But that doesn't
actually fix the issue.

The usecase seen above is due to clk_disable_unused, but the same could
happen when a clock consumer calls prepare/unprepare on a clock.

I guess the conclusion is that the current state of the clock core and
genpd implementation does not support a usecase where a clock controller
has a PD which in turn uses another clock (from another clock controller).

Jacky, Pete, did I miss anything here ?

> Changes since v4:
>  * added back the bus_blk_clk in the imx8mp blk_ctl driver (media_blk_ctl)
>  * added the R-b tag from Rob to the documentation patch
> 
> Abel Vesa (14):
>   dt-bindings: clocks: imx8mp: Rename audiomix ids clocks to
>     audio_blk_ctl
>   dt-bindings: reset: imx8mp: Add audio blk_ctl reset IDs
>   dt-bindings: clock: imx8mp: Add ids for the audio shared gate
>   dt-bindings: clock: imx8mp: Add media blk_ctl clock IDs
>   dt-bindings: reset: imx8mp: Add media blk_ctl reset IDs
>   dt-bindings: clock: imx8mp: Add hdmi blk_ctl clock IDs
>   dt-bindings: reset: imx8mp: Add hdmi blk_ctl reset IDs
>   clk: imx8mp: Add audio shared gate
>   Documentation: bindings: clk: Add bindings for i.MX BLK_CTL
>   clk: imx: Add generic blk-ctl driver
>   clk: imx: Add blk-ctl driver for i.MX8MP
>   arm64: dts: imx8mp: Add audio_blk_ctl node
>   arm64: dts: imx8mp: Add media_blk_ctl node
>   arm64: dts: imx8mp: Add hdmi_blk_ctl node
> 
>  .../devicetree/bindings/clock/fsl,imx-blk-ctl.yaml |  60 ++++
>  arch/arm64/boot/dts/freescale/imx8mp.dtsi          |  37 +++
>  drivers/clk/imx/Makefile                           |   2 +-
>  drivers/clk/imx/clk-blk-ctl-imx8mp.c               | 317 +++++++++++++++++++++
>  drivers/clk/imx/clk-blk-ctl.c                      | 302 ++++++++++++++++++++
>  drivers/clk/imx/clk-blk-ctl.h                      |  80 ++++++
>  drivers/clk/imx/clk-imx8mp.c                       |  12 +-
>  include/dt-bindings/clock/imx8mp-clock.h           | 200 +++++++++----
>  include/dt-bindings/reset/imx8mp-reset.h           |  45 +++
>  9 files changed, 992 insertions(+), 63 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/clock/fsl,imx-blk-ctl.yaml
>  create mode 100644 drivers/clk/imx/clk-blk-ctl-imx8mp.c
>  create mode 100644 drivers/clk/imx/clk-blk-ctl.c
>  create mode 100644 drivers/clk/imx/clk-blk-ctl.h
> 
> -- 
> 2.7.4
>
Marek Vasut March 3, 2021, 10:54 a.m. UTC | #2
On 3/3/21 11:47 AM, Abel Vesa wrote:
> On 20-11-03 13:18:12, Abel Vesa wrote:
>> The BLK_CTL according to HW design is basically the wrapper of the entire
>> function specific group of IPs and holds GPRs that usually cannot be placed
>> into one specific IP from that group. Some of these GPRs are used to control
>> some clocks, other some resets, others some very specific function that does
>> not fit into clocks or resets. Since the clocks are registered using the i.MX
>> clock subsystem API, the driver is placed into the clock subsystem, but it
>> also registers the resets. For the other functionalities that other GPRs might
>> have, the syscon is used.
>>
> 
> This approach seems to be introducing a possible ABBA deadlock due to
> the core clock and genpd locking. Here is a backtrace I got from Pete
> Zhang (he reported the issue on the internal mailing list):
> 
> [   11.667711][  T108] -> #1 (&genpd->mlock){+.+.}-{3:3}:
> [   11.675041][  T108]        __lock_acquire+0xae4/0xef8
> [   11.680093][  T108]        lock_acquire+0xfc/0x2f8
> [   11.684888][  T108]        __mutex_lock+0x90/0x870
> [   11.689685][  T108]        mutex_lock_nested+0x44/0x50
> [   11.694826][  T108]        genpd_lock_mtx+0x18/0x24
> [   11.699706][  T108]        genpd_runtime_resume+0x90/0x214 (hold genpd->mlock)
> [   11.705194][  T108]        __rpm_callback+0x80/0x2c0
> [   11.710160][  T108]        rpm_resume+0x468/0x650
> [   11.714866][  T108]        __pm_runtime_resume+0x60/0x88
> [   11.720180][  T108]        clk_pm_runtime_get+0x28/0x9c
> [   11.725410][  T108]        clk_disable_unused_subtree+0x8c/0x144
> [   11.731420][  T108]        clk_disable_unused_subtree+0x124/0x144
> [   11.737518][  T108]        clk_disable_unused+0xa4/0x11c (hold prepare_lock)
> [   11.742833][  T108]        do_one_initcall+0x98/0x178
> [   11.747888][  T108]        do_initcall_level+0x9c/0xb8
> [   11.753028][  T108]        do_initcalls+0x54/0x94
> [   11.757736][  T108]        do_basic_setup+0x24/0x30
> [   11.762614][  T108]        kernel_init_freeable+0x70/0xa4
> [   11.768014][  T108]        kernel_init+0x14/0x18c
> [   11.772722][  T108]        ret_from_fork+0x10/0x18
> 
> [   11.777512][  T108] -> #0 (prepare_lock){+.+.}-{3:3}:
> [   11.784749][  T108]        check_noncircular+0x134/0x13c
> [   11.790064][  T108]        validate_chain+0x590/0x2a04
> [   11.795204][  T108]        __lock_acquire+0xae4/0xef8
> [   11.800258][  T108]        lock_acquire+0xfc/0x2f8
> [   11.805050][  T108]        __mutex_lock+0x90/0x870
> [   11.809841][  T108]        mutex_lock_nested+0x44/0x50
> [   11.814983][  T108]        clk_unprepare+0x5c/0x100 ((hold prepare_lock))
> [   11.819864][  T108]        imx8m_pd_power_off+0xac/0x110
> [   11.825179][  T108]        genpd_power_off+0x1b4/0x2dc
> [   11.830318][  T108]        genpd_power_off_work_fn+0x38/0x58 (hold genpd->mlock)
> [   11.835981][  T108]        process_one_work+0x270/0x444
> [   11.841208][  T108]        worker_thread+0x280/0x4e4
> [   11.846176][  T108]        kthread+0x13c/0x14
> [   11.850621][  T108]        ret_from_fork+0x10/0x18
> 
> Now, this has been reproduced only on the NXP internal tree, but I think
> it is pretty obvious this could happen in upstream too, with this
> patchset applied.
> 
> First, my thought was to change the prepare_lock/enable_lock in clock
> core, from a global approach to a per clock basis. But that doesn't
> actually fix the issue.
> 
> The usecase seen above is due to clk_disable_unused, but the same could
> happen when a clock consumer calls prepare/unprepare on a clock.
> 
> I guess the conclusion is that the current state of the clock core and
> genpd implementation does not support a usecase where a clock controller
> has a PD which in turn uses another clock (from another clock controller).
> 
> Jacky, Pete, did I miss anything here ?

Just for completeness, I have a feeling I already managed to trigger 
this and discussed this with Lucas before, so this concern is certainly 
valid.
Adam Ford March 22, 2021, 11:49 p.m. UTC | #3
On Wed, Mar 3, 2021 at 4:54 AM Marek Vasut <marex@denx.de> wrote:
>
> On 3/3/21 11:47 AM, Abel Vesa wrote:
> > On 20-11-03 13:18:12, Abel Vesa wrote:
> >> The BLK_CTL according to HW design is basically the wrapper of the entire
> >> function specific group of IPs and holds GPRs that usually cannot be placed
> >> into one specific IP from that group. Some of these GPRs are used to control
> >> some clocks, other some resets, others some very specific function that does
> >> not fit into clocks or resets. Since the clocks are registered using the i.MX
> >> clock subsystem API, the driver is placed into the clock subsystem, but it
> >> also registers the resets. For the other functionalities that other GPRs might
> >> have, the syscon is used.
> >>
> >
> > This approach seems to be introducing a possible ABBA deadlock due to
> > the core clock and genpd locking. Here is a backtrace I got from Pete
> > Zhang (he reported the issue on the internal mailing list):
> >
> > [   11.667711][  T108] -> #1 (&genpd->mlock){+.+.}-{3:3}:
> > [   11.675041][  T108]        __lock_acquire+0xae4/0xef8
> > [   11.680093][  T108]        lock_acquire+0xfc/0x2f8
> > [   11.684888][  T108]        __mutex_lock+0x90/0x870
> > [   11.689685][  T108]        mutex_lock_nested+0x44/0x50
> > [   11.694826][  T108]        genpd_lock_mtx+0x18/0x24
> > [   11.699706][  T108]        genpd_runtime_resume+0x90/0x214 (hold genpd->mlock)
> > [   11.705194][  T108]        __rpm_callback+0x80/0x2c0
> > [   11.710160][  T108]        rpm_resume+0x468/0x650
> > [   11.714866][  T108]        __pm_runtime_resume+0x60/0x88
> > [   11.720180][  T108]        clk_pm_runtime_get+0x28/0x9c
> > [   11.725410][  T108]        clk_disable_unused_subtree+0x8c/0x144
> > [   11.731420][  T108]        clk_disable_unused_subtree+0x124/0x144
> > [   11.737518][  T108]        clk_disable_unused+0xa4/0x11c (hold prepare_lock)
> > [   11.742833][  T108]        do_one_initcall+0x98/0x178
> > [   11.747888][  T108]        do_initcall_level+0x9c/0xb8
> > [   11.753028][  T108]        do_initcalls+0x54/0x94
> > [   11.757736][  T108]        do_basic_setup+0x24/0x30
> > [   11.762614][  T108]        kernel_init_freeable+0x70/0xa4
> > [   11.768014][  T108]        kernel_init+0x14/0x18c
> > [   11.772722][  T108]        ret_from_fork+0x10/0x18
> >
> > [   11.777512][  T108] -> #0 (prepare_lock){+.+.}-{3:3}:
> > [   11.784749][  T108]        check_noncircular+0x134/0x13c
> > [   11.790064][  T108]        validate_chain+0x590/0x2a04
> > [   11.795204][  T108]        __lock_acquire+0xae4/0xef8
> > [   11.800258][  T108]        lock_acquire+0xfc/0x2f8
> > [   11.805050][  T108]        __mutex_lock+0x90/0x870
> > [   11.809841][  T108]        mutex_lock_nested+0x44/0x50
> > [   11.814983][  T108]        clk_unprepare+0x5c/0x100 ((hold prepare_lock))
> > [   11.819864][  T108]        imx8m_pd_power_off+0xac/0x110
> > [   11.825179][  T108]        genpd_power_off+0x1b4/0x2dc
> > [   11.830318][  T108]        genpd_power_off_work_fn+0x38/0x58 (hold genpd->mlock)
> > [   11.835981][  T108]        process_one_work+0x270/0x444
> > [   11.841208][  T108]        worker_thread+0x280/0x4e4
> > [   11.846176][  T108]        kthread+0x13c/0x14
> > [   11.850621][  T108]        ret_from_fork+0x10/0x18
> >
> > Now, this has been reproduced only on the NXP internal tree, but I think
> > it is pretty obvious this could happen in upstream too, with this
> > patchset applied.
> >
> > First, my thought was to change the prepare_lock/enable_lock in clock
> > core, from a global approach to a per clock basis. But that doesn't
> > actually fix the issue.
> >
> > The usecase seen above is due to clk_disable_unused, but the same could
> > happen when a clock consumer calls prepare/unprepare on a clock.
> >
> > I guess the conclusion is that the current state of the clock core and
> > genpd implementation does not support a usecase where a clock controller
> > has a PD which in turn uses another clock (from another clock controller).
> >
> > Jacky, Pete, did I miss anything here ?
>
> Just for completeness, I have a feeling I already managed to trigger
> this and discussed this with Lucas before, so this concern is certainly
> valid.

I know it may not be ideal, someone tied a soft-reset and soft-enable
to the driver of the Hantro VPU on the IMXMQ [1], and I wonder if
something similar could be done for the drivers who are consumers of
the clocks.

For example:

lcdif could request the power domain.
The power domain soft-resets and enables bus clock (vis syscon)
After successful enabling of power-domain, the LCDIF requests the soft
reset and respective clock bits (also via syscon) similar to how [1]
and [2] do it for the Hantro VPU.

The syscon node could be a common node similar to what was done in
[2], and multiple consumers could control when each soft-reset and
clock-enable get activated.  I know it's probably more of an abuse of
the syscon system, but having the individual drivers control the order
of operations might be safer than trying to create a one-size-fits-all
driver.

adam
[1] - https://patchwork.kernel.org/project/linux-arm-kernel/patch/20210318082046.51546-4-benjamin.gaignard@collabora.com/
[2] - https://patchwork.kernel.org/project/linux-arm-kernel/patch/20210318082046.51546-14-benjamin.gaignard@collabora.com/