Message ID | 20250303143629.400583-5-m.wilczynski@samsung.com (mailing list archive) |
---|---|
State | Handled Elsewhere |
Headers | show |
Series | [v1,1/4] dt-bindings: clock: thead: Add TH1520 VO clock controller | expand |
Context | Check | Description |
---|---|---|
bjorn/pre-ci_am | success | Success |
bjorn/build-rv32-defconfig | success | build-rv32-defconfig |
bjorn/build-rv64-clang-allmodconfig | success | build-rv64-clang-allmodconfig |
bjorn/build-rv64-gcc-allmodconfig | success | build-rv64-gcc-allmodconfig |
bjorn/build-rv64-nommu-k210-defconfig | success | build-rv64-nommu-k210-defconfig |
bjorn/build-rv64-nommu-k210-virt | success | build-rv64-nommu-k210-virt |
bjorn/checkpatch | success | checkpatch |
bjorn/dtb-warn-rv64 | success | dtb-warn-rv64 |
bjorn/header-inline | success | header-inline |
bjorn/kdoc | success | kdoc |
bjorn/module-param | success | module-param |
bjorn/verify-fixes | success | verify-fixes |
bjorn/verify-signedoff | success | verify-signedoff |
Quoting Michal Wilczynski (2025-03-03 06:36:29) > The T-HEAD TH1520 has three GPU clocks: core, cfg, and mem. The mem > clock gate is marked as "Reserved" in hardware, while core and cfg are > configurable. In order for these clock gates to work properly, the > CLKGEN reset must be managed in a specific sequence. > > Move the CLKGEN reset handling to the clock driver since it's > fundamentally a clock-related workaround [1]. This ensures that clk_enabled > GPU clocks stay physically enabled without external interference from > the reset driver. The reset is now deasserted only when both core and > cfg clocks are enabled, and asserted when either of them is disabled. > > The mem clock is configured to use nop operations since it cannot be > controlled. > > Link: https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de [1] > > Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com> [...] > diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c > index ea96d007aecd..1dfcde867233 100644 > --- a/drivers/clk/thead/clk-th1520-ap.c > +++ b/drivers/clk/thead/clk-th1520-ap.c > @@ -862,17 +863,70 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0); [...] > > static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk", > video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops); > +static CCU_GATE_CLK_OPS(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk", > + video_pll_clk_pd, 0x0, BIT(3), 0, ccu_gate_gpu_ops); > +static CCU_GATE_CLK_OPS(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk", > + video_pll_clk_pd, 0x0, BIT(4), 0, ccu_gate_gpu_ops); > + > +static void ccu_gpu_clk_disable(struct clk_hw *hw) > +{ > + struct ccu_gate *cg = hw_to_ccu_gate(hw); > + unsigned long flags; > + > + spin_lock_irqsave(&gpu_reset_lock, flags); > + > + ccu_disable_helper(&cg->common, cg->enable); > + > + if ((cg == &gpu_core_clk && > + !clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) || > + (cg == &gpu_cfg_aclk && > + !clk_hw_is_enabled(&gpu_core_clk.common.hw))) > + reset_control_assert(gpu_reset); Why can't the clk consumer control the reset itself? Doing this here is not ideal because we hold the clk lock when we try to grab the reset lock. These are all spinlocks that should be small in lines of code where the lock is held, but we're calling into an entire other framework under a spinlock. If an (unrelated) reset driver tries to grab the clk lock it will deadlock. I see the commit text talks about this being a workaround. I'm not sure what the workaround is though. I've seen designs where the reset doesn't work unless the clk is enabled because the flops have to be clocking for the reset to propagate a few cycles, or the clk has to be disabled so that the reset controller can do the clocking, or vice versa for the clk not working unless the reset is deasserted. Long story short, it's different between SoCs. Likely the reset and clk control should be coordinated in a PM domain for the device so that when the device is active, the clks are enabled and the reset is deasserted in the correct order for the SoC. Can you do that? > + > + spin_unlock_irqrestore(&gpu_reset_lock, flags); > +}
On 3/6/25 00:47, Stephen Boyd wrote: > Quoting Michal Wilczynski (2025-03-03 06:36:29) >> The T-HEAD TH1520 has three GPU clocks: core, cfg, and mem. The mem >> clock gate is marked as "Reserved" in hardware, while core and cfg are >> configurable. In order for these clock gates to work properly, the >> CLKGEN reset must be managed in a specific sequence. >> >> Move the CLKGEN reset handling to the clock driver since it's >> fundamentally a clock-related workaround [1]. This ensures that clk_enabled >> GPU clocks stay physically enabled without external interference from >> the reset driver. The reset is now deasserted only when both core and >> cfg clocks are enabled, and asserted when either of them is disabled. >> >> The mem clock is configured to use nop operations since it cannot be >> controlled. >> >> Link: https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de [1] >> >> Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com> > [...] >> diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c >> index ea96d007aecd..1dfcde867233 100644 >> --- a/drivers/clk/thead/clk-th1520-ap.c >> +++ b/drivers/clk/thead/clk-th1520-ap.c >> @@ -862,17 +863,70 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0); > [...] >> >> static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk", >> video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops); >> +static CCU_GATE_CLK_OPS(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk", >> + video_pll_clk_pd, 0x0, BIT(3), 0, ccu_gate_gpu_ops); >> +static CCU_GATE_CLK_OPS(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk", >> + video_pll_clk_pd, 0x0, BIT(4), 0, ccu_gate_gpu_ops); >> + >> +static void ccu_gpu_clk_disable(struct clk_hw *hw) >> +{ >> + struct ccu_gate *cg = hw_to_ccu_gate(hw); >> + unsigned long flags; >> + >> + spin_lock_irqsave(&gpu_reset_lock, flags); >> + >> + ccu_disable_helper(&cg->common, cg->enable); >> + >> + if ((cg == &gpu_core_clk && >> + !clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) || >> + (cg == &gpu_cfg_aclk && >> + !clk_hw_is_enabled(&gpu_core_clk.common.hw))) >> + reset_control_assert(gpu_reset); > > Why can't the clk consumer control the reset itself? Doing this here is > not ideal because we hold the clk lock when we try to grab the reset > lock. These are all spinlocks that should be small in lines of code > where the lock is held, but we're calling into an entire other framework > under a spinlock. If an (unrelated) reset driver tries to grab the clk > lock it will deadlock. So in our case the clk consumer is the drm/imagination driver. Here is the comment from the maintainer for my previous attempt to use a reset driver to abstract the GPU init sequence [1]: "Do you know what this resets? From our side, the GPU only has a single reset line (which I assume to be GPU_RESET)." "I don't love that this procedure appears in the platform reset driver. I appreciate it may not be clear from the SoC TRM, but this is the standard reset procedure for all IMG Rogue GPUs. The currently supported TI SoC handles this in silicon, when power up/down requests are sent so we never needed to encode it in the driver before. Strictly speaking, the 32 cycle delay is required between power and clocks being enabled and the reset line being deasserted. If nothing here touches power or clocks (which I don't think it should), the delay could potentially be lifted to the GPU driver." From the drm/imagination maintainers point of view their hardware has only one reset, the extra CLKGEN reset is SoC specific. Also the reset driver maintainer didn't like my way of abstracting two resets ("GPU" and and SoC specific"CLKGEN") into one reset to make it seem to the consumer driver drm/imagination like there is only one reset and suggested and attempt to code the re-setting in the clock driver [2]. Even though he suggested a different way of achieving that: "In my mind it shouldn't be much: the GPU clocks could all share the same refcounted implementation. The first clock to get enabled would ungate both GPU_CORE and GPU_CFG_ACLK gates and deassert GPU_SW_CLKGEN_RST, all in one place. The remaining enable(s) would be no-ops. Would that work?" The above would have similar effect, but I felt like enabling both clocks in single .enable callback could be confusing so I ended up with the current approach. This could be easily re-done if you feel this would be better. I agree that using spinlocks here is dangerous, but looking at the code of the reset_control_deassert and reset_control_assert, it doesn't seem like any spinlocks are acquired/relased in that code flow, unless the driver ops would introduce that. So in this specific case deadlock shouldn't happen ? [1] - https://lore.kernel.org/all/816db99d-7088-4c1a-af03-b9a825ac09dc@imgtec.com/ [2] - https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de/ > > I see the commit text talks about this being a workaround. I'm not sure > what the workaround is though. I've seen designs where the reset doesn't > work unless the clk is enabled because the flops have to be clocking for > the reset to propagate a few cycles, or the clk has to be disabled so > that the reset controller can do the clocking, or vice versa for the clk > not working unless the reset is deasserted. Long story short, it's > different between SoCs. OK, so this is how GPU initialization needs to happen for this specific SoC: drm/imagination consumer driver: pvr_power_device_resume(): clk_prepare_enable(pvr_dev->core_clk); clk_prepare_enable(pvr_dev->sys_clk); clk_prepare_enable(pvr_dev->mem_clk); // Then deassert reset introduced in [3] // [3] - https://lore.kernel.org/all/20250128194816.2185326-11-m.wilczynski@samsung.com/ // Eventually this would get called in the SoC specific reset driver static void th1520_rst_gpu_enable(struct regmap *reg, struct mutex *gpu_seq_lock) { int val; mutex_lock(gpu_seq_lock); /* if the GPU is not in a reset state it, put it into one */ regmap_read(reg, TH1520_GPU_RST_CFG, &val); if (val) regmap_update_bits(reg, TH1520_GPU_RST_CFG, TH1520_GPU_RST_CFG_MASK, 0x0); /* rst gpu clkgen */ regmap_set_bits(reg, TH1520_GPU_RST_CFG, TH1520_GPU_SW_CLKGEN_RST); /* * According to the hardware manual, a delay of at least 32 clock * cycles is required between de-asserting the clkgen reset and * de-asserting the GPU reset. Assuming a worst-case scenario with * a very high GPU clock frequency, a delay of 1 microsecond is * sufficient to ensure this requirement is met across all * feasible GPU clock speeds. */ udelay(1); /* rst gpu */ regmap_set_bits(reg, TH1520_GPU_RST_CFG, TH1520_GPU_SW_GPU_RST); mutex_unlock(gpu_seq_lock); } Based on my testing this is exactly how the resets should happen, else the GPU fails to initialize, and the drm/imagination driver hangs. To reiterate: first power ON the GPU using power-domain driver. Then drm/imagination enable all three clocks, then deassert reset of the GPU CLKGEN (SoC specific), delay for 32 cycles, and then deassert the GPU reset. > > Likely the reset and clk control should be coordinated in a PM domain > for the device so that when the device is active, the clks are enabled > and the reset is deasserted in the correct order for the SoC. Can you do > that? Obviously this would work, but I'm worried, the drm/imagination driver maintainers would reject it, as for them this is a special case, from their perspective there is only one reset line to their hardware, and there are three clocks which they manage in driver already. And the PM maintainers probably wouldn't be happy to take this as well. At the very end maybe we could go back to abstracting the resets in the reset driver, as the reset maintainer seemed to be open to it, if the alternatives proves to be too problematic [4]. "I won't object to carry this in the reset driver if the clock implementation turns out to be unreasonably complex, but I currently don't expect that to be the case. Please give it a shot." [4] - https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de/ > >> + >> + spin_unlock_irqrestore(&gpu_reset_lock, flags); >> +} >
On Do, 2025-03-06 at 17:43 +0100, Michal Wilczynski wrote: > > On 3/6/25 00:47, Stephen Boyd wrote: > > Quoting Michal Wilczynski (2025-03-03 06:36:29) > > > The T-HEAD TH1520 has three GPU clocks: core, cfg, and mem. The mem > > > clock gate is marked as "Reserved" in hardware, while core and cfg are > > > configurable. In order for these clock gates to work properly, the > > > CLKGEN reset must be managed in a specific sequence. > > > > > > Move the CLKGEN reset handling to the clock driver since it's > > > fundamentally a clock-related workaround [1]. This ensures that clk_enabled > > > GPU clocks stay physically enabled without external interference from > > > the reset driver. The reset is now deasserted only when both core and > > > cfg clocks are enabled, and asserted when either of them is disabled. > > > > > > The mem clock is configured to use nop operations since it cannot be > > > controlled. > > > > > > Link: https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de [1] > > > > > > Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com> > > [...] > > > diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c > > > index ea96d007aecd..1dfcde867233 100644 > > > --- a/drivers/clk/thead/clk-th1520-ap.c > > > +++ b/drivers/clk/thead/clk-th1520-ap.c > > > @@ -862,17 +863,70 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0); > > [...] > > > > > > static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk", > > > video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops); > > > +static CCU_GATE_CLK_OPS(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk", > > > + video_pll_clk_pd, 0x0, BIT(3), 0, ccu_gate_gpu_ops); > > > +static CCU_GATE_CLK_OPS(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk", > > > + video_pll_clk_pd, 0x0, BIT(4), 0, ccu_gate_gpu_ops); > > > + > > > +static void ccu_gpu_clk_disable(struct clk_hw *hw) > > > +{ > > > + struct ccu_gate *cg = hw_to_ccu_gate(hw); > > > + unsigned long flags; > > > + > > > + spin_lock_irqsave(&gpu_reset_lock, flags); > > > + > > > + ccu_disable_helper(&cg->common, cg->enable); > > > + > > > + if ((cg == &gpu_core_clk && > > > + !clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) || > > > + (cg == &gpu_cfg_aclk && > > > + !clk_hw_is_enabled(&gpu_core_clk.common.hw))) > > > + reset_control_assert(gpu_reset); > > > > Why can't the clk consumer control the reset itself? Doing this here is > > not ideal because we hold the clk lock when we try to grab the reset > > lock. These are all spinlocks that should be small in lines of code > > where the lock is held, but we're calling into an entire other framework > > under a spinlock. If an (unrelated) reset driver tries to grab the clk > > lock it will deadlock. > > So in our case the clk consumer is the drm/imagination driver. Here is > the comment from the maintainer for my previous attempt to use a reset > driver to abstract the GPU init sequence [1]: > > "Do you know what this resets? From our side, the GPU only has a single > reset line (which I assume to be GPU_RESET)." > > "I don't love that this procedure appears in the platform reset driver. > I appreciate it may not be clear from the SoC TRM, but this is the > standard reset procedure for all IMG Rogue GPUs. The currently > supported TI SoC handles this in silicon, when power up/down requests > are sent so we never needed to encode it in the driver before. > > Strictly speaking, the 32 cycle delay is required between power and > clocks being enabled and the reset line being deasserted. If nothing > here touches power or clocks (which I don't think it should), the delay > could potentially be lifted to the GPU driver." > > From the drm/imagination maintainers point of view their hardware has > only one reset, the extra CLKGEN reset is SoC specific. If I am understanding correctly, the CLKGEN reset doesn't reset anything in the GPU itself, but holds the GPU clock generator block in reset, effectively disabling the three GPU clocks as a workaround for the always-ungated GPU_MEM clock. > Also the reset driver maintainer didn't like my way of abstracting two > resets ("GPU" and and SoC specific"CLKGEN") into one reset That is one part of it. The other is that (according to my understanding as laid out above), the combined GPU+CLKGEN reset would effectively disable all three GPU clocks for a while, after the GPU driver has already requested them to be enabled. > to make it > seem to the consumer driver drm/imagination like there is only one > reset and suggested and attempt to code the re-setting in the clock > driver [2]. Even though he suggested a different way of achieving that: > > "In my mind it shouldn't be much: the GPU clocks could all share the > same refcounted implementation. The first clock to get enabled would > ungate both GPU_CORE and GPU_CFG_ACLK gates and deassert > GPU_SW_CLKGEN_RST, all in one place. The remaining enable(s) would be > no-ops. Would that work?" > > The above would have similar effect, but I felt like enabling both > clocks in single .enable callback could be confusing so I ended up with > the current approach. This could be easily re-done if you feel this > would be better. > > I agree that using spinlocks here is dangerous, but looking at the code > of the reset_control_deassert and reset_control_assert, it doesn't seem > like any spinlocks are acquired/relased in that code flow, unless the > driver ops would introduce that. So in this specific case deadlock > shouldn't happen ? There are no spinlocks in the reset_control_(de)assert paths in the reset framework, but in general there could be in the driver. The thead driver [1], uses regmap_update_bits() on a regmap with .fast_io = true, which uses the internal struct regmap::spinlock. [1] https://lore.kernel.org/all/20250303152511.494405-3-m.wilczynski@samsung.com/ regards Philipp
On 3/13/25 10:25, Philipp Zabel wrote: > On Do, 2025-03-06 at 17:43 +0100, Michal Wilczynski wrote: >> >> On 3/6/25 00:47, Stephen Boyd wrote: >>> Quoting Michal Wilczynski (2025-03-03 06:36:29) >>>> The T-HEAD TH1520 has three GPU clocks: core, cfg, and mem. The mem >>>> clock gate is marked as "Reserved" in hardware, while core and cfg are >>>> configurable. In order for these clock gates to work properly, the >>>> CLKGEN reset must be managed in a specific sequence. >>>> >>>> Move the CLKGEN reset handling to the clock driver since it's >>>> fundamentally a clock-related workaround [1]. This ensures that clk_enabled >>>> GPU clocks stay physically enabled without external interference from >>>> the reset driver. The reset is now deasserted only when both core and >>>> cfg clocks are enabled, and asserted when either of them is disabled. >>>> >>>> The mem clock is configured to use nop operations since it cannot be >>>> controlled. >>>> >>>> Link: https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de [1] >>>> >>>> Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com> >>> [...] >>>> diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c >>>> index ea96d007aecd..1dfcde867233 100644 >>>> --- a/drivers/clk/thead/clk-th1520-ap.c >>>> +++ b/drivers/clk/thead/clk-th1520-ap.c >>>> @@ -862,17 +863,70 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0); >>> [...] >>>> >>>> static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk", >>>> video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops); >>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk", >>>> + video_pll_clk_pd, 0x0, BIT(3), 0, ccu_gate_gpu_ops); >>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk", >>>> + video_pll_clk_pd, 0x0, BIT(4), 0, ccu_gate_gpu_ops); >>>> + >>>> +static void ccu_gpu_clk_disable(struct clk_hw *hw) >>>> +{ >>>> + struct ccu_gate *cg = hw_to_ccu_gate(hw); >>>> + unsigned long flags; >>>> + >>>> + spin_lock_irqsave(&gpu_reset_lock, flags); >>>> + >>>> + ccu_disable_helper(&cg->common, cg->enable); >>>> + >>>> + if ((cg == &gpu_core_clk && >>>> + !clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) || >>>> + (cg == &gpu_cfg_aclk && >>>> + !clk_hw_is_enabled(&gpu_core_clk.common.hw))) >>>> + reset_control_assert(gpu_reset); >>> >>> Why can't the clk consumer control the reset itself? Doing this here is >>> not ideal because we hold the clk lock when we try to grab the reset >>> lock. These are all spinlocks that should be small in lines of code >>> where the lock is held, but we're calling into an entire other framework >>> under a spinlock. If an (unrelated) reset driver tries to grab the clk >>> lock it will deadlock. >> >> So in our case the clk consumer is the drm/imagination driver. Here is >> the comment from the maintainer for my previous attempt to use a reset >> driver to abstract the GPU init sequence [1]: >> >> "Do you know what this resets? From our side, the GPU only has a single >> reset line (which I assume to be GPU_RESET)." >> >> "I don't love that this procedure appears in the platform reset driver. >> I appreciate it may not be clear from the SoC TRM, but this is the >> standard reset procedure for all IMG Rogue GPUs. The currently >> supported TI SoC handles this in silicon, when power up/down requests >> are sent so we never needed to encode it in the driver before. >> >> Strictly speaking, the 32 cycle delay is required between power and >> clocks being enabled and the reset line being deasserted. If nothing >> here touches power or clocks (which I don't think it should), the delay >> could potentially be lifted to the GPU driver." >> >> From the drm/imagination maintainers point of view their hardware has >> only one reset, the extra CLKGEN reset is SoC specific. > > If I am understanding correctly, the CLKGEN reset doesn't reset > anything in the GPU itself, but holds the GPU clock generator block in > reset, effectively disabling the three GPU clocks as a workaround for > the always-ungated GPU_MEM clock. > >> Also the reset driver maintainer didn't like my way of abstracting two >> resets ("GPU" and and SoC specific"CLKGEN") into one reset > > That is one part of it. The other is that (according to my > understanding as laid out above), the combined GPU+CLKGEN reset would > effectively disable all three GPU clocks for a while, after the GPU > driver has already requested them to be enabled. Thank you for your comments Philipp, it seems like we're on the same page here. I was wondering whether there is anything I can do to move the patches forward. Stephen, if the current patch is a no go from your perspective could you please advise whether there is a way to solve this in a clock that would be acceptable to you. Thanks, Michał > >> to make it >> seem to the consumer driver drm/imagination like there is only one >> reset and suggested and attempt to code the re-setting in the clock >> driver [2]. Even though he suggested a different way of achieving that: >> >> "In my mind it shouldn't be much: the GPU clocks could all share the >> same refcounted implementation. The first clock to get enabled would >> ungate both GPU_CORE and GPU_CFG_ACLK gates and deassert >> GPU_SW_CLKGEN_RST, all in one place. The remaining enable(s) would be >> no-ops. Would that work?" >> >> The above would have similar effect, but I felt like enabling both >> clocks in single .enable callback could be confusing so I ended up with >> the current approach. This could be easily re-done if you feel this >> would be better. >> >> I agree that using spinlocks here is dangerous, but looking at the code >> of the reset_control_deassert and reset_control_assert, it doesn't seem >> like any spinlocks are acquired/relased in that code flow, unless the >> driver ops would introduce that. So in this specific case deadlock >> shouldn't happen ? > > There are no spinlocks in the reset_control_(de)assert paths in the > reset framework, but in general there could be in the driver. The thead > driver [1], uses regmap_update_bits() on a regmap with .fast_io = true, > which uses the internal struct regmap::spinlock. > > [1] https://lore.kernel.org/all/20250303152511.494405-3-m.wilczynski@samsung.com/ > > > regards > Philipp >
Quoting Michal Wilczynski (2025-03-19 02:22:11) > > > On 3/13/25 10:25, Philipp Zabel wrote: > > On Do, 2025-03-06 at 17:43 +0100, Michal Wilczynski wrote: > >> > >> On 3/6/25 00:47, Stephen Boyd wrote: > >>> Quoting Michal Wilczynski (2025-03-03 06:36:29) > >>>> The T-HEAD TH1520 has three GPU clocks: core, cfg, and mem. The mem > >>>> clock gate is marked as "Reserved" in hardware, while core and cfg are > >>>> configurable. In order for these clock gates to work properly, the > >>>> CLKGEN reset must be managed in a specific sequence. > >>>> > >>>> Move the CLKGEN reset handling to the clock driver since it's > >>>> fundamentally a clock-related workaround [1]. This ensures that clk_enabled > >>>> GPU clocks stay physically enabled without external interference from > >>>> the reset driver. The reset is now deasserted only when both core and > >>>> cfg clocks are enabled, and asserted when either of them is disabled. > >>>> > >>>> The mem clock is configured to use nop operations since it cannot be > >>>> controlled. > >>>> > >>>> Link: https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de [1] > >>>> > >>>> Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com> > >>> [...] > >>>> diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c > >>>> index ea96d007aecd..1dfcde867233 100644 > >>>> --- a/drivers/clk/thead/clk-th1520-ap.c > >>>> +++ b/drivers/clk/thead/clk-th1520-ap.c > >>>> @@ -862,17 +863,70 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0); > >>> [...] > >>>> > >>>> static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk", > >>>> video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops); > >>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk", > >>>> + video_pll_clk_pd, 0x0, BIT(3), 0, ccu_gate_gpu_ops); > >>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk", > >>>> + video_pll_clk_pd, 0x0, BIT(4), 0, ccu_gate_gpu_ops); > >>>> + > >>>> +static void ccu_gpu_clk_disable(struct clk_hw *hw) > >>>> +{ > >>>> + struct ccu_gate *cg = hw_to_ccu_gate(hw); > >>>> + unsigned long flags; > >>>> + > >>>> + spin_lock_irqsave(&gpu_reset_lock, flags); > >>>> + > >>>> + ccu_disable_helper(&cg->common, cg->enable); > >>>> + > >>>> + if ((cg == &gpu_core_clk && > >>>> + !clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) || > >>>> + (cg == &gpu_cfg_aclk && > >>>> + !clk_hw_is_enabled(&gpu_core_clk.common.hw))) > >>>> + reset_control_assert(gpu_reset); > >>> > >>> Why can't the clk consumer control the reset itself? Doing this here is > >>> not ideal because we hold the clk lock when we try to grab the reset > >>> lock. These are all spinlocks that should be small in lines of code > >>> where the lock is held, but we're calling into an entire other framework > >>> under a spinlock. If an (unrelated) reset driver tries to grab the clk > >>> lock it will deadlock. > >> > >> So in our case the clk consumer is the drm/imagination driver. Here is > >> the comment from the maintainer for my previous attempt to use a reset > >> driver to abstract the GPU init sequence [1]: > >> > >> "Do you know what this resets? From our side, the GPU only has a single > >> reset line (which I assume to be GPU_RESET)." > >> > >> "I don't love that this procedure appears in the platform reset driver. > >> I appreciate it may not be clear from the SoC TRM, but this is the > >> standard reset procedure for all IMG Rogue GPUs. The currently > >> supported TI SoC handles this in silicon, when power up/down requests > >> are sent so we never needed to encode it in the driver before. > >> > >> Strictly speaking, the 32 cycle delay is required between power and > >> clocks being enabled and the reset line being deasserted. If nothing > >> here touches power or clocks (which I don't think it should), the delay > >> could potentially be lifted to the GPU driver." > >> > >> From the drm/imagination maintainers point of view their hardware has > >> only one reset, the extra CLKGEN reset is SoC specific. > > > > If I am understanding correctly, the CLKGEN reset doesn't reset > > anything in the GPU itself, but holds the GPU clock generator block in > > reset, effectively disabling the three GPU clocks as a workaround for > > the always-ungated GPU_MEM clock. > > > >> Also the reset driver maintainer didn't like my way of abstracting two > >> resets ("GPU" and and SoC specific"CLKGEN") into one reset > > > > That is one part of it. The other is that (according to my > > understanding as laid out above), the combined GPU+CLKGEN reset would > > effectively disable all three GPU clocks for a while, after the GPU > > driver has already requested them to be enabled. > > Thank you for your comments Philipp, it seems like we're on the same > page here. I was wondering whether there is anything I can do to move the > patches forward. > > Stephen, if the current patch is a no go from your perspective could you > please advise whether there is a way to solve this in a clock that would > be acceptable to you. It looks like the SoC glue makes the interactions between the clk and reset frameworks complicated because GPU clks don't work if a reset is asserted. You're trying to find a place to coordinate the clk and reset. Am I right? I'd advise managing the clks and resets in a generic power domain that is attached to the GPU device. In that power domain, coordinate the clk and reset sequencing so that the reset is deasserted before the clks are enabled (or whatever the actual requirement is). If the GPU driver _must_ have a clk and reset pointer to use, implement one that either does nothing or flag to the GPU driver that the power domain is managing all this for it so it should just use runtime PM and system PM hooks to turn on the clks and take the GPU out of reset. From what I can tell, the GPU driver maintainer doesn't want to think about the wrapper that likely got placed around the hardware block shipped by IMG. This wrapper is the SoC glue that needs to go into a generic power domain so that the different PM resources, reset, clk, etc. can be coordinated based on the GPU device's power state. It's either that, or go the dwc3 route and have SoC glue platform drivers that manage this stuff and create a child device to represent the hard macro shipped by the vendor like Synopsys/Imagination. Doing the parent device design isn't as flexible as PM domains because you can only have one parent device and the child device state can be ignored vs. many PM domains attached in a graph to a device that are more directly influenced by the device using runtime PM. Maybe you'll be heartened to know this problem isn't unique and has existed for decades :) I don't know what state the graphics driver is in but they'll likely be interested in solving this problem in a way that doesn't "pollute" their driver with SoC specific details. It's all a question of where you put the code. The reset framework wants to focus on resets, the clk framework wants to focus on clks, and the graphics driver wants to focus on graphics. BTW, we went through a similar discussion with regulators and clks years ago and ended up handling that with OPPs and power domains. I believe a PM domain is the right place for this kind of stuff, and I actually presented on this topic at OSSEU[1], but I don't maintain that code. Ulf does. [1] https://osseu2024.sched.com/event/1ej38/the-case-for-an-soc-power-management-driver-stephen-boyd-google
On Tue, 25 Mar 2025 at 23:40, Stephen Boyd <sboyd@kernel.org> wrote: > > Quoting Michal Wilczynski (2025-03-19 02:22:11) > > > > > > On 3/13/25 10:25, Philipp Zabel wrote: > > > On Do, 2025-03-06 at 17:43 +0100, Michal Wilczynski wrote: > > >> > > >> On 3/6/25 00:47, Stephen Boyd wrote: > > >>> Quoting Michal Wilczynski (2025-03-03 06:36:29) > > >>>> The T-HEAD TH1520 has three GPU clocks: core, cfg, and mem. The mem > > >>>> clock gate is marked as "Reserved" in hardware, while core and cfg are > > >>>> configurable. In order for these clock gates to work properly, the > > >>>> CLKGEN reset must be managed in a specific sequence. > > >>>> > > >>>> Move the CLKGEN reset handling to the clock driver since it's > > >>>> fundamentally a clock-related workaround [1]. This ensures that clk_enabled > > >>>> GPU clocks stay physically enabled without external interference from > > >>>> the reset driver. The reset is now deasserted only when both core and > > >>>> cfg clocks are enabled, and asserted when either of them is disabled. > > >>>> > > >>>> The mem clock is configured to use nop operations since it cannot be > > >>>> controlled. > > >>>> > > >>>> Link: https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de [1] > > >>>> > > >>>> Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com> > > >>> [...] > > >>>> diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c > > >>>> index ea96d007aecd..1dfcde867233 100644 > > >>>> --- a/drivers/clk/thead/clk-th1520-ap.c > > >>>> +++ b/drivers/clk/thead/clk-th1520-ap.c > > >>>> @@ -862,17 +863,70 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0); > > >>> [...] > > >>>> > > >>>> static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk", > > >>>> video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops); > > >>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk", > > >>>> + video_pll_clk_pd, 0x0, BIT(3), 0, ccu_gate_gpu_ops); > > >>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk", > > >>>> + video_pll_clk_pd, 0x0, BIT(4), 0, ccu_gate_gpu_ops); > > >>>> + > > >>>> +static void ccu_gpu_clk_disable(struct clk_hw *hw) > > >>>> +{ > > >>>> + struct ccu_gate *cg = hw_to_ccu_gate(hw); > > >>>> + unsigned long flags; > > >>>> + > > >>>> + spin_lock_irqsave(&gpu_reset_lock, flags); > > >>>> + > > >>>> + ccu_disable_helper(&cg->common, cg->enable); > > >>>> + > > >>>> + if ((cg == &gpu_core_clk && > > >>>> + !clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) || > > >>>> + (cg == &gpu_cfg_aclk && > > >>>> + !clk_hw_is_enabled(&gpu_core_clk.common.hw))) > > >>>> + reset_control_assert(gpu_reset); > > >>> > > >>> Why can't the clk consumer control the reset itself? Doing this here is > > >>> not ideal because we hold the clk lock when we try to grab the reset > > >>> lock. These are all spinlocks that should be small in lines of code > > >>> where the lock is held, but we're calling into an entire other framework > > >>> under a spinlock. If an (unrelated) reset driver tries to grab the clk > > >>> lock it will deadlock. > > >> > > >> So in our case the clk consumer is the drm/imagination driver. Here is > > >> the comment from the maintainer for my previous attempt to use a reset > > >> driver to abstract the GPU init sequence [1]: > > >> > > >> "Do you know what this resets? From our side, the GPU only has a single > > >> reset line (which I assume to be GPU_RESET)." > > >> > > >> "I don't love that this procedure appears in the platform reset driver. > > >> I appreciate it may not be clear from the SoC TRM, but this is the > > >> standard reset procedure for all IMG Rogue GPUs. The currently > > >> supported TI SoC handles this in silicon, when power up/down requests > > >> are sent so we never needed to encode it in the driver before. > > >> > > >> Strictly speaking, the 32 cycle delay is required between power and > > >> clocks being enabled and the reset line being deasserted. If nothing > > >> here touches power or clocks (which I don't think it should), the delay > > >> could potentially be lifted to the GPU driver." > > >> > > >> From the drm/imagination maintainers point of view their hardware has > > >> only one reset, the extra CLKGEN reset is SoC specific. > > > > > > If I am understanding correctly, the CLKGEN reset doesn't reset > > > anything in the GPU itself, but holds the GPU clock generator block in > > > reset, effectively disabling the three GPU clocks as a workaround for > > > the always-ungated GPU_MEM clock. > > > > > >> Also the reset driver maintainer didn't like my way of abstracting two > > >> resets ("GPU" and and SoC specific"CLKGEN") into one reset > > > > > > That is one part of it. The other is that (according to my > > > understanding as laid out above), the combined GPU+CLKGEN reset would > > > effectively disable all three GPU clocks for a while, after the GPU > > > driver has already requested them to be enabled. > > > > Thank you for your comments Philipp, it seems like we're on the same > > page here. I was wondering whether there is anything I can do to move the > > patches forward. > > > > Stephen, if the current patch is a no go from your perspective could you > > please advise whether there is a way to solve this in a clock that would > > be acceptable to you. > > It looks like the SoC glue makes the interactions between the clk and > reset frameworks complicated because GPU clks don't work if a reset is > asserted. You're trying to find a place to coordinate the clk and reset. > Am I right? > > I'd advise managing the clks and resets in a generic power domain that > is attached to the GPU device. In that power domain, coordinate the clk > and reset sequencing so that the reset is deasserted before the clks are > enabled (or whatever the actual requirement is). If the GPU driver > _must_ have a clk and reset pointer to use, implement one that either > does nothing or flag to the GPU driver that the power domain is managing > all this for it so it should just use runtime PM and system PM hooks to > turn on the clks and take the GPU out of reset. > > From what I can tell, the GPU driver maintainer doesn't want to think > about the wrapper that likely got placed around the hardware block > shipped by IMG. This wrapper is the SoC glue that needs to go into a > generic power domain so that the different PM resources, reset, clk, > etc. can be coordinated based on the GPU device's power state. It's > either that, or go the dwc3 route and have SoC glue platform drivers > that manage this stuff and create a child device to represent the hard > macro shipped by the vendor like Synopsys/Imagination. Doing the parent > device design isn't as flexible as PM domains because you can only have > one parent device and the child device state can be ignored vs. many PM > domains attached in a graph to a device that are more directly > influenced by the device using runtime PM. > > Maybe you'll be heartened to know this problem isn't unique and has > existed for decades :) I don't know what state the graphics driver is in > but they'll likely be interested in solving this problem in a way that > doesn't "pollute" their driver with SoC specific details. It's all a > question of where you put the code. The reset framework wants to focus > on resets, the clk framework wants to focus on clks, and the graphics > driver wants to focus on graphics. BTW, we went through a similar > discussion with regulators and clks years ago and ended up handling that > with OPPs and power domains. Right, power-domain providers are mostly implementing SoC specific code. In some cases, power-domain providers also handle per device SoC specific constraints/sequences, which seems what you are discussing here. For that, genpd has a couple of callbacks that could be interesting to have a look at, such as: genpd->attach|detach_dev() - for probe/remove genpd.dev_ops->start|stop() - for runtime/system PM That said, maybe just using the regular genpd->power_on|off() callback is sufficient here, depending on how you decide to model things. > > I believe a PM domain is the right place for this kind of stuff, and I > actually presented on this topic at OSSEU[1], but I don't maintain that > code. Ulf does. > > [1] https://osseu2024.sched.com/event/1ej38/the-case-for-an-soc-power-management-driver-stephen-boyd-google Kind regards Uffe
On 3/26/25 12:24, Ulf Hansson wrote: > On Tue, 25 Mar 2025 at 23:40, Stephen Boyd <sboyd@kernel.org> wrote: >> >> Quoting Michal Wilczynski (2025-03-19 02:22:11) >>> >>> >>> On 3/13/25 10:25, Philipp Zabel wrote: >>>> On Do, 2025-03-06 at 17:43 +0100, Michal Wilczynski wrote: >>>>> >>>>> On 3/6/25 00:47, Stephen Boyd wrote: >>>>>> Quoting Michal Wilczynski (2025-03-03 06:36:29) >>>>>>> The T-HEAD TH1520 has three GPU clocks: core, cfg, and mem. The mem >>>>>>> clock gate is marked as "Reserved" in hardware, while core and cfg are >>>>>>> configurable. In order for these clock gates to work properly, the >>>>>>> CLKGEN reset must be managed in a specific sequence. >>>>>>> >>>>>>> Move the CLKGEN reset handling to the clock driver since it's >>>>>>> fundamentally a clock-related workaround [1]. This ensures that clk_enabled >>>>>>> GPU clocks stay physically enabled without external interference from >>>>>>> the reset driver. The reset is now deasserted only when both core and >>>>>>> cfg clocks are enabled, and asserted when either of them is disabled. >>>>>>> >>>>>>> The mem clock is configured to use nop operations since it cannot be >>>>>>> controlled. >>>>>>> >>>>>>> Link: https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de [1] >>>>>>> >>>>>>> Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com> >>>>>> [...] >>>>>>> diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c >>>>>>> index ea96d007aecd..1dfcde867233 100644 >>>>>>> --- a/drivers/clk/thead/clk-th1520-ap.c >>>>>>> +++ b/drivers/clk/thead/clk-th1520-ap.c >>>>>>> @@ -862,17 +863,70 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0); >>>>>> [...] >>>>>>> >>>>>>> static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk", >>>>>>> video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops); >>>>>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk", >>>>>>> + video_pll_clk_pd, 0x0, BIT(3), 0, ccu_gate_gpu_ops); >>>>>>> +static CCU_GATE_CLK_OPS(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk", >>>>>>> + video_pll_clk_pd, 0x0, BIT(4), 0, ccu_gate_gpu_ops); >>>>>>> + >>>>>>> +static void ccu_gpu_clk_disable(struct clk_hw *hw) >>>>>>> +{ >>>>>>> + struct ccu_gate *cg = hw_to_ccu_gate(hw); >>>>>>> + unsigned long flags; >>>>>>> + >>>>>>> + spin_lock_irqsave(&gpu_reset_lock, flags); >>>>>>> + >>>>>>> + ccu_disable_helper(&cg->common, cg->enable); >>>>>>> + >>>>>>> + if ((cg == &gpu_core_clk && >>>>>>> + !clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) || >>>>>>> + (cg == &gpu_cfg_aclk && >>>>>>> + !clk_hw_is_enabled(&gpu_core_clk.common.hw))) >>>>>>> + reset_control_assert(gpu_reset); >>>>>> >>>>>> Why can't the clk consumer control the reset itself? Doing this here is >>>>>> not ideal because we hold the clk lock when we try to grab the reset >>>>>> lock. These are all spinlocks that should be small in lines of code >>>>>> where the lock is held, but we're calling into an entire other framework >>>>>> under a spinlock. If an (unrelated) reset driver tries to grab the clk >>>>>> lock it will deadlock. >>>>> >>>>> So in our case the clk consumer is the drm/imagination driver. Here is >>>>> the comment from the maintainer for my previous attempt to use a reset >>>>> driver to abstract the GPU init sequence [1]: >>>>> >>>>> "Do you know what this resets? From our side, the GPU only has a single >>>>> reset line (which I assume to be GPU_RESET)." >>>>> >>>>> "I don't love that this procedure appears in the platform reset driver. >>>>> I appreciate it may not be clear from the SoC TRM, but this is the >>>>> standard reset procedure for all IMG Rogue GPUs. The currently >>>>> supported TI SoC handles this in silicon, when power up/down requests >>>>> are sent so we never needed to encode it in the driver before. >>>>> >>>>> Strictly speaking, the 32 cycle delay is required between power and >>>>> clocks being enabled and the reset line being deasserted. If nothing >>>>> here touches power or clocks (which I don't think it should), the delay >>>>> could potentially be lifted to the GPU driver." >>>>> >>>>> From the drm/imagination maintainers point of view their hardware has >>>>> only one reset, the extra CLKGEN reset is SoC specific. >>>> >>>> If I am understanding correctly, the CLKGEN reset doesn't reset >>>> anything in the GPU itself, but holds the GPU clock generator block in >>>> reset, effectively disabling the three GPU clocks as a workaround for >>>> the always-ungated GPU_MEM clock. >>>> >>>>> Also the reset driver maintainer didn't like my way of abstracting two >>>>> resets ("GPU" and and SoC specific"CLKGEN") into one reset >>>> >>>> That is one part of it. The other is that (according to my >>>> understanding as laid out above), the combined GPU+CLKGEN reset would >>>> effectively disable all three GPU clocks for a while, after the GPU >>>> driver has already requested them to be enabled. >>> >>> Thank you for your comments Philipp, it seems like we're on the same >>> page here. I was wondering whether there is anything I can do to move the >>> patches forward. >>> >>> Stephen, if the current patch is a no go from your perspective could you >>> please advise whether there is a way to solve this in a clock that would >>> be acceptable to you. >> >> It looks like the SoC glue makes the interactions between the clk and >> reset frameworks complicated because GPU clks don't work if a reset is >> asserted. You're trying to find a place to coordinate the clk and reset. >> Am I right? >> >> I'd advise managing the clks and resets in a generic power domain that >> is attached to the GPU device. In that power domain, coordinate the clk >> and reset sequencing so that the reset is deasserted before the clks are >> enabled (or whatever the actual requirement is). If the GPU driver >> _must_ have a clk and reset pointer to use, implement one that either >> does nothing or flag to the GPU driver that the power domain is managing >> all this for it so it should just use runtime PM and system PM hooks to >> turn on the clks and take the GPU out of reset. >> >> From what I can tell, the GPU driver maintainer doesn't want to think >> about the wrapper that likely got placed around the hardware block >> shipped by IMG. This wrapper is the SoC glue that needs to go into a >> generic power domain so that the different PM resources, reset, clk, >> etc. can be coordinated based on the GPU device's power state. It's >> either that, or go the dwc3 route and have SoC glue platform drivers >> that manage this stuff and create a child device to represent the hard >> macro shipped by the vendor like Synopsys/Imagination. Doing the parent >> device design isn't as flexible as PM domains because you can only have >> one parent device and the child device state can be ignored vs. many PM >> domains attached in a graph to a device that are more directly >> influenced by the device using runtime PM. >> >> Maybe you'll be heartened to know this problem isn't unique and has >> existed for decades :) I don't know what state the graphics driver is in >> but they'll likely be interested in solving this problem in a way that >> doesn't "pollute" their driver with SoC specific details. It's all a >> question of where you put the code. The reset framework wants to focus >> on resets, the clk framework wants to focus on clks, and the graphics >> driver wants to focus on graphics. BTW, we went through a similar >> discussion with regulators and clks years ago and ended up handling that >> with OPPs and power domains. > > Right, power-domain providers are mostly implementing SoC specific code. > > In some cases, power-domain providers also handle per device SoC > specific constraints/sequences, which seems what you are discussing > here. For that, genpd has a couple of callbacks that could be > interesting to have a look at, such as: > > genpd->attach|detach_dev() - for probe/remove > genpd.dev_ops->start|stop() - for runtime/system PM > > That said, maybe just using the regular genpd->power_on|off() callback > is sufficient here, depending on how you decide to model things. Thanks Stephen, Ulf ! So the way forward I see: 1) The reset driver can be merged as-is, if Philipp is fine with this code [2]. 2) I will cook up the update to the thead power-domain driver which will handle reset and clock management. 3) I think it would be best to convince the drm/imagination maintainers to make the clock management in their consumer driver optional. This way if there is a SoC specific sequence the clocks/resets will be managed from generic PM driver which is SoC specific. Will talk to them. 4) Will remove the reset management from this series, and re-send. [2] - https://lore.kernel.org/all/4205b786-fb65-468c-a3d8-bce807dd829a@samsung.com/ > >> >> I believe a PM domain is the right place for this kind of stuff, and I >> actually presented on this topic at OSSEU[1], but I don't maintain that >> code. Ulf does. >> >> [1] https://osseu2024.sched.com/event/1ej38/the-case-for-an-soc-power-management-driver-stephen-boyd-google Thanks ! Watched the presentation, very interesting. Hopefully I'll be able to attend in person in Amsterdam this year if you're presenting :-) > > Kind regards > Uffe >
diff --git a/drivers/clk/thead/clk-th1520-ap.c b/drivers/clk/thead/clk-th1520-ap.c index ea96d007aecd..1dfcde867233 100644 --- a/drivers/clk/thead/clk-th1520-ap.c +++ b/drivers/clk/thead/clk-th1520-ap.c @@ -12,6 +12,7 @@ #include <linux/module.h> #include <linux/platform_device.h> #include <linux/regmap.h> +#include <linux/reset.h> #define TH1520_PLL_POSTDIV2 GENMASK(26, 24) #define TH1520_PLL_POSTDIV1 GENMASK(22, 20) @@ -862,17 +863,70 @@ static CCU_GATE(CLK_SRAM1, sram1_clk, "sram1", axi_aclk_pd, 0x20c, BIT(3), 0); static CCU_GATE(CLK_SRAM2, sram2_clk, "sram2", axi_aclk_pd, 0x20c, BIT(2), 0); static CCU_GATE(CLK_SRAM3, sram3_clk, "sram3", axi_aclk_pd, 0x20c, BIT(1), 0); +static struct reset_control *gpu_reset; +static DEFINE_SPINLOCK(gpu_reset_lock); /* protect GPU reset sequence */ + +static void ccu_gpu_clk_disable(struct clk_hw *hw); +static int ccu_gpu_clk_enable(struct clk_hw *hw); + +static const struct clk_ops ccu_gate_gpu_ops = { + .disable = ccu_gpu_clk_disable, + .enable = ccu_gpu_clk_enable +}; + static const struct clk_ops clk_nop_ops = {}; static CCU_GATE_CLK_OPS(CLK_GPU_MEM, gpu_mem_clk, "gpu-mem-clk", video_pll_clk_pd, 0x0, BIT(2), 0, clk_nop_ops); +static CCU_GATE_CLK_OPS(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk", + video_pll_clk_pd, 0x0, BIT(3), 0, ccu_gate_gpu_ops); +static CCU_GATE_CLK_OPS(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk", + video_pll_clk_pd, 0x0, BIT(4), 0, ccu_gate_gpu_ops); + +static void ccu_gpu_clk_disable(struct clk_hw *hw) +{ + struct ccu_gate *cg = hw_to_ccu_gate(hw); + unsigned long flags; + + spin_lock_irqsave(&gpu_reset_lock, flags); + + ccu_disable_helper(&cg->common, cg->enable); + + if ((cg == &gpu_core_clk && + !clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) || + (cg == &gpu_cfg_aclk && + !clk_hw_is_enabled(&gpu_core_clk.common.hw))) + reset_control_assert(gpu_reset); + + spin_unlock_irqrestore(&gpu_reset_lock, flags); +} + +static int ccu_gpu_clk_enable(struct clk_hw *hw) +{ + struct ccu_gate *cg = hw_to_ccu_gate(hw); + unsigned long flags; + int ret; + + spin_lock_irqsave(&gpu_reset_lock, flags); + + ret = ccu_enable_helper(&cg->common, cg->enable); + if (ret) { + spin_unlock_irqrestore(&gpu_reset_lock, flags); + return ret; + } + + if ((cg == &gpu_core_clk && + clk_hw_is_enabled(&gpu_cfg_aclk.common.hw)) || + (cg == &gpu_cfg_aclk && clk_hw_is_enabled(&gpu_core_clk.common.hw))) + ret = reset_control_deassert(gpu_reset); + + spin_unlock_irqrestore(&gpu_reset_lock, flags); + + return ret; +} static CCU_GATE(CLK_AXI4_VO_ACLK, axi4_vo_aclk, "axi4-vo-aclk", video_pll_clk_pd, 0x0, BIT(0), 0); -static CCU_GATE(CLK_GPU_CORE, gpu_core_clk, "gpu-core-clk", video_pll_clk_pd, - 0x0, BIT(3), 0); -static CCU_GATE(CLK_GPU_CFG_ACLK, gpu_cfg_aclk, "gpu-cfg-aclk", - video_pll_clk_pd, 0x0, BIT(4), 0); static CCU_GATE(CLK_DPU_PIXELCLK0, dpu0_pixelclk, "dpu0-pixelclk", video_pll_clk_pd, 0x0, BIT(5), 0); static CCU_GATE(CLK_DPU_PIXELCLK1, dpu1_pixelclk, "dpu1-pixelclk", @@ -1046,8 +1100,6 @@ static struct ccu_common *th1520_gate_clks[] = { static struct ccu_common *th1520_vo_gate_clks[] = { &axi4_vo_aclk.common, - &gpu_core_clk.common, - &gpu_cfg_aclk.common, &dpu0_pixelclk.common, &dpu1_pixelclk.common, &dpu_hclk.common, @@ -1150,6 +1202,13 @@ static int th1520_clk_probe(struct platform_device *pdev) if (IS_ERR(map)) return PTR_ERR(map); + if (plat_data == &th1520_vo_platdata) { + gpu_reset = devm_reset_control_get_exclusive(dev, NULL); + if (IS_ERR(gpu_reset)) + return dev_err_probe(dev, PTR_ERR(gpu_reset), + "GPU reset is required for VO clock controller\n"); + } + for (i = 0; i < plat_data->nr_pll_clks; i++) { struct ccu_pll *cp = hw_to_ccu_pll(&plat_data->th1520_pll_clks[i]->hw); @@ -1226,11 +1285,27 @@ static int th1520_clk_probe(struct platform_device *pdev) if (ret) return ret; } else if (plat_data == &th1520_vo_platdata) { + /* GPU clocks need to be treated differently, as MEM clock + * is non-configurable, and the reset needs to be de-asserted + * after enabling CORE and CFG clocks. + */ ret = devm_clk_hw_register(dev, &gpu_mem_clk.common.hw); if (ret) return ret; gpu_mem_clk.common.map = map; priv->hws[CLK_GPU_MEM] = &gpu_mem_clk.common.hw; + + ret = devm_clk_hw_register(dev, &gpu_core_clk.common.hw); + if (ret) + return ret; + gpu_core_clk.common.map = map; + priv->hws[CLK_GPU_CORE] = &gpu_core_clk.common.hw; + + ret = devm_clk_hw_register(dev, &gpu_cfg_aclk.common.hw); + if (ret) + return ret; + gpu_cfg_aclk.common.map = map; + priv->hws[CLK_GPU_CFG_ACLK] = &gpu_cfg_aclk.common.hw; } ret = devm_of_clk_add_hw_provider(dev, of_clk_hw_onecell_get, priv);
The T-HEAD TH1520 has three GPU clocks: core, cfg, and mem. The mem clock gate is marked as "Reserved" in hardware, while core and cfg are configurable. In order for these clock gates to work properly, the CLKGEN reset must be managed in a specific sequence. Move the CLKGEN reset handling to the clock driver since it's fundamentally a clock-related workaround [1]. This ensures that clk_enabled GPU clocks stay physically enabled without external interference from the reset driver. The reset is now deasserted only when both core and cfg clocks are enabled, and asserted when either of them is disabled. The mem clock is configured to use nop operations since it cannot be controlled. Link: https://lore.kernel.org/all/945fb7e913a9c3dcb40697328b7e9842b75fea5c.camel@pengutronix.de [1] Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com> --- drivers/clk/thead/clk-th1520-ap.c | 87 ++++++++++++++++++++++++++++--- 1 file changed, 81 insertions(+), 6 deletions(-)