mbox series

[0/1] clk: Meson8/8b/8m2: fix the mali clock flags

Message ID 20191215210153.1449067-1-martin.blumenstingl@googlemail.com (mailing list archive)
Headers show
Series clk: Meson8/8b/8m2: fix the mali clock flags | expand

Message

Martin Blumenstingl Dec. 15, 2019, 9:01 p.m. UTC
While playing with devfreq support for the lima driver I experienced
sporadic (random) system lockups. It turned out that this was in
certain cases when changing the mali clock.

The Amlogic vendor GPU platform driver (which is responsible for
changing the clock frequency) uses the following pattern when updating
the mali clock rate:
- at initialization: initialize the two mali_0 and mali_1 clock trees
  with a default setting and enable both clocks
- when changing the clock frequency:
-- set HHI_MALI_CLK_CNTL[31] to temporarily use the mali_1 clock output
-- update the mali_0 clock tree (set the mux, divider, etc.)
-- clear HHI_MALI_CLK_CNTL[31] to temporarily use the mali_0 clock
   output again

With the common clock framework we can even do better:
by setting CLK_SET_RATE_PARENT for the mali_0 and mali_1 output gates
we can force the common clock framework to update the "inactive" clock
and then switch to it's output.

I only tested this patch for a limited time only (approx. 2 hours).
So far I couldn't reproduce the sporadic system lockups with it.
However, broader testing would be great so I would like this to be
applied for -next.


Martin Blumenstingl (1):
  clk: meson: meson8b: make the CCF use the glitch-free "mali" mux

 drivers/clk/meson/meson8b.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Jerome Brunet Dec. 16, 2019, 9:13 a.m. UTC | #1
On Sun 15 Dec 2019 at 22:01, Martin Blumenstingl <martin.blumenstingl@googlemail.com> wrote:

> While playing with devfreq support for the lima driver I experienced
> sporadic (random) system lockups. It turned out that this was in
> certain cases when changing the mali clock.
>
> The Amlogic vendor GPU platform driver (which is responsible for
> changing the clock frequency) uses the following pattern when updating
> the mali clock rate:
> - at initialization: initialize the two mali_0 and mali_1 clock trees
>   with a default setting and enable both clocks
> - when changing the clock frequency:
> -- set HHI_MALI_CLK_CNTL[31] to temporarily use the mali_1 clock output
> -- update the mali_0 clock tree (set the mux, divider, etc.)
> -- clear HHI_MALI_CLK_CNTL[31] to temporarily use the mali_0 clock
                                      ^ no final setting then ? :P
>    output again
>
> With the common clock framework we can even do better:
> by setting CLK_SET_RATE_PARENT for the mali_0 and mali_1 output gates
                ^
From your patch, I guess you mean CLK_SET_RATE_GATE ?

> we can force the common clock framework to update the "inactive" clock
> and then switch to it's output.
>
> I only tested this patch for a limited time only (approx. 2 hours).
> So far I couldn't reproduce the sporadic system lockups with it.
> However, broader testing would be great so I would like this to be
> applied for -next.

CLK_SET_RATE_GATE guarantees that a clock cannot be updated while in
use. While it works at your advantage here, I'm not sure CCF guarantees
the assumption this implementation is based on. Some explanation below:

In your case, if it works as you expect when calling set_rate() on the
top clock, it goes as this:

- mali0 is use with rate X:
- => set_rate(mali_top, Y)
- mali0 is in use, cannot change, will round rate Y to X
- mali1 is not in use, can provide Y
- mali1 is determined to be the new best parent for mali top

So far so good.

- CCF pick the mali1 subtree
  *start updating the clock from the root to the leaf*

So the mali top mux, which choose between mali0 and mali1, will be
*updated last* which crucial to your use case.

I just wonder if this crucial part something CCF guarantee and you can
rely on it ... or if it might break in the future.

Stephen, any thoughts on this ?

PS: If CCF does guarantee "root-to-leaf" updates, I think this
implementation is a clever trick to solve this usual glitch free clock
update issue ... much more elegant that the notifier solution we have
been using so far.

>
>
> Martin Blumenstingl (1):
>   clk: meson: meson8b: make the CCF use the glitch-free "mali" mux
>
>  drivers/clk/meson/meson8b.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
Stephen Boyd Dec. 16, 2019, 5:50 p.m. UTC | #2
Quoting Jerome Brunet (2019-12-16 01:13:31)
> 
> On Sun 15 Dec 2019 at 22:01, Martin Blumenstingl <martin.blumenstingl@googlemail.com> wrote:
> 
> > While playing with devfreq support for the lima driver I experienced
> > sporadic (random) system lockups. It turned out that this was in
> > certain cases when changing the mali clock.
> >
> > The Amlogic vendor GPU platform driver (which is responsible for
> > changing the clock frequency) uses the following pattern when updating
> > the mali clock rate:
> > - at initialization: initialize the two mali_0 and mali_1 clock trees
> >   with a default setting and enable both clocks
> > - when changing the clock frequency:
> > -- set HHI_MALI_CLK_CNTL[31] to temporarily use the mali_1 clock output
> > -- update the mali_0 clock tree (set the mux, divider, etc.)
> > -- clear HHI_MALI_CLK_CNTL[31] to temporarily use the mali_0 clock
>                                       ^ no final setting then ? :P
> >    output again
> >
> > With the common clock framework we can even do better:
> > by setting CLK_SET_RATE_PARENT for the mali_0 and mali_1 output gates
>                 ^
> From your patch, I guess you mean CLK_SET_RATE_GATE ?
> 
> > we can force the common clock framework to update the "inactive" clock
> > and then switch to it's output.
> >
> > I only tested this patch for a limited time only (approx. 2 hours).
> > So far I couldn't reproduce the sporadic system lockups with it.
> > However, broader testing would be great so I would like this to be
> > applied for -next.
> 
> CLK_SET_RATE_GATE guarantees that a clock cannot be updated while in
> use. While it works at your advantage here, I'm not sure CCF guarantees
> the assumption this implementation is based on. Some explanation below:
> 
> In your case, if it works as you expect when calling set_rate() on the
> top clock, it goes as this:
> 
> - mali0 is use with rate X:
> - => set_rate(mali_top, Y)
> - mali0 is in use, cannot change, will round rate Y to X
> - mali1 is not in use, can provide Y
> - mali1 is determined to be the new best parent for mali top
> 
> So far so good.
> 
> - CCF pick the mali1 subtree
>   *start updating the clock from the root to the leaf*
> 
> So the mali top mux, which choose between mali0 and mali1, will be
> *updated last* which crucial to your use case.
> 
> I just wonder if this crucial part something CCF guarantee and you can
> rely on it ... or if it might break in the future.
> 
> Stephen, any thoughts on this ?

We have problems with the order in which we call the set_rate clk_op.
Sometimes clk providers want us to call from leaf to root but instead we
call from root to leaf because of implementation reasons. Controlling
the order in which clk operations are done is an unsolved problem. But
yes, in the future I'd like to see us introduce the vaporware that is
coordinated clk rates that would allow clk providers to decide what this
order should be, instead of having to do this "root-to-leaf" update.
Doing so would help us with the clk dividers that have some parent
changing rate that causes the downstream device to be overclocked while
we change the parent before the divider.

If there are more assumptions like this about how the CCF is implemented
then we'll have to be extra careful to not disturb the "normal" order of
operations when introducing something that allows clk providers to
modify it.

Also, isn't CLK_SET_RATE_GATE broken in the case that clk_set_rate()
isn't called on that particular clk? I seem to recall that the flag only
matters when it's applied to the "leaf" or entry point into the CCF from
a consumer API. I've wanted to fix that but never gotten around to it.
The whole flag sort of irks me because I don't understand what consumers
are supposed to do when this flag is set on a clk. How do they discover
it? They're supposed to "just know" and turn off the clk first and then
call clk_set_rate()? Why can't the framework do this all in the
clk_set_rate() call?

> 
> PS: If CCF does guarantee "root-to-leaf" updates, I think this
> implementation is a clever trick to solve this usual glitch free clock
> update issue ... much more elegant that the notifier solution we have
> been using so far.
Jerome Brunet Dec. 16, 2019, 7:17 p.m. UTC | #3
On Mon 16 Dec 2019 at 18:50, Stephen Boyd <sboyd@kernel.org> wrote:

> Quoting Jerome Brunet (2019-12-16 01:13:31)
>> 
>> On Sun 15 Dec 2019 at 22:01, Martin Blumenstingl <martin.blumenstingl@googlemail.com> wrote:
>> 
>> > While playing with devfreq support for the lima driver I experienced
>> > sporadic (random) system lockups. It turned out that this was in
>> > certain cases when changing the mali clock.
>> >
>> > The Amlogic vendor GPU platform driver (which is responsible for
>> > changing the clock frequency) uses the following pattern when updating
>> > the mali clock rate:
>> > - at initialization: initialize the two mali_0 and mali_1 clock trees
>> >   with a default setting and enable both clocks
>> > - when changing the clock frequency:
>> > -- set HHI_MALI_CLK_CNTL[31] to temporarily use the mali_1 clock output
>> > -- update the mali_0 clock tree (set the mux, divider, etc.)
>> > -- clear HHI_MALI_CLK_CNTL[31] to temporarily use the mali_0 clock
>>                                       ^ no final setting then ? :P
>> >    output again
>> >
>> > With the common clock framework we can even do better:
>> > by setting CLK_SET_RATE_PARENT for the mali_0 and mali_1 output gates
>>                 ^
>> From your patch, I guess you mean CLK_SET_RATE_GATE ?
>> 
>> > we can force the common clock framework to update the "inactive" clock
>> > and then switch to it's output.
>> >
>> > I only tested this patch for a limited time only (approx. 2 hours).
>> > So far I couldn't reproduce the sporadic system lockups with it.
>> > However, broader testing would be great so I would like this to be
>> > applied for -next.
>> 
>> CLK_SET_RATE_GATE guarantees that a clock cannot be updated while in
>> use. While it works at your advantage here, I'm not sure CCF guarantees
>> the assumption this implementation is based on. Some explanation below:
>> 
>> In your case, if it works as you expect when calling set_rate() on the
>> top clock, it goes as this:
>> 
>> - mali0 is use with rate X:
>> - => set_rate(mali_top, Y)
>> - mali0 is in use, cannot change, will round rate Y to X
>> - mali1 is not in use, can provide Y
>> - mali1 is determined to be the new best parent for mali top
>> 
>> So far so good.
>> 
>> - CCF pick the mali1 subtree
>>   *start updating the clock from the root to the leaf*
>> 
>> So the mali top mux, which choose between mali0 and mali1, will be
>> *updated last* which crucial to your use case.
>> 
>> I just wonder if this crucial part something CCF guarantee and you can
>> rely on it ... or if it might break in the future.
>> 
>> Stephen, any thoughts on this ?
>
> We have problems with the order in which we call the set_rate clk_op.
> Sometimes clk providers want us to call from leaf to root but instead we
> call from root to leaf because of implementation reasons. Controlling
> the order in which clk operations are done is an unsolved problem. But
> yes, in the future I'd like to see us introduce the vaporware that is
> coordinated clk rates that would allow clk providers to decide what this
> order should be, instead of having to do this "root-to-leaf" update.
> Doing so would help us with the clk dividers that have some parent
> changing rate that causes the downstream device to be overclocked while
> we change the parent before the divider.
>
> If there are more assumptions like this about how the CCF is implemented
> then we'll have to be extra careful to not disturb the "normal" order of
> operations when introducing something that allows clk providers to
> modify it.

I understand that CCR would, in theory, allow to define that sort of
details. Still defining (and documenting) the default behavior would be
nice.

So the question is:
 * Can we rely set_rate() doing a root-to-leaf update until CCR comes
   around ?
 * If not, for use cases like the one described by Martin, I guess we
   are stuck with the notifier ? Or would you have something else to
   propose ?
   
>
> Also, isn't CLK_SET_RATE_GATE broken in the case that clk_set_rate()
> isn't called on that particular clk? I seem to recall that the flag only
> matters when it's applied to the "leaf" or entry point into the CCF from
> a consumer API.

It did but not anymore

> I've wanted to fix that but never gotten around to it.

I fixed that already :P
CLK_SET_RATE_GATE is a special case of clock protect. The clock is
protecting itself so it is going down through the tree.


> The whole flag sort of irks me because I don't understand what consumers
> are supposed to do when this flag is set on a clk. How do they discover
> it?

Actually (ATM) the consumer is not even aware of it. If a clock with
CLK_SET_RATE_GATE is enabled, it will return the current rate to
.round_rate() and .set_rate() ... as if it was fixed.

> They're supposed to "just know" and turn off the clk first and then
> call clk_set_rate()?

ATM, yes ... if CCF cannot switch to another "unlocked" subtree (the
case here)

> Why can't the framework do this all in the clk_set_rate() call?

When there is multiple consumers the behavior would become a bit
difficult to predict and drivers may have troubles anticipating that,
maybe, the clock is locked.

>
>> 
>> PS: If CCF does guarantee "root-to-leaf" updates, I think this
>> implementation is a clever trick to solve this usual glitch free clock
>> update issue ... much more elegant that the notifier solution we have
>> been using so far.
Stephen Boyd Dec. 24, 2019, 3:36 a.m. UTC | #4
Quoting Jerome Brunet (2019-12-16 11:17:21)
> 
> On Mon 16 Dec 2019 at 18:50, Stephen Boyd <sboyd@kernel.org> wrote:
> 
> > Quoting Jerome Brunet (2019-12-16 01:13:31)
> >> 
> >> *updated last* which crucial to your use case.
> >> 
> >> I just wonder if this crucial part something CCF guarantee and you can
> >> rely on it ... or if it might break in the future.
> >> 
> >> Stephen, any thoughts on this ?
> >
> > We have problems with the order in which we call the set_rate clk_op.
> > Sometimes clk providers want us to call from leaf to root but instead we
> > call from root to leaf because of implementation reasons. Controlling
> > the order in which clk operations are done is an unsolved problem. But
> > yes, in the future I'd like to see us introduce the vaporware that is
> > coordinated clk rates that would allow clk providers to decide what this
> > order should be, instead of having to do this "root-to-leaf" update.
> > Doing so would help us with the clk dividers that have some parent
> > changing rate that causes the downstream device to be overclocked while
> > we change the parent before the divider.
> >
> > If there are more assumptions like this about how the CCF is implemented
> > then we'll have to be extra careful to not disturb the "normal" order of
> > operations when introducing something that allows clk providers to
> > modify it.
> 
> I understand that CCR would, in theory, allow to define that sort of
> details. Still defining (and documenting) the default behavior would be
> nice.
> 
> So the question is:
>  * Can we rely set_rate() doing a root-to-leaf update until CCR comes
>    around ?
>  * If not, for use cases like the one described by Martin, I guess we
>    are stuck with the notifier ? Or would you have something else to
>    propose ?

I suppose we should just state that clk_set_rate() should do a
root-to-leaf update. It's not like anyone is interested in changing
this behavior. The notifier is not ideal. I've wanted to add a new
clk_op that would cover some amount of the notifier users by having a
'pre_set_rate' clk op that can mux the clk over to something safe or
setup a divider to something that is known to be safe and work. Then we
can avoid having to register for a notifier just to do something right
before the root-to-leaf update happens.

>    
> >
> > Also, isn't CLK_SET_RATE_GATE broken in the case that clk_set_rate()
> > isn't called on that particular clk? I seem to recall that the flag only
> > matters when it's applied to the "leaf" or entry point into the CCF from
> > a consumer API.
> 
> It did but not anymore
> 
> > I've wanted to fix that but never gotten around to it.
> 
> I fixed that already :P
> CLK_SET_RATE_GATE is a special case of clock protect. The clock is
> protecting itself so it is going down through the tree.
> 

Ahaha ok. As you can see I'm trying to forget clock protect ;-)


> 
> > The whole flag sort of irks me because I don't understand what consumers
> > are supposed to do when this flag is set on a clk. How do they discover
> > it?
> 
> Actually (ATM) the consumer is not even aware of it. If a clock with
> CLK_SET_RATE_GATE is enabled, it will return the current rate to
> .round_rate() and .set_rate() ... as if it was fixed.

And then when the clk is disabled it will magically "unstick" and start
to accept the same rate request again?

> 
> > They're supposed to "just know" and turn off the clk first and then
> > call clk_set_rate()?
> 
> ATM, yes ... if CCF cannot switch to another "unlocked" subtree (the
> case here)
> 
> > Why can't the framework do this all in the clk_set_rate() call?
> 
> When there is multiple consumers the behavior would become a bit
> difficult to predict and drivers may have troubles anticipating that,
> maybe, the clock is locked.

Fun times!
Jerome Brunet Dec. 26, 2019, 9:06 a.m. UTC | #5
On Tue 24 Dec 2019 at 04:36, Stephen Boyd <sboyd@kernel.org> wrote:

> Quoting Jerome Brunet (2019-12-16 11:17:21)
>> 
>> On Mon 16 Dec 2019 at 18:50, Stephen Boyd <sboyd@kernel.org> wrote:
>> 
>> > Quoting Jerome Brunet (2019-12-16 01:13:31)
>> >> 
>> >> *updated last* which crucial to your use case.
>> >> 
>> >> I just wonder if this crucial part something CCF guarantee and you can
>> >> rely on it ... or if it might break in the future.
>> >> 
>> >> Stephen, any thoughts on this ?
>> >
>> > We have problems with the order in which we call the set_rate clk_op.
>> > Sometimes clk providers want us to call from leaf to root but instead we
>> > call from root to leaf because of implementation reasons. Controlling
>> > the order in which clk operations are done is an unsolved problem. But
>> > yes, in the future I'd like to see us introduce the vaporware that is
>> > coordinated clk rates that would allow clk providers to decide what this
>> > order should be, instead of having to do this "root-to-leaf" update.
>> > Doing so would help us with the clk dividers that have some parent
>> > changing rate that causes the downstream device to be overclocked while
>> > we change the parent before the divider.
>> >
>> > If there are more assumptions like this about how the CCF is implemented
>> > then we'll have to be extra careful to not disturb the "normal" order of
>> > operations when introducing something that allows clk providers to
>> > modify it.
>> 
>> I understand that CCR would, in theory, allow to define that sort of
>> details. Still defining (and documenting) the default behavior would be
>> nice.
>> 
>> So the question is:
>>  * Can we rely set_rate() doing a root-to-leaf update until CCR comes
>>    around ?
>>  * If not, for use cases like the one described by Martin, I guess we
>>    are stuck with the notifier ? Or would you have something else to
>>    propose ?
>
> I suppose we should just state that clk_set_rate() should do a
> root-to-leaf update. It's not like anyone is interested in changing
> this behavior. The notifier is not ideal. I've wanted to add a new
> clk_op that would cover some amount of the notifier users by having a
> 'pre_set_rate' clk op that can mux the clk over to something safe or
> setup a divider to something that is known to be safe and work. Then we
> can avoid having to register for a notifier just to do something right
> before the root-to-leaf update happens.
>

Martin,

It looks like a green light to me ;) Just add a detailed comment on the
mali top clock explaining things and it should be alright.

>>    
>> >
>> > Also, isn't CLK_SET_RATE_GATE broken in the case that clk_set_rate()
>> > isn't called on that particular clk? I seem to recall that the flag only
>> > matters when it's applied to the "leaf" or entry point into the CCF from
>> > a consumer API.
>> 
>> It did but not anymore
>> 
>> > I've wanted to fix that but never gotten around to it.
>> 
>> I fixed that already :P
>> CLK_SET_RATE_GATE is a special case of clock protect. The clock is
>> protecting itself so it is going down through the tree.
>> 
>
> Ahaha ok. As you can see I'm trying to forget clock protect ;-)
>
>
>> 
>> > The whole flag sort of irks me because I don't understand what consumers
>> > are supposed to do when this flag is set on a clk. How do they discover
>> > it?
>> 
>> Actually (ATM) the consumer is not even aware of it. If a clock with
>> CLK_SET_RATE_GATE is enabled, it will return the current rate to
>> .round_rate() and .set_rate() ... as if it was fixed.
>
> And then when the clk is disabled it will magically "unstick" and start
> to accept the same rate request again?
>

Exactly

>> 
>> > They're supposed to "just know" and turn off the clk first and then
>> > call clk_set_rate()?
>> 
>> ATM, yes ... if CCF cannot switch to another "unlocked" subtree (the
>> case here)
>> 
>> > Why can't the framework do this all in the clk_set_rate() call?
>> 
>> When there is multiple consumers the behavior would become a bit
>> difficult to predict and drivers may have troubles anticipating that,
>> maybe, the clock is locked.
>
> Fun times!