diff mbox

clk: Propagate prepare and enable when reparenting orphans

Message ID 1415386312-23741-1-git-send-email-dianders@chromium.org (mailing list archive)
State New, archived
Headers show

Commit Message

Douglas Anderson Nov. 7, 2014, 6:51 p.m. UTC
With the existing code, if you find a parent for an orhpan that has
already been prepared / enabled, you won't enable the parent.  That
can cause later problems since the clock tree isn't in a consistent
state.  Fix by propagating the prepare and enable.

NOTE: this does bring up the question about whether the enable of the
orphan actually made sense.  If the orphan's parent wasn't enabled by
default (by the bootloader or the default state of the hardware) then
the original enable of the orphan probably didn't do what the caller
though it would.  Some users of the orphan might have preferred an
EPROBE_DEFER be returned until we had a full path to a root clock.
This patch doesn't address those concerns and really just syncs up the
state.

Tested on rk3288-evb-rk808 by temporarily considering "sclk_tsadc" as
a critical clock (to simulate a driver enabling it at bootup).

Before:

   clock                         enable_cnt  prepare_cnt        rate   accuracy   phase
----------------------------------------------------------------------------------------
 xin32k                                   0            0       32768          0 0
    sclk_hdmi_cec                         0            0       32768          0 0
    sclk_otg_adp                          0            0       32768          0 0
    sclk_tsadc                            1            1         993          0 0

After:

   clock                         enable_cnt  prepare_cnt        rate   accuracy   phase
----------------------------------------------------------------------------------------
 xin32k                                   1            1       32768          0 0
    sclk_hdmi_cec                         0            0       32768          0 0
    sclk_otg_adp                          0            0       32768          0 0
    sclk_tsadc                            1            1         993          0 0

Note that xin32k on rk808 is a clock that cannot be disabled in
hardware (it's an always on clock), so really all we needed to do was
to sync up the state.

Signed-off-by: Doug Anderson <dianders@chromium.org>
---
 drivers/clk/clk.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Russell King - ARM Linux Nov. 7, 2014, 6:58 p.m. UTC | #1
On Fri, Nov 07, 2014 at 10:51:52AM -0800, Doug Anderson wrote:
> diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> index 4896ae9..a3d3d58 100644
> --- a/drivers/clk/clk.c
> +++ b/drivers/clk/clk.c
> @@ -1650,6 +1650,17 @@ void __clk_reparent(struct clk *clk, struct clk *new_parent)
>  	clk_reparent(clk, new_parent);
>  	__clk_recalc_accuracies(clk);
>  	__clk_recalc_rates(clk, POST_RATE_CHANGE);
> +
> +	if (clk->prepare_count) {
> +		unsigned long flags;
> +
> +		__clk_prepare(new_parent);
> +
> +		flags = clk_enable_lock();
> +		if (clk->enable_count)
> +			__clk_enable(new_parent);
> +		clk_enable_unlock(flags);
> +	}

I really don't understand why this isn't already done - I said this was
necessary a /long/ time ago.

However, the above isn't sufficient.  Think about the old parent - this
should be disabled and unprepared if it was prepared and enabled by the
child.
Douglas Anderson Nov. 7, 2014, 10:52 p.m. UTC | #2
Russell,

On Fri, Nov 7, 2014 at 10:58 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Nov 07, 2014 at 10:51:52AM -0800, Doug Anderson wrote:
>> diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
>> index 4896ae9..a3d3d58 100644
>> --- a/drivers/clk/clk.c
>> +++ b/drivers/clk/clk.c
>> @@ -1650,6 +1650,17 @@ void __clk_reparent(struct clk *clk, struct clk *new_parent)
>>       clk_reparent(clk, new_parent);
>>       __clk_recalc_accuracies(clk);
>>       __clk_recalc_rates(clk, POST_RATE_CHANGE);
>> +
>> +     if (clk->prepare_count) {
>> +             unsigned long flags;
>> +
>> +             __clk_prepare(new_parent);
>> +
>> +             flags = clk_enable_lock();
>> +             if (clk->enable_count)
>> +                     __clk_enable(new_parent);
>> +             clk_enable_unlock(flags);
>> +     }
>
> I really don't understand why this isn't already done - I said this was
> necessary a /long/ time ago.
>
> However, the above isn't sufficient.  Think about the old parent - this
> should be disabled and unprepared if it was prepared and enabled by the
> child.

You may be referring of a different bug than I am addressing.  I can
think about the old parent, but it always a tear to my eyes since
these clocks are orphans and had no old parents (unless you count the
matron at the orphanage, but I doubt she was either prepared or
enabled).  ;)

Ah, but I see!  There are other users of this function that are not
part of "clk.c".  Doh!  Since this was a "__" function with no
documentation I assumed it was "static", but I see that it is not.  I
see two callers that are not part of the orphan code.

I'll happily move this code down so it's only called by the orphan
code and not touch the two callers of __clk_reparent(), assuming that
they don't need to deal with this scenario.


NOTE: As far as I can tell, the standard exposed API call is
clk_set_parent().  From reading comments that does move the prepared /
enabled state, but I haven't confirmed that functionality.

-Doug
Russell King - ARM Linux Nov. 7, 2014, 11:36 p.m. UTC | #3
On Fri, Nov 07, 2014 at 02:52:38PM -0800, Doug Anderson wrote:
> Russell,
> 
> On Fri, Nov 7, 2014 at 10:58 AM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Fri, Nov 07, 2014 at 10:51:52AM -0800, Doug Anderson wrote:
> >> diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> >> index 4896ae9..a3d3d58 100644
> >> --- a/drivers/clk/clk.c
> >> +++ b/drivers/clk/clk.c
> >> @@ -1650,6 +1650,17 @@ void __clk_reparent(struct clk *clk, struct clk *new_parent)
> >>       clk_reparent(clk, new_parent);
> >>       __clk_recalc_accuracies(clk);
> >>       __clk_recalc_rates(clk, POST_RATE_CHANGE);
> >> +
> >> +     if (clk->prepare_count) {
> >> +             unsigned long flags;
> >> +
> >> +             __clk_prepare(new_parent);
> >> +
> >> +             flags = clk_enable_lock();
> >> +             if (clk->enable_count)
> >> +                     __clk_enable(new_parent);
> >> +             clk_enable_unlock(flags);
> >> +     }
> >
> > I really don't understand why this isn't already done - I said this was
> > necessary a /long/ time ago.
> >
> > However, the above isn't sufficient.  Think about the old parent - this
> > should be disabled and unprepared if it was prepared and enabled by the
> > child.
> 
> You may be referring of a different bug than I am addressing.  I can
> think about the old parent, but it always a tear to my eyes since
> these clocks are orphans and had no old parents (unless you count the
> matron at the orphanage, but I doubt she was either prepared or
> enabled).  ;)
> 
> Ah, but I see!  There are other users of this function that are not
> part of "clk.c".  Doh!  Since this was a "__" function with no
> documentation I assumed it was "static", but I see that it is not.  I
> see two callers that are not part of the orphan code.
> 
> I'll happily move this code down so it's only called by the orphan
> code and not touch the two callers of __clk_reparent(), assuming that
> they don't need to deal with this scenario.

What I am saying is as follows.  Take this diagram - a mux.  clkc can
be sourced from either clkp1 or clkp2.  Initially, it is set to clkp1:

clkp1 -----o
            \
             o--------> clkc

clkp2 -----o

Let's assume that none of these clocks are requested, prepared or
enabled.

Now, if clkc is requested, and then prepared, clkp1 will be prepared,
but not clkp2.

When clkc is re-parented to clkp2 in this state, there are three things
which must happen:

1. clkp2 needs to be prepared.
2. clkc needs to be switched from clkp1 to clkp2.
3. clkp1 needs to be unprepared.

(the order is debatable.)

The reason for step 3 is because of what happens if we unprepare clkc,
or switch back to clkp1.

If we unprepare clkc, we _should_ end up with clkp1, clkp2 and clkc
_all_ back in their initial states - in other words, all unprepared.
clkp1 should not be left prepared by this sequence of events.

If we switch back to clkp1, then the same three things need to happen
(just with the appropriate parent clocks):

1. clkp1 needs to be prepared.
2. clkc needs to be switched from clkp2 to clkp1.
3. clkp2 needs to be unprepared.

And, having done that, we can see that we are in exactly the same state
as we were when we first prepared clkc in the beginning.

If we omit the unprepare stage, then at this point, we will have prepared
clkp1 _twice_ and clkp2 _once_, which means when clkc is unprepared, both
clkp1 and clkp2 are left with a preparation count of one - which is
effectively a refcount leak.

Fixing the lack of prepare may fix the "clock not running" problem, but
without addressing the unprepare side, you are introducing a new bug
while fixing an existing bug.  Both issues need to be resolved together.
Douglas Anderson Nov. 8, 2014, 12:14 a.m. UTC | #4
Russell,

On Fri, Nov 7, 2014 at 3:36 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> What I am saying is as follows.  Take this diagram - a mux.  clkc can
> be sourced from either clkp1 or clkp2.  Initially, it is set to clkp1:
>
> clkp1 -----o
>             \
>              o--------> clkc
>
> clkp2 -----o

OK.  This isn't my case at all.  In my case the clock being parented
is an orphan.  By definition it had no previous parent.  ...but let's
think about your scenario too.


> Let's assume that none of these clocks are requested, prepared or
> enabled.
>
> Now, if clkc is requested, and then prepared, clkp1 will be prepared,
> but not clkp2.
>
> When clkc is re-parented to clkp2 in this state, there are three things
> which must happen:
>
> 1. clkp2 needs to be prepared.
> 2. clkc needs to be switched from clkp1 to clkp2.
> 3. clkp1 needs to be unprepared.
>
> (the order is debatable.)
>
> The reason for step 3 is because of what happens if we unprepare clkc,
> or switch back to clkp1.
>
> If we unprepare clkc, we _should_ end up with clkp1, clkp2 and clkc
> _all_ back in their initial states - in other words, all unprepared.
> clkp1 should not be left prepared by this sequence of events.
>
> If we switch back to clkp1, then the same three things need to happen
> (just with the appropriate parent clocks):
>
> 1. clkp1 needs to be prepared.
> 2. clkc needs to be switched from clkp2 to clkp1.
> 3. clkp2 needs to be unprepared.
>
> And, having done that, we can see that we are in exactly the same state
> as we were when we first prepared clkc in the beginning.
>
> If we omit the unprepare stage, then at this point, we will have prepared
> clkp1 _twice_ and clkp2 _once_, which means when clkc is unprepared, both
> clkp1 and clkp2 are left with a preparation count of one - which is
> effectively a refcount leak.

All of the above is clear and matches my understanding of how
clk_set_parent() works.  You don't think it does?  ...or are you
talking about some other API call?

I see:

clk_set_parent()
-> __clk_set_parent()
----> __clk_set_parent_before()
------> prepare new parent
------> enable new parent
------> enable clk
------> actually do the reparent in CCF
----> call clk->ops->set_parent()
----> clk_set_parent_after()
------> disable clk
------> disable old parent
------> unprepare old parent

clk_set_parent() is documented to temporarily enable clk during its operation.


> Fixing the lack of prepare may fix the "clock not running" problem, but
> without addressing the unprepare side, you are introducing a new bug
> while fixing an existing bug.  Both issues need to be resolved together.

I guess I'm still confused.  My patch continues to be about orphans
and I don't see the bug you are pointing to.

-Doug
Russell King - ARM Linux Nov. 8, 2014, 12:23 a.m. UTC | #5
On Fri, Nov 07, 2014 at 04:14:23PM -0800, Doug Anderson wrote:
> Russell,
> I guess I'm still confused.  My patch continues to be about orphans
> and I don't see the bug you are pointing to.

Ah, in which case, the question changes: how can an orphaned clock be
succesfully prepared and enabled?

Drivers expect that a clock for which clk_enable() has returned
successfully _will_ at that point be supplying the clock.  If we don't
yet know it's parent, how do we know that it will be supplying that
clock?

What about a driver calling clk_set_rate() on an orphaned clock?

From what I can see (__clk_reparent will re-set the child's clock when
reparenting) having a driver able to claim an orphaned clock, let
alone prepare and enable it, looks rather buggy to me.
Douglas Anderson Nov. 8, 2014, 12:46 a.m. UTC | #6
Russell,

On Fri, Nov 7, 2014 at 4:23 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Nov 07, 2014 at 04:14:23PM -0800, Doug Anderson wrote:
>> Russell,
>> I guess I'm still confused.  My patch continues to be about orphans
>> and I don't see the bug you are pointing to.
>
> Ah, in which case, the question changes: how can an orphaned clock be
> succesfully prepared and enabled?
>
> Drivers expect that a clock for which clk_enable() has returned
> successfully _will_ at that point be supplying the clock.  If we don't
> yet know it's parent, how do we know that it will be supplying that
> clock?
>
> What about a driver calling clk_set_rate() on an orphaned clock?
>
> From what I can see (__clk_reparent will re-set the child's clock when
> reparenting) having a driver able to claim an orphaned clock, let
> alone prepare and enable it, looks rather buggy to me.

Yes, it is pretty questionable.  I discussed some of this in my
comment message in this patch.  Specifically, I said:

> NOTE: this does bring up the question about whether the enable of the
> orphan actually made sense.  If the orphan's parent wasn't enabled by
> default (by the bootloader or the default state of the hardware) then
> the original enable of the orphan probably didn't do what the caller
> though it would.  Some users of the orphan might have preferred an
> EPROBE_DEFER be returned until we had a full path to a root clock.
> This patch doesn't address those concerns and really just syncs up the
> state.

I'm not sure I want to go all the way doing the above and adding the
EPROBE_DEFER because I think that there are probably lots of users out
there that are assuming that they can enable/disable an orphaned clock
and I can't myself commit to fixing all of them.  If you want to
propose such a patch and can get it landed then my patch would
certainly not be necessarily.

Also see the note in the original commit message:

> Note that xin32k on rk808 is a clock that cannot be disabled in
> hardware (it's an always on clock), so really all we needed to do was
> to sync up the state.

-Doug
diff mbox

Patch

diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
index 4896ae9..a3d3d58 100644
--- a/drivers/clk/clk.c
+++ b/drivers/clk/clk.c
@@ -1650,6 +1650,17 @@  void __clk_reparent(struct clk *clk, struct clk *new_parent)
 	clk_reparent(clk, new_parent);
 	__clk_recalc_accuracies(clk);
 	__clk_recalc_rates(clk, POST_RATE_CHANGE);
+
+	if (clk->prepare_count) {
+		unsigned long flags;
+
+		__clk_prepare(new_parent);
+
+		flags = clk_enable_lock();
+		if (clk->enable_count)
+			__clk_enable(new_parent);
+		clk_enable_unlock(flags);
+	}
 }
 
 /**