[v3,05/10] clk: add support for clock protection

Message ID	20170612194438.12298-6-jbrunet@baylibre.com (mailing list archive)
State	Not Applicable
Headers	show Return-Path: <linux-amlogic-bounces+patchwork-linux-amlogic=patchwork.kernel.org@lists.infradead.org> From: Jerome Brunet <jbrunet@baylibre.com> To: Michael Turquette <mturquette@baylibre.com>, Stephen Boyd <sboyd@codeaurora.org>, linux-clk@vger.kernel.org Subject: [PATCH v3 05/10] clk: add support for clock protection Date: Mon, 12 Jun 2017 21:44:33 +0200 Message-Id: <20170612194438.12298-6-jbrunet@baylibre.com> In-Reply-To: <20170612194438.12298-1-jbrunet@baylibre.com> References: <20170612194438.12298-1-jbrunet@baylibre.com> Precedence: list Cc: Boris Brezillon <boris.brezillon@free-electrons.com>, Kevin Hilman <khilman@baylibre.com>, Linus Walleij <linus.walleij@linaro.org>, Russell King <linux@armlinux.org.uk>, linux-amlogic@lists.infradead.org, Jerome Brunet <jbrunet@baylibre.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-amlogic" <linux-amlogic-bounces@lists.infradead.org> Errors-To: linux-amlogic-bounces+patchwork-linux-amlogic=patchwork.kernel.org@lists.infradead.org

Jerome Brunet June 12, 2017, 7:44 p.m. UTC

The patch adds clk_protect and clk_unprotect to the CCF API. These
functions allow a consumer to inform the system that the rate of clock is
critical to for its operations and it can't tolerate other consumers
changing the rate or introducing glitches while the clock is protected.

Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
---
 drivers/clk/clk.c            | 205 +++++++++++++++++++++++++++++++++++++++++--
 include/linux/clk-provider.h |   1 +
 include/linux/clk.h          |  29 ++++++
 3 files changed, 229 insertions(+), 6 deletions(-)

Stephen Boyd July 26, 2017, 12:12 a.m. UTC | #1

On 06/12, Jerome Brunet wrote:
> The patch adds clk_protect and clk_unprotect to the CCF API. These
> functions allow a consumer to inform the system that the rate of clock is
> critical to for its operations and it can't tolerate other consumers

s/for//

> changing the rate or introducing glitches while the clock is protected.

> diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> index 163cb9832f10..d688b8f59a59 100644
> --- a/drivers/clk/clk.c
> +++ b/drivers/clk/clk.c
> +
> +/**
> + * clk_rate_unprotect - unprotect the rate of a clock source
> + * @clk: the clk being unprotected
> + *
> + * clk_unprotect completes a critical section during which the clock
> + * consumer cannot tolerate any change to the clock rate. If no other clock
> + * consumers have protected clocks in the parent chain, then calls to this
> + * function will allow the clocks in the parent chain to change rates
> + * freely.
> + *
> + * Unlike the clk_set_rate_range method, which allows the rate to change
> + * within a given range, protected clocks cannot have their rate changed,
> + * either directly or indirectly due to changes further up the parent chain
> + * of clocks.
> + *
> + * Calls to clk_unprotect must be balanced with calls to clk_protect. Calls
> + * to this function may sleep, and do not return error status.
> + */
> +void clk_rate_unprotect(struct clk *clk)
> +{
> +	if (!clk)
> +		return;
> +
> +	clk_prepare_lock();
> +
> +	/*
> +	 * if there is something wrong with this consumer protect count, stop
> +	 * here before messing with the provider
> +	 */
> +	if (WARN_ON(clk->protect_count <= 0))
> +		goto out;
> +
> +	clk_core_rate_unprotect(clk->core);

Can we make this stuff non-recursive? I know that this is
basically a copy paste of prepare/unprepare code and recursion is
nice and elegant, but we really don't need to do it when we could
have a loop that's the same and doesn't blow up our stack frame
usage. I'll send a patch for prepare/enable so you get the idea.

> +	clk->protect_count--;
> +out:
> +	clk_prepare_unlock();
> +}
> +EXPORT_SYMBOL_GPL(clk_rate_unprotect);
[..]
> +
> @@ -2952,6 +3134,17 @@ void __clk_put(struct clk *clk)
>  
>  	clk_prepare_lock();
>  
> +	/*
> +	 * Before calling clk_put, all calls to clk_rate_protect from a given
> +	 * user must be balanced with calls to clk_rate_unprotect and by that
> +	 * same user
> +	 */
> +	WARN_ON(clk->protect_count);
> +
> +	/* We voiced our concern, let's sanitize the situation */
> +	for (; clk->protect_count; clk->protect_count--)
> +		clk_core_rate_unprotect(clk->core);

Does this do anything different than:

	clk->core->protect_count -= clk->protect_count;
	clk->protect_count = 1;
	clk_core_rate_unprotect(clk->core);

Just seems better to not do a loop here.

> diff --git a/include/linux/clk.h b/include/linux/clk.h
> index 91bd464f4c9b..b60c36f2e6b0 100644
> --- a/include/linux/clk.h
> +++ b/include/linux/clk.h
> @@ -331,6 +331,30 @@ struct clk *devm_clk_get(struct device *dev, const char *id);
>   */
>  struct clk *devm_get_clk_from_child(struct device *dev,
>  				    struct device_node *np, const char *con_id);
> +/**
> + * clk_rate_protect - inform the system when the clock rate must be protected.
> + * @clk: clock source
> + *
> + * This function informs the system that the consumer protecting the clock
> + * depends on the rate of the clock source and can't tolerate any glitches
> + * introduced by further clock rate change or re-parenting of the clock source.
> + *
> + * Must not be called from within atomic context.
> + */
> +void clk_rate_protect(struct clk *clk);

Is there any plan to use this clk_rate_protect() API? It seems
inherently racy for a clk consumer to call clk_set_rate() and
then this clk_rate_protect() API after that to lock the rate in.
How about we leave this out of the consumer API until a user
needs it?

I'm tempted to say that we could do this rate locking stuff with
clk_set_rate_range(), but with more thought that doesn't seem
possible because there's a subtle difference. The range API is
willing to accept a range of frequencies, and calling
clk_set_rate_range() with some exact frequency should fail if
that exact frequency can't be met. With this API and the
subsequent clk_set_rate_protect() API we're willing to accept
that the rate we call clk_set_rate_protect() with could be
different than the rate we actually get.

Finally, When does a consumer want the rate of a clk to change
after they call clk_set_rate() on it? I would guess that very few
consumers would be willing to accept that. Which begs the
question, if anyone will keep calling clk_set_rate() after this
API (and the clk_set_rate_protect() API) is added. It almost
seems like we would want it to be opt-out, instead of opt-in, so
that consumers would call clk_set_rate() and expect it to be a
stable clk rate after that, and they would call
clk_set_rate_trample_on_me() or something properly named when
they don't care what the rate is after they call the API.

> +
> +/**
> + * clk_rate_unprotect - release the protection of the clock source.
> + * @clk: clock source
> + *
> + * This function informs the system that the consumer previously protecting the
> + * clock rate can now deal with other consumer altering the clock source rate

other consumers

> + *
> + * The caller must balance the number of rate_protect and rate_unprotect calls.

Please say clk_rate_protect() and clk_rate_unprotect() here.

Jerome Brunet July 26, 2017, 5:18 p.m. UTC | #2

On Tue, 2017-07-25 at 17:12 -0700, Stephen Boyd wrote:
> On 06/12, Jerome Brunet wrote:
> > The patch adds clk_protect and clk_unprotect to the CCF API. These
> > functions allow a consumer to inform the system that the rate of clock is
> > critical to for its operations and it can't tolerate other consumers
> 
> s/for//
> 
> > changing the rate or introducing glitches while the clock is protected.
> 
> 
> > diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> > index 163cb9832f10..d688b8f59a59 100644
> > --- a/drivers/clk/clk.c
> > +++ b/drivers/clk/clk.c
> > +
> > +/**
> > + * clk_rate_unprotect - unprotect the rate of a clock source
> > + * @clk: the clk being unprotected
> > + *
> > + * clk_unprotect completes a critical section during which the clock
> > + * consumer cannot tolerate any change to the clock rate. If no other clock
> > + * consumers have protected clocks in the parent chain, then calls to this
> > + * function will allow the clocks in the parent chain to change rates
> > + * freely.
> > + *
> > + * Unlike the clk_set_rate_range method, which allows the rate to change
> > + * within a given range, protected clocks cannot have their rate changed,
> > + * either directly or indirectly due to changes further up the parent chain
> > + * of clocks.
> > + *
> > + * Calls to clk_unprotect must be balanced with calls to clk_protect. Calls
> > + * to this function may sleep, and do not return error status.
> > + */
> > +void clk_rate_unprotect(struct clk *clk)
> > +{
> > +	if (!clk)
> > +		return;
> > +
> > +	clk_prepare_lock();
> > +
> > +	/*
> > +	 * if there is something wrong with this consumer protect count,
> > stop
> > +	 * here before messing with the provider
> > +	 */
> > +	if (WARN_ON(clk->protect_count <= 0))
> > +		goto out;
> > +
> > +	clk_core_rate_unprotect(clk->core);
> 
> Can we make this stuff non-recursive? I know that this is
> basically a copy paste of prepare/unprepare code and recursion is
> nice and elegant, but we really don't need to do it when we could
> have a loop that's the same and doesn't blow up our stack frame
> usage. I'll send a patch for prepare/enable so you get the idea.

I think we should not be doing the same thing differently inside the framework.
If enable and disable are recursive, protect should be too. The stack can handle
enable and prepare, why would it not handle protect ?

Of course, If your patch to "stack-ify" enable and prepare merges first, I'll
copy the mechanism. The important thing for me is keep things consistent.

On a more general note, your idea is no light change ...
Are the trees really that tall that it is becoming a problem ?
I see in your patch comment that there is still a few question to be answered,
right ?

Could we take this one step at time and keep the recursion for now?
When the whole CCF moves to a stack approach, I'll be happy to help.

> 
> > +	clk->protect_count--;
> > +out:
> > +	clk_prepare_unlock();
> > +}
> > +EXPORT_SYMBOL_GPL(clk_rate_unprotect);
> 
> [..]
> > +
> > @@ -2952,6 +3134,17 @@ void __clk_put(struct clk *clk)
> >  
> >  	clk_prepare_lock();
> >  
> > +	/*
> > +	 * Before calling clk_put, all calls to clk_rate_protect from a
> > given
> > +	 * user must be balanced with calls to clk_rate_unprotect and by
> > that
> > +	 * same user
> > +	 */
> > +	WARN_ON(clk->protect_count);
> > +
> > +	/* We voiced our concern, let's sanitize the situation */
> > +	for (; clk->protect_count; clk->protect_count--)
> > +		clk_core_rate_unprotect(clk->core);
> 
> Does this do anything different than:
> 
> 	clk->core->protect_count -= clk->protect_count;
> 	clk->protect_count = 1;

this looks really hacky !

> 	clk_core_rate_unprotect(clk->core);
> 
> Just seems better to not do a loop here.

A loop a easy to understand.
This code should actually never run, so readability looks more important to me
than optimisation.
If you really insist on this, I'll yield but I'm not a big fan.

> 
> > diff --git a/include/linux/clk.h b/include/linux/clk.h
> > index 91bd464f4c9b..b60c36f2e6b0 100644
> > --- a/include/linux/clk.h
> > +++ b/include/linux/clk.h
> > @@ -331,6 +331,30 @@ struct clk *devm_clk_get(struct device *dev, const char
> > *id);
> >   */
> >  struct clk *devm_get_clk_from_child(struct device *dev,
> >  				    struct device_node *np, const char
> > *con_id);
> > +/**
> > + * clk_rate_protect - inform the system when the clock rate must be
> > protected.
> > + * @clk: clock source
> > + *
> > + * This function informs the system that the consumer protecting the clock
> > + * depends on the rate of the clock source and can't tolerate any glitches
> > + * introduced by further clock rate change or re-parenting of the clock
> > source.
> > + *
> > + * Must not be called from within atomic context.
> > + */
> > +void clk_rate_protect(struct clk *clk);
> 
> Is there any plan to use this clk_rate_protect() API? It seems
> inherently racy for a clk consumer to call clk_set_rate() and
> then this clk_rate_protect() API after that to lock the rate in.
> How about we leave this out of the consumer API until a user
> needs it?

Having this API available is whole reason I've been working on this for so long.
By the now, you may have forgot but I explained the use-case in first RFC [0]
Here is an example (wip) of usage [1]

[0]: http://lkml.kernel.org/r/20170302173835.18313-1-jbrunet@baylibre.co
[1]: https://github.com/jeromebrunet/linux/commits/amlogic/wip/audio-clk-lock

> 
> I'm tempted to say that we could do this rate locking stuff with
> clk_set_rate_range(), but with more thought that doesn't seem
> possible because there's a subtle difference. The range API is
> willing to accept a range of frequencies, and calling
> clk_set_rate_range() with some exact frequency should fail if
> that exact frequency can't be met. With this API and the
> subsequent clk_set_rate_protect() API we're willing to accept
> that the rate we call clk_set_rate_protect() with could be
> different than the rate we actually get.

Indeed and this point is important for PLLs which (almost) certainly never meet
the exact requested rate.

Also rate_range would give no guarantee regarding gliches and reparenting.
Keep audio in mind here, glitches are the enemy.

Say you have 2 clocks fed by a divider (SET_RATE_PARENT of course)
# clock #1 is a PLL locked to 48000kHz
# clock #2 request a new rate

If the clock #1 is protected by rate_range mechanism, the divider could change,
provided that PLL is able to relock at 48000kHz ... but we'll get a glitch

This won't happen with clock protect, clock #2 will have to work with what the
divider provides at the moment, or reparent to better suited (free) parent
clock.

> 
> Finally, When does a consumer want the rate of a clk to change
> after they call clk_set_rate() on it? I would guess that very few
> consumers would be willing to accept that. Which begs the
> question, if anyone will keep calling clk_set_rate() after this
> API (and the clk_set_rate_protect() API) is added. It almost
> seems like we would want it to be opt-out, instead of opt-in, so
> that consumers would call clk_set_rate() and expect it to be a
> stable clk rate after that, and they would call
> clk_set_rate_trample_on_me() or something properly named when
> they don't care what the rate is after they call the API.
> 

Indeed, we generally don't want our rate to change, but:
- This is mostly for leaf clocks, the internal path would generally not care, as
long as the leaf are happy.
- Even a leaf may be open (be able to deal with) to small glitches, pll relock,
re parenting

Actually, if all the clock could not tolerate any glitches, there would have
been a lot of complains about CCF by now, it does not prevent glitches at all.

If you go over the initial RFC, the point is also for the CCF to favor other
(unused parent) when several (sometimes with the same capabilities) are
available. I think this is also a fairly common use case. That's something
rate_range won't do either, as far as I understood.

Last but not least, it allows consumer to set the rate in a sort of critical
section and have the guarantee that nobody will be able to change the rate
between the clk_set_rate() call and prepare_enable(). That's something we don't
have at the moment.

So I think this API bring quite a few things

> > +
> > +/**
> > + * clk_rate_unprotect - release the protection of the clock source.
> > + * @clk: clock source
> > + *
> > + * This function informs the system that the consumer previously protecting
> > the
> > + * clock rate can now deal with other consumer altering the clock source
> > rate
> 
> other consumers
> 
> > + *
> > + * The caller must balance the number of rate_protect and rate_unprotect
> > calls.
> 
> Please say clk_rate_protect() and clk_rate_unprotect() here.
>

Stephen Boyd Aug. 4, 2017, 12:18 a.m. UTC | #3

On 07/26, Jerome Brunet wrote:
> On Tue, 2017-07-25 at 17:12 -0700, Stephen Boyd wrote:
> > On 06/12, Jerome Brunet wrote:
> > > +	if (WARN_ON(clk->protect_count <= 0))
> > > +		goto out;
> > > +
> > > +	clk_core_rate_unprotect(clk->core);
> > 
> > Can we make this stuff non-recursive? I know that this is
> > basically a copy paste of prepare/unprepare code and recursion is
> > nice and elegant, but we really don't need to do it when we could
> > have a loop that's the same and doesn't blow up our stack frame
> > usage. I'll send a patch for prepare/enable so you get the idea.
> 
> I think we should not be doing the same thing differently inside the framework.
> If enable and disable are recursive, protect should be too. The stack can handle
> enable and prepare, why would it not handle protect ?
> 
> Of course, If your patch to "stack-ify" enable and prepare merges first, I'll
> copy the mechanism. The important thing for me is keep things consistent.
> 
> On a more general note, your idea is no light change ...
> Are the trees really that tall that it is becoming a problem ?
> I see in your patch comment that there is still a few question to be answered,
> right ?

I see it more of a problem when set_rate in the middle of
recursion needs to call enable/prepare and then goes and stacks a
bunch more frames again, and that has already been called deep in
a stack by a driver. I admit I haven't seen a problem, but I'd
rather not find out later that it was a problem.

> 
> Could we take this one step at time and keep the recursion for now?
> When the whole CCF moves to a stack approach, I'll be happy to help.

Sounds fine. We can take this up on another thread.

> 
> > 
> > > +	clk->protect_count--;
> > > +out:
> > > +	clk_prepare_unlock();
> > > +}
> > > +EXPORT_SYMBOL_GPL(clk_rate_unprotect);
> > 
> > [..]
> > > +
> > > @@ -2952,6 +3134,17 @@ void __clk_put(struct clk *clk)
> > >  
> > >  	clk_prepare_lock();
> > >  
> > > +	/*
> > > +	 * Before calling clk_put, all calls to clk_rate_protect from a
> > > given
> > > +	 * user must be balanced with calls to clk_rate_unprotect and by
> > > that
> > > +	 * same user
> > > +	 */
> > > +	WARN_ON(clk->protect_count);
> > > +
> > > +	/* We voiced our concern, let's sanitize the situation */
> > > +	for (; clk->protect_count; clk->protect_count--)
> > > +		clk_core_rate_unprotect(clk->core);
> > 
> > Does this do anything different than:
> > 
> > 	clk->core->protect_count -= clk->protect_count;
> > 	clk->protect_count = 1;
> 
> this looks really hacky !
> 
> > 	clk_core_rate_unprotect(clk->core);
> > 
> > Just seems better to not do a loop here.
> 
> A loop a easy to understand.
> This code should actually never run, so readability looks more important to me
> than optimisation.
> If you really insist on this, I'll yield but I'm not a big fan.

Well I'm suggesting this because the same loop looked to be
repeated in a couple of other places, and in those cases it was
always run, so looping through it all the time to decrement is
sort of odd. Maybe it could be a small function?

	int clk_core_rate_unprotect_force(struct clk_core *core)
	{
		int count = core->protect_count;

		if (count) {
			core->protect_count = 1;
			clk_core_rate_unprotect(core);
		}

		return count;
	}

And then call-site would be

	if (clk_core_rate_unprotect_force(clk->core))
		clk->protect_count = 0;

and the other call-site where we temporarily remove it we can
have unprotect force and then protect_force restore the value
returned.

> 
> > 
> > > diff --git a/include/linux/clk.h b/include/linux/clk.h
> > > index 91bd464f4c9b..b60c36f2e6b0 100644
> > > --- a/include/linux/clk.h
> > > +++ b/include/linux/clk.h
> > > @@ -331,6 +331,30 @@ struct clk *devm_clk_get(struct device *dev, const char
> > > *id);
> > >   */
> > >  struct clk *devm_get_clk_from_child(struct device *dev,
> > >  				    struct device_node *np, const char
> > > *con_id);
> > > +/**
> > > + * clk_rate_protect - inform the system when the clock rate must be
> > > protected.
> > > + * @clk: clock source
> > > + *
> > > + * This function informs the system that the consumer protecting the clock
> > > + * depends on the rate of the clock source and can't tolerate any glitches
> > > + * introduced by further clock rate change or re-parenting of the clock
> > > source.
> > > + *
> > > + * Must not be called from within atomic context.
> > > + */
> > > +void clk_rate_protect(struct clk *clk);
> > 
> > Is there any plan to use this clk_rate_protect() API? It seems
> > inherently racy for a clk consumer to call clk_set_rate() and
> > then this clk_rate_protect() API after that to lock the rate in.
> > How about we leave this out of the consumer API until a user
> > needs it?
> 
> Having this API available is whole reason I've been working on this for so long.
> By the now, you may have forgot but I explained the use-case in first RFC [0]
> Here is an example (wip) of usage [1]
> 
> [0]: http://lkml.kernel.org/r/20170302173835.18313-1-jbrunet@baylibre.com
> [1]: https://github.com/jeromebrunet/linux/commits/amlogic/wip/audio-clk-lock

If we're forgetting why something is introduced then it means the
commit text is missing information. Please clearly describe the
need for the API in the commit text for the patch that introduces
it.

> 
> > 
> > Finally, When does a consumer want the rate of a clk to change
> > after they call clk_set_rate() on it? I would guess that very few
> > consumers would be willing to accept that. Which begs the
> > question, if anyone will keep calling clk_set_rate() after this
> > API (and the clk_set_rate_protect() API) is added. It almost
> > seems like we would want it to be opt-out, instead of opt-in, so
> > that consumers would call clk_set_rate() and expect it to be a
> > stable clk rate after that, and they would call
> > clk_set_rate_trample_on_me() or something properly named when
> > they don't care what the rate is after they call the API.
> > 
> 
> Indeed, we generally don't want our rate to change, but:
> - This is mostly for leaf clocks, the internal path would generally not care, as
> long as the leaf are happy.
> - Even a leaf may be open (be able to deal with) to small glitches, pll relock,
> re parenting
> 
> Actually, if all the clock could not tolerate any glitches, there would have
> been a lot of complains about CCF by now, it does not prevent glitches at all.

Well some devices handle glitches in the hardware, so the details
of glitch free rate changes are hidden from clk consumers, and
the software in general, on those devices.

> 
> If you go over the initial RFC, the point is also for the CCF to favor other
> (unused parent) when several (sometimes with the same capabilities) are
> available. I think this is also a fairly common use case. That's something
> rate_range won't do either, as far as I understood.

Fair enough, but why do we want consumers to need to know that
there are sometimes unused parents that aren't getting chosen for
a particular frequency? I see this as exposing the internals of
the clk tree to consumers when they shouldn't need to care. Of
course, sometimes clk consumers really do care about internals,
for example if some PLL is used for a display controller and it's
also routed out of the chip on the display phy pins to encode
data or something. Then perhaps we really want to use one
particular PLL instead of a generic one that may also be a parent
of the display controller clk. Making sure clk_set_rate() doesn't
trample on clks deep in the tree seems different than this
though.

Going back to your RFC series cover letter though, I see three
PLLs (mppl0,1,2) and two users of the PLLs (amclk and mclk_i958).
Is anything else using these PLLs in the system? Why are we doing
all this stuff instead of hard-coding the parents for these clks
to be different PLLs? If we want it to be flexible we could
assign parents to the cts_amclk_sel and cts_mclk_i958_sel to be
different PLLs in DT via assigned clock parents. Or it could just
be hard-coded in the clk driver during probe.

If there's really sharing going on, and you can't hardcode the
parents, please indicate why that's the case in the commit text.
I don't want to introduce another consumer API just because we
didn't want to restrict the available parents for a couple clks.

> 
> Last but not least, it allows consumer to set the rate in a sort of critical
> section and have the guarantee that nobody will be able to change the rate
> between the clk_set_rate() call and prepare_enable(). That's something we don't
> have at the moment.
> 

Right, but clk_rate_protect() doesn't close the critical section
between clk_set_rate() and clk_rate_protect() if another
consumers changes the frequency, or if that consumer changes the
frequency of some parent of the clk. This is why I'm asking what
the use of this API is for. Shouldn't we want consumers to use
clk_set_rate_protect() so they can be sure they got the rate they
wanted, instead of hoping that something else hasn't come in
between the set_rate and the protect calls and changed the
frequency?

Michael Turquette Aug. 8, 2017, 10:37 p.m. UTC | #4

Hi Stephen,

Quoting Stephen Boyd (2017-08-03 17:18:36)
> On 07/26, Jerome Brunet wrote:
> > > > +void clk_rate_protect(struct clk *clk);
> > > 
> > > Is there any plan to use this clk_rate_protect() API? It seems
> > > inherently racy for a clk consumer to call clk_set_rate() and
> > > then this clk_rate_protect() API after that to lock the rate in.
> > > How about we leave this out of the consumer API until a user
> > > needs it?
> > 
> > Having this API available is whole reason I've been working on this for so long.
> > By the now, you may have forgot but I explained the use-case in first RFC [0]
> > Here is an example (wip) of usage [1]
> > 
> > [0]: http://lkml.kernel.org/r/20170302173835.18313-1-jbrunet@baylibre.com
> > [1]: https://github.com/jeromebrunet/linux/commits/amlogic/wip/audio-clk-lock

Indeed, something like rate protection or "lock rate" has been discussed
since the birth of CCF. I remember a whiteboarding session between you,
Paul W. and myself about it probably in 2012. Peter might have been
there too.

> 
> If we're forgetting why something is introduced then it means the
> commit text is missing information. Please clearly describe the
> need for the API in the commit text for the patch that introduces
> it.

My $0.02 is that the "pick an unused PLL" thing is a benefit of this new
api that is internal to the clock controller driver, and is not the
driving force behind the series. If a simple, easy to understand
justification for this patch series is needed, might I suggest something
like the following for the next commit log/coverletter:

"Some clock consumers require that a clock rate must not deviate from
its selected frequency. There can be several reasons for this, not least
of which is that some hardware may not be able to handle or recover from
a glitch caused by changing the clock rate while the hardware is in
operation. The ability to lock a clock's rate, and release that lock, is
a fundamental clock rate control primitive. It's absence is a bug that
is fixed by this patch series."

That's the short and sweet version. If more verbosity is needed as to
why rate_range doesn't need this, there are some good reasons:

1) simplicity: some consumers don't care about their rate, but do
care that they rate doesn't change. clk_rate_{un}protect is much simpler
than forcing those consumers to call clk_get_rate, then cache that value
for future use and then call clk_set_rate_range.

2) expressiveness / debug: trying to find out why a clock rate is locked
searching through every use of clk_set_rate_range is sort of lame,
especially if variables are used to pass in min/max instead of
hard-coded values. It's way way easier to just grep for
clk_rate_protect.

> 
> > 
> > > 
> > > Finally, When does a consumer want the rate of a clk to change
> > > after they call clk_set_rate() on it? I would guess that very few
> > > consumers would be willing to accept that. Which begs the
> > > question, if anyone will keep calling clk_set_rate() after this
> > > API (and the clk_set_rate_protect() API) is added. It almost
> > > seems like we would want it to be opt-out, instead of opt-in, so
> > > that consumers would call clk_set_rate() and expect it to be a
> > > stable clk rate after that, and they would call
> > > clk_set_rate_trample_on_me() or something properly named when
> > > they don't care what the rate is after they call the API.
> > > 
> > 
> > Indeed, we generally don't want our rate to change, but:
> > - This is mostly for leaf clocks, the internal path would generally not care, as
> > long as the leaf are happy.
> > - Even a leaf may be open (be able to deal with) to small glitches, pll relock,
> > re parenting
> > 
> > Actually, if all the clock could not tolerate any glitches, there would have
> > been a lot of complains about CCF by now, it does not prevent glitches at all.
> 
> Well some devices handle glitches in the hardware, so the details
> of glitch free rate changes are hidden from clk consumers, and
> the software in general, on those devices.

On the hardware that I am familiar with, the problem of glitches lies
not in the clock hardware, but with the downstream peripheral logic / ip
block that consumes that clock signal. So it seems to me that having a
consumer api for locking the rate makes perfect sense.

> 
> > 
> > If you go over the initial RFC, the point is also for the CCF to favor other
> > (unused parent) when several (sometimes with the same capabilities) are
> > available. I think this is also a fairly common use case. That's something
> > rate_range won't do either, as far as I understood.
> 
> Fair enough, but why do we want consumers to need to know that
> there are sometimes unused parents that aren't getting chosen for
> a particular frequency? I see this as exposing the internals of
> the clk tree to consumers when they shouldn't need to care. Of
> course, sometimes clk consumers really do care about internals,

Right, sometimes they do, and we need to strike a balance. I think that
this api has been needed for some time. It very likely could have been
included in the initial version of the CCF that was merged years back if
I hadn't been trying very hard to stick only to the existing clk.h.

clk_set_rate has always been a "last write wins" api, across a shared
resource, and we've always known that this is not a great situation.
This patch series does a good job of solving that issue, in conjunction
with the existing range stuff.

> for example if some PLL is used for a display controller and it's
> also routed out of the chip on the display phy pins to encode
> data or something. Then perhaps we really want to use one
> particular PLL instead of a generic one that may also be a parent
> of the display controller clk. Making sure clk_set_rate() doesn't
> trample on clks deep in the tree seems different than this
> though.
> 
> Going back to your RFC series cover letter though, I see three
> PLLs (mppl0,1,2) and two users of the PLLs (amclk and mclk_i958).
> Is anything else using these PLLs in the system? Why are we doing
> all this stuff instead of hard-coding the parents for these clks
> to be different PLLs? If we want it to be flexible we could
> assign parents to the cts_amclk_sel and cts_mclk_i958_sel to be
> different PLLs in DT via assigned clock parents. Or it could just
> be hard-coded in the clk driver during probe.
> 
> If there's really sharing going on, and you can't hardcode the
> parents, please indicate why that's the case in the commit text.
> I don't want to introduce another consumer API just because we
> didn't want to restrict the available parents for a couple clks.

I think the PLL sharing thing is a big distraction. Giving consumers the
ability to guarantee that their rates are locked in using a simple
critical section call just makes sense to me. If it helps with other
cases then yay.

> 
> > 
> > Last but not least, it allows consumer to set the rate in a sort of critical
> > section and have the guarantee that nobody will be able to change the rate
> > between the clk_set_rate() call and prepare_enable(). That's something we don't
> > have at the moment.
> > 
> 
> Right, but clk_rate_protect() doesn't close the critical section
> between clk_set_rate() and clk_rate_protect() if another
> consumers changes the frequency, or if that consumer changes the
> frequency of some parent of the clk. This is why I'm asking what

Right, clk_set_rate_protect does this in the next patch in this series.

> the use of this API is for. Shouldn't we want consumers to use
> clk_set_rate_protect() so they can be sure they got the rate they
> wanted, instead of hoping that something else hasn't come in
> between the set_rate and the protect calls and changed the
> frequency?

We can remove clk_rate_protect if you really want, but there is always
the case that a consumer driver does not set the rate, but needs to
guarantee that the rate will not change during operation.

Any driver that must both set the rate AND guarantee it will not change
must use clk_set_rate_protect.

Regards,
Mike

> 
> -- 
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> a Linux Foundation Collaborative Project

Stephen Boyd Aug. 9, 2017, 2:19 a.m. UTC | #5

On 08/08, Michael Turquette wrote:
> Hi Stephen,
> 
> Quoting Stephen Boyd (2017-08-03 17:18:36)
> > On 07/26, Jerome Brunet wrote:
> > > > > +void clk_rate_protect(struct clk *clk);
> > > > 
> > > > Is there any plan to use this clk_rate_protect() API? It seems
> > > > inherently racy for a clk consumer to call clk_set_rate() and
> > > > then this clk_rate_protect() API after that to lock the rate in.
> > > > How about we leave this out of the consumer API until a user
> > > > needs it?
> > > 
> > > Having this API available is whole reason I've been working on this for so long.
> > > By the now, you may have forgot but I explained the use-case in first RFC [0]
> > > Here is an example (wip) of usage [1]
> > > 
> > > [0]: http://lkml.kernel.org/r/20170302173835.18313-1-jbrunet@baylibre.com
> > > [1]: https://github.com/jeromebrunet/linux/commits/amlogic/wip/audio-clk-lock
> 
> Indeed, something like rate protection or "lock rate" has been discussed
> since the birth of CCF. I remember a whiteboarding session between you,
> Paul W. and myself about it probably in 2012. Peter might have been
> there too.

I seem to recall this being about something with how OMAP could
change rates only high up in the tree and that would trickle down
to multiple different clks that all had to use that rate and then
divide it themselves to achieve the rate they actually wanted. Is
that right? I also vaguely remember Paul saying that
clk_round_rate() could return something and then clk_set_rate()
after that would fail because what was calculated during the rate
speculation/rounding phase would be different if some other
consumer goes and changes some rate high up in the tree.

> 
> > 
> > If we're forgetting why something is introduced then it means the
> > commit text is missing information. Please clearly describe the
> > need for the API in the commit text for the patch that introduces
> > it.
> 
> My $0.02 is that the "pick an unused PLL" thing is a benefit of this new
> api that is internal to the clock controller driver, and is not the
> driving force behind the series. If a simple, easy to understand
> justification for this patch series is needed, might I suggest something
> like the following for the next commit log/coverletter:
> 
> "Some clock consumers require that a clock rate must not deviate from
> its selected frequency. There can be several reasons for this, not least
> of which is that some hardware may not be able to handle or recover from
> a glitch caused by changing the clock rate while the hardware is in
> operation. The ability to lock a clock's rate, and release that lock, is
> a fundamental clock rate control primitive. It's absence is a bug that
> is fixed by this patch series."

This sounds like we're fixing the CLK_SET_RATE_GATE flag. I'd
prefer we also describe how adding that flag to some clk doesn't
solve the problem. I'm all more commit text when introducing this
API. Please don't go super verbose, but at least describe the
real problem.

> 
> That's the short and sweet version. If more verbosity is needed as to
> why rate_range doesn't need this, there are some good reasons:

I'm not really concerned about rate range handling this problem
here. Let's try to untangle the two in this discussion please.

> 
> > 
> > > 
> > > If you go over the initial RFC, the point is also for the CCF to favor other
> > > (unused parent) when several (sometimes with the same capabilities) are
> > > available. I think this is also a fairly common use case. That's something
> > > rate_range won't do either, as far as I understood.
> > 
> > Fair enough, but why do we want consumers to need to know that
> > there are sometimes unused parents that aren't getting chosen for
> > a particular frequency? I see this as exposing the internals of
> > the clk tree to consumers when they shouldn't need to care. Of
> > course, sometimes clk consumers really do care about internals,
> 
> Right, sometimes they do, and we need to strike a balance. I think that
> this api has been needed for some time. It very likely could have been
> included in the initial version of the CCF that was merged years back if
> I hadn't been trying very hard to stick only to the existing clk.h.
> 
> clk_set_rate has always been a "last write wins" api, across a shared
> resource, and we've always known that this is not a great situation.
> This patch series does a good job of solving that issue, in conjunction
> with the existing range stuff.
> 
> > for example if some PLL is used for a display controller and it's
> > also routed out of the chip on the display phy pins to encode
> > data or something. Then perhaps we really want to use one
> > particular PLL instead of a generic one that may also be a parent
> > of the display controller clk. Making sure clk_set_rate() doesn't
> > trample on clks deep in the tree seems different than this
> > though.
> > 
> > Going back to your RFC series cover letter though, I see three
> > PLLs (mppl0,1,2) and two users of the PLLs (amclk and mclk_i958).
> > Is anything else using these PLLs in the system? Why are we doing
> > all this stuff instead of hard-coding the parents for these clks
> > to be different PLLs? If we want it to be flexible we could
> > assign parents to the cts_amclk_sel and cts_mclk_i958_sel to be
> > different PLLs in DT via assigned clock parents. Or it could just
> > be hard-coded in the clk driver during probe.
> > 
> > If there's really sharing going on, and you can't hardcode the
> > parents, please indicate why that's the case in the commit text.
> > I don't want to introduce another consumer API just because we
> > didn't want to restrict the available parents for a couple clks.
> 
> I think the PLL sharing thing is a big distraction. Giving consumers the
> ability to guarantee that their rates are locked in using a simple
> critical section call just makes sense to me. If it helps with other
> cases then yay.

Ok. Can we get details on where we can't fix it in the provider
layer? I'm fine to ignore the PLL sharing thing, if we can
clearly describe a real situation where a clk consumer needs to
lock in a rate, and that isn't the only consumer of a clk that is
changing that rate or causing a glitch by changing some other
rate up the tree. So far we haven't come across this case,
presumably because clk providers are working around the problem
by hardcoding parents that can change rate during
determine_rate(), or blocking rate changes going up the tree for
certain clks, or the hardware isn't designed in a way that
something is shared while two devices are active, or maybe nobody
has complained (likely!).

I'm all for consumers locking in rates in a critical section,
just that I haven't seen a need for it so far. I've mostly seen
that hardware doesn't design it this way, or that there's some
sort of hardware plan to make sure that desired frequencies can
be achieved when there's some shared resource. And the glitching
thing hasn't been a problem because consumers are the only ones
causing the glitch when they call clk_set_rate() and so far they
have only called clk_set_rate() when their hardware is able to
handle the glitch that comes down (e.g. audio turns off speaker
output, or device is held in reset).

TL;DR: Please distract me from the PLL sharing thing with the
real problem that's being solved.

> 
> > 
> > > 
> > > Last but not least, it allows consumer to set the rate in a sort of critical
> > > section and have the guarantee that nobody will be able to change the rate
> > > between the clk_set_rate() call and prepare_enable(). That's something we don't
> > > have at the moment.
> > > 
> > 
> > Right, but clk_rate_protect() doesn't close the critical section
> > between clk_set_rate() and clk_rate_protect() if another
> > consumers changes the frequency, or if that consumer changes the
> > frequency of some parent of the clk. This is why I'm asking what
> 
> Right, clk_set_rate_protect does this in the next patch in this series.
> 
> > the use of this API is for. Shouldn't we want consumers to use
> > clk_set_rate_protect() so they can be sure they got the rate they
> > wanted, instead of hoping that something else hasn't come in
> > between the set_rate and the protect calls and changed the
> > frequency?
> 
> We can remove clk_rate_protect if you really want, but there is always
> the case that a consumer driver does not set the rate, but needs to
> guarantee that the rate will not change during operation.
> 
> Any driver that must both set the rate AND guarantee it will not change
> must use clk_set_rate_protect.
> 

No need to remove the API. This is a good description of a case
where a consumer would just call clk_rate_protect() (or perhaps
clk_dont_glitch()) without caring what the rate is. I was missing
this part of the argument because I got hung up on the critical
section part. Maybe this can be added to the function
documentation so we don't forget that some consumers don't set a
rate but still care to make sure it doesn't glitch for them.

Russell King (Oracle) Aug. 9, 2017, 11:45 a.m. UTC | #6

On Tue, Aug 08, 2017 at 07:19:06PM -0700, Stephen Boyd wrote:
> I also vaguely remember Paul saying that
> clk_round_rate() could return something and then clk_set_rate()
> after that would fail because what was calculated during the rate
> speculation/rounding phase would be different if some other
> consumer goes and changes some rate high up in the tree.

That's probably because people tend to get this stuff wrong.  It is
_not_ supposed to be:

	rounded_rate = clk_round_rate(clk, requested_rate);

	clk_set_rate(clk, rounded_rate);

but:

	rounded_rate = clk_round_rate(clk, requested_rate);

	clk_set_rate(clk, requested_rate);

The former is wrong for two reasons:

1. it's completely wasteful of CPU resources to do all the calculations
   that need to be done within clk_set_rate().

2. it's racy - there is no guarantee that you'll end up with "rounded_rate"

The API definition of clk_round_rate() explicitly mentions that it is
equivalent to clk_set_rate() followed by clk_get_rate() with the
exception that it doesn't affect the hardware.

I'm not sure that the clock rate protection API is really the right
solution - if we're trying to stop others from changing the clock rate,
that implies we have multiple different threads potentially changing
the rate at any time.  If a driver does this:

	clk_set_rate(clk, foo);
	clk_rate_protect(clk);

what prevents another thread from changing the clock rate between these
two calls?  The only way to do this safely would be something like:

	r = clk_round_rate(clk, foo);
	while (1) {
		err = clk_set_rate(clk, foo);
		clk_rate_protect(clk);
		if (err < 0)
			break;

		if (r == clk_get_rate(clk)) /* success */
			break;

		clk_rate_unprotect(clk);
	}

	if (err)
		failed;

That's rather a lot of code to add to every driver, and given the
number of times I've seen people get the clk_round_rate() vs
clk_set_rate() thing wrong, I've zero confidence that folk will get
this right either.

So, I'd suggest _not_ adding this clk_rate_protect() thing, but
instead an API that simultaneously sets and protects the rate, so
driver authors don't have to get involved in details like the above.

Jerome Brunet Aug. 9, 2017, 12:18 p.m. UTC | #7

On Tue, 2017-08-08 at 15:37 -0700, Michael Turquette wrote:
> Hi Stephen,
> 
> Quoting Stephen Boyd (2017-08-03 17:18:36)
> > On 07/26, Jerome Brunet wrote:
> > > > > +void clk_rate_protect(struct clk *clk);
> > > > 
> > > > Is there any plan to use this clk_rate_protect() API? It seems
> > > > inherently racy for a clk consumer to call clk_set_rate() and
> > > > then this clk_rate_protect() API after that to lock the rate in.
> > > > How about we leave this out of the consumer API until a user
> > > > needs it?
> > > 
> > > Having this API available is whole reason I've been working on this for so
> > > long.
> > > By the now, you may have forgot but I explained the use-case in first RFC
> > > [0]
> > > Here is an example (wip) of usage [1]
> > > 
> > > [0]: http://lkml.kernel.org/r/20170302173835.18313-1-jbrunet@baylibre.com
> > > [1]: https://github.com/jeromebrunet/linux/commits/amlogic/wip/audio-clk-l
> > > ock
> 
> Indeed, something like rate protection or "lock rate" has been discussed
> since the birth of CCF. I remember a whiteboarding session between you,
> Paul W. and myself about it probably in 2012. Peter might have been
> there too.
> 
> > 
> > If we're forgetting why something is introduced then it means the
> > commit text is missing information. Please clearly describe the
> > need for the API in the commit text for the patch that introduces
> > it.
> 
> My $0.02 is that the "pick an unused PLL" thing is a benefit of this new
> api that is internal to the clock controller driver, and is not the
> driving force behind the series. If a simple, easy to understand
> justification for this patch series is needed, might I suggest something
> like the following for the next commit log/coverletter:
> 
> "Some clock consumers require that a clock rate must not deviate from
> its selected frequency. There can be several reasons for this, not least
> of which is that some hardware may not be able to handle or recover from
> a glitch caused by changing the clock rate while the hardware is in
> operation. The ability to lock a clock's rate, and release that lock, is
> a fundamental clock rate control primitive. It's absence is a bug that
> is fixed by this patch series."
> 
> That's the short and sweet version. If more verbosity is needed as to
> why rate_range doesn't need this, there are some good reasons:
> 
> 1) simplicity: some consumers don't care about their rate, but do
> care that they rate doesn't change. clk_rate_{un}protect is much simpler
> than forcing those consumers to call clk_get_rate, then cache that value
> for future use and then call clk_set_rate_range.
> 
> 2) expressiveness / debug: trying to find out why a clock rate is locked
> searching through every use of clk_set_rate_range is sort of lame,
> especially if variables are used to pass in min/max instead of
> hard-coded values. It's way way easier to just grep for
> clk_rate_protect.
> 
> > 
> > > 
> > > > 
> > > > Finally, When does a consumer want the rate of a clk to change
> > > > after they call clk_set_rate() on it? I would guess that very few
> > > > consumers would be willing to accept that. Which begs the
> > > > question, if anyone will keep calling clk_set_rate() after this
> > > > API (and the clk_set_rate_protect() API) is added. It almost
> > > > seems like we would want it to be opt-out, instead of opt-in, so
> > > > that consumers would call clk_set_rate() and expect it to be a
> > > > stable clk rate after that, and they would call
> > > > clk_set_rate_trample_on_me() or something properly named when
> > > > they don't care what the rate is after they call the API.
> > > > 
> > > 
> > > Indeed, we generally don't want our rate to change, but:
> > > - This is mostly for leaf clocks, the internal path would generally not
> > > care, as
> > > long as the leaf are happy.
> > > - Even a leaf may be open (be able to deal with) to small glitches, pll
> > > relock,
> > > re parenting
> > > 
> > > Actually, if all the clock could not tolerate any glitches, there would
> > > have
> > > been a lot of complains about CCF by now, it does not prevent glitches at
> > > all.
> > 
> > Well some devices handle glitches in the hardware, so the details
> > of glitch free rate changes are hidden from clk consumers, and
> > the software in general, on those devices.
> 
> On the hardware that I am familiar with, the problem of glitches lies
> not in the clock hardware, but with the downstream peripheral logic / ip
> block that consumes that clock signal. So it seems to me that having a
> consumer api for locking the rate makes perfect sense.
> 
> > 
> > > 
> > > If you go over the initial RFC, the point is also for the CCF to favor
> > > other
> > > (unused parent) when several (sometimes with the same capabilities) are
> > > available. I think this is also a fairly common use case. That's something
> > > rate_range won't do either, as far as I understood.
> > 
> > Fair enough, but why do we want consumers to need to know that
> > there are sometimes unused parents that aren't getting chosen for
> > a particular frequency? I see this as exposing the internals of
> > the clk tree to consumers when they shouldn't need to care. Of
> > course, sometimes clk consumers really do care about internals,
> 
> Right, sometimes they do, and we need to strike a balance. I think that
> this api has been needed for some time. It very likely could have been
> included in the initial version of the CCF that was merged years back if
> I hadn't been trying very hard to stick only to the existing clk.h.
> 
> clk_set_rate has always been a "last write wins" api, across a shared
> resource, and we've always known that this is not a great situation.
> This patch series does a good job of solving that issue, in conjunction
> with the existing range stuff.
> 
> > for example if some PLL is used for a display controller and it's
> > also routed out of the chip on the display phy pins to encode
> > data or something. Then perhaps we really want to use one
> > particular PLL instead of a generic one that may also be a parent
> > of the display controller clk. Making sure clk_set_rate() doesn't
> > trample on clks deep in the tree seems different than this
> > though.
> > 
> > Going back to your RFC series cover letter though, I see three
> > PLLs (mppl0,1,2) and two users of the PLLs (amclk and mclk_i958).
> > Is anything else using these PLLs in the system? Why are we doing
> > all this stuff instead of hard-coding the parents for these clks
> > to be different PLLs? If we want it to be flexible we could
> > assign parents to the cts_amclk_sel and cts_mclk_i958_sel to be
> > different PLLs in DT via assigned clock parents. Or it could just
> > be hard-coded in the clk driver during probe.
> > 
> > If there's really sharing going on, and you can't hardcode the
> > parents, please indicate why that's the case in the commit text.
> > I don't want to introduce another consumer API just because we
> > didn't want to restrict the available parents for a couple clks.
> 
> I think the PLL sharing thing is a big distraction. Giving consumers the
> ability to guarantee that their rates are locked in using a simple
> critical section call just makes sense to me. If it helps with other
> cases then yay.

It guarantee that what is being protected, won't be broken ... for sure it helps
with sharing issues.

> 
> > 
> > > 
> > > Last but not least, it allows consumer to set the rate in a sort of
> > > critical
> > > section and have the guarantee that nobody will be able to change the rate
> > > between the clk_set_rate() call and prepare_enable(). That's something we
> > > don't
> > > have at the moment.
> > > 
> > 
> > Right, but clk_rate_protect() doesn't close the critical section
> > between clk_set_rate() and clk_rate_protect() if another
> > consumers changes the frequency, or if that consumer changes the
> > frequency of some parent of the clk. This is why I'm asking what
> 
> Right, clk_set_rate_protect does this in the next patch in this series.

Hum, I disagree here. For sure, if you call clk_set_rate() before
clk_rate_protect() you have not guarantee at all, but you are doing it wrong.
The way to use this it to call clk_rate_protect() then clk_set_rate().

As explained earlier, as long as the clock is protected only once, the
protecting consumer is still able to change the rate, because it is the only one
depending on clock.

This is a good fit for driver which heavily depends on rate being but need to
change the rate during during their operations. It is not necessary to unprotect
the clock before calling clk_set_rate().

clk_set_rate_protect() is only introduce to help driver which could get in the
situation where 2 consumers protect the clock w/o getting a chance to set the
rate, exhausting  the ressources. To re-set the rate, clk_rate_unprotect() must
be called before calling clk_set_rate_protect() again.

> 
> > the use of this API is for. Shouldn't we want consumers to use
> > clk_set_rate_protect() so they can be sure they got the rate they
> > wanted, instead of hoping that something else hasn't come in
> > between the set_rate and the protect calls and changed the
> > frequency?
> 
> We can remove clk_rate_protect if you really want, but there is always
> the case that a consumer driver does not set the rate, but needs to
> guarantee that the rate will not change during operation.
> 
> Any driver that must both set the rate AND guarantee it will not change
> must use clk_set_rate_protect.
> 
> Regards,
> Mike
> 
> > 
> > -- 
> > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> > a Linux Foundation Collaborative Project

Jerome Brunet Aug. 9, 2017, 1:07 p.m. UTC | #8

On Tue, 2017-08-08 at 19:19 -0700, Stephen Boyd wrote:
> On 08/08, Michael Turquette wrote:
> > Hi Stephen,
> > 
> > Quoting Stephen Boyd (2017-08-03 17:18:36)
> > > On 07/26, Jerome Brunet wrote:
> > > > > > +void clk_rate_protect(struct clk *clk);
> > > > > 
> > > > > Is there any plan to use this clk_rate_protect() API? It seems
> > > > > inherently racy for a clk consumer to call clk_set_rate() and
> > > > > then this clk_rate_protect() API after that to lock the rate in.
> > > > > How about we leave this out of the consumer API until a user
> > > > > needs it?
> > > > 
> > > > Having this API available is whole reason I've been working on this for
> > > > so long.
> > > > By the now, you may have forgot but I explained the use-case in first
> > > > RFC [0]
> > > > Here is an example (wip) of usage [1]
> > > > 
> > > > [0]: http://lkml.kernel.org/r/20170302173835.18313-1-jbrunet@baylibre.co
> > > > m
> > > > [1]: https://github.com/jeromebrunet/linux/commits/amlogic/wip/audio-clk
> > > > -lock
> > 
> > Indeed, something like rate protection or "lock rate" has been discussed
> > since the birth of CCF. I remember a whiteboarding session between you,
> > Paul W. and myself about it probably in 2012. Peter might have been
> > there too.
> 
> I seem to recall this being about something with how OMAP could
> change rates only high up in the tree and that would trickle down
> to multiple different clks that all had to use that rate and then
> divide it themselves to achieve the rate they actually wanted. Is
> that right? I also vaguely remember Paul saying that
> clk_round_rate() could return something and then clk_set_rate()
> after that would fail because what was calculated during the rate
> speculation/rounding phase would be different if some other
> consumer goes and changes some rate high up in the tree.
> 
> > 
> > > 
> > > If we're forgetting why something is introduced then it means the
> > > commit text is missing information. Please clearly describe the
> > > need for the API in the commit text for the patch that introduces
> > > it.
> > 
> > My $0.02 is that the "pick an unused PLL" thing is a benefit of this new
> > api that is internal to the clock controller driver, and is not the
> > driving force behind the series. If a simple, easy to understand
> > justification for this patch series is needed, might I suggest something
> > like the following for the next commit log/coverletter:
> > 
> > "Some clock consumers require that a clock rate must not deviate from
> > its selected frequency. There can be several reasons for this, not least
> > of which is that some hardware may not be able to handle or recover from
> > a glitch caused by changing the clock rate while the hardware is in
> > operation. The ability to lock a clock's rate, and release that lock, is
> > a fundamental clock rate control primitive. It's absence is a bug that
> > is fixed by this patch series."
> 
> This sounds like we're fixing the CLK_SET_RATE_GATE flag. I'd
> prefer we also describe how adding that flag to some clk doesn't
> solve the problem. I'm all more commit text when introducing this
> API. Please don't go super verbose, but at least describe the
> real problem.

Yes we are, with another patch later in this series.
CLK_SET_RATE_GATE is a particular case of clock protection, where the clock rate
is protected by provider (kind of the clock protecting itself) instead of being
protected by the consumer.

> 
> > 
> > That's the short and sweet version. If more verbosity is needed as to
> > why rate_range doesn't need this, there are some good reasons:
> 
> I'm not really concerned about rate range handling this problem
> here. Let's try to untangle the two in this discussion please.
> 
> > 
> > > 
> > > > 
> > > > If you go over the initial RFC, the point is also for the CCF to favor
> > > > other
> > > > (unused parent) when several (sometimes with the same capabilities) are
> > > > available. I think this is also a fairly common use case. That's
> > > > something
> > > > rate_range won't do either, as far as I understood.
> > > 
> > > Fair enough, but why do we want consumers to need to know that
> > > there are sometimes unused parents that aren't getting chosen for
> > > a particular frequency? I see this as exposing the internals of
> > > the clk tree to consumers when they shouldn't need to care. Of
> > > course, sometimes clk consumers really do care about internals,
> > 
> > Right, sometimes they do, and we need to strike a balance. I think that
> > this api has been needed for some time. It very likely could have been
> > included in the initial version of the CCF that was merged years back if
> > I hadn't been trying very hard to stick only to the existing clk.h.
> > 
> > clk_set_rate has always been a "last write wins" api, across a shared
> > resource, and we've always known that this is not a great situation.
> > This patch series does a good job of solving that issue, in conjunction
> > with the existing range stuff.
> > 
> > > for example if some PLL is used for a display controller and it's
> > > also routed out of the chip on the display phy pins to encode
> > > data or something. Then perhaps we really want to use one
> > > particular PLL instead of a generic one that may also be a parent
> > > of the display controller clk. Making sure clk_set_rate() doesn't
> > > trample on clks deep in the tree seems different than this
> > > though.
> > > 
> > > Going back to your RFC series cover letter though, I see three
> > > PLLs (mppl0,1,2) and two users of the PLLs (amclk and mclk_i958).
> > > Is anything else using these PLLs in the system? Why are we doing
> > > all this stuff instead of hard-coding the parents for these clks
> > > to be different PLLs? If we want it to be flexible we could
> > > assign parents to the cts_amclk_sel and cts_mclk_i958_sel to be
> > > different PLLs in DT via assigned clock parents. Or it could just
> > > be hard-coded in the clk driver during probe.
> > > 
> > > If there's really sharing going on, and you can't hardcode the
> > > parents, please indicate why that's the case in the commit text.
> > > I don't want to introduce another consumer API just because we
> > > didn't want to restrict the available parents for a couple clks.
> > 
> > I think the PLL sharing thing is a big distraction. Giving consumers the
> > ability to guarantee that their rates are locked in using a simple
> > critical section call just makes sense to me. If it helps with other
> > cases then yay.
> 
> Ok. Can we get details on where we can't fix it in the provider
> layer? I'm fine to ignore the PLL sharing thing, if we can
> clearly describe a real situation where a clk consumer needs to
> lock in a rate, and that isn't the only consumer of a clk that is
> changing that rate or causing a glitch by changing some other
> rate up the tree. So far we haven't come across this case,
> presumably because clk providers are working around the problem
> by hardcoding parents that can change rate during
> determine_rate(), or blocking rate changes going up the tree for
> certain clks, or the hardware isn't designed in a way that
> something is shared while two devices are active, or maybe nobody
> has complained (likely!).
> 
> I'm all for consumers locking in rates in a critical section,
> just that I haven't seen a need for it so far. I've mostly seen
> that hardware doesn't design it this way, or that there's some
> sort of hardware plan to make sure that desired frequencies can
> be achieved when there's some shared resource. And the glitching
> thing hasn't been a problem because consumers are the only ones
> causing the glitch when they call clk_set_rate() and so far they
> have only called clk_set_rate() when their hardware is able to
> handle the glitch that comes down (e.g. audio turns off speaker
> output, or device is held in reset).
> 
> TL;DR: Please distract me from the PLL sharing thing with the
> real problem that's being solved.

Well, the problem is not sharing "per-se" but constraints resolution.
When walking the tree, CCF check what each clock can provide and decide what the
best setting is to provide the clock down the line.

The fact the several clocks may share/depend on the same parent is another
constraint in the tree. When clock can't change or glitch, it adds constraints
along the tree. This is what clock_protect() does.

A use-case of this, without "clock sharing" is the one I already explained is
this thread:

Say you have 2 clocks fed by a divider (SET_RATE_PARENT of course)
# clock #1 is a PLL locked to 48000kHz
# clock #2 is another clock (whatever)

clock #2 request a new rate which may change the parent divider...

With the current CCF code, the divider would change and just screw up the PLL
(which would end up with another rate)
Say, we could automatically relock the PLL at the initial rate (with clk_range
for example), we would still get a glitch.
This would not happen with clock protect, clock #2 would have to work with what
the divider provides at the moment ... or "determine" a better suited parent if
available.

So what it solves is quite simple: 
* Once consumer protect a clock it has the guarantee from CCF that no-one will
be messing with what comes out of the said clock.
* For CCF, it provides a more accurate view of the clock tree constraints when
trying to determine a new clock rate.

> 
> > 
> > > 
> > > > 
> > > > Last but not least, it allows consumer to set the rate in a sort of
> > > > critical
> > > > section and have the guarantee that nobody will be able to change the
> > > > rate
> > > > between the clk_set_rate() call and prepare_enable(). That's something
> > > > we don't
> > > > have at the moment.
> > > > 
> > > 
> > > Right, but clk_rate_protect() doesn't close the critical section
> > > between clk_set_rate() and clk_rate_protect() if another
> > > consumers changes the frequency, or if that consumer changes the
> > > frequency of some parent of the clk. This is why I'm asking what
> > 
> > Right, clk_set_rate_protect does this in the next patch in this series.
> > 
> > > the use of this API is for. Shouldn't we want consumers to use
> > > clk_set_rate_protect() so they can be sure they got the rate they
> > > wanted, instead of hoping that something else hasn't come in
> > > between the set_rate and the protect calls and changed the
> > > frequency?
> > 
> > We can remove clk_rate_protect if you really want, but there is always
> > the case that a consumer driver does not set the rate, but needs to
> > guarantee that the rate will not change during operation.
> > 
> > Any driver that must both set the rate AND guarantee it will not change
> > must use clk_set_rate_protect.
> > 
> 
> No need to remove the API. This is a good description of a case
> where a consumer would just call clk_rate_protect() (or perhaps
> clk_dont_glitch()) without caring what the rate is. I was missing
> this part of the argument because I got hung up on the critical
> section part. Maybe this can be added to the function
> documentation so we don't forget that some consumers don't set a
> rate but still care to make sure it doesn't glitch for them.

Sure, no problem.

>

Jerome Brunet Aug. 9, 2017, 1:34 p.m. UTC | #9

On Wed, 2017-08-09 at 12:45 +0100, Russell King - ARM Linux wrote:
> On Tue, Aug 08, 2017 at 07:19:06PM -0700, Stephen Boyd wrote:
> > I also vaguely remember Paul saying that
> > clk_round_rate() could return something and then clk_set_rate()
> > after that would fail because what was calculated during the rate
> > speculation/rounding phase would be different if some other
> > consumer goes and changes some rate high up in the tree.
> 
> That's probably because people tend to get this stuff wrong.  It is
> _not_ supposed to be:
> 
> 	rounded_rate = clk_round_rate(clk, requested_rate);
> 
> 	clk_set_rate(clk, rounded_rate);
> 
> but:
> 
> 	rounded_rate = clk_round_rate(clk, requested_rate);
> 
> 	clk_set_rate(clk, requested_rate);
> 
> The former is wrong for two reasons:
> 
> 1. it's completely wasteful of CPU resources to do all the calculations
>    that need to be done within clk_set_rate().
> 
> 2. it's racy - there is no guarantee that you'll end up with "rounded_rate"
> 
> The API definition of clk_round_rate() explicitly mentions that it is
> equivalent to clk_set_rate() followed by clk_get_rate() with the
> exception that it doesn't affect the hardware.
> 
> I'm not sure that the clock rate protection API is really the right
> solution - if we're trying to stop others from changing the clock rate,
> that implies we have multiple different threads potentially changing
> the rate at any time.  If a driver does this:
> 
> 	clk_set_rate(clk, foo);
> 	clk_rate_protect(clk);
> 
> what prevents another thread from changing the clock rate between these
> two calls?  The only way to do this safely would be something like:
> 
> 	r = clk_round_rate(clk, foo);
> 	while (1) {
> 		err = clk_set_rate(clk, foo);
> 		clk_rate_protect(clk);
> 		if (err < 0)
> 			break;
> 
> 		if (r == clk_get_rate(clk)) /* success */
> 			break;
> 
> 		clk_rate_unprotect(clk);
> 	}
> 
> 	if (err)
> 		failed;

Russell,
I think you have missed one subtle thing, when trying any clock altering
operation, if the consumer is protecting the clock, it will temporarily release 
the protection once, under the prepare_lock (to guarantee safe operation). This
is explained in the cover letter:

"""
With this series there is 3 use-case:
 - the provider is not protected: nothing changes
 - the provider is protected by only 1 consumer (and only once), then only
   this consumer will be able to alter the rate of the clock, as it is the
   only one depending on it.
 - If the provider is protected more than once, or by the provider itself,
   the rate is basically locked and protected against reparenting.
"""

So what you should do if you have to protect the clock is:

clk_rate_protect(clk);
err = clk_set_rate(clk, foo);

[...]
clk_rate_unprotect(clk);

If the clock rate could be changed (protected only once) err = 0;
If the clock rate could not be changed: err = -EBUSY;

if you wish to combine clk_rate_protect() and clk_set_rate(), you may use
clk_set_rate_protect() which is added in another patch of this series.

> 
> That's rather a lot of code to add to every driver, and given the
> number of times I've seen people get the clk_round_rate() vs
> clk_set_rate() thing wrong, I've zero confidence that folk will get
> this right either.
> 
> So, I'd suggest _not_ adding this clk_rate_protect() thing, but
> instead an API that simultaneously sets and protects the rate, so
> driver authors don't have to get involved in details like the above.
>

Russell King (Oracle) Aug. 9, 2017, 1:40 p.m. UTC | #10

On Wed, Aug 09, 2017 at 03:34:48PM +0200, Jerome Brunet wrote:
> On Wed, 2017-08-09 at 12:45 +0100, Russell King - ARM Linux wrote:
> > On Tue, Aug 08, 2017 at 07:19:06PM -0700, Stephen Boyd wrote:
> > > I also vaguely remember Paul saying that
> > > clk_round_rate() could return something and then clk_set_rate()
> > > after that would fail because what was calculated during the rate
> > > speculation/rounding phase would be different if some other
> > > consumer goes and changes some rate high up in the tree.
> > 
> > That's probably because people tend to get this stuff wrong.  It is
> > _not_ supposed to be:
> > 
> > 	rounded_rate = clk_round_rate(clk, requested_rate);
> > 
> > 	clk_set_rate(clk, rounded_rate);
> > 
> > but:
> > 
> > 	rounded_rate = clk_round_rate(clk, requested_rate);
> > 
> > 	clk_set_rate(clk, requested_rate);
> > 
> > The former is wrong for two reasons:
> > 
> > 1. it's completely wasteful of CPU resources to do all the calculations
> >    that need to be done within clk_set_rate().
> > 
> > 2. it's racy - there is no guarantee that you'll end up with "rounded_rate"
> > 
> > The API definition of clk_round_rate() explicitly mentions that it is
> > equivalent to clk_set_rate() followed by clk_get_rate() with the
> > exception that it doesn't affect the hardware.
> > 
> > I'm not sure that the clock rate protection API is really the right
> > solution - if we're trying to stop others from changing the clock rate,
> > that implies we have multiple different threads potentially changing
> > the rate at any time.  If a driver does this:
> > 
> > 	clk_set_rate(clk, foo);
> > 	clk_rate_protect(clk);
> > 
> > what prevents another thread from changing the clock rate between these
> > two calls?  The only way to do this safely would be something like:
> > 
> > 	r = clk_round_rate(clk, foo);
> > 	while (1) {
> > 		err = clk_set_rate(clk, foo);
> > 		clk_rate_protect(clk);
> > 		if (err < 0)
> > 			break;
> > 
> > 		if (r == clk_get_rate(clk)) /* success */
> > 			break;
> > 
> > 		clk_rate_unprotect(clk);
> > 	}
> > 
> > 	if (err)
> > 		failed;
> 
> Russell,
> I think you have missed one subtle thing, when trying any clock altering
> operation, if the consumer is protecting the clock, it will temporarily release 
> the protection once, under the prepare_lock (to guarantee safe operation). This
> is explained in the cover letter:
> 
> """
> With this series there is 3 use-case:
>  - the provider is not protected: nothing changes
>  - the provider is protected by only 1 consumer (and only once), then only
>    this consumer will be able to alter the rate of the clock, as it is the
>    only one depending on it.
>  - If the provider is protected more than once, or by the provider itself,
>    the rate is basically locked and protected against reparenting.
> """
> 
> So what you should do if you have to protect the clock is:
> 
> clk_rate_protect(clk);
> err = clk_set_rate(clk, foo);
> 
> [...]
> clk_rate_unprotect(clk);

So here you drop the protection, which means anyone can alter the clock
again.

Either that or "clk_rate_unprotect" is inappropriately named and doesn't
do what it says it does.

Jerome Brunet Aug. 9, 2017, 1:45 p.m. UTC | #11

On Wed, 2017-08-09 at 14:40 +0100, Russell King - ARM Linux wrote:
> On Wed, Aug 09, 2017 at 03:34:48PM +0200, Jerome Brunet wrote:
> > On Wed, 2017-08-09 at 12:45 +0100, Russell King - ARM Linux wrote:
> > > On Tue, Aug 08, 2017 at 07:19:06PM -0700, Stephen Boyd wrote:
> > > > I also vaguely remember Paul saying that
> > > > clk_round_rate() could return something and then clk_set_rate()
> > > > after that would fail because what was calculated during the rate
> > > > speculation/rounding phase would be different if some other
> > > > consumer goes and changes some rate high up in the tree.
> > > 
> > > That's probably because people tend to get this stuff wrong.  It is
> > > _not_ supposed to be:
> > > 
> > > 	rounded_rate = clk_round_rate(clk, requested_rate);
> > > 
> > > 	clk_set_rate(clk, rounded_rate);
> > > 
> > > but:
> > > 
> > > 	rounded_rate = clk_round_rate(clk, requested_rate);
> > > 
> > > 	clk_set_rate(clk, requested_rate);
> > > 
> > > The former is wrong for two reasons:
> > > 
> > > 1. it's completely wasteful of CPU resources to do all the calculations
> > >    that need to be done within clk_set_rate().
> > > 
> > > 2. it's racy - there is no guarantee that you'll end up with
> > > "rounded_rate"
> > > 
> > > The API definition of clk_round_rate() explicitly mentions that it is
> > > equivalent to clk_set_rate() followed by clk_get_rate() with the
> > > exception that it doesn't affect the hardware.
> > > 
> > > I'm not sure that the clock rate protection API is really the right
> > > solution - if we're trying to stop others from changing the clock rate,
> > > that implies we have multiple different threads potentially changing
> > > the rate at any time.  If a driver does this:
> > > 
> > > 	clk_set_rate(clk, foo);
> > > 	clk_rate_protect(clk);
> > > 
> > > what prevents another thread from changing the clock rate between these
> > > two calls?  The only way to do this safely would be something like:
> > > 
> > > 	r = clk_round_rate(clk, foo);
> > > 	while (1) {
> > > 		err = clk_set_rate(clk, foo);
> > > 		clk_rate_protect(clk);
> > > 		if (err < 0)
> > > 			break;
> > > 
> > > 		if (r == clk_get_rate(clk)) /* success */
> > > 			break;
> > > 
> > > 		clk_rate_unprotect(clk);
> > > 	}
> > > 
> > > 	if (err)
> > > 		failed;
> > 
> > Russell,
> > I think you have missed one subtle thing, when trying any clock altering
> > operation, if the consumer is protecting the clock, it will temporarily
> > release 
> > the protection once, under the prepare_lock (to guarantee safe operation).
> > This
> > is explained in the cover letter:
> > 
> > """
> > With this series there is 3 use-case:
> >  - the provider is not protected: nothing changes
> >  - the provider is protected by only 1 consumer (and only once), then only
> >    this consumer will be able to alter the rate of the clock, as it is the
> >    only one depending on it.
> >  - If the provider is protected more than once, or by the provider itself,
> >    the rate is basically locked and protected against reparenting.
> > """
> > 
> > So what you should do if you have to protect the clock is:
> > 
> > clk_rate_protect(clk);
> > err = clk_set_rate(clk, foo);
> > 
> > [...]
> > clk_rate_unprotect(clk);
> 
> So here you drop the protection, which means anyone can alter the clock
> again.
> 

That's just an example. The rate is set after clk_set_rate() if no other
consumer depends on the clock.

I just added clk_rate_unprotect() here to illustrate that the calls should be
balanced, as documented.

> Either that or "clk_rate_unprotect" is inappropriately named and doesn't
> do what it says it does.
>

Michael Turquette Aug. 10, 2017, 4:46 p.m. UTC | #12

Hi Russell,

Quoting Russell King - ARM Linux (2017-08-09 04:45:48)
> I'm not sure that the clock rate protection API is really the right
> solution - if we're trying to stop others from changing the clock rate,
> that implies we have multiple different threads potentially changing
> the rate at any time.  If a driver does this:
> 
>         clk_set_rate(clk, foo);
>         clk_rate_protect(clk);
> 
> what prevents another thread from changing the clock rate between these
> two calls?  The only way to do this safely would be something like:

I agree that ordering the calls in your example above is racy. It is an
incorrect use of this new api. The two correct ways to use this api are:

1) Use clk_set_rate_protect, introduced in patch #6 of this series

2) Call clk_rate_protect() first, and if that call succeeds then call
clk_set_rate(). Thanks to per-consumer struct clk objects, we can know
that *this* consumer is the exclusive owner of the clk rate, and is
allowed to change the rate within the critical section.

I think this subtlety is not well described in the commitlog nor the
kerneldoc, since most reviewing including myself have missed it: the
consumer that successfully protects the clk can change the rate within
the critical section, by design.

> 
>         r = clk_round_rate(clk, foo);
>         while (1) {
>                 err = clk_set_rate(clk, foo);
>                 clk_rate_protect(clk);
>                 if (err < 0)
>                         break;
> 
>                 if (r == clk_get_rate(clk)) /* success */
>                         break;
> 
>                 clk_rate_unprotect(clk);
>         }
> 
>         if (err)
>                 failed;
> 
> That's rather a lot of code to add to every driver, and given the
> number of times I've seen people get the clk_round_rate() vs
> clk_set_rate() thing wrong, I've zero confidence that folk will get
> this right either.
> 
> So, I'd suggest _not_ adding this clk_rate_protect() thing, but
> instead an API that simultaneously sets and protects the rate, so
> driver authors don't have to get involved in details like the above.

As mentioned above, patch #6 does exactly this. Furthermore if you
simply flip your use of clk_set_rate and clk_rate_protect, then things
Just Work.

Thanks,
Mike

> 
> -- 
> RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
> According to speedtest.net: 8.21Mbps down 510kbps up

Michael Turquette Aug. 10, 2017, 4:48 p.m. UTC | #13

Quoting Russell King - ARM Linux (2017-08-09 06:40:36)
> On Wed, Aug 09, 2017 at 03:34:48PM +0200, Jerome Brunet wrote:
> > On Wed, 2017-08-09 at 12:45 +0100, Russell King - ARM Linux wrote:
> > > On Tue, Aug 08, 2017 at 07:19:06PM -0700, Stephen Boyd wrote:
> > > > I also vaguely remember Paul saying that
> > > > clk_round_rate() could return something and then clk_set_rate()
> > > > after that would fail because what was calculated during the rate
> > > > speculation/rounding phase would be different if some other
> > > > consumer goes and changes some rate high up in the tree.
> > > 
> > > That's probably because people tend to get this stuff wrong.  It is
> > > _not_ supposed to be:
> > > 
> > >     rounded_rate = clk_round_rate(clk, requested_rate);
> > > 
> > >     clk_set_rate(clk, rounded_rate);
> > > 
> > > but:
> > > 
> > >     rounded_rate = clk_round_rate(clk, requested_rate);
> > > 
> > >     clk_set_rate(clk, requested_rate);
> > > 
> > > The former is wrong for two reasons:
> > > 
> > > 1. it's completely wasteful of CPU resources to do all the calculations
> > >    that need to be done within clk_set_rate().
> > > 
> > > 2. it's racy - there is no guarantee that you'll end up with "rounded_rate"
> > > 
> > > The API definition of clk_round_rate() explicitly mentions that it is
> > > equivalent to clk_set_rate() followed by clk_get_rate() with the
> > > exception that it doesn't affect the hardware.
> > > 
> > > I'm not sure that the clock rate protection API is really the right
> > > solution - if we're trying to stop others from changing the clock rate,
> > > that implies we have multiple different threads potentially changing
> > > the rate at any time.  If a driver does this:
> > > 
> > >     clk_set_rate(clk, foo);
> > >     clk_rate_protect(clk);
> > > 
> > > what prevents another thread from changing the clock rate between these
> > > two calls?  The only way to do this safely would be something like:
> > > 
> > >     r = clk_round_rate(clk, foo);
> > >     while (1) {
> > >             err = clk_set_rate(clk, foo);
> > >             clk_rate_protect(clk);
> > >             if (err < 0)
> > >                     break;
> > > 
> > >             if (r == clk_get_rate(clk)) /* success */
> > >                     break;
> > > 
> > >             clk_rate_unprotect(clk);
> > >     }
> > > 
> > >     if (err)
> > >             failed;
> > 
> > Russell,
> > I think you have missed one subtle thing, when trying any clock altering
> > operation, if the consumer is protecting the clock, it will temporarily release 
> > the protection once, under the prepare_lock (to guarantee safe operation). This
> > is explained in the cover letter:
> > 
> > """
> > With this series there is 3 use-case:
> >  - the provider is not protected: nothing changes
> >  - the provider is protected by only 1 consumer (and only once), then only
> >    this consumer will be able to alter the rate of the clock, as it is the
> >    only one depending on it.
> >  - If the provider is protected more than once, or by the provider itself,
> >    the rate is basically locked and protected against reparenting.
> > """
> > 
> > So what you should do if you have to protect the clock is:
> > 
> > clk_rate_protect(clk);
> > err = clk_set_rate(clk, foo);
> > 
> > [...]
> > clk_rate_unprotect(clk);
> 
> So here you drop the protection, which means anyone can alter the clock
> again.

Correct. It would be appropriate to use clk_rate_unprotect in two ways:

1) module unload
2) when a clk consumer driver completes a critical section of work and
no longer cares about clk rate. I would expect this same consumer driver
to call clk_disable_unprepare as well.

Regards,
Mike

> 
> Either that or "clk_rate_unprotect" is inappropriately named and doesn't
> do what it says it does.
> 
> -- 
> RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
> According to speedtest.net: 8.21Mbps down 510kbps up

Michael Turquette Aug. 10, 2017, 4:54 p.m. UTC | #14

Quoting Jerome Brunet (2017-08-09 05:18:22)
> On Tue, 2017-08-08 at 15:37 -0700, Michael Turquette wrote:
> > Hi Stephen,
> > 
> > Quoting Stephen Boyd (2017-08-03 17:18:36)
> > > On 07/26, Jerome Brunet wrote:
> > > > > > +void clk_rate_protect(struct clk *clk);
> > > > > 
> > > > > Is there any plan to use this clk_rate_protect() API? It seems
> > > > > inherently racy for a clk consumer to call clk_set_rate() and
> > > > > then this clk_rate_protect() API after that to lock the rate in.
> > > > > How about we leave this out of the consumer API until a user
> > > > > needs it?
> > > > 
> > > > Having this API available is whole reason I've been working on this for so
> > > > long.
> > > > By the now, you may have forgot but I explained the use-case in first RFC
> > > > [0]
> > > > Here is an example (wip) of usage [1]
> > > > 
> > > > [0]: http://lkml.kernel.org/r/20170302173835.18313-1-jbrunet@baylibre.com
> > > > [1]: https://github.com/jeromebrunet/linux/commits/amlogic/wip/audio-clk-l
> > > > ock
> > 
> > Indeed, something like rate protection or "lock rate" has been discussed
> > since the birth of CCF. I remember a whiteboarding session between you,
> > Paul W. and myself about it probably in 2012. Peter might have been
> > there too.
> > 
> > > 
> > > If we're forgetting why something is introduced then it means the
> > > commit text is missing information. Please clearly describe the
> > > need for the API in the commit text for the patch that introduces
> > > it.
> > 
> > My $0.02 is that the "pick an unused PLL" thing is a benefit of this new
> > api that is internal to the clock controller driver, and is not the
> > driving force behind the series. If a simple, easy to understand
> > justification for this patch series is needed, might I suggest something
> > like the following for the next commit log/coverletter:
> > 
> > "Some clock consumers require that a clock rate must not deviate from
> > its selected frequency. There can be several reasons for this, not least
> > of which is that some hardware may not be able to handle or recover from
> > a glitch caused by changing the clock rate while the hardware is in
> > operation. The ability to lock a clock's rate, and release that lock, is
> > a fundamental clock rate control primitive. It's absence is a bug that
> > is fixed by this patch series."
> > 
> > That's the short and sweet version. If more verbosity is needed as to
> > why rate_range doesn't need this, there are some good reasons:
> > 
> > 1) simplicity: some consumers don't care about their rate, but do
> > care that they rate doesn't change. clk_rate_{un}protect is much simpler
> > than forcing those consumers to call clk_get_rate, then cache that value
> > for future use and then call clk_set_rate_range.
> > 
> > 2) expressiveness / debug: trying to find out why a clock rate is locked
> > searching through every use of clk_set_rate_range is sort of lame,
> > especially if variables are used to pass in min/max instead of
> > hard-coded values. It's way way easier to just grep for
> > clk_rate_protect.
> > 
> > > 
> > > > 
> > > > > 
> > > > > Finally, When does a consumer want the rate of a clk to change
> > > > > after they call clk_set_rate() on it? I would guess that very few
> > > > > consumers would be willing to accept that. Which begs the
> > > > > question, if anyone will keep calling clk_set_rate() after this
> > > > > API (and the clk_set_rate_protect() API) is added. It almost
> > > > > seems like we would want it to be opt-out, instead of opt-in, so
> > > > > that consumers would call clk_set_rate() and expect it to be a
> > > > > stable clk rate after that, and they would call
> > > > > clk_set_rate_trample_on_me() or something properly named when
> > > > > they don't care what the rate is after they call the API.
> > > > > 
> > > > 
> > > > Indeed, we generally don't want our rate to change, but:
> > > > - This is mostly for leaf clocks, the internal path would generally not
> > > > care, as
> > > > long as the leaf are happy.
> > > > - Even a leaf may be open (be able to deal with) to small glitches, pll
> > > > relock,
> > > > re parenting
> > > > 
> > > > Actually, if all the clock could not tolerate any glitches, there would
> > > > have
> > > > been a lot of complains about CCF by now, it does not prevent glitches at
> > > > all.
> > > 
> > > Well some devices handle glitches in the hardware, so the details
> > > of glitch free rate changes are hidden from clk consumers, and
> > > the software in general, on those devices.
> > 
> > On the hardware that I am familiar with, the problem of glitches lies
> > not in the clock hardware, but with the downstream peripheral logic / ip
> > block that consumes that clock signal. So it seems to me that having a
> > consumer api for locking the rate makes perfect sense.
> > 
> > > 
> > > > 
> > > > If you go over the initial RFC, the point is also for the CCF to favor
> > > > other
> > > > (unused parent) when several (sometimes with the same capabilities) are
> > > > available. I think this is also a fairly common use case. That's something
> > > > rate_range won't do either, as far as I understood.
> > > 
> > > Fair enough, but why do we want consumers to need to know that
> > > there are sometimes unused parents that aren't getting chosen for
> > > a particular frequency? I see this as exposing the internals of
> > > the clk tree to consumers when they shouldn't need to care. Of
> > > course, sometimes clk consumers really do care about internals,
> > 
> > Right, sometimes they do, and we need to strike a balance. I think that
> > this api has been needed for some time. It very likely could have been
> > included in the initial version of the CCF that was merged years back if
> > I hadn't been trying very hard to stick only to the existing clk.h.
> > 
> > clk_set_rate has always been a "last write wins" api, across a shared
> > resource, and we've always known that this is not a great situation.
> > This patch series does a good job of solving that issue, in conjunction
> > with the existing range stuff.
> > 
> > > for example if some PLL is used for a display controller and it's
> > > also routed out of the chip on the display phy pins to encode
> > > data or something. Then perhaps we really want to use one
> > > particular PLL instead of a generic one that may also be a parent
> > > of the display controller clk. Making sure clk_set_rate() doesn't
> > > trample on clks deep in the tree seems different than this
> > > though.
> > > 
> > > Going back to your RFC series cover letter though, I see three
> > > PLLs (mppl0,1,2) and two users of the PLLs (amclk and mclk_i958).
> > > Is anything else using these PLLs in the system? Why are we doing
> > > all this stuff instead of hard-coding the parents for these clks
> > > to be different PLLs? If we want it to be flexible we could
> > > assign parents to the cts_amclk_sel and cts_mclk_i958_sel to be
> > > different PLLs in DT via assigned clock parents. Or it could just
> > > be hard-coded in the clk driver during probe.
> > > 
> > > If there's really sharing going on, and you can't hardcode the
> > > parents, please indicate why that's the case in the commit text.
> > > I don't want to introduce another consumer API just because we
> > > didn't want to restrict the available parents for a couple clks.
> > 
> > I think the PLL sharing thing is a big distraction. Giving consumers the
> > ability to guarantee that their rates are locked in using a simple
> > critical section call just makes sense to me. If it helps with other
> > cases then yay.
> 
> It guarantee that what is being protected, won't be broken ... for sure it helps
> with sharing issues.
> 
> > 
> > > 
> > > > 
> > > > Last but not least, it allows consumer to set the rate in a sort of
> > > > critical
> > > > section and have the guarantee that nobody will be able to change the rate
> > > > between the clk_set_rate() call and prepare_enable(). That's something we
> > > > don't
> > > > have at the moment.
> > > > 
> > > 
> > > Right, but clk_rate_protect() doesn't close the critical section
> > > between clk_set_rate() and clk_rate_protect() if another
> > > consumers changes the frequency, or if that consumer changes the
> > > frequency of some parent of the clk. This is why I'm asking what
> > 
> > Right, clk_set_rate_protect does this in the next patch in this series.
> 
> Hum, I disagree here. For sure, if you call clk_set_rate() before
> clk_rate_protect() you have not guarantee at all, but you are doing it wrong.
> The way to use this it to call clk_rate_protect() then clk_set_rate().
> 
> As explained earlier, as long as the clock is protected only once, the
> protecting consumer is still able to change the rate, because it is the only one
> depending on clock.

I honestly do not see the difference between calling
clk_set_rate_protect() versus calling clk_rate_protect() followed by
clk_set_rate() within the rate-protected critical section.

Yes, you can always call clk_set_rate() later on, but I'm really
surprised that calling clk_set_rate_protect() is somehow *wrong*. I'm
quite sure that if this api gets merged that users will think of
clk_set_rate_protect() in the same way that they think of using
clk_prepare_enable() ... it's a useful helper that becomes the default
choice of driver authors.

> 
> This is a good fit for driver which heavily depends on rate being but need to
> change the rate during during their operations. It is not necessary to unprotect
> the clock before calling clk_set_rate().

Sure, and drivers can aggressively call clk_{en,dis}able within the
clk_prepare critical section, but very few do that in practice. From
experience, driver authors will not micro-optimize this stuff.

Regarding my comment below to remove clk_rate_protect(), I rescind that
statement. We definitely need the api, but it's also clear to me that
most users will just opt-in to use clk_set_rate_protect() by default.

Regards,
Mike

> 
> clk_set_rate_protect() is only introduce to help driver which could get in the
> situation where 2 consumers protect the clock w/o getting a chance to set the
> rate, exhausting  the ressources. To re-set the rate, clk_rate_unprotect() must
> be called before calling clk_set_rate_protect() again.
> 
> > 
> > > the use of this API is for. Shouldn't we want consumers to use
> > > clk_set_rate_protect() so they can be sure they got the rate they
> > > wanted, instead of hoping that something else hasn't come in
> > > between the set_rate and the protect calls and changed the
> > > frequency?
> > 
> > We can remove clk_rate_protect if you really want, but there is always
> > the case that a consumer driver does not set the rate, but needs to
> > guarantee that the rate will not change during operation.
> > 
> > Any driver that must both set the rate AND guarantee it will not change
> > must use clk_set_rate_protect.
> > 
> > Regards,
> > Mike
> > 
> > > 
> > > -- 
> > > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> > > a Linux Foundation Collaborative Project
>

[v3,05/10] clk: add support for clock protection

Commit Message

Comments

Patch