[v2] clk: samsung: Prevent potential endless loop in the PLL set_rate ops

Message ID	20200811112507.24418-1-s.nawrocki@samsung.com (mailing list archive)
State	Not Applicable
Headers	show Return-Path: <SRS0=cAz1=BV=vger.kernel.org=linux-samsung-soc-owner@kernel.org> From: Sylwester Nawrocki <s.nawrocki@samsung.com> To: linux-clk@vger.kernel.org Cc: tomasz.figa@gmail.com, cw00.choi@samsung.com, sboyd@kernel.org, mturquette@baylibre.com, linux-samsung-soc@vger.kernel.org, linux-kernel@vger.kernel.org, b.zolnierkie@samsung.com, m.szyprowski@samsung.com, Sylwester Nawrocki <s.nawrocki@samsung.com> Subject: [PATCH v2] clk: samsung: Prevent potential endless loop in the PLL set_rate ops Date: Tue, 11 Aug 2020 13:25:07 +0200 Message-Id: <20200811112507.24418-1-s.nawrocki@samsung.com> Content-Type: text/plain; charset="utf-8" CMS-TYPE: 201P References: <CGME20200811112518eucas1p2a751f664a907ac7cd8e1dd235dc2fa54@eucas1p2.samsung.com> Sender: linux-samsung-soc-owner@vger.kernel.org Precedence: bulk
Series	[v2] clk: samsung: Prevent potential endless loop in the PLL set_rate ops \| expand [v2] clk: samsung: Prevent potential endless loop in the PLL set_rate ops

Sylwester Nawrocki/Kernel \(PLT\) /SRPOL/Staff Engineer/Samsung Electronics Aug. 11, 2020, 11:25 a.m. UTC

In the .set_rate callback for some PLLs there is a loop polling state
of the PLL lock bit and it may become an endless loop when something
goes wrong with the PLL. For some PLLs there is already (a duplicated)
code for polling with timeout. This patch replaces that code with
the readl_relaxed_poll_timeout_atomic() macro and moves it to a common
helper function, which is then used for all the PLLs. The downside
of switching to the common macro is that we drop the cpu_relax() call.
Using a common helper function rather than the macro directly allows
to avoid repeating the error message in the code and to avoid the object
code size increase due to inlining.

Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
---
Changes for v2:
 - use common readl_relaxed_poll_timeout_atomic() macro
---
 drivers/clk/samsung/clk-pll.c | 92 +++++++++++++++----------------------------
 1 file changed, 32 insertions(+), 60 deletions(-)

Tomasz Figa Aug. 11, 2020, 12:59 p.m. UTC | #1

Hi Sylwester,

2020年8月11日(火) 13:25 Sylwester Nawrocki <s.nawrocki@samsung.com>:
>
> In the .set_rate callback for some PLLs there is a loop polling state
> of the PLL lock bit and it may become an endless loop when something
> goes wrong with the PLL. For some PLLs there is already (a duplicated)
> code for polling with timeout. This patch replaces that code with
> the readl_relaxed_poll_timeout_atomic() macro and moves it to a common
> helper function, which is then used for all the PLLs. The downside
> of switching to the common macro is that we drop the cpu_relax() call.

Tbh. I'm not sure what effect was exactly expected from cpu_relax() in
the functions which already had timeout handling. Could someone shed
some light on this?

> Using a common helper function rather than the macro directly allows
> to avoid repeating the error message in the code and to avoid the object
> code size increase due to inlining.
>
> Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
> ---
> Changes for v2:
>  - use common readl_relaxed_poll_timeout_atomic() macro
> ---
>  drivers/clk/samsung/clk-pll.c | 92 +++++++++++++++----------------------------
>  1 file changed, 32 insertions(+), 60 deletions(-)
>
> diff --git a/drivers/clk/samsung/clk-pll.c b/drivers/clk/samsung/clk-pll.c
> index ac70ad7..c3c1efe 100644
> --- a/drivers/clk/samsung/clk-pll.c
> +++ b/drivers/clk/samsung/clk-pll.c
> @@ -9,13 +9,14 @@
>  #include <linux/errno.h>
>  #include <linux/hrtimer.h>
>  #include <linux/delay.h>
> +#include <linux/iopoll.h>
>  #include <linux/slab.h>
>  #include <linux/clk-provider.h>
>  #include <linux/io.h>
>  #include "clk.h"
>  #include "clk-pll.h"
>
> -#define PLL_TIMEOUT_MS         10
> +#define PLL_TIMEOUT_US         10000U

I'm also wondering if 10ms is the universal value that would cover the
oldest PLLs as well, but my loose recollection is that they should
still lock much faster than that. Could you double check that in the
documentation?

Otherwise the patch looks good to me, thanks!

Reviewed-by: Tomasz Figa <tomasz.figa@gmail.com>

Best regards,
Tomasz

>
>  struct samsung_clk_pll {
>         struct clk_hw           hw;
> @@ -63,6 +64,22 @@ static long samsung_pll_round_rate(struct clk_hw *hw,
>         return rate_table[i - 1].rate;
>  }
>
> +static int samsung_pll_lock_wait(struct samsung_clk_pll *pll,
> +                                unsigned int reg_mask)
> +{
> +       u32 val;
> +       int ret;
> +
> +       /* Wait until the PLL is in steady locked state */
> +       ret = readl_relaxed_poll_timeout_atomic(pll->con_reg, val,
> +                                       val & reg_mask, 0, PLL_TIMEOUT_US);
> +       if (ret < 0)
> +               pr_err("%s: Could not lock PLL %s\n",
> +                      __func__, clk_hw_get_name(&pll->hw));
> +
> +       return ret;
> +}
> +
>  static int samsung_pll3xxx_enable(struct clk_hw *hw)
>  {
>         struct samsung_clk_pll *pll = to_clk_pll(hw);
> @@ -241,12 +258,9 @@ static int samsung_pll35xx_set_rate(struct clk_hw *hw, unsigned long drate,
>         writel_relaxed(tmp, pll->con_reg);
>
>         /* Wait until the PLL is locked if it is enabled. */
> -       if (tmp & BIT(pll->enable_offs)) {
> -               do {
> -                       cpu_relax();
> -                       tmp = readl_relaxed(pll->con_reg);
> -               } while (!(tmp & BIT(pll->lock_offs)));
> -       }
> +       if (tmp & BIT(pll->enable_offs))
> +               return samsung_pll_lock_wait(pll, BIT(pll->lock_offs));
> +
>         return 0;
>  }
>
> @@ -318,7 +332,7 @@ static int samsung_pll36xx_set_rate(struct clk_hw *hw, unsigned long drate,
>                                         unsigned long parent_rate)
>  {
>         struct samsung_clk_pll *pll = to_clk_pll(hw);
> -       u32 tmp, pll_con0, pll_con1;
> +       u32 pll_con0, pll_con1;
>         const struct samsung_pll_rate_table *rate;
>
>         rate = samsung_get_pll_settings(pll, drate);
> @@ -356,13 +370,8 @@ static int samsung_pll36xx_set_rate(struct clk_hw *hw, unsigned long drate,
>         pll_con1 |= rate->kdiv << PLL36XX_KDIV_SHIFT;
>         writel_relaxed(pll_con1, pll->con_reg + 4);
>
> -       /* wait_lock_time */
> -       if (pll_con0 & BIT(pll->enable_offs)) {
> -               do {
> -                       cpu_relax();
> -                       tmp = readl_relaxed(pll->con_reg);
> -               } while (!(tmp & BIT(pll->lock_offs)));
> -       }
> +       if (pll_con0 & BIT(pll->enable_offs))
> +               return samsung_pll_lock_wait(pll, BIT(pll->lock_offs));
>
>         return 0;
>  }
> @@ -437,7 +446,6 @@ static int samsung_pll45xx_set_rate(struct clk_hw *hw, unsigned long drate,
>         struct samsung_clk_pll *pll = to_clk_pll(hw);
>         const struct samsung_pll_rate_table *rate;
>         u32 con0, con1;
> -       ktime_t start;
>
>         /* Get required rate settings from table */
>         rate = samsung_get_pll_settings(pll, drate);
> @@ -489,20 +497,7 @@ static int samsung_pll45xx_set_rate(struct clk_hw *hw, unsigned long drate,
>         writel_relaxed(con0, pll->con_reg);
>
>         /* Wait for locking. */
> -       start = ktime_get();
> -       while (!(readl_relaxed(pll->con_reg) & PLL45XX_LOCKED)) {
> -               ktime_t delta = ktime_sub(ktime_get(), start);
> -
> -               if (ktime_to_ms(delta) > PLL_TIMEOUT_MS) {
> -                       pr_err("%s: could not lock PLL %s\n",
> -                                       __func__, clk_hw_get_name(hw));
> -                       return -EFAULT;
> -               }
> -
> -               cpu_relax();
> -       }
> -
> -       return 0;
> +       return samsung_pll_lock_wait(pll, PLL45XX_LOCKED);
>  }
>
>  static const struct clk_ops samsung_pll45xx_clk_ops = {
> @@ -588,7 +583,6 @@ static int samsung_pll46xx_set_rate(struct clk_hw *hw, unsigned long drate,
>         struct samsung_clk_pll *pll = to_clk_pll(hw);
>         const struct samsung_pll_rate_table *rate;
>         u32 con0, con1, lock;
> -       ktime_t start;
>
>         /* Get required rate settings from table */
>         rate = samsung_get_pll_settings(pll, drate);
> @@ -648,20 +642,7 @@ static int samsung_pll46xx_set_rate(struct clk_hw *hw, unsigned long drate,
>         writel_relaxed(con1, pll->con_reg + 0x4);
>
>         /* Wait for locking. */
> -       start = ktime_get();
> -       while (!(readl_relaxed(pll->con_reg) & PLL46XX_LOCKED)) {
> -               ktime_t delta = ktime_sub(ktime_get(), start);
> -
> -               if (ktime_to_ms(delta) > PLL_TIMEOUT_MS) {
> -                       pr_err("%s: could not lock PLL %s\n",
> -                                       __func__, clk_hw_get_name(hw));
> -                       return -EFAULT;
> -               }
> -
> -               cpu_relax();
> -       }
> -
> -       return 0;
> +       return samsung_pll_lock_wait(pll, PLL46XX_LOCKED);
>  }
>
>  static const struct clk_ops samsung_pll46xx_clk_ops = {
> @@ -1035,14 +1016,9 @@ static int samsung_pll2550xx_set_rate(struct clk_hw *hw, unsigned long drate,
>                         (rate->sdiv << PLL2550XX_S_SHIFT);
>         writel_relaxed(tmp, pll->con_reg);
>
> -       /* wait_lock_time */
> -       do {
> -               cpu_relax();
> -               tmp = readl_relaxed(pll->con_reg);
> -       } while (!(tmp & (PLL2550XX_LOCK_STAT_MASK
> -                       << PLL2550XX_LOCK_STAT_SHIFT)));
> -
> -       return 0;
> +       /* Wait for locking. */
> +       return samsung_pll_lock_wait(pll,
> +                       PLL2550XX_LOCK_STAT_MASK << PLL2550XX_LOCK_STAT_SHIFT);
>  }
>
>  static const struct clk_ops samsung_pll2550xx_clk_ops = {
> @@ -1132,13 +1108,9 @@ static int samsung_pll2650x_set_rate(struct clk_hw *hw, unsigned long drate,
>         con1 |= ((rate->kdiv & PLL2650X_K_MASK) << PLL2650X_K_SHIFT);
>         writel_relaxed(con1, pll->con_reg + 4);
>
> -       do {
> -               cpu_relax();
> -               con0 = readl_relaxed(pll->con_reg);
> -       } while (!(con0 & (PLL2650X_LOCK_STAT_MASK
> -                       << PLL2650X_LOCK_STAT_SHIFT)));
> -
> -       return 0;
> +       /* Wait for locking. */
> +       return samsung_pll_lock_wait(pll,
> +                       PLL2650X_LOCK_STAT_MASK << PLL2650X_LOCK_STAT_SHIFT);
>  }
>
>  static const struct clk_ops samsung_pll2650x_clk_ops = {
> --
> 2.7.4
>

Krzysztof Kozlowski Aug. 11, 2020, 4:23 p.m. UTC | #2

On Tue, Aug 11, 2020 at 02:59:07PM +0200, Tomasz Figa wrote:
> Hi Sylwester,
> 
> 2020年8月11日(火) 13:25 Sylwester Nawrocki <s.nawrocki@samsung.com>:
> >
> > In the .set_rate callback for some PLLs there is a loop polling state
> > of the PLL lock bit and it may become an endless loop when something
> > goes wrong with the PLL. For some PLLs there is already (a duplicated)
> > code for polling with timeout. This patch replaces that code with
> > the readl_relaxed_poll_timeout_atomic() macro and moves it to a common
> > helper function, which is then used for all the PLLs. The downside
> > of switching to the common macro is that we drop the cpu_relax() call.
> 
> Tbh. I'm not sure what effect was exactly expected from cpu_relax() in
> the functions which already had timeout handling. Could someone shed
> some light on this?

For us, it should not matter much, except:
1. when on A9 with ARM_ERRATA_754327, but we do not enable it on our
   platforms,
2. it is a generic pattern for busy loops.

On other architectures it could mean something (e.g. yield to other
hyper-threading CPU).

Looks good.

Acked-by: Krzysztof Kozlowski <krzk@kernel.org>

Best regards,
Krzysztof

Tomasz Figa Aug. 11, 2020, 4:28 p.m. UTC | #3

2020年8月11日(火) 18:24 Krzysztof Kozlowski <krzk@kernel.org>:
>
> On Tue, Aug 11, 2020 at 02:59:07PM +0200, Tomasz Figa wrote:
> > Hi Sylwester,
> >
> > 2020年8月11日(火) 13:25 Sylwester Nawrocki <s.nawrocki@samsung.com>:
> > >
> > > In the .set_rate callback for some PLLs there is a loop polling state
> > > of the PLL lock bit and it may become an endless loop when something
> > > goes wrong with the PLL. For some PLLs there is already (a duplicated)
> > > code for polling with timeout. This patch replaces that code with
> > > the readl_relaxed_poll_timeout_atomic() macro and moves it to a common
> > > helper function, which is then used for all the PLLs. The downside
> > > of switching to the common macro is that we drop the cpu_relax() call.
> >
> > Tbh. I'm not sure what effect was exactly expected from cpu_relax() in
> > the functions which already had timeout handling. Could someone shed
> > some light on this?
>
> For us, it should not matter much, except:
> 1. when on A9 with ARM_ERRATA_754327, but we do not enable it on our
>    platforms,
> 2. it is a generic pattern for busy loops.
>
> On other architectures it could mean something (e.g. yield to other
> hyper-threading CPU).

Okay, thanks for confirming that it doesn't matter for us.

Now, I wonder if the readx_poll_*() helpers are supposed to take all
of those into account or on systems which would benefit from such
operations, it would be the caller's responsibility.

Best regards,
Tomasz

Krzysztof Kozlowski Aug. 11, 2020, 4:34 p.m. UTC | #4

On Tue, Aug 11, 2020 at 06:28:18PM +0200, Tomasz Figa wrote:
> 2020年8月11日(火) 18:24 Krzysztof Kozlowski <krzk@kernel.org>:
> >
> > On Tue, Aug 11, 2020 at 02:59:07PM +0200, Tomasz Figa wrote:
> > > Hi Sylwester,
> > >
> > > 2020年8月11日(火) 13:25 Sylwester Nawrocki <s.nawrocki@samsung.com>:
> > > >
> > > > In the .set_rate callback for some PLLs there is a loop polling state
> > > > of the PLL lock bit and it may become an endless loop when something
> > > > goes wrong with the PLL. For some PLLs there is already (a duplicated)
> > > > code for polling with timeout. This patch replaces that code with
> > > > the readl_relaxed_poll_timeout_atomic() macro and moves it to a common
> > > > helper function, which is then used for all the PLLs. The downside
> > > > of switching to the common macro is that we drop the cpu_relax() call.
> > >
> > > Tbh. I'm not sure what effect was exactly expected from cpu_relax() in
> > > the functions which already had timeout handling. Could someone shed
> > > some light on this?
> >
> > For us, it should not matter much, except:
> > 1. when on A9 with ARM_ERRATA_754327, but we do not enable it on our
> >    platforms,
> > 2. it is a generic pattern for busy loops.
> >
> > On other architectures it could mean something (e.g. yield to other
> > hyper-threading CPU).
> 
> Okay, thanks for confirming that it doesn't matter for us.
> 
> Now, I wonder if the readx_poll_*() helpers are supposed to take all
> of those into account or on systems which would benefit from such
> operations, it would be the caller's responsibility.

That's a very good point. In case of ARM_ERRATA_754327, busy waiting
should have a barrier thus cpu_relax() is desired. I guess the generic
macro for busy waiting therefore should use them.

Best regards,
Krzysztof

Tomasz Figa Aug. 11, 2020, 4:41 p.m. UTC | #5

2020年8月11日(火) 18:34 Krzysztof Kozlowski <krzk@kernel.org>:
>
> On Tue, Aug 11, 2020 at 06:28:18PM +0200, Tomasz Figa wrote:
> > 2020年8月11日(火) 18:24 Krzysztof Kozlowski <krzk@kernel.org>:
> > >
> > > On Tue, Aug 11, 2020 at 02:59:07PM +0200, Tomasz Figa wrote:
> > > > Hi Sylwester,
> > > >
> > > > 2020年8月11日(火) 13:25 Sylwester Nawrocki <s.nawrocki@samsung.com>:
> > > > >
> > > > > In the .set_rate callback for some PLLs there is a loop polling state
> > > > > of the PLL lock bit and it may become an endless loop when something
> > > > > goes wrong with the PLL. For some PLLs there is already (a duplicated)
> > > > > code for polling with timeout. This patch replaces that code with
> > > > > the readl_relaxed_poll_timeout_atomic() macro and moves it to a common
> > > > > helper function, which is then used for all the PLLs. The downside
> > > > > of switching to the common macro is that we drop the cpu_relax() call.
> > > >
> > > > Tbh. I'm not sure what effect was exactly expected from cpu_relax() in
> > > > the functions which already had timeout handling. Could someone shed
> > > > some light on this?
> > >
> > > For us, it should not matter much, except:
> > > 1. when on A9 with ARM_ERRATA_754327, but we do not enable it on our
> > >    platforms,
> > > 2. it is a generic pattern for busy loops.
> > >
> > > On other architectures it could mean something (e.g. yield to other
> > > hyper-threading CPU).
> >
> > Okay, thanks for confirming that it doesn't matter for us.
> >
> > Now, I wonder if the readx_poll_*() helpers are supposed to take all
> > of those into account or on systems which would benefit from such
> > operations, it would be the caller's responsibility.
>
> That's a very good point. In case of ARM_ERRATA_754327, busy waiting
> should have a barrier thus cpu_relax() is desired. I guess the generic
> macro for busy waiting therefore should use them.

Is there yet another macro available somewhere or you mean
read_poll_timeout_atomic()? The latter doesn't include cpu_relax().
Given that udelay() likely already does this kind of an idle call,
perhaps it could be as simple as this?

        if (__delay_us) \
                udelay(__delay_us); \
+       else \
+               cpu_relax(); \

On the other hand, I wonder if there are cases where a call to
cpu_relax() is not desirable.

Best regards,
Tomasz

Sylwester Nawrocki/Kernel \(PLT\) /SRPOL/Staff Engineer/Samsung Electronics Aug. 11, 2020, 4:45 p.m. UTC | #6

Hi Tomasz,

On 11.08.2020 14:59, Tomasz Figa wrote:
> 2020年8月11日(火) 13:25 Sylwester Nawrocki <s.nawrocki@samsung.com>:
>>
>> In the .set_rate callback for some PLLs there is a loop polling state
>> of the PLL lock bit and it may become an endless loop when something
>> goes wrong with the PLL. For some PLLs there is already (a duplicated)
>> code for polling with timeout. This patch replaces that code with
>> the readl_relaxed_poll_timeout_atomic() macro and moves it to a common
>> helper function, which is then used for all the PLLs. The downside
>> of switching to the common macro is that we drop the cpu_relax() call.
> 
> Tbh. I'm not sure what effect was exactly expected from cpu_relax() in
> the functions which already had timeout handling. Could someone shed
> some light on this?
> 
>> Using a common helper function rather than the macro directly allows
>> to avoid repeating the error message in the code and to avoid the object
>> code size increase due to inlining.
>>
>> Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
>> ---
>> Changes for v2:
>>  - use common readl_relaxed_poll_timeout_atomic() macro
>> ---
>>  drivers/clk/samsung/clk-pll.c | 92 +++++++++++++++----------------------------
>>  1 file changed, 32 insertions(+), 60 deletions(-)
>>
>> diff --git a/drivers/clk/samsung/clk-pll.c b/drivers/clk/samsung/clk-pll.c
>> index ac70ad7..c3c1efe 100644
>> --- a/drivers/clk/samsung/clk-pll.c
>> +++ b/drivers/clk/samsung/clk-pll.c
>> @@ -9,13 +9,14 @@

>> -#define PLL_TIMEOUT_MS         10
>> +#define PLL_TIMEOUT_US         10000U
> 
> I'm also wondering if 10ms is the universal value that would cover the
> oldest PLLs as well, but my loose recollection is that they should
> still lock much faster than that. Could you double check that in the
> documentation?

Thanks for your comments.

The oldest PLLs have a hard coded 300 us waiting time for PLL lock and 
are not affected by the patch.
I have checked some of the PLLs and maximum observed lock time was around 
370 us and most of the time it was just a few us.

We calculate the lock time in each set_rate op, in the oscillator cycle
units, as a product of current P divider value and a constant PLL type 
specific LOCK_FACTOR. Maximum possible P value is 64, maximum possible
LOCK_FACTOR is 3000. Assuming minimum VCO frequency of 24 MHz (which 
I think will usually be much higher than that) maximum lock time
would be (64 x 3000) / 24 MHz = 8 ms. I think we can leave the current 
10 ms value.

But there is other issue, it seems we can't really use the ktime API
in the set_rate callbacks, as these could be called early, before the
clocksource is initialized and ktime doesn't work yet. Below trace 
is from a dump_stack() added to the samsung_pll_lock_wait() callback.
The PLL rate setting is triggered by assigned-clock* properties in 
the clock supplier node.
I think we need to switch to a simple udelay() loop, as is done in 
clk-tegra210 driver for instance.

[    0.000000] Hardware name: Samsung Exynos (Flattened Device Tree)
[    0.000000] [<c0111e9c>] (unwind_backtrace) from [<c010d0ec>] (show_stack+0x10/0x14)
[    0.000000] [<c010d0ec>] (show_stack) from [<c051d890>] (dump_stack+0xac/0xd8)
[    0.000000] [<c051d890>] (dump_stack) from [<c0578d94>] (samsung_pll_lock_wait+0x14/0x174)
[    0.000000] [<c0578d94>] (samsung_pll_lock_wait) from [<c057319c>] (clk_change_rate+0x1a8/0x8ac)
[    0.000000] [<c057319c>] (clk_change_rate) from [<c0573aec>] (clk_core_set_rate_nolock+0x24c/0x268)
[    0.000000] [<c0573aec>] (clk_core_set_rate_nolock) from [<c0573b38>] (clk_set_rate+0x30/0x64)
[    0.000000] [<c0573b38>] (clk_set_rate) from [<c0577df8>] (of_clk_set_defaults+0x214/0x384)
[    0.000000] [<c0577df8>] (of_clk_set_defaults) from [<c0572f34>] (of_clk_add_hw_provider+0x98/0xd8)
[    0.000000] [<c0572f34>] (of_clk_add_hw_provider) from [<c1120278>] (samsung_clk_of_add_provider+0x1c/0x30)
[    0.000000] [<c1120278>] (samsung_clk_of_add_provider) from [<c1121844>] (exynos5250_clk_of_clk_init_driver+0x1f4/0x240)
[    0.000000] [<c1121844>] (exynos5250_clk_of_clk_init_driver) from [<c11200d0>] (of_clk_init+0x16c/0x218)
[    0.000000] [<c11200d0>] (of_clk_init) from [<c1104bdc>] (time_init+0x24/0x30)
[    0.000000] [<c1104bdc>] (time_init) from [<c1100d20>] (start_kernel+0x3b0/0x520)
[    0.000000] [<c1100d20>] (start_kernel) from [<00000000>] (0x0)
[    0.000000] samsung_pll_lock_wait: PLL fout_epll, lock time: 0 us, ret: 0
[    0.000000] Exynos5250: clock setup completed, armclk=1700000000
[    0.000000] Switching to timer-based delay loop, resolution 41ns
[    0.000000] clocksource: mct-frc: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 79635851949 ns
[    0.000003] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 89478484971ns
[    0.000032] genirq: irq_chip COMBINER did not update eff. affinity mask of irq 49
[    0.000523] arch_timer: cp15 timer(s) running at 24.00MHz (virt).
[    0.000536] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0x588fe9dc0, max_idle_ns: 440795202592 ns
[    0.000551] sched_clock: 56 bits at 24MHz, resolution 41ns, wraps every 4398046511097ns

Krzysztof Kozlowski Aug. 11, 2020, 4:46 p.m. UTC | #7

On Tue, Aug 11, 2020 at 06:41:20PM +0200, Tomasz Figa wrote:
> 2020年8月11日(火) 18:34 Krzysztof Kozlowski <krzk@kernel.org>:
> >
> > On Tue, Aug 11, 2020 at 06:28:18PM +0200, Tomasz Figa wrote:
> > > 2020年8月11日(火) 18:24 Krzysztof Kozlowski <krzk@kernel.org>:
> > > >
> > > > On Tue, Aug 11, 2020 at 02:59:07PM +0200, Tomasz Figa wrote:
> > > > > Hi Sylwester,
> > > > >
> > > > > 2020年8月11日(火) 13:25 Sylwester Nawrocki <s.nawrocki@samsung.com>:
> > > > > >
> > > > > > In the .set_rate callback for some PLLs there is a loop polling state
> > > > > > of the PLL lock bit and it may become an endless loop when something
> > > > > > goes wrong with the PLL. For some PLLs there is already (a duplicated)
> > > > > > code for polling with timeout. This patch replaces that code with
> > > > > > the readl_relaxed_poll_timeout_atomic() macro and moves it to a common
> > > > > > helper function, which is then used for all the PLLs. The downside
> > > > > > of switching to the common macro is that we drop the cpu_relax() call.
> > > > >
> > > > > Tbh. I'm not sure what effect was exactly expected from cpu_relax() in
> > > > > the functions which already had timeout handling. Could someone shed
> > > > > some light on this?
> > > >
> > > > For us, it should not matter much, except:
> > > > 1. when on A9 with ARM_ERRATA_754327, but we do not enable it on our
> > > >    platforms,
> > > > 2. it is a generic pattern for busy loops.
> > > >
> > > > On other architectures it could mean something (e.g. yield to other
> > > > hyper-threading CPU).
> > >
> > > Okay, thanks for confirming that it doesn't matter for us.
> > >
> > > Now, I wonder if the readx_poll_*() helpers are supposed to take all
> > > of those into account or on systems which would benefit from such
> > > operations, it would be the caller's responsibility.
> >
> > That's a very good point. In case of ARM_ERRATA_754327, busy waiting
> > should have a barrier thus cpu_relax() is desired. I guess the generic
> > macro for busy waiting therefore should use them.
> 
> Is there yet another macro available somewhere or you mean
> read_poll_timeout_atomic()? The latter doesn't include cpu_relax().

Yes, I meant the generic read_poll_timeout_atomic().

> Given that udelay() likely already does this kind of an idle call,
> perhaps it could be as simple as this?
> 
>         if (__delay_us) \
>                 udelay(__delay_us); \
> +       else \
> +               cpu_relax(); \
> 

I think udelay does not have it. Delaying by some simple operations
(e.g. multiplication) does not require IO barriers.

> On the other hand, I wonder if there are cases where a call to
> cpu_relax() is not desirable.

Hmmm, it is really a generic pattern all over the kernel, so I doubt
that generic macros should target such case.

Best regards,
Krzysztof

Tomasz Figa Aug. 11, 2020, 4:53 p.m. UTC | #8

2020年8月11日(火) 18:45 Sylwester Nawrocki <s.nawrocki@samsung.com>:
>
> Hi Tomasz,
>
> On 11.08.2020 14:59, Tomasz Figa wrote:
> > 2020年8月11日(火) 13:25 Sylwester Nawrocki <s.nawrocki@samsung.com>:
> >>
> >> In the .set_rate callback for some PLLs there is a loop polling state
> >> of the PLL lock bit and it may become an endless loop when something
> >> goes wrong with the PLL. For some PLLs there is already (a duplicated)
> >> code for polling with timeout. This patch replaces that code with
> >> the readl_relaxed_poll_timeout_atomic() macro and moves it to a common
> >> helper function, which is then used for all the PLLs. The downside
> >> of switching to the common macro is that we drop the cpu_relax() call.
> >
> > Tbh. I'm not sure what effect was exactly expected from cpu_relax() in
> > the functions which already had timeout handling. Could someone shed
> > some light on this?
> >
> >> Using a common helper function rather than the macro directly allows
> >> to avoid repeating the error message in the code and to avoid the object
> >> code size increase due to inlining.
> >>
> >> Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
> >> ---
> >> Changes for v2:
> >>  - use common readl_relaxed_poll_timeout_atomic() macro
> >> ---
> >>  drivers/clk/samsung/clk-pll.c | 92 +++++++++++++++----------------------------
> >>  1 file changed, 32 insertions(+), 60 deletions(-)
> >>
> >> diff --git a/drivers/clk/samsung/clk-pll.c b/drivers/clk/samsung/clk-pll.c
> >> index ac70ad7..c3c1efe 100644
> >> --- a/drivers/clk/samsung/clk-pll.c
> >> +++ b/drivers/clk/samsung/clk-pll.c
> >> @@ -9,13 +9,14 @@
>
> >> -#define PLL_TIMEOUT_MS         10
> >> +#define PLL_TIMEOUT_US         10000U
> >
> > I'm also wondering if 10ms is the universal value that would cover the
> > oldest PLLs as well, but my loose recollection is that they should
> > still lock much faster than that. Could you double check that in the
> > documentation?
>
> Thanks for your comments.
>
> The oldest PLLs have a hard coded 300 us waiting time for PLL lock and
> are not affected by the patch.
> I have checked some of the PLLs and maximum observed lock time was around
> 370 us and most of the time it was just a few us.
>
> We calculate the lock time in each set_rate op, in the oscillator cycle
> units, as a product of current P divider value and a constant PLL type
> specific LOCK_FACTOR. Maximum possible P value is 64, maximum possible
> LOCK_FACTOR is 3000. Assuming minimum VCO frequency of 24 MHz (which
> I think will usually be much higher than that) maximum lock time
> would be (64 x 3000) / 24 MHz = 8 ms. I think we can leave the current
> 10 ms value.

Sounds good to me. Thanks!

>
> But there is other issue, it seems we can't really use the ktime API
> in the set_rate callbacks, as these could be called early, before the
> clocksource is initialized and ktime doesn't work yet. Below trace
> is from a dump_stack() added to the samsung_pll_lock_wait() callback.
> The PLL rate setting is triggered by assigned-clock* properties in
> the clock supplier node.
> I think we need to switch to a simple udelay() loop, as is done in
> clk-tegra210 driver for instance.
>
> [    0.000000] Hardware name: Samsung Exynos (Flattened Device Tree)
> [    0.000000] [<c0111e9c>] (unwind_backtrace) from [<c010d0ec>] (show_stack+0x10/0x14)
> [    0.000000] [<c010d0ec>] (show_stack) from [<c051d890>] (dump_stack+0xac/0xd8)
> [    0.000000] [<c051d890>] (dump_stack) from [<c0578d94>] (samsung_pll_lock_wait+0x14/0x174)
> [    0.000000] [<c0578d94>] (samsung_pll_lock_wait) from [<c057319c>] (clk_change_rate+0x1a8/0x8ac)
> [    0.000000] [<c057319c>] (clk_change_rate) from [<c0573aec>] (clk_core_set_rate_nolock+0x24c/0x268)
> [    0.000000] [<c0573aec>] (clk_core_set_rate_nolock) from [<c0573b38>] (clk_set_rate+0x30/0x64)
> [    0.000000] [<c0573b38>] (clk_set_rate) from [<c0577df8>] (of_clk_set_defaults+0x214/0x384)
> [    0.000000] [<c0577df8>] (of_clk_set_defaults) from [<c0572f34>] (of_clk_add_hw_provider+0x98/0xd8)
> [    0.000000] [<c0572f34>] (of_clk_add_hw_provider) from [<c1120278>] (samsung_clk_of_add_provider+0x1c/0x30)
> [    0.000000] [<c1120278>] (samsung_clk_of_add_provider) from [<c1121844>] (exynos5250_clk_of_clk_init_driver+0x1f4/0x240)
> [    0.000000] [<c1121844>] (exynos5250_clk_of_clk_init_driver) from [<c11200d0>] (of_clk_init+0x16c/0x218)
> [    0.000000] [<c11200d0>] (of_clk_init) from [<c1104bdc>] (time_init+0x24/0x30)
> [    0.000000] [<c1104bdc>] (time_init) from [<c1100d20>] (start_kernel+0x3b0/0x520)

Yeah... I should've thought about this. Interestingly enough, some of
the existing implementations in drivers/clk/samsung/clk-pll.c use the
ktime API. I guess they are lucky enough not to be called too early,
i.e. are not needed for the initialization of timers.

Best regards,
Tomasz


> [    0.000000] [<c1100d20>] (start_kernel) from [<00000000>] (0x0)
> [    0.000000] samsung_pll_lock_wait: PLL fout_epll, lock time: 0 us, ret: 0
> [    0.000000] Exynos5250: clock setup completed, armclk=1700000000
> [    0.000000] Switching to timer-based delay loop, resolution 41ns
> [    0.000000] clocksource: mct-frc: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 79635851949 ns
> [    0.000003] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 89478484971ns
> [    0.000032] genirq: irq_chip COMBINER did not update eff. affinity mask of irq 49
> [    0.000523] arch_timer: cp15 timer(s) running at 24.00MHz (virt).
> [    0.000536] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0x588fe9dc0, max_idle_ns: 440795202592 ns
> [    0.000551] sched_clock: 56 bits at 24MHz, resolution 41ns, wraps every 4398046511097ns
>
> --
> Regards,
> Sylwester

Sylwester Nawrocki/Kernel \(PLT\) /SRPOL/Staff Engineer/Samsung Electronics Aug. 11, 2020, 5:26 p.m. UTC | #9

On 11.08.2020 18:53, Tomasz Figa wrote:
> Yeah... I should've thought about this. Interestingly enough, some of
> the existing implementations in drivers/clk/samsung/clk-pll.c use the
> ktime API. I guess they are lucky enough not to be called too early,
> i.e. are not needed for the initialization of timers.

In normal conditions nothing bad really happens, the loop exits as soon
as the lock bit is asserted. Just the timeout condition will never be
detected - if there is something wrong with the PLL's setup and it never
locks we will never stop polling.

[v2] clk: samsung: Prevent potential endless loop in the PLL set_rate ops

Commit Message

Comments

Patch