Message ID | 20210714064129.1321277-1-victor.liu@nxp.com (mailing list archive) |
---|---|
State | RFC, archived |
Headers | show |
Series | [RFC] clk: fractional-divider: Correct max_{m,n} handed over to rational_best_approximation() | expand |
On Wed, Jul 14, 2021 at 02:41:29PM +0800, Liu Ying wrote: > If a fractional divider clock has the flag > CLK_FRAC_DIVIDER_ZERO_BASED set, the maximum > numerator and denominator handed over to > rational_best_approximation(), in this case > max_m and max_n, should be increased by one > comparing to those have the flag unset. Without > this patch, a zero based fractional divider > with 1-bit mwidth and 3-bit nwidth would wrongly > generate 96MHz clock rate if the parent clock > rate is 288MHz, while the expected clock rate > is 115.2MHz with m = 2 and n = 5. Make sure that your editor is configured to allow you to have lines ~70-72 characters long. ... > The patch is RFC, because the rationale behind the below snippet in > clk_fd_general_approximation() is unclear to Jacky and me and we are > not sure if there is any room to improve this patch due to the snippet. > Maybe, Andy may help shed some light here. Thanks. > > -----------------------------------8<--------------------------------- > /* > * Get rate closer to *parent_rate to guarantee there is no overflow > * for m and n. In the result it will be the nearest rate left shifted > * by (scale - fd->nwidth) bits. > */ I don't know how to rephrase above comment better. > scale = fls_long(*parent_rate / rate - 1); > if (scale > fd->nwidth) > rate <<= scale - fd->nwidth; This takes an advantage of the numbers be in a form of n = k * 2^m, (1) where m will be scale in the snippet above. Thus, if n can be represented by (1), we opportunistically reduce amount of bits needed for it by shifting right by m bits. Does it make sense? The code looks good to me, btw, although I dunno if you need to call the newly introduced function before or after the above mentioned snippet.
On Wed, 2021-07-14 at 12:12 +0300, Andy Shevchenko wrote: > On Wed, Jul 14, 2021 at 02:41:29PM +0800, Liu Ying wrote: > > If a fractional divider clock has the flag > > CLK_FRAC_DIVIDER_ZERO_BASED set, the maximum > > numerator and denominator handed over to > > rational_best_approximation(), in this case > > max_m and max_n, should be increased by one > > comparing to those have the flag unset. Without > > this patch, a zero based fractional divider > > with 1-bit mwidth and 3-bit nwidth would wrongly > > generate 96MHz clock rate if the parent clock > > rate is 288MHz, while the expected clock rate > > is 115.2MHz with m = 2 and n = 5. > > Make sure that your editor is configured to allow you to have lines ~70-72 > characters long. Alright, I'll see if I can improve this in v2 if necessary. > > ... > > > The patch is RFC, because the rationale behind the below snippet in > > clk_fd_general_approximation() is unclear to Jacky and me and we are > > not sure if there is any room to improve this patch due to the snippet. > > Maybe, Andy may help shed some light here. Thanks. > > > > -----------------------------------8<--------------------------------- > > /* > > * Get rate closer to *parent_rate to guarantee there is no overflow > > * for m and n. In the result it will be the nearest rate left shifted > > * by (scale - fd->nwidth) bits. > > */ > > I don't know how to rephrase above comment better. > > > scale = fls_long(*parent_rate / rate - 1); > > if (scale > fd->nwidth) > > rate <<= scale - fd->nwidth; > > This takes an advantage of the numbers be in a form of > > n = k * 2^m, (1) > > where m will be scale in the snippet above. Thus, if n can be represented by > (1), we opportunistically reduce amount of bits needed for it by shifting right > by m bits. > > Does it make sense? Thanks for your explaination. But, sorry, Jacky and I still don't understand this. > > The code looks good to me, btw, although I dunno if you need to call the newly > introduced function before or after the above mentioned snippet. Assuming that snippet is fully orthogonal to this patch, then it doesn't matter if it's before or after. So, enlightenment/comments/ideas are welcome. Thanks, Liu Ying >
On Wed, Jul 14, 2021 at 06:10:46PM +0800, Liu Ying wrote: > On Wed, 2021-07-14 at 12:12 +0300, Andy Shevchenko wrote: > > On Wed, Jul 14, 2021 at 02:41:29PM +0800, Liu Ying wrote: ... > > > /* > > > * Get rate closer to *parent_rate to guarantee there is no overflow > > > * for m and n. In the result it will be the nearest rate left shifted > > > * by (scale - fd->nwidth) bits. > > > */ > > > > I don't know how to rephrase above comment better. > > > > > scale = fls_long(*parent_rate / rate - 1); > > > if (scale > fd->nwidth) > > > rate <<= scale - fd->nwidth; > > > > This takes an advantage of the numbers be in a form of > > > > n = k * 2^m, (1) > > > > where m will be scale in the snippet above. Thus, if n can be represented by > > (1), we opportunistically reduce amount of bits needed for it by shifting right > > by m bits. > > > > Does it make sense? > > Thanks for your explaination. > But, sorry, Jacky and I still don't understand this. Okay, We have two values in question: r_o (original rate of the parent clock) r_u (the rate user asked for) We have a pre-scaler block which asks for m (denominator) n (nominator) values to be provided to satisfy the (2) r_u ~= r_o * m / n, (2) where we try our best to make it "=" instead of "~=". Now, m and n have the limitation by a range, e.g. n >= 1, n < N_lim, where N_lim = 2^nlim. (3) Hence, from (2) and (3), assuming the worst case m = 1, ln2(r_o / r_u) <= nlim. (4) The above code tries to satisfy (4). Have you got it now? > > The code looks good to me, btw, although I dunno if you need to call the newly > > introduced function before or after the above mentioned snippet. > > Assuming that snippet is fully orthogonal to this patch, then it > doesn't matter if it's before or after. Please, double check this. Because you play with limits, while we expect them to satisfy (3).
On Wed, Jul 14, 2021 at 01:38:22PM +0300, Andy Shevchenko wrote: > On Wed, Jul 14, 2021 at 06:10:46PM +0800, Liu Ying wrote: > > On Wed, 2021-07-14 at 12:12 +0300, Andy Shevchenko wrote: > > > On Wed, Jul 14, 2021 at 02:41:29PM +0800, Liu Ying wrote: > > ... > > > > > /* > > > > * Get rate closer to *parent_rate to guarantee there is no overflow > > > > * for m and n. In the result it will be the nearest rate left shifted > > > > * by (scale - fd->nwidth) bits. > > > > */ > > > > > > I don't know how to rephrase above comment better. > > > > > > > scale = fls_long(*parent_rate / rate - 1); > > > > if (scale > fd->nwidth) > > > > rate <<= scale - fd->nwidth; > > > > > > This takes an advantage of the numbers be in a form of > > > > > > n = k * 2^m, (1) > > > > > > where m will be scale in the snippet above. Thus, if n can be represented by > > > (1), we opportunistically reduce amount of bits needed for it by shifting right > > > by m bits. > > > Does it make sense? > > > > Thanks for your explaination. > > But, sorry, Jacky and I still don't understand this. It seems I poorly chose the letters for (1). Let's rewrite above as This takes an advantage of the numbers be in a form of a = k * 2^b, (1) where b will be scale in the snippet above. Thus, if a can be represented by (1), we opportunistically reduce amount of bits needed for it by shifting right by b bits. Also note, "shifting right" here is about the result of n (see below). > Okay, We have two values in question: > r_o (original rate of the parent clock) > r_u (the rate user asked for) > > We have a pre-scaler block which asks for > m (denominator) > n (nominator) > values to be provided to satisfy the (2) > > r_u ~= r_o * m / n, (2) > > where we try our best to make it "=" instead of "~=". > > Now, m and n have the limitation by a range, e.g. > > n >= 1, n < N_lim, where N_lim = 2^nlim. (3) > > Hence, from (2) and (3), assuming the worst case m = 1, > > ln2(r_o / r_u) <= nlim. (4) > > The above code tries to satisfy (4). > > Have you got it now? > > > > The code looks good to me, btw, although I dunno if you need to call the newly > > > introduced function before or after the above mentioned snippet. > > > > Assuming that snippet is fully orthogonal to this patch, then it > > doesn't matter if it's before or after. > > Please, double check this. Because you play with limits, while we expect them > to satisfy (3). > > -- > With Best Regards, > Andy Shevchenko > >
On Wed, 2021-07-14 at 13:46 +0300, Andy Shevchenko wrote: > On Wed, Jul 14, 2021 at 01:38:22PM +0300, Andy Shevchenko wrote: > > On Wed, Jul 14, 2021 at 06:10:46PM +0800, Liu Ying wrote: > > > On Wed, 2021-07-14 at 12:12 +0300, Andy Shevchenko wrote: > > > > On Wed, Jul 14, 2021 at 02:41:29PM +0800, Liu Ying wrote: > > > > ... > > > > > > > /* > > > > > * Get rate closer to *parent_rate to guarantee there is no overflow > > > > > * for m and n. In the result it will be the nearest rate left shifted > > > > > * by (scale - fd->nwidth) bits. > > > > > */ > > > > > > > > I don't know how to rephrase above comment better. > > > > > > > > > scale = fls_long(*parent_rate / rate - 1); > > > > > if (scale > fd->nwidth) > > > > > rate <<= scale - fd->nwidth; > > > > > > > > This takes an advantage of the numbers be in a form of > > > > > > > > n = k * 2^m, (1) > > > > > > > > where m will be scale in the snippet above. Thus, if n can be represented by > > > > (1), we opportunistically reduce amount of bits needed for it by shifting right > > > > by m bits. > > > > Does it make sense? > > > > > > Thanks for your explaination. > > > But, sorry, Jacky and I still don't understand this. > > It seems I poorly chose the letters for (1). Let's rewrite above as > > This takes an advantage of the numbers be in a form of > > a = k * 2^b, (1) > > where b will be scale in the snippet above. Thus, if a can be represented by > (1), we opportunistically reduce amount of bits needed for it by shifting right > by b bits. > > Also note, "shifting right" here is about the result of n (see below). > > > Okay, We have two values in question: > > r_o (original rate of the parent clock) > > r_u (the rate user asked for) > > > > We have a pre-scaler block which asks for > > m (denominator) > > n (nominator) > > values to be provided to satisfy the (2) > > > > r_u ~= r_o * m / n, (2) > > > > where we try our best to make it "=" instead of "~=". > > > > Now, m and n have the limitation by a range, e.g. > > > > n >= 1, n < N_lim, where N_lim = 2^nlim. (3) > > > > Hence, from (2) and (3), assuming the worst case m = 1, > > > > ln2(r_o / r_u) <= nlim. (4) > > > > The above code tries to satisfy (4). > > > > Have you got it now? I'm afraid I haven't, sorry. Jacky, what about you? Is that snippet really needed? Without that snippet, it seems that rational_best_approximation() is able to offer best_numerator and best_denominator without the risk of overflow for m and n, since max_numerator and max_denominator are already handed over to rational_best_approximation()? Does rational_best_approximation() always offer best_numerator by the range of [1, max_numerator] and best_denominator [1, max_denominator]? Regards, Liu Ying > > > > > > The code looks good to me, btw, although I dunno if you need to call the newly > > > > introduced function before or after the above mentioned snippet. > > > > > > Assuming that snippet is fully orthogonal to this patch, then it > > > doesn't matter if it's before or after. > > > > Please, double check this. Because you play with limits, while we expect them > > to satisfy (3). > > > > -- > > With Best Regards, > > Andy Shevchenko > > > >
On Thu, Jul 15, 2021 at 8:51 AM Liu Ying <victor.liu@nxp.com> wrote: > On Wed, 2021-07-14 at 13:46 +0300, Andy Shevchenko wrote: > > On Wed, Jul 14, 2021 at 01:38:22PM +0300, Andy Shevchenko wrote: > > > On Wed, Jul 14, 2021 at 06:10:46PM +0800, Liu Ying wrote: > > > > On Wed, 2021-07-14 at 12:12 +0300, Andy Shevchenko wrote: > > > > > On Wed, Jul 14, 2021 at 02:41:29PM +0800, Liu Ying wrote: > > > > > > ... > > > > > > > > > /* > > > > > > * Get rate closer to *parent_rate to guarantee there is no overflow > > > > > > * for m and n. In the result it will be the nearest rate left shifted > > > > > > * by (scale - fd->nwidth) bits. > > > > > > */ > > > > > > > > > > I don't know how to rephrase above comment better. > > > > > > > > > > > scale = fls_long(*parent_rate / rate - 1); > > > > > > if (scale > fd->nwidth) > > > > > > rate <<= scale - fd->nwidth; > > > > > > > > > > This takes an advantage of the numbers be in a form of > > > > > > > > > > n = k * 2^m, (1) > > > > > > > > > > where m will be scale in the snippet above. Thus, if n can be represented by > > > > > (1), we opportunistically reduce amount of bits needed for it by shifting right > > > > > by m bits. > > > > > Does it make sense? > > > > > > > > Thanks for your explaination. > > > > But, sorry, Jacky and I still don't understand this. > > > > It seems I poorly chose the letters for (1). Let's rewrite above as > > > > This takes an advantage of the numbers be in a form of > > > > a = k * 2^b, (1) > > > > where b will be scale in the snippet above. Thus, if a can be represented by > > (1), we opportunistically reduce amount of bits needed for it by shifting right > > by b bits. > > > > Also note, "shifting right" here is about the result of n (see below). > > > > > Okay, We have two values in question: > > > r_o (original rate of the parent clock) > > > r_u (the rate user asked for) > > > > > > We have a pre-scaler block which asks for > > > m (denominator) > > > n (nominator) > > > values to be provided to satisfy the (2) > > > > > > r_u ~= r_o * m / n, (2) > > > > > > where we try our best to make it "=" instead of "~=". > > > > > > Now, m and n have the limitation by a range, e.g. > > > > > > n >= 1, n < N_lim, where N_lim = 2^nlim. (3) > > > > > > Hence, from (2) and (3), assuming the worst case m = 1, > > > > > > ln2(r_o / r_u) <= nlim. (4) > > > > > > The above code tries to satisfy (4). > > > > > > Have you got it now? > > I'm afraid I haven't, sorry. Jacky, what about you? You may take a pen and paper and model different cases. After all it's not rocket science, just arithmetics :-) > Is that snippet really needed? Yes. The (4) shows why. > Without that snippet, it seems that rational_best_approximation() is > able to offer best_numerator and best_denominator without the risk of > overflow for m and n, since max_numerator and max_denominator are > already handed over to rational_best_approximation()? No. > Does rational_best_approximation() always offer best_numerator by the > range of [1, max_numerator] and best_denominator [1, max_denominator]? Of course not, when it goes out of the range. > > > > > The code looks good to me, btw, although I dunno if you need to call the newly > > > > > introduced function before or after the above mentioned snippet. > > > > > > > > Assuming that snippet is fully orthogonal to this patch, then it > > > > doesn't matter if it's before or after. > > > > > > Please, double check this. Because you play with limits, while we expect them > > > to satisfy (3).
On Thu, Jul 15, 2021 at 11:02 AM Andy Shevchenko <andy.shevchenko@gmail.com> wrote: > On Thu, Jul 15, 2021 at 8:51 AM Liu Ying <victor.liu@nxp.com> wrote: > > On Wed, 2021-07-14 at 13:46 +0300, Andy Shevchenko wrote: > > > On Wed, Jul 14, 2021 at 01:38:22PM +0300, Andy Shevchenko wrote: > > > > On Wed, Jul 14, 2021 at 06:10:46PM +0800, Liu Ying wrote: > > > > > On Wed, 2021-07-14 at 12:12 +0300, Andy Shevchenko wrote: > > > > > > On Wed, Jul 14, 2021 at 02:41:29PM +0800, Liu Ying wrote: > > > > > > > > ... > > > > > > > > > > > /* > > > > > > > * Get rate closer to *parent_rate to guarantee there is no overflow > > > > > > > * for m and n. In the result it will be the nearest rate left shifted > > > > > > > * by (scale - fd->nwidth) bits. > > > > > > > */ > > > > > > > > > > > > I don't know how to rephrase above comment better. > > > > > > > > > > > > > scale = fls_long(*parent_rate / rate - 1); > > > > > > > if (scale > fd->nwidth) > > > > > > > rate <<= scale - fd->nwidth; > > > > > > > > > > > > This takes an advantage of the numbers be in a form of > > > > > > > > > > > > n = k * 2^m, (1) > > > > > > > > > > > > where m will be scale in the snippet above. Thus, if n can be represented by > > > > > > (1), we opportunistically reduce amount of bits needed for it by shifting right > > > > > > by m bits. > > > > > > Does it make sense? > > > > > > > > > > Thanks for your explaination. > > > > > But, sorry, Jacky and I still don't understand this. > > > > > > It seems I poorly chose the letters for (1). Let's rewrite above as > > > > > > This takes an advantage of the numbers be in a form of > > > > > > a = k * 2^b, (1) > > > > > > where b will be scale in the snippet above. Thus, if a can be represented by > > > (1), we opportunistically reduce amount of bits needed for it by shifting right > > > by b bits. > > > > > > Also note, "shifting right" here is about the result of n (see below). > > > > > > > Okay, We have two values in question: > > > > r_o (original rate of the parent clock) > > > > r_u (the rate user asked for) > > > > > > > > We have a pre-scaler block which asks for > > > > m (denominator) > > > > n (nominator) > > > > values to be provided to satisfy the (2) > > > > > > > > r_u ~= r_o * m / n, (2) > > > > > > > > where we try our best to make it "=" instead of "~=". > > > > > > > > Now, m and n have the limitation by a range, e.g. > > > > > > > > n >= 1, n < N_lim, where N_lim = 2^nlim. (3) > > > > > > > > Hence, from (2) and (3), assuming the worst case m = 1, > > > > > > > > ln2(r_o / r_u) <= nlim. (4) > > > > > > > > The above code tries to satisfy (4). > > > > > > > > Have you got it now? > > > > I'm afraid I haven't, sorry. Jacky, what about you? > > You may take a pen and paper and model different cases. After all it's > not rocket science, just arithmetics :-) > > > Is that snippet really needed? > > Yes. The (4) shows why. Now I realize one more item which is missed in the big picture. When we have overflowed the denominator (n) and shifted the values, we are expecting that the caller will check the rate and use another 2^scale (see (6) below) prescaler if needed to drop frequency to the needed values. The first few users of this were the drivers of the IPs which have an additional prescaler themselves. I dunno if there is any flag in CCF to show that the set frequency is 2^scale higher than asked. It means if r_o / r_u >> N_lim (5) we will get m and n *saturated* without this snipped, while with it they will be much much better with a nuance that resulting frequency is shifted left by scale = ln2(r_o/r_u) - nlim (6) scale bits. > > Without that snippet, it seems that rational_best_approximation() is > > able to offer best_numerator and best_denominator without the risk of > > overflow for m and n, since max_numerator and max_denominator are > > already handed over to rational_best_approximation()? > > No. > > > Does rational_best_approximation() always offer best_numerator by the > > range of [1, max_numerator] and best_denominator [1, max_denominator]? > > Of course not, when it goes out of the range. > > > > > > The code looks good to me, btw, although I dunno if you need to call the newly > > > > > > introduced function before or after the above mentioned snippet. > > > > > > > > > > Assuming that snippet is fully orthogonal to this patch, then it > > > > > doesn't matter if it's before or after. > > > > > > > > Please, double check this. Because you play with limits, while we expect them > > > > to satisfy (3).
+Cc: Mauro (below Q about sphinx and docs) On Thu, Jul 15, 2021 at 11:14 AM Andy Shevchenko <andy.shevchenko@gmail.com> wrote: > On Thu, Jul 15, 2021 at 11:02 AM Andy Shevchenko > <andy.shevchenko@gmail.com> wrote: ... > Now I realize one more item which is missed in the big picture. > When we have overflowed the denominator (n) and shifted the values, we > are expecting that the caller will check the rate and use another > 2^scale (see (6) below) prescaler if needed to drop frequency to the > needed values. The first few users of this were the drivers of the IPs > which have an additional prescaler themselves. I dunno if there is any > flag in CCF to show that the set frequency is 2^scale higher than > asked. > > It means if > > r_o / r_u >> N_lim (5) > > we will get m and n *saturated* without this snipped, while with it > they will be much much better with a nuance that resulting frequency > is shifted left by > > scale = ln2(r_o/r_u) - nlim (6) > > scale bits. I think I have at some point to introduce a documentation and explain all this from the thread there. Mauro, is sphinx capable of producing TeX formulas?
diff --git a/drivers/clk/clk-fractional-divider.c b/drivers/clk/clk-fractional-divider.c index b1e556f20911..86e985e14468 100644 --- a/drivers/clk/clk-fractional-divider.c +++ b/drivers/clk/clk-fractional-divider.c @@ -30,6 +30,19 @@ static inline void clk_fd_writel(struct clk_fractional_divider *fd, u32 val) writel(val, fd->reg); } +static inline void clk_fd_get_max_m_n(struct clk_fractional_divider *fd, + unsigned long *max_m, + unsigned long *max_n) +{ + *max_m = GENMASK(fd->mwidth - 1, 0); + *max_n = GENMASK(fd->nwidth - 1, 0); + + if (fd->flags & CLK_FRAC_DIVIDER_ZERO_BASED) { + (*max_m)++; + (*max_n)++; + } +} + static unsigned long clk_fd_recalc_rate(struct clk_hw *hw, unsigned long parent_rate) { @@ -73,7 +86,7 @@ static void clk_fd_general_approximation(struct clk_hw *hw, unsigned long rate, unsigned long *m, unsigned long *n) { struct clk_fractional_divider *fd = to_clk_fd(hw); - unsigned long scale; + unsigned long scale, max_m, max_n; /* * Get rate closer to *parent_rate to guarantee there is no overflow @@ -84,9 +97,9 @@ static void clk_fd_general_approximation(struct clk_hw *hw, unsigned long rate, if (scale > fd->nwidth) rate <<= scale - fd->nwidth; - rational_best_approximation(rate, *parent_rate, - GENMASK(fd->mwidth - 1, 0), GENMASK(fd->nwidth - 1, 0), - m, n); + clk_fd_get_max_m_n(fd, &max_m, &max_n); + + rational_best_approximation(rate, *parent_rate, max_m, max_n, m, n); } static long clk_fd_round_rate(struct clk_hw *hw, unsigned long rate, @@ -115,12 +128,13 @@ static int clk_fd_set_rate(struct clk_hw *hw, unsigned long rate, { struct clk_fractional_divider *fd = to_clk_fd(hw); unsigned long flags = 0; + unsigned long max_m, max_n; unsigned long m, n; u32 val; - rational_best_approximation(rate, parent_rate, - GENMASK(fd->mwidth - 1, 0), GENMASK(fd->nwidth - 1, 0), - &m, &n); + clk_fd_get_max_m_n(fd, &max_m, &max_n); + + rational_best_approximation(rate, parent_rate, max_m, max_n, &m, &n); if (fd->flags & CLK_FRAC_DIVIDER_ZERO_BASED) { m--;