Submitter | Kuninori Morimoto |
---|---|

Date | Dec. 6, 2017, 6:05 a.m. |

Message ID | <87374oz9f9.wl%kuninori.morimoto.gx@renesas.com> |

Download | mbox | patch |

Permalink | /patch/10095083/ |

State | New |

Headers | show |

## Comments

Hello Morimoto-san, Thank you for the patch. On Wednesday, 6 December 2017 08:05:38 EET Kuninori Morimoto wrote: > From: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> > > In general, PLL has VCO (= Voltage controlled oscillator), > one of the very important electronic feature called as "jitter" > is related to this VCO. > In academic generalism, VCO should be maximum to be more small jitter. > In high frequency clock, jitter will be large impact. > Thus, selecting Hi VCO is general theory. > > fin fvco fout fclkout > in --> [1/M] --> |PD| -> [LPF] -> [VCO] -> [1/P] -+-> [1/FDPLL] -> out > +-> | | | > | | > +-----------------[1/N]<-------------+ > > fclkout = fvco / P / FDPLL -- (1) > > In PD, it will loop until fin/M = fvco/P/N > > fvco = fin * P * N / M -- (2) > > (1) + (2) indicates, fclkout = fin * N / M / FDPLL > In this device, N = (n + 1), M = (m + 1), P = 2, thus > > fclkout = fin * (n + 1) / (m + 1) / FDPLL > > This is the datasheet formula. > One note here is that it should be 2000 < fvco < 4096MHz > To be smaller jitter, fvco should be maximum, > in other words, N as large as possible, M as small as possible driver > should select. Here, basically M=1. > This patch do it. > > Reported-by: HIROSHI INOSE <hiroshi.inose.rb@renesas.com> > Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> > --- > v1 -> v2 > > - tidyup typo on git-log "fout" -> "fclkout" > - tidyup for loop terminate condition 40 -> 38 for n > > drivers/gpu/drm/rcar-du/rcar_du_crtc.c | 36 ++++++++++++++++++++++++++++-- > 1 file changed, 34 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c > b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c index b492063..57479c9 100644 > --- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c > +++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c > @@ -125,8 +125,40 @@ static void rcar_du_dpll_divider(struct rcar_du_crtc > *rcrtc, unsigned int m; > unsigned int n; > > - for (n = 39; n < 120; n++) { > - for (m = 0; m < 4; m++) { > + /* > + * fin fvco fout fclkout > + * in --> [1/M] --> |PD| -> [LPF] -> [VCO] -> [1/P] -+-> [1/FDPLL] -> out > + * +-> | | | > + * | | > + * +-----------------[1/N]<-------------+ > + * > + * fclkout = fvco / P / FDPLL -- (1) > + * > + * fin/M = fvco/P/N > + * > + * fvco = fin * P * N / M -- (2) > + * > + * (1) + (2) indicates > + * > + * fclkout = fin * N / M / FDPLL > + * > + * NOTES > + * N = (n + 1), M = (m + 1), P = 2 > + * 2000 < fvco < 4096Mhz Are you sure that the fvco constraint is really 2kHz, and not 2GHz ? 2kHz - 4GHz would be a surprisingly large range. > + * Basically M=1 Why is this ? > + * To be small jitter, > + * N : as large as possible > + * M : as small as possible > + */ > + for (m = 0; m < 4; m++) { > + for (n = 119; n > 38; n--) { > + unsigned long long fvco = input * 2 * (n + 1) / (m + 1); This code runs for Gen3 only, so unsigned long would be enough. The rest of the function already relies on native support for 64-bit calculation. If you wanted to run this on a 32-bit CPU, you would likely need to do_div() for the division, and convert input to u64 to avoid integer overflows, otherwise the calculation will be performed on 32-bit before a final conversion to 64-bit. > + if ((fvco < 2000) || > + (fvco > 4096000000ll)) No need for the inner parentheses, and you can write both conditions on a single line. Furthemore 4096 MHz will fit in a 32-bit number, so there's no need for the ll. > + continue; > + I think you can then drop the output >= 4000000000 check inside the inner fdpll loop, as the output frequency can't be higher than 4GHz if the VCO frequency isn't. > for (fdpll = 1; fdpll < 32; fdpll++) { > unsigned long output; The output frequency on the line below can be calculated with output = fvco / 2 / (fdpll + 1) to avoid the multiplication by (n + 1) and division by (m + 1). If we wanted to optimize even more we could compute and operatate on fout instead of fvco, that would remove the * 2 and / 2. This patch seems to be a good first step in case of multiple possible exact frequency matches. However, when the PLL can't achieve an exact match, we might still end up with a high M value when a lower value could produce an output frequency close enough to the desired value. I wonder if this function should also take a frequency tolerance as an input parameter, and compute the M, N and FDPLL values that will produce an output frequency within the tolerance with M as small as possible. This can be done as a separate patch. And while we're discussing PLL calculation, the three nested loops will run a total of 10044 iterations :-/ That's a lot, and should be optimized if possible. With the outer loop operating on N an easy optimization would have been to compute fin * N in a local variable to avoid redoing the multiplication for every value of M, but that's not possible anymore with the outer loop operating on M.

Hi Laurent Thank you for your feedback > > + * NOTES > > + * N = (n + 1), M = (m + 1), P = 2 > > + * 2000 < fvco < 4096Mhz > > Are you sure that the fvco constraint is really 2kHz, and not 2GHz ? 2kHz - > 4GHz would be a surprisingly large range. It is 2kHz. This is came from HW team, and indicated on HW design sheet (?) > > + * Basically M=1 > > Why is this ? This is came from HW team, too. They are assuming M=1, basically. But yes confusable, let's remove it from comment. m is started from 0 (= M=1), no need to explain. > > + for (m = 0; m < 4; m++) { > > + for (n = 119; n > 38; n--) { > > + unsigned long long fvco = input * 2 * (n + 1) / (m + 1); > > This code runs for Gen3 only, so unsigned long would be enough. The rest of > the function already relies on native support for 64-bit calculation. If you > wanted to run this on a 32-bit CPU, you would likely need to do_div() for the > division, and convert input to u64 to avoid integer overflows, otherwise the > calculation will be performed on 32-bit before a final conversion to 64-bit. (snip) > > + if ((fvco < 2000) || > > + (fvco > 4096000000ll)) > > No need for the inner parentheses, and you can write both conditions on a > single line. Furthemore 4096 MHz will fit in a 32-bit number, so there's no > need for the ll. Yes, but compiled by 32bit too, right ? Without this "ll", 32bit compiler say warning: this decimal constant is unsigned only in ISO C90 # anyway, I will add this assumption (= used only by 64bit CPU) # on comment to avoid future confusion > I think you can then drop the output >= 4000000000 check inside the inner > fdpll loop, as the output frequency can't be higher than 4GHz if the VCO > frequency isn't. I think code has if (output >= 400000000) This is 400MHz, not 4GHz > > for (fdpll = 1; fdpll < 32; fdpll++) { > > unsigned long output; > > The output frequency on the line below can be calculated with > > output = fvco / 2 / (fdpll + 1) > > to avoid the multiplication by (n + 1) and division by (m + 1). It is nice idea to avoid extra calculation. I will use this idea, and add extrate comment to avoid confusion Best regards --- Kuninori Morimoto

Hi Morimoto-san, On Thursday, 14 December 2017 04:10:27 EET Kuninori Morimoto wrote: > Hi Laurent > > Thank you for your feedback > > >> + * NOTES > >> + * N = (n + 1), M = (m + 1), P = 2 > >> + * 2000 < fvco < 4096Mhz > > > > Are you sure that the fvco constraint is really 2kHz, and not 2GHz ? 2kHz > > - 4GHz would be a surprisingly large range. > > It is 2kHz. This is came from HW team, and indicated > on HW design sheet (?) OK, it's a surprising VCO, no issue with that :-) > >> + * Basically M=1 > > > > Why is this ? > > This is came from HW team, too. > They are assuming M=1, basically. > But yes confusable, let's remove it from comment. > m is started from 0 (= M=1), no need to explain. Sounds good to me. > >> + for (m = 0; m < 4; m++) { > >> + for (n = 119; n > 38; n--) { > >> + unsigned long long fvco = input * 2 * (n + 1) / (m + 1); > > > > This code runs for Gen3 only, so unsigned long would be enough. The rest > > of the function already relies on native support for 64-bit calculation. > > If you wanted to run this on a 32-bit CPU, you would likely need to > > do_div() for the division, and convert input to u64 to avoid integer > > overflows, otherwise the calculation will be performed on 32-bit before a > > final conversion to 64-bit. > > (snip) > > >> + if ((fvco < 2000) || > >> + (fvco > 4096000000ll)) > > > > No need for the inner parentheses, and you can write both conditions on a > > single line. Furthemore 4096 MHz will fit in a 32-bit number, so there's > > no need for the ll. > > Yes, but compiled by 32bit too, right ? > Without this "ll", 32bit compiler say > > warning: this decimal constant is unsigned only in ISO C90 That's right. How about 4096000000UL then, to force unsigned integer types ? Or possibly even better, 4096 * 1000 * 1000UL to make it more readable ? > # anyway, I will add this assumption (= used only by 64bit CPU) > # on comment to avoid future confusion > > > I think you can then drop the output >= 4000000000 check inside the inner > > fdpll loop, as the output frequency can't be higher than 4GHz if the VCO > > frequency isn't. > > I think code has > > if (output >= 400000000) > > This is 400MHz, not 4GHz You're right, my bad. Maybe I should write it 400 * 1000 * 1000 :-) > >> for (fdpll = 1; fdpll < 32; fdpll++) { > >> unsigned long output; > > > > The output frequency on the line below can be calculated with > > > > output = fvco / 2 / (fdpll + 1) > > > > to avoid the multiplication by (n + 1) and division by (m + 1). > > It is nice idea to avoid extra calculation. > I will use this idea, and add extrate comment to avoid confusion Thank you.

Hi Laurent, On Thu, Dec 14, 2017 at 9:17 AM, Laurent Pinchart <laurent.pinchart@ideasonboard.com> wrote: > On Thursday, 14 December 2017 04:10:27 EET Kuninori Morimoto wrote: >> >> + if ((fvco < 2000) || >> >> + (fvco > 4096000000ll)) >> > >> > No need for the inner parentheses, and you can write both conditions on a >> > single line. Furthemore 4096 MHz will fit in a 32-bit number, so there's >> > no need for the ll. >> >> Yes, but compiled by 32bit too, right ? >> Without this "ll", 32bit compiler say >> >> warning: this decimal constant is unsigned only in ISO C90 > > That's right. How about 4096000000UL then, to force unsigned integer types ? > Or possibly even better, 4096 * 1000 * 1000UL to make it more readable ? If it's just about making the number unsigned, and not about 64-bit arithmetic, a "U" suffix should be sufficient. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds

Hi Geert, Laurent > >> Yes, but compiled by 32bit too, right ? > >> Without this "ll", 32bit compiler say > >> > >> warning: this decimal constant is unsigned only in ISO C90 > > > > That's right. How about 4096000000UL then, to force unsigned integer types ? > > Or possibly even better, 4096 * 1000 * 1000UL to make it more readable ? > > If it's just about making the number unsigned, and not about 64-bit arithmetic, > a "U" suffix should be sufficient. Thanks. Will try Best regards --- Kuninori Morimoto

## Patch

diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c index b492063..57479c9 100644 --- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c +++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c @@ -125,8 +125,40 @@ static void rcar_du_dpll_divider(struct rcar_du_crtc *rcrtc, unsigned int m; unsigned int n; - for (n = 39; n < 120; n++) { - for (m = 0; m < 4; m++) { + /* + * fin fvco fout fclkout + * in --> [1/M] --> |PD| -> [LPF] -> [VCO] -> [1/P] -+-> [1/FDPLL] -> out + * +-> | | | + * | | + * +-----------------[1/N]<-------------+ + * + * fclkout = fvco / P / FDPLL -- (1) + * + * fin/M = fvco/P/N + * + * fvco = fin * P * N / M -- (2) + * + * (1) + (2) indicates + * + * fclkout = fin * N / M / FDPLL + * + * NOTES + * N = (n + 1), M = (m + 1), P = 2 + * 2000 < fvco < 4096Mhz + * Basically M=1 + * + * To be small jitter, + * N : as large as possible + * M : as small as possible + */ + for (m = 0; m < 4; m++) { + for (n = 119; n > 38; n--) { + unsigned long long fvco = input * 2 * (n + 1) / (m + 1); + + if ((fvco < 2000) || + (fvco > 4096000000ll)) + continue; + for (fdpll = 1; fdpll < 32; fdpll++) { unsigned long output;