diff mbox series

[V6,4/5] phy: freescale: fsl-samsung-hdmi: Use closest divider

Message ID 20240904233100.114611-5-aford173@gmail.com
State Superseded
Headers show
Series phy: freescale: fsl-samsung-hdmi: Expand phy clock options | expand

Commit Message

Adam Ford Sept. 4, 2024, 11:30 p.m. UTC
Currently, if the clock values cannot be set to the exact rate,
the round_rate and set_rate functions use the closest value found in
the look-up-table.  In preparation of removing values from the LUT
that can be calculated evenly with the integer calculator, it's
necessary to ensure to check both the look-up-table and the integer
divider clock values to get the closest values to the requested
value.  It does this by measuring the difference between the
requested clock value and the closest value in both integer divider
calucator and the fractional clock look-up-table.

Which ever has the smallest difference between them is returned as
the cloesest rate.

Signed-off-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com>
---
V6:  Simplify the calculation of the closest rate and fix
     a situation where the integer divider values may not be properly
     setup before they are used.
     Fixup some comments
V5:  No Change
V4:  New to series
---
 drivers/phy/freescale/phy-fsl-samsung-hdmi.c | 46 ++++++++++++++------
 1 file changed, 33 insertions(+), 13 deletions(-)

Comments

Frieder Schrempf Sept. 5, 2024, 7:30 a.m. UTC | #1
On 05.09.24 1:30 AM, Adam Ford wrote:
> Currently, if the clock values cannot be set to the exact rate,
> the round_rate and set_rate functions use the closest value found in
> the look-up-table.  In preparation of removing values from the LUT
> that can be calculated evenly with the integer calculator, it's
> necessary to ensure to check both the look-up-table and the integer
> divider clock values to get the closest values to the requested
> value.  It does this by measuring the difference between the
> requested clock value and the closest value in both integer divider
> calucator and the fractional clock look-up-table.
> 
> Which ever has the smallest difference between them is returned as
> the cloesest rate.
> 
> Signed-off-by: Adam Ford <aford173@gmail.com>
> Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com>

Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Tested-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Dominique Martinet Sept. 6, 2024, 12:26 a.m. UTC | #2
(sorry I meant to send this yesterday but I'm being forced to adjust my
mail pipeline with work and gmail and it didn't go out -- trying
again. Sorry if it actually did go through. Hopefully I didn't misfire
anything else yesterday...)

Adam Ford wrote on Wed, Sep 04, 2024 at 06:30:32PM -0500:
> Currently, if the clock values cannot be set to the exact rate,
> the round_rate and set_rate functions use the closest value found in
> the look-up-table.  In preparation of removing values from the LUT
> that can be calculated evenly with the integer calculator, it's
> necessary to ensure to check both the look-up-table and the integer
> divider clock values to get the closest values to the requested
> value.  It does this by measuring the difference between the
> requested clock value and the closest value in both integer divider
> calucator and the fractional clock look-up-table.
> 
> Which ever has the smallest difference between them is returned as
> the cloesest rate.
> 
> Signed-off-by: Adam Ford <aford173@gmail.com>
> Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com>

b4 (or whatever you're using) probably picked that up from the patch I
included in my reply to this patch, this sob should go away.



> diff --git a/drivers/phy/freescale/phy-fsl-samsung-hdmi.c b/drivers/phy/freescale/phy-fsl-samsung-hdmi.c
> index 4b13e386e5ba..9a21dbbf1a82 100644
> --- a/drivers/phy/freescale/phy-fsl-samsung-hdmi.c
> +++ b/drivers/phy/freescale/phy-fsl-samsung-hdmi.c
> @@ -547,6 +547,16 @@ static unsigned long phy_clk_recalc_rate(struct clk_hw *hw,
>  	return phy->cur_cfg->pixclk;
>  }
>  
> +static u32 fsl_samsung_hdmi_phy_get_closest_rate(unsigned long rate,
> +						 u32 int_div_clk, u32 frac_div_clk)
> +{
> +	/* The int_div_clk may be greater than rate, so cast it and use ABS */
> +	if (abs((long)rate - (long)int_div_clk) < (rate - frac_div_clk))

I still think `rate - frac_div_clk` might not always hold in the future
(because there is no intrinsic reason we'd pick the smaller end in case
of inexact match and a future improvement might change this to the
closest value as well), so I'll argue again for having both use abs(),
but at least there's only one place to update if that changes in the
future now so hopefully whoever does this will notice...

> +		return int_div_clk;
> +
> +	return frac_div_clk;
> +}
> +
>  static long phy_clk_round_rate(struct clk_hw *hw,
>  			       unsigned long rate, unsigned long *parent_rate)
>  {
> @@ -563,6 +573,7 @@ static long phy_clk_round_rate(struct clk_hw *hw,
>  	for (i = ARRAY_SIZE(phy_pll_cfg) - 1; i >= 0; i--)
>  		if (phy_pll_cfg[i].pixclk <= rate)
>  			break;
> +

(unrelated)

>  	/* If the rate is an exact match, return it now */
>  	if (rate == phy_pll_cfg[i].pixclk)
>  		return phy_pll_cfg[i].pixclk;
> @@ -579,8 +590,7 @@ static long phy_clk_round_rate(struct clk_hw *hw,
>  	if (int_div_clk == rate)
>  		return int_div_clk;
>  
> -	/* Fall back to the closest value in the LUT */
> -	return phy_pll_cfg[i].pixclk;
> +	return fsl_samsung_hdmi_phy_get_closest_rate(rate, int_div_clk, phy_pll_cfg[i].pixclk);
>  }
>  
>  static int phy_clk_set_rate(struct clk_hw *hw,
> @@ -594,27 +604,37 @@ static int phy_clk_set_rate(struct clk_hw *hw,
>  
>  	/* If the integer divider works, just use it */

I found this comment a bit confusing given the current flow as of this
patch. Might make more sense immediately before the if?


>  	int_div_clk = fsl_samsung_hdmi_phy_find_pms(rate * 5, &p, &m, &s) / 5;
> +	calculated_phy_pll_cfg.pixclk = int_div_clk;
> +	calculated_phy_pll_cfg.pll_div_regs[0] = FIELD_PREP(REG01_PMS_P_MASK, p);
> +	calculated_phy_pll_cfg.pll_div_regs[1] = m;
> +	calculated_phy_pll_cfg.pll_div_regs[2] = FIELD_PREP(REG03_PMS_S_MASK, s-1);
> +	phy->cur_cfg = &calculated_phy_pll_cfg;
>  	if (int_div_clk == rate) {
>  		dev_dbg(phy->dev, "fsl_samsung_hdmi_phy: using integer divider\n");
> -		calculated_phy_pll_cfg.pixclk = int_div_clk;
> -		calculated_phy_pll_cfg.pll_div_regs[0] = FIELD_PREP(REG01_PMS_P_MASK, p);
> -		calculated_phy_pll_cfg.pll_div_regs[1] = m;
> -		calculated_phy_pll_cfg.pll_div_regs[2] = FIELD_PREP(REG03_PMS_S_MASK, s-1);
> -		/* pll_div_regs 3-6 are fixed and pre-defined already */

nitpick: might want to keep the above comment?

> -		phy->cur_cfg  = &calculated_phy_pll_cfg;
> +		goto done;
>  	} else {
>  		/* Otherwise, search the LUT */
> -		dev_dbg(phy->dev, "fsl_samsung_hdmi_phy: using fractional divider\n");
> -		for (i = ARRAY_SIZE(phy_pll_cfg) - 1; i >= 0; i--)
> -			if (phy_pll_cfg[i].pixclk <= rate)
> +		for (i = ARRAY_SIZE(phy_pll_cfg) - 1; i >= 0; i--) {
> +			if (phy_pll_cfg[i].pixclk == rate) {
> +				dev_dbg(phy->dev, "fsl_samsung_hdmi_phy: using fractional divider\n");

nitpick: might make sense to print what was picked in case of inexact
match as well, but these are dbg warning so probably fine either way.


overall I find the flow of this function hard to read; it's a bit ugly
flow-wise but jumping in the clock comparison 'if' might help trim this?
(and if we're going out of our way to factor out the diff, maybe the lut
lookup could be as well)

But I'm probably just being overcritical here, it's fine as is if you
pefer your version, just writing down this as an illustration of what I
meant with the above sentence as I'm not sure I was clear -- I'll be
just as happy to consider this series done so we can do more interesting
things :P

{
    u32 int_div_clk, frac_div_clk;
    int i;
    u16 m;
    u8 p, s;
    
    // (I haven't given up on that *5 to move inside this function...)
    int_div_clk = fsl_samsung_hdmi_phy_find_pms(rate, &p, &m, &s);
    if (int_div_clk == rate)
        goto use_int_clk;

    frac_div_clk = fsl_samsung_hdmi_phy_find_lut(rate, &i);
    // (not convinced that check actually brings much, but it's not like
    // it hurts either)
    if (frac_div_clk == rate)
        goto use_frac_clk;
    
    if (fsl_samsung_hdmi_phy_get_closest_rate(rate, int_div_clk,
                                              frac_div_clk) == int_div_clk) {
use_int_clk:
        dev_dbg(phy->dev, "fsl_samsung_hdmi_phy: using integer divider\n");
        calculated_phy_pll_cfg.pixclk = int_div_clk;
        calculated_phy_pll_cfg.pll_div_regs[0] = FIELD_PREP(REG01_PMS_P_MASK, p);
        calculated_phy_pll_cfg.pll_div_regs[1] = m;
        calculated_phy_pll_cfg.pll_div_regs[2] = FIELD_PREP(REG03_PMS_S_MASK, s-1);
        /* pll_div_regs 3-6 are fixed and pre-defined already */
        phy->cur_cfg  = &calculated_phy_pll_cfg;
    } else {
use_frac_clk:
        dev_dbg(phy->dev, "fsl_samsung_hdmi_phy: using fractional divider\n");
        phy->cur_cfg = &phy_pll_cfg[i];
    }
    return fsl_samsung_hdmi_phy_configure(phy, phy->cur_cfg);
}
Adam Ford Sept. 6, 2024, 12:57 a.m. UTC | #3
On Thu, Sep 5, 2024 at 7:26 PM Dominique Martinet
<dominique.martinet@atmark-techno.com> wrote:
>
>
> (sorry I meant to send this yesterday but I'm being forced to adjust my
> mail pipeline with work and gmail and it didn't go out -- trying
> again. Sorry if it actually did go through. Hopefully I didn't misfire
> anything else yesterday...)
>
> Adam Ford wrote on Wed, Sep 04, 2024 at 06:30:32PM -0500:
> > Currently, if the clock values cannot be set to the exact rate,
> > the round_rate and set_rate functions use the closest value found in
> > the look-up-table.  In preparation of removing values from the LUT
> > that can be calculated evenly with the integer calculator, it's
> > necessary to ensure to check both the look-up-table and the integer
> > divider clock values to get the closest values to the requested
> > value.  It does this by measuring the difference between the
> > requested clock value and the closest value in both integer divider
> > calucator and the fractional clock look-up-table.
> >
> > Which ever has the smallest difference between them is returned as
> > the cloesest rate.
> >
> > Signed-off-by: Adam Ford <aford173@gmail.com>
> > Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com>
>
> b4 (or whatever you're using) probably picked that up from the patch I
> included in my reply to this patch, this sob should go away.
>
For each iteration, I grabbed the patches from patchwork which
contained any s-o-b messages, if present.  I didn't add anything
manually.
>
>
> > diff --git a/drivers/phy/freescale/phy-fsl-samsung-hdmi.c b/drivers/phy/freescale/phy-fsl-samsung-hdmi.c
> > index 4b13e386e5ba..9a21dbbf1a82 100644
> > --- a/drivers/phy/freescale/phy-fsl-samsung-hdmi.c
> > +++ b/drivers/phy/freescale/phy-fsl-samsung-hdmi.c
> > @@ -547,6 +547,16 @@ static unsigned long phy_clk_recalc_rate(struct clk_hw *hw,
> >       return phy->cur_cfg->pixclk;
> >  }
> >
> > +static u32 fsl_samsung_hdmi_phy_get_closest_rate(unsigned long rate,
> > +                                              u32 int_div_clk, u32 frac_div_clk)
> > +{
> > +     /* The int_div_clk may be greater than rate, so cast it and use ABS */
> > +     if (abs((long)rate - (long)int_div_clk) < (rate - frac_div_clk))
>
> I still think `rate - frac_div_clk` might not always hold in the future
> (because there is no intrinsic reason we'd pick the smaller end in case
> of inexact match and a future improvement might change this to the
> closest value as well), so I'll argue again for having both use abs(),
> but at least there's only one place to update if that changes in the
> future now so hopefully whoever does this will notice...

I can add the ABS on the fractional divider.  I left it out on purpose
since the LUT table always return a value equal or less, so the extra
ABS seemed like busy work.  However, I can see the argument for being
consistent.

>
> > +             return int_div_clk;
> > +
> > +     return frac_div_clk;
> > +}
> > +
> >  static long phy_clk_round_rate(struct clk_hw *hw,
> >                              unsigned long rate, unsigned long *parent_rate)
> >  {
> > @@ -563,6 +573,7 @@ static long phy_clk_round_rate(struct clk_hw *hw,
> >       for (i = ARRAY_SIZE(phy_pll_cfg) - 1; i >= 0; i--)
> >               if (phy_pll_cfg[i].pixclk <= rate)
> >                       break;
> > +
>
> (unrelated)

I don't understand what you're asking here.

>
> >       /* If the rate is an exact match, return it now */
> >       if (rate == phy_pll_cfg[i].pixclk)
> >               return phy_pll_cfg[i].pixclk;
> > @@ -579,8 +590,7 @@ static long phy_clk_round_rate(struct clk_hw *hw,
> >       if (int_div_clk == rate)
> >               return int_div_clk;
> >
> > -     /* Fall back to the closest value in the LUT */
> > -     return phy_pll_cfg[i].pixclk;
> > +     return fsl_samsung_hdmi_phy_get_closest_rate(rate, int_div_clk, phy_pll_cfg[i].pixclk);
> >  }
> >
> >  static int phy_clk_set_rate(struct clk_hw *hw,
> > @@ -594,27 +604,37 @@ static int phy_clk_set_rate(struct clk_hw *hw,
> >
> >       /* If the integer divider works, just use it */
>
> I found this comment a bit confusing given the current flow as of this
> patch. Might make more sense immediately before the if?
>

This code evolved with each iteration, but I didn't necessarily
reorganize the comments.  I can rearrange them.
>
> >       int_div_clk = fsl_samsung_hdmi_phy_find_pms(rate * 5, &p, &m, &s) / 5;
> > +     calculated_phy_pll_cfg.pixclk = int_div_clk;
> > +     calculated_phy_pll_cfg.pll_div_regs[0] = FIELD_PREP(REG01_PMS_P_MASK, p);
> > +     calculated_phy_pll_cfg.pll_div_regs[1] = m;
> > +     calculated_phy_pll_cfg.pll_div_regs[2] = FIELD_PREP(REG03_PMS_S_MASK, s-1);
> > +     phy->cur_cfg = &calculated_phy_pll_cfg;
> >       if (int_div_clk == rate) {
> >               dev_dbg(phy->dev, "fsl_samsung_hdmi_phy: using integer divider\n");
> > -             calculated_phy_pll_cfg.pixclk = int_div_clk;
> > -             calculated_phy_pll_cfg.pll_div_regs[0] = FIELD_PREP(REG01_PMS_P_MASK, p);
> > -             calculated_phy_pll_cfg.pll_div_regs[1] = m;
> > -             calculated_phy_pll_cfg.pll_div_regs[2] = FIELD_PREP(REG03_PMS_S_MASK, s-1);
> > -             /* pll_div_regs 3-6 are fixed and pre-defined already */
>
> nitpick: might want to keep the above comment?

ok.
>
> > -             phy->cur_cfg  = &calculated_phy_pll_cfg;
> > +             goto done;
> >       } else {
> >               /* Otherwise, search the LUT */
> > -             dev_dbg(phy->dev, "fsl_samsung_hdmi_phy: using fractional divider\n");
> > -             for (i = ARRAY_SIZE(phy_pll_cfg) - 1; i >= 0; i--)
> > -                     if (phy_pll_cfg[i].pixclk <= rate)
> > +             for (i = ARRAY_SIZE(phy_pll_cfg) - 1; i >= 0; i--) {
> > +                     if (phy_pll_cfg[i].pixclk == rate) {
> > +                             dev_dbg(phy->dev, "fsl_samsung_hdmi_phy: using fractional divider\n");
>
> nitpick: might make sense to print what was picked in case of inexact
> match as well, but these are dbg warning so probably fine either way.

I can add the actual values returned.

>
>
> overall I find the flow of this function hard to read; it's a bit ugly
> flow-wise but jumping in the clock comparison 'if' might help trim this?
> (and if we're going out of our way to factor out the diff, maybe the lut
> lookup could be as well)
>
> But I'm probably just being overcritical here, it's fine as is if you
> pefer your version, just writing down this as an illustration of what I
> meant with the above sentence as I'm not sure I was clear -- I'll be
> just as happy to consider this series done so we can do more interesting
> things :P

Now I am a bit more confused, because above I got the impression you
were withdrawing your s-o-b, but now it sounds like you want to move
it forward.

It sounded like Frieder was making some progress on understanding a
little more about the fractional divider.

>
> {
>     u32 int_div_clk, frac_div_clk;
>     int i;
>     u16 m;
>     u8 p, s;
>
>     // (I haven't given up on that *5 to move inside this function...)

I wanted to keep the PMS calculator returning the real clock value
since the calculations are based on equation in the ref manual, Fpll =
Fref * M / (P*S)
This way, the calling function can determine if it needs to be
multiplied by 5.  I haven't fully determined how the fractional
calculator determines what frequency it wants for a target frequency,
and using the values for P, M and S from the fractional divider
doesn't seem to always yield 5x like they did for the table entries
using the integer divider.

I am hoping someone from NXP can elaborate, or give us some clues on
how to get better fractional divider values.

>     int_div_clk = fsl_samsung_hdmi_phy_find_pms(rate, &p, &m, &s);
>     if (int_div_clk == rate)
>         goto use_int_clk;
>
>     frac_div_clk = fsl_samsung_hdmi_phy_find_lut(rate, &i);
>     // (not convinced that check actually brings much, but it's not like
>     // it hurts either)
>     if (frac_div_clk == rate)
>         goto use_frac_clk;
>
>     if (fsl_samsung_hdmi_phy_get_closest_rate(rate, int_div_clk,
>                                               frac_div_clk) == int_div_clk) {
> use_int_clk:
>         dev_dbg(phy->dev, "fsl_samsung_hdmi_phy: using integer divider\n");
>         calculated_phy_pll_cfg.pixclk = int_div_clk;
>         calculated_phy_pll_cfg.pll_div_regs[0] = FIELD_PREP(REG01_PMS_P_MASK, p);
>         calculated_phy_pll_cfg.pll_div_regs[1] = m;
>         calculated_phy_pll_cfg.pll_div_regs[2] = FIELD_PREP(REG03_PMS_S_MASK, s-1);
>         /* pll_div_regs 3-6 are fixed and pre-defined already */
>         phy->cur_cfg  = &calculated_phy_pll_cfg;
>     } else {
> use_frac_clk:
>         dev_dbg(phy->dev, "fsl_samsung_hdmi_phy: using fractional divider\n");
>         phy->cur_cfg = &phy_pll_cfg[i];
>     }
>     return fsl_samsung_hdmi_phy_configure(phy, phy->cur_cfg);
> }
>
> --
> Dominique
Dominique Martinet Sept. 6, 2024, 1:54 a.m. UTC | #4
Adam Ford wrote on Thu, Sep 05, 2024 at 07:57:36PM -0500:
> > > Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com>
> >
> > b4 (or whatever you're using) probably picked that up from the patch I
> > included in my reply to this patch, this sob should go away.
> >
> For each iteration, I grabbed the patches from patchwork which
> contained any s-o-b messages, if present.  I didn't add anything
> manually.

Yes, I'm just saying the tool got confused - I replied to an earlier
iteration of this patch with a quote of the 0.5% tolerance patch I
properly sent afterwards here:
https://lore.kernel.org/all/ZtaawD3vb8gmnVmO@atmark-techno.com/

And it picked the sob from the patch, that I hadn't intended to be added
as a review -- from my understanding a sob is something you'd add if you
pick the patch through your tree (e.g. I add sob for patches I apply
through the 9p tree I maintain), but not something to give on reviews.
I'll reply to individual patches with reviewed-by/tested-by once this
thread has settled down; I don't have more comments than what I sent
just now, so I think this is almost ready.


> > > +static u32 fsl_samsung_hdmi_phy_get_closest_rate(unsigned long rate,
> > > +                                              u32 int_div_clk, u32 frac_div_clk)
> > > +{
> > > +     /* The int_div_clk may be greater than rate, so cast it and use ABS */
> > > +     if (abs((long)rate - (long)int_div_clk) < (rate - frac_div_clk))
> >
> > I still think `rate - frac_div_clk` might not always hold in the future
> > (because there is no intrinsic reason we'd pick the smaller end in case
> > of inexact match and a future improvement might change this to the
> > closest value as well), so I'll argue again for having both use abs(),
> > but at least there's only one place to update if that changes in the
> > future now so hopefully whoever does this will notice...
> 
> I can add the ABS on the fractional divider.  I left it out on purpose
> since the LUT table always return a value equal or less, so the extra
> ABS seemed like busy work.  However, I can see the argument for being
> consistent.

Yeah, I agree it's not needed right now; I won't insist on this, just
saying I prefer consistency/future-proofing here.

> > > +             return int_div_clk;
> > > +
> > > +     return frac_div_clk;
> > > +}
> > > +
> > >  static long phy_clk_round_rate(struct clk_hw *hw,
> > >                              unsigned long rate, unsigned long *parent_rate)
> > >  {
> > > @@ -563,6 +573,7 @@ static long phy_clk_round_rate(struct clk_hw *hw,
> > >       for (i = ARRAY_SIZE(phy_pll_cfg) - 1; i >= 0; i--)
> > >               if (phy_pll_cfg[i].pixclk <= rate)
> > >                       break;
> > > +
> >
> > (unrelated)
> 
> I don't understand what you're asking here.

This chunk is just adding a blank line that's not related to the rest of
the patch; it's fine, just pointed it out if it wasn't intended.
(these are usually leftovers from tests or something like that)

> > > -             phy->cur_cfg  = &calculated_phy_pll_cfg;
> > > +             goto done;
> > >       } else {
> > >               /* Otherwise, search the LUT */
> > > -             dev_dbg(phy->dev, "fsl_samsung_hdmi_phy: using fractional divider\n");
> > > -             for (i = ARRAY_SIZE(phy_pll_cfg) - 1; i >= 0; i--)
> > > -                     if (phy_pll_cfg[i].pixclk <= rate)
> > > +             for (i = ARRAY_SIZE(phy_pll_cfg) - 1; i >= 0; i--) {
> > > +                     if (phy_pll_cfg[i].pixclk == rate) {
> > > +                             dev_dbg(phy->dev, "fsl_samsung_hdmi_phy: using fractional divider\n");
> >
> > nitpick: might make sense to print what was picked in case of inexact
> > match as well, but these are dbg warning so probably fine either way.
> 
> I can add the actual values returned.

Note my argument here wasn't about having the values on this line, it's
that the inexact case (when comparing with
fsl_samsung_hdmi_phy_get_closest_rate) doesn't print either line

(having the values can't hurt for dbg though, I think it could be useful
if you rework this)

> > overall I find the flow of this function hard to read; it's a bit ugly
> > flow-wise but jumping in the clock comparison 'if' might help trim this?
> > (and if we're going out of our way to factor out the diff, maybe the lut
> > lookup could be as well)
> >
> > But I'm probably just being overcritical here, it's fine as is if you
> > pefer your version, just writing down this as an illustration of what I
> > meant with the above sentence as I'm not sure I was clear -- I'll be
> > just as happy to consider this series done so we can do more interesting
> > things :P
> 
> Now I am a bit more confused, because above I got the impression you
> were withdrawing your s-o-b, but now it sounds like you want to move
> it forward.

I don't have any strong opinion here: I generally dislike nitpicking
about form, and value the progress you've done more than fixing the flow
of this function (e.g. this function already is good/progress from what
we have right now, we don't need perfect)

withdrawing my s-o-b doesn't meant I disagree with the patch, I just
hadn't willingly given it. If you don't mind reworking the patch a bit
I'll retest the new version, otherwise I'm fine approving this as is.

> It sounded like Frieder was making some progress on understanding a
> little more about the fractional divider.

That's awesome!
I think we can/want to get this merged first, and we can improve
with fractional divider calculations in a later iteration to get rid of
the LUT altogether; it'll be easier for other users of the chip to go a
step at a time as well if they notice a breakage.

> >     // (I haven't given up on that *5 to move inside this function...)
> 
> I wanted to keep the PMS calculator returning the real clock value
> since the calculations are based on equation in the ref manual, Fpll =
> Fref * M / (P*S)
> This way, the calling function can determine if it needs to be
> multiplied by 5.  I haven't fully determined how the fractional
> calculator determines what frequency it wants for a target frequency,
> and using the values for P, M and S from the fractional divider
> doesn't seem to always yield 5x like they did for the table entries
> using the integer divider.

The way I see it is that 5x is an artifact of the PMS calculation:
the caller doesn't want '5x rate', it wants 'rate', and actually setting
the p/m/s values provided by the function gets us 'rate', so it feels
like that's not the caller should have to worry about.. When we add
fractional divider support then it'll be the calculator's job to
determine if/when it needs that 5x.

But if I understand what you're saying, fsl_samsung_hdmi_phy_find_pms()
is not a pms specific to the NXP chip, but something more general to any
integer divisor that is better kept more generic if it weren't for the
few references to our hardware (e.g. limits on m)?

If so then I guess I can understand your position, I wouldn't go as far
as saying it warrants another level of indirection so I guess it's fine
as is.

(In a similar line of thought, had I implemented this I would have had
the function not return generic p/m/s but return the pll registers
directly -- it took me a moment to check why we were setting regs[2] to
`s-1` and confirming we had s>1...)



Anyway, once again I don't feel like I am in any position to be preachy
about all of this, so I'm fine with this patch as is - please take of it
what you agree with/want to rework and we can leave the rest here as far
as I'm concerned.
Frieder Schrempf Sept. 6, 2024, 8:28 p.m. UTC | #5
On 06.09.24 02:57, Adam Ford wrote:
> On Thu, Sep 5, 2024 at 7:26 PM Dominique Martinet
> <dominique.martinet@atmark-techno.com> wrote:
>>
>>
>> (sorry I meant to send this yesterday but I'm being forced to adjust my
>> mail pipeline with work and gmail and it didn't go out -- trying
>> again. Sorry if it actually did go through. Hopefully I didn't misfire
>> anything else yesterday...)
>>
>> Adam Ford wrote on Wed, Sep 04, 2024 at 06:30:32PM -0500:
>>> Currently, if the clock values cannot be set to the exact rate,
>>> the round_rate and set_rate functions use the closest value found in
>>> the look-up-table.  In preparation of removing values from the LUT
>>> that can be calculated evenly with the integer calculator, it's
>>> necessary to ensure to check both the look-up-table and the integer
>>> divider clock values to get the closest values to the requested
>>> value.  It does this by measuring the difference between the
>>> requested clock value and the closest value in both integer divider
>>> calucator and the fractional clock look-up-table.
>>>
>>> Which ever has the smallest difference between them is returned as
>>> the cloesest rate.
>>>
>>> Signed-off-by: Adam Ford <aford173@gmail.com>
>>> Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com>
>>
>> b4 (or whatever you're using) probably picked that up from the patch I
>> included in my reply to this patch, this sob should go away.
>>
> For each iteration, I grabbed the patches from patchwork which
> contained any s-o-b messages, if present.  I didn't add anything
> manually.
>>
>>
>>> diff --git a/drivers/phy/freescale/phy-fsl-samsung-hdmi.c b/drivers/phy/freescale/phy-fsl-samsung-hdmi.c
>>> index 4b13e386e5ba..9a21dbbf1a82 100644
>>> --- a/drivers/phy/freescale/phy-fsl-samsung-hdmi.c
>>> +++ b/drivers/phy/freescale/phy-fsl-samsung-hdmi.c
>>> @@ -547,6 +547,16 @@ static unsigned long phy_clk_recalc_rate(struct clk_hw *hw,
>>>        return phy->cur_cfg->pixclk;
>>>   }
>>>
>>> +static u32 fsl_samsung_hdmi_phy_get_closest_rate(unsigned long rate,
>>> +                                              u32 int_div_clk, u32 frac_div_clk)
>>> +{
>>> +     /* The int_div_clk may be greater than rate, so cast it and use ABS */
>>> +     if (abs((long)rate - (long)int_div_clk) < (rate - frac_div_clk))
>>
>> I still think `rate - frac_div_clk` might not always hold in the future
>> (because there is no intrinsic reason we'd pick the smaller end in case
>> of inexact match and a future improvement might change this to the
>> closest value as well), so I'll argue again for having both use abs(),
>> but at least there's only one place to update if that changes in the
>> future now so hopefully whoever does this will notice...
> 
> I can add the ABS on the fractional divider.  I left it out on purpose
> since the LUT table always return a value equal or less, so the extra
> ABS seemed like busy work.  However, I can see the argument for being
> consistent.
> 
>>
>>> +             return int_div_clk;
>>> +
>>> +     return frac_div_clk;
>>> +}
>>> +
>>>   static long phy_clk_round_rate(struct clk_hw *hw,
>>>                               unsigned long rate, unsigned long *parent_rate)
>>>   {
>>> @@ -563,6 +573,7 @@ static long phy_clk_round_rate(struct clk_hw *hw,
>>>        for (i = ARRAY_SIZE(phy_pll_cfg) - 1; i >= 0; i--)
>>>                if (phy_pll_cfg[i].pixclk <= rate)
>>>                        break;
>>> +
>>
>> (unrelated)
> 
> I don't understand what you're asking here.
> 
>>
>>>        /* If the rate is an exact match, return it now */
>>>        if (rate == phy_pll_cfg[i].pixclk)
>>>                return phy_pll_cfg[i].pixclk;
>>> @@ -579,8 +590,7 @@ static long phy_clk_round_rate(struct clk_hw *hw,
>>>        if (int_div_clk == rate)
>>>                return int_div_clk;
>>>
>>> -     /* Fall back to the closest value in the LUT */
>>> -     return phy_pll_cfg[i].pixclk;
>>> +     return fsl_samsung_hdmi_phy_get_closest_rate(rate, int_div_clk, phy_pll_cfg[i].pixclk);
>>>   }
>>>
>>>   static int phy_clk_set_rate(struct clk_hw *hw,
>>> @@ -594,27 +604,37 @@ static int phy_clk_set_rate(struct clk_hw *hw,
>>>
>>>        /* If the integer divider works, just use it */
>>
>> I found this comment a bit confusing given the current flow as of this
>> patch. Might make more sense immediately before the if?
>>
> 
> This code evolved with each iteration, but I didn't necessarily
> reorganize the comments.  I can rearrange them.
>>
>>>        int_div_clk = fsl_samsung_hdmi_phy_find_pms(rate * 5, &p, &m, &s) / 5;
>>> +     calculated_phy_pll_cfg.pixclk = int_div_clk;
>>> +     calculated_phy_pll_cfg.pll_div_regs[0] = FIELD_PREP(REG01_PMS_P_MASK, p);
>>> +     calculated_phy_pll_cfg.pll_div_regs[1] = m;
>>> +     calculated_phy_pll_cfg.pll_div_regs[2] = FIELD_PREP(REG03_PMS_S_MASK, s-1);
>>> +     phy->cur_cfg = &calculated_phy_pll_cfg;
>>>        if (int_div_clk == rate) {
>>>                dev_dbg(phy->dev, "fsl_samsung_hdmi_phy: using integer divider\n");
>>> -             calculated_phy_pll_cfg.pixclk = int_div_clk;
>>> -             calculated_phy_pll_cfg.pll_div_regs[0] = FIELD_PREP(REG01_PMS_P_MASK, p);
>>> -             calculated_phy_pll_cfg.pll_div_regs[1] = m;
>>> -             calculated_phy_pll_cfg.pll_div_regs[2] = FIELD_PREP(REG03_PMS_S_MASK, s-1);
>>> -             /* pll_div_regs 3-6 are fixed and pre-defined already */
>>
>> nitpick: might want to keep the above comment?
> 
> ok.
>>
>>> -             phy->cur_cfg  = &calculated_phy_pll_cfg;
>>> +             goto done;
>>>        } else {
>>>                /* Otherwise, search the LUT */
>>> -             dev_dbg(phy->dev, "fsl_samsung_hdmi_phy: using fractional divider\n");
>>> -             for (i = ARRAY_SIZE(phy_pll_cfg) - 1; i >= 0; i--)
>>> -                     if (phy_pll_cfg[i].pixclk <= rate)
>>> +             for (i = ARRAY_SIZE(phy_pll_cfg) - 1; i >= 0; i--) {
>>> +                     if (phy_pll_cfg[i].pixclk == rate) {
>>> +                             dev_dbg(phy->dev, "fsl_samsung_hdmi_phy: using fractional divider\n");
>>
>> nitpick: might make sense to print what was picked in case of inexact
>> match as well, but these are dbg warning so probably fine either way.
> 
> I can add the actual values returned.
> 
>>
>>
>> overall I find the flow of this function hard to read; it's a bit ugly
>> flow-wise but jumping in the clock comparison 'if' might help trim this?
>> (and if we're going out of our way to factor out the diff, maybe the lut
>> lookup could be as well)
>>
>> But I'm probably just being overcritical here, it's fine as is if you
>> pefer your version, just writing down this as an illustration of what I
>> meant with the above sentence as I'm not sure I was clear -- I'll be
>> just as happy to consider this series done so we can do more interesting
>> things :P
> 
> Now I am a bit more confused, because above I got the impression you
> were withdrawing your s-o-b, but now it sounds like you want to move
> it forward.
> 
> It sounded like Frieder was making some progress on understanding a
> little more about the fractional divider.

I think I managed to get behind the calculation of the fractional-n 
divider parameters. I came up with a spreadsheet to calculate the output 
frequency from existing register values and I have a crude Python script 
that can be used to search for parameters for a given pixel clock.

I tested this with three different non-CEA-861 pixel clock values 
(supported by my HDMI USB grabber) for which the integer PLL yielded 
deviations >0.5%. With the new LUT entries those modes work now.

I will clean things up a bit and then share what I have. I hope that 
this allows anyone to calculate parameters for their non-standard 
displays if required.

If someone feels extra motivated they could try to calculate the 
fractional parameters at runtime. However I'm not sure that this is 
feasible. The numerical computation of a large number of parameters is 
quite heavy and it's probably not easy to strip the algorithm down to 
something that can be run on the target without too much overhead.

> 
>>
>> {
>>      u32 int_div_clk, frac_div_clk;
>>      int i;
>>      u16 m;
>>      u8 p, s;
>>
>>      // (I haven't given up on that *5 to move inside this function...)
> 
> I wanted to keep the PMS calculator returning the real clock value
> since the calculations are based on equation in the ref manual, Fpll =
> Fref * M / (P*S)
> This way, the calling function can determine if it needs to be
> multiplied by 5.  I haven't fully determined how the fractional
> calculator determines what frequency it wants for a target frequency,
> and using the values for P, M and S from the fractional divider
> doesn't seem to always yield 5x like they did for the table entries
> using the integer divider.

For what I found out the factor of 5 always applies. For the integer 
part and also for the fractional part.

> 
> I am hoping someone from NXP can elaborate, or give us some clues on
> how to get better fractional divider values.
Dominique Martinet Sept. 7, 2024, 4:49 a.m. UTC | #6
Frieder Schrempf wrote on Fri, Sep 06, 2024 at 10:28:59PM +0200:
> I think I managed to get behind the calculation of the fractional-n divider
> parameters. I came up with a spreadsheet to calculate the output frequency
> from existing register values and I have a crude Python script that can be
> used to search for parameters for a given pixel clock.
> 
> I tested this with three different non-CEA-861 pixel clock values (supported
> by my HDMI USB grabber) for which the integer PLL yielded deviations >0.5%.
> With the new LUT entries those modes work now.
> 
> I will clean things up a bit and then share what I have. I hope that this
> allows anyone to calculate parameters for their non-standard displays if
> required.
> 
> If someone feels extra motivated they could try to calculate the fractional
> parameters at runtime. However I'm not sure that this is feasible. The
> numerical computation of a large number of parameters is quite heavy and
> it's probably not easy to strip the algorithm down to something that can be
> run on the target without too much overhead.

I think keeping the LUT is perfectly fine if we know where the values
come from - perhaps having your python program in a comment above the
LUT so anyone can check the values match?

My main problem with the LUT is just that -- there's no way of
checking. If the values come from somewhere sensible and can be verified
I think it makes sense to keep it to fill the holes where the integer
divider isn't enough.

> > This way, the calling function can determine if it needs to be
> > multiplied by 5.  I haven't fully determined how the fractional
> > calculator determines what frequency it wants for a target frequency,
> > and using the values for P, M and S from the fractional divider
> > doesn't seem to always yield 5x like they did for the table entries
> > using the integer divider.
> 
> For what I found out the factor of 5 always applies. For the integer part
> and also for the fractional part.

In that case I'm definitely in favor of moving it inside the function as
part of the calculation

Thank you both!
Adam Ford Sept. 9, 2024, 12:46 p.m. UTC | #7
On Fri, Sep 6, 2024 at 11:50 PM Dominique Martinet
<dominique.martinet@atmark-techno.com> wrote:
>
> Frieder Schrempf wrote on Fri, Sep 06, 2024 at 10:28:59PM +0200:
> > I think I managed to get behind the calculation of the fractional-n divider
> > parameters. I came up with a spreadsheet to calculate the output frequency
> > from existing register values and I have a crude Python script that can be
> > used to search for parameters for a given pixel clock.
> >
> > I tested this with three different non-CEA-861 pixel clock values (supported
> > by my HDMI USB grabber) for which the integer PLL yielded deviations >0.5%.
> > With the new LUT entries those modes work now.
> >
> > I will clean things up a bit and then share what I have. I hope that this
> > allows anyone to calculate parameters for their non-standard displays if
> > required.
> >
> > If someone feels extra motivated they could try to calculate the fractional
> > parameters at runtime. However I'm not sure that this is feasible. The
> > numerical computation of a large number of parameters is quite heavy and
> > it's probably not easy to strip the algorithm down to something that can be
> > run on the target without too much overhead.
>
> I think keeping the LUT is perfectly fine if we know where the values
> come from - perhaps having your python program in a comment above the
> LUT so anyone can check the values match?
>
> My main problem with the LUT is just that -- there's no way of
> checking. If the values come from somewhere sensible and can be verified
> I think it makes sense to keep it to fill the holes where the integer
> divider isn't enough.
>
> > > This way, the calling function can determine if it needs to be
> > > multiplied by 5.  I haven't fully determined how the fractional
> > > calculator determines what frequency it wants for a target frequency,
> > > and using the values for P, M and S from the fractional divider
> > > doesn't seem to always yield 5x like they did for the table entries
> > > using the integer divider.
> >
> > For what I found out the factor of 5 always applies. For the integer part
> > and also for the fractional part.
>
> In that case I'm definitely in favor of moving it inside the function as
> part of the calculation

I'll do a V7 with the 5x factor moved into the integer calculator.
I'll also try to rearrange the flow a bit as you requested.  Depending
on how much changes, I may strip off any s-o-b notes.
I was traveling over the weekend and I'm likely traveling the next
weekends, so I will try to get to it this week before I go, but I
can't promise anything quickly.

adam

>
> Thank you both!
> --
> Dominique
diff mbox series

Patch

diff --git a/drivers/phy/freescale/phy-fsl-samsung-hdmi.c b/drivers/phy/freescale/phy-fsl-samsung-hdmi.c
index 4b13e386e5ba..9a21dbbf1a82 100644
--- a/drivers/phy/freescale/phy-fsl-samsung-hdmi.c
+++ b/drivers/phy/freescale/phy-fsl-samsung-hdmi.c
@@ -547,6 +547,16 @@  static unsigned long phy_clk_recalc_rate(struct clk_hw *hw,
 	return phy->cur_cfg->pixclk;
 }
 
+static u32 fsl_samsung_hdmi_phy_get_closest_rate(unsigned long rate,
+						 u32 int_div_clk, u32 frac_div_clk)
+{
+	/* The int_div_clk may be greater than rate, so cast it and use ABS */
+	if (abs((long)rate - (long)int_div_clk) < (rate - frac_div_clk))
+		return int_div_clk;
+
+	return frac_div_clk;
+}
+
 static long phy_clk_round_rate(struct clk_hw *hw,
 			       unsigned long rate, unsigned long *parent_rate)
 {
@@ -563,6 +573,7 @@  static long phy_clk_round_rate(struct clk_hw *hw,
 	for (i = ARRAY_SIZE(phy_pll_cfg) - 1; i >= 0; i--)
 		if (phy_pll_cfg[i].pixclk <= rate)
 			break;
+
 	/* If the rate is an exact match, return it now */
 	if (rate == phy_pll_cfg[i].pixclk)
 		return phy_pll_cfg[i].pixclk;
@@ -579,8 +590,7 @@  static long phy_clk_round_rate(struct clk_hw *hw,
 	if (int_div_clk == rate)
 		return int_div_clk;
 
-	/* Fall back to the closest value in the LUT */
-	return phy_pll_cfg[i].pixclk;
+	return fsl_samsung_hdmi_phy_get_closest_rate(rate, int_div_clk, phy_pll_cfg[i].pixclk);
 }
 
 static int phy_clk_set_rate(struct clk_hw *hw,
@@ -594,27 +604,37 @@  static int phy_clk_set_rate(struct clk_hw *hw,
 
 	/* If the integer divider works, just use it */
 	int_div_clk = fsl_samsung_hdmi_phy_find_pms(rate * 5, &p, &m, &s) / 5;
+	calculated_phy_pll_cfg.pixclk = int_div_clk;
+	calculated_phy_pll_cfg.pll_div_regs[0] = FIELD_PREP(REG01_PMS_P_MASK, p);
+	calculated_phy_pll_cfg.pll_div_regs[1] = m;
+	calculated_phy_pll_cfg.pll_div_regs[2] = FIELD_PREP(REG03_PMS_S_MASK, s-1);
+	phy->cur_cfg = &calculated_phy_pll_cfg;
 	if (int_div_clk == rate) {
 		dev_dbg(phy->dev, "fsl_samsung_hdmi_phy: using integer divider\n");
-		calculated_phy_pll_cfg.pixclk = int_div_clk;
-		calculated_phy_pll_cfg.pll_div_regs[0] = FIELD_PREP(REG01_PMS_P_MASK, p);
-		calculated_phy_pll_cfg.pll_div_regs[1] = m;
-		calculated_phy_pll_cfg.pll_div_regs[2] = FIELD_PREP(REG03_PMS_S_MASK, s-1);
-		/* pll_div_regs 3-6 are fixed and pre-defined already */
-		phy->cur_cfg  = &calculated_phy_pll_cfg;
+		goto done;
 	} else {
 		/* Otherwise, search the LUT */
-		dev_dbg(phy->dev, "fsl_samsung_hdmi_phy: using fractional divider\n");
-		for (i = ARRAY_SIZE(phy_pll_cfg) - 1; i >= 0; i--)
-			if (phy_pll_cfg[i].pixclk <= rate)
+		for (i = ARRAY_SIZE(phy_pll_cfg) - 1; i >= 0; i--) {
+			if (phy_pll_cfg[i].pixclk == rate) {
+				dev_dbg(phy->dev, "fsl_samsung_hdmi_phy: using fractional divider\n");
+				phy->cur_cfg = &phy_pll_cfg[i];
+				goto done;
+			}
+
+			if (phy_pll_cfg[i].pixclk < rate)
 				break;
+		}
 
 		if (i < 0)
 			return -EINVAL;
-
-		phy->cur_cfg = &phy_pll_cfg[i];
 	}
 
+	if (fsl_samsung_hdmi_phy_get_closest_rate(rate, int_div_clk,
+						  phy_pll_cfg[i].pixclk) == int_div_clk)
+		phy->cur_cfg = &calculated_phy_pll_cfg;
+	else
+		phy->cur_cfg = &phy_pll_cfg[i];
+done:
 	return fsl_samsung_hdmi_phy_configure(phy, phy->cur_cfg);
 }