diff mbox series

[v4,1/4] clk: fractional-divider: Export approximation algorithm to the CCF users

Message ID 20210812170025.67074-1-andriy.shevchenko@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series [v4,1/4] clk: fractional-divider: Export approximation algorithm to the CCF users | expand

Commit Message

Andy Shevchenko Aug. 12, 2021, 5 p.m. UTC
At least one user currently duplicates some functions that are provided
by fractional divider module. Let's export approximation algorithm and
replace the open-coded variant.

As a bonus the exported function will get better documentation in place.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Heiko Stuebner <heiko@sntech.de>
---
v4: rebased on top of latest CLK codebase
 drivers/clk/clk-fractional-divider.c | 11 +++++++----
 drivers/clk/clk-fractional-divider.h |  9 +++++++++
 drivers/clk/rockchip/clk.c           | 17 +++--------------
 3 files changed, 19 insertions(+), 18 deletions(-)
 create mode 100644 drivers/clk/clk-fractional-divider.h

Comments

Stephen Boyd Aug. 12, 2021, 7:56 p.m. UTC | #1
Quoting Andy Shevchenko (2021-08-12 10:00:22)
> At least one user currently duplicates some functions that are provided
> by fractional divider module. Let's export approximation algorithm and
> replace the open-coded variant.
> 
> As a bonus the exported function will get better documentation in place.
> 
> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Tested-by: Heiko Stuebner <heiko@sntech.de>
> Acked-by: Heiko Stuebner <heiko@sntech.de>
> ---

Applied to clk-next
Andy Shevchenko Aug. 13, 2021, 9:43 a.m. UTC | #2
On Thu, Aug 12, 2021 at 12:56:35PM -0700, Stephen Boyd wrote:
> Quoting Andy Shevchenko (2021-08-12 10:00:22)
> > At least one user currently duplicates some functions that are provided
> > by fractional divider module. Let's export approximation algorithm and
> > replace the open-coded variant.
> > 
> > As a bonus the exported function will get better documentation in place.
> Applied to clk-next

Thank you, Stephen!
Andy Shevchenko Aug. 17, 2021, 12:45 p.m. UTC | #3
On Fri, Aug 13, 2021 at 12:43:22PM +0300, Andy Shevchenko wrote:
> On Thu, Aug 12, 2021 at 12:56:35PM -0700, Stephen Boyd wrote:
> > Quoting Andy Shevchenko (2021-08-12 10:00:22)
> > > At least one user currently duplicates some functions that are provided
> > > by fractional divider module. Let's export approximation algorithm and
> > > replace the open-coded variant.
> > > 
> > > As a bonus the exported function will get better documentation in place.
> > Applied to clk-next
> 
> Thank you, Stephen!

When they are expected to be visible in Linux Next?
Chris Morgan Sept. 7, 2021, 3:44 p.m. UTC | #4
From: Chris Morgan <macromorgan@hotmail.com>

Unfortunately, I can confirm this breaks the DSI panel on the Rockchip
PX30 (and possibly other SoCs). Tested on my Odroid Go Advance. When
I revert 4e7cf74fa3b2 "clk: fractional-divider: Export approximation
algorithm to the CCF users" and 928f9e268611 "clk: fractional-divider:
Hide clk_fractional_divider_ops from wide audience" the panel begins
working again as expected on the master branch.

It looks like an assumption is made in the vop_crtc_mode_fixup()
function in the rockchip_drm_vop.c that gets broken with this change.
Specifically, the function says in the comments "When DRM gives us a
mode, we should add 999 Hz to it.". I believe this is no longer true
after this clk change, and when I remove the + 999 from the function
the DSI panel works again. Note that I do not know the implications
of removing this 999 aside from that it fixes the DSI panel on my
PX30 after this change, so I don't know if it's a positive change
or not.

Thank you.
Andy Shevchenko Sept. 7, 2021, 5:54 p.m. UTC | #5
On Tue, Sep 07, 2021 at 10:44:00AM -0500, Chris Morgan wrote:
> From: Chris Morgan <macromorgan@hotmail.com>
> 
> Unfortunately, I can confirm this breaks the DSI panel on the Rockchip
> PX30 (and possibly other SoCs). Tested on my Odroid Go Advance. When
> I revert 4e7cf74fa3b2 "clk: fractional-divider: Export approximation
> algorithm to the CCF users" and 928f9e268611 "clk: fractional-divider:
> Hide clk_fractional_divider_ops from wide audience" the panel begins
> working again as expected on the master branch.
> 
> It looks like an assumption is made in the vop_crtc_mode_fixup()
> function in the rockchip_drm_vop.c that gets broken with this change.
> Specifically, the function says in the comments "When DRM gives us a
> mode, we should add 999 Hz to it.". I believe this is no longer true
> after this clk change, and when I remove the + 999 from the function
> the DSI panel works again. Note that I do not know the implications
> of removing this 999 aside from that it fixes the DSI panel on my
> PX30 after this change, so I don't know if it's a positive change
> or not.
> 
> Thank you.

Thank you for the report!

I'll check this. Perhaps Heiko can help with testing as well on his side.
Andy Shevchenko Sept. 7, 2021, 6:06 p.m. UTC | #6
On Tue, Sep 07, 2021 at 08:54:04PM +0300, Andy Shevchenko wrote:
> On Tue, Sep 07, 2021 at 10:44:00AM -0500, Chris Morgan wrote:
> > From: Chris Morgan <macromorgan@hotmail.com>
> > 
> > Unfortunately, I can confirm this breaks the DSI panel on the Rockchip
> > PX30 (and possibly other SoCs). Tested on my Odroid Go Advance. When
> > I revert 4e7cf74fa3b2 "clk: fractional-divider: Export approximation
> > algorithm to the CCF users" and 928f9e268611 "clk: fractional-divider:
> > Hide clk_fractional_divider_ops from wide audience" the panel begins
> > working again as expected on the master branch.
> > 
> > It looks like an assumption is made in the vop_crtc_mode_fixup()
> > function in the rockchip_drm_vop.c that gets broken with this change.
> > Specifically, the function says in the comments "When DRM gives us a
> > mode, we should add 999 Hz to it.". I believe this is no longer true
> > after this clk change, and when I remove the + 999 from the function
> > the DSI panel works again. Note that I do not know the implications
> > of removing this 999 aside from that it fixes the DSI panel on my
> > PX30 after this change, so I don't know if it's a positive change
> > or not.
> > 
> > Thank you.
> 
> Thank you for the report!
> 
> I'll check this. Perhaps Heiko can help with testing as well on his side.

On the first glance the mentioned patch may not be the culprit because it does
not change the functional behaviour (if I'm not mistaken). What really changes
it is the additional flag that removes the left-shift of the rate in the
calculations.

To me sounds like you found a proper solution to the issue and that +999 is
a hack against the (blindly?) copied part of the algorithm used in fractional
divider. Have you read the top comment in clk-fractional-divider.c? It should
explain how it works after my series.

In any case I'm not going to come to any conclusions right now and also want
to hear from people who have better understanding of this hardware.
Chris Morgan Sept. 8, 2021, 2:17 a.m. UTC | #7
On Tue, Sep 07, 2021 at 09:06:10PM +0300, Andy Shevchenko wrote:
> On Tue, Sep 07, 2021 at 08:54:04PM +0300, Andy Shevchenko wrote:
> > On Tue, Sep 07, 2021 at 10:44:00AM -0500, Chris Morgan wrote:
> > > From: Chris Morgan <macromorgan@hotmail.com>
> > > 
> > > Unfortunately, I can confirm this breaks the DSI panel on the Rockchip
> > > PX30 (and possibly other SoCs). Tested on my Odroid Go Advance. When
> > > I revert 4e7cf74fa3b2 "clk: fractional-divider: Export approximation
> > > algorithm to the CCF users" and 928f9e268611 "clk: fractional-divider:
> > > Hide clk_fractional_divider_ops from wide audience" the panel begins
> > > working again as expected on the master branch.
> > > 
> > > It looks like an assumption is made in the vop_crtc_mode_fixup()
> > > function in the rockchip_drm_vop.c that gets broken with this change.
> > > Specifically, the function says in the comments "When DRM gives us a
> > > mode, we should add 999 Hz to it.". I believe this is no longer true
> > > after this clk change, and when I remove the + 999 from the function
> > > the DSI panel works again. Note that I do not know the implications
> > > of removing this 999 aside from that it fixes the DSI panel on my
> > > PX30 after this change, so I don't know if it's a positive change
> > > or not.
> > > 
> > > Thank you.
> > 
> > Thank you for the report!
> > 
> > I'll check this. Perhaps Heiko can help with testing as well on his side.
> 
> On the first glance the mentioned patch may not be the culprit because it does
> not change the functional behaviour (if I'm not mistaken). What really changes
> it is the additional flag that removes the left-shift of the rate in the
> calculations.

I noticed the behavior on the 5.14 kernel was to set the numerator at
an ungodly 7649082492112076800 and the denominator at 1 (no, that's not
a typo). I think it tried to write 65535 to the register though, but it
would go through this a few times and eventually settle on 1:1 as the
fractional ratio (which I assume is all good, because that would work).

Contrast this to the 5.15 behavior where it would try to set the ratio
to 17001:17000, which would cause the DSI screen to fail to initalize.

After tracing through the code I figured out that the VOP was trying to
add 999 to the clock and set it at 17000999. 17000000/17000999 gives us
0, and subtracting 1 from that gives us a -1. The fls_long function 
would then return 64, and if we subtract 16 (the value of fd->mwidth
for my board) it would tell us to shift the 17000999 48 bits to the
left, which matches the ungodly large number.

With the changes in 5.15 if I remove the + 999 from the VOP driver the
clock then gets set at 17000000, since the parent is at 17000000 that
gives us a 1:1 where everything works and everything is fine.

Long story short I think this is a bug that's existed all along, and
this change simply exposed it in a manner where it stopped working
despite the bug being present. Unfortunately I neither know enough
about the hardware to be confident in this fix beyond my specific
board, nor do I have enough hardware to test it on anything except
a Rockchip rk3326 with a DSI panel.

> 
> To me sounds like you found a proper solution to the issue and that +999 is
> a hack against the (blindly?) copied part of the algorithm used in fractional
> divider. Have you read the top comment in clk-fractional-divider.c? It should
> explain how it works after my series.

No, but I probably should read the docs more. I just stumbled on this
series doing a bisect when the DSI panel stopped working.

> 
> In any case I'm not going to come to any conclusions right now and also want
> to hear from people who have better understanding of this hardware.

Yeah, I want to see what Heiko says after some more research, or anyone
who has more familiarity with clocks/DRM than I do or who has more
hardware to test on than I do.

I intended to send a message informing you that "hey, this breaks
upstream", but I think it turns out it's more a matter of "hey,
this makes a broken upstream break instead of limp along".

Thank you.

> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
>
Andy Shevchenko Sept. 8, 2021, 10:52 a.m. UTC | #8
On Tue, Sep 07, 2021 at 09:17:47PM -0500, Chris Morgan wrote:
> On Tue, Sep 07, 2021 at 09:06:10PM +0300, Andy Shevchenko wrote:
> > On Tue, Sep 07, 2021 at 08:54:04PM +0300, Andy Shevchenko wrote:
> > > On Tue, Sep 07, 2021 at 10:44:00AM -0500, Chris Morgan wrote:

> > > > Unfortunately, I can confirm this breaks the DSI panel on the Rockchip
> > > > PX30 (and possibly other SoCs). Tested on my Odroid Go Advance. When
> > > > I revert 4e7cf74fa3b2 "clk: fractional-divider: Export approximation
> > > > algorithm to the CCF users" and 928f9e268611 "clk: fractional-divider:
> > > > Hide clk_fractional_divider_ops from wide audience" the panel begins
> > > > working again as expected on the master branch.
> > > > 
> > > > It looks like an assumption is made in the vop_crtc_mode_fixup()
> > > > function in the rockchip_drm_vop.c that gets broken with this change.
> > > > Specifically, the function says in the comments "When DRM gives us a
> > > > mode, we should add 999 Hz to it.". I believe this is no longer true
> > > > after this clk change, and when I remove the + 999 from the function
> > > > the DSI panel works again. Note that I do not know the implications
> > > > of removing this 999 aside from that it fixes the DSI panel on my
> > > > PX30 after this change, so I don't know if it's a positive change
> > > > or not.
> > > 
> > > Thank you for the report!
> > > 
> > > I'll check this. Perhaps Heiko can help with testing as well on his side.
> > 
> > On the first glance the mentioned patch may not be the culprit because it does
> > not change the functional behaviour (if I'm not mistaken). What really changes
> > it is the additional flag that removes the left-shift of the rate in the
> > calculations.
> 
> I noticed the behavior on the 5.14 kernel was to set the numerator at
> an ungodly 7649082492112076800 and the denominator at 1 (no, that's not
> a typo). I think it tried to write 65535 to the register though, but it
> would go through this a few times and eventually settle on 1:1 as the
> fractional ratio (which I assume is all good, because that would work).
> 
> Contrast this to the 5.15 behavior where it would try to set the ratio
> to 17001:17000, which would cause the DSI screen to fail to initalize.
> 
> After tracing through the code I figured out that the VOP was trying to
> add 999 to the clock and set it at 17000999. 17000000/17000999 gives us
> 0, and subtracting 1 from that gives us a -1. The fls_long function
> would then return 64, and if we subtract 16 (the value of fd->mwidth
> for my board) it would tell us to shift the 17000999 48 bits to the
> left, which matches the ungodly large number.
> 
> With the changes in 5.15 if I remove the + 999 from the VOP driver the
> clock then gets set at 17000000, since the parent is at 17000000 that
> gives us a 1:1 where everything works and everything is fine.
> 
> Long story short I think this is a bug that's existed all along, and
> this change simply exposed it in a manner where it stopped working
> despite the bug being present. Unfortunately I neither know enough
> about the hardware to be confident in this fix beyond my specific
> board, nor do I have enough hardware to test it on anything except
> a Rockchip rk3326 with a DSI panel.

This is a very good analysis!

> > To me sounds like you found a proper solution to the issue and that +999 is
> > a hack against the (blindly?) copied part of the algorithm used in fractional
> > divider. Have you read the top comment in clk-fractional-divider.c? It should
> > explain how it works after my series.
> 
> No, but I probably should read the docs more. I just stumbled on this
> series doing a bisect when the DSI panel stopped working.
> 
> > In any case I'm not going to come to any conclusions right now and also want
> > to hear from people who have better understanding of this hardware.
> 
> Yeah, I want to see what Heiko says after some more research, or anyone
> who has more familiarity with clocks/DRM than I do or who has more
> hardware to test on than I do.

After what I read above I can't add anything and what I think is the best
course of actions is to submit a patch with removal of +999 part and above
explanation. It would be nice to find the real commit ID that may be used
for a Fixes tag.

Then we  at least will have a patch ready in case it's considered correct
by people from Rockchip side.

> I intended to send a message informing you that "hey, this breaks
> upstream", but I think it turns out it's more a matter of "hey,
> this makes a broken upstream break instead of limp along".

Understand.
diff mbox series

Patch

diff --git a/drivers/clk/clk-fractional-divider.c b/drivers/clk/clk-fractional-divider.c
index b1e556f20911..535d299af646 100644
--- a/drivers/clk/clk-fractional-divider.c
+++ b/drivers/clk/clk-fractional-divider.c
@@ -14,6 +14,8 @@ 
 #include <linux/slab.h>
 #include <linux/rational.h>
 
+#include "clk-fractional-divider.h"
+
 static inline u32 clk_fd_readl(struct clk_fractional_divider *fd)
 {
 	if (fd->flags & CLK_FRAC_DIVIDER_BIG_ENDIAN)
@@ -68,9 +70,10 @@  static unsigned long clk_fd_recalc_rate(struct clk_hw *hw,
 	return ret;
 }
 
-static void clk_fd_general_approximation(struct clk_hw *hw, unsigned long rate,
-					 unsigned long *parent_rate,
-					 unsigned long *m, unsigned long *n)
+void clk_fractional_divider_general_approximation(struct clk_hw *hw,
+						  unsigned long rate,
+						  unsigned long *parent_rate,
+						  unsigned long *m, unsigned long *n)
 {
 	struct clk_fractional_divider *fd = to_clk_fd(hw);
 	unsigned long scale;
@@ -102,7 +105,7 @@  static long clk_fd_round_rate(struct clk_hw *hw, unsigned long rate,
 	if (fd->approximation)
 		fd->approximation(hw, rate, parent_rate, &m, &n);
 	else
-		clk_fd_general_approximation(hw, rate, parent_rate, &m, &n);
+		clk_fractional_divider_general_approximation(hw, rate, parent_rate, &m, &n);
 
 	ret = (u64)*parent_rate * m;
 	do_div(ret, n);
diff --git a/drivers/clk/clk-fractional-divider.h b/drivers/clk/clk-fractional-divider.h
new file mode 100644
index 000000000000..4fa359a12ef4
--- /dev/null
+++ b/drivers/clk/clk-fractional-divider.h
@@ -0,0 +1,9 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+
+struct clk_hw;
+
+void clk_fractional_divider_general_approximation(struct clk_hw *hw,
+						  unsigned long rate,
+						  unsigned long *parent_rate,
+						  unsigned long *m,
+						  unsigned long *n);
diff --git a/drivers/clk/rockchip/clk.c b/drivers/clk/rockchip/clk.c
index 049e5e0b64f6..b7be7e11b0df 100644
--- a/drivers/clk/rockchip/clk.c
+++ b/drivers/clk/rockchip/clk.c
@@ -22,6 +22,8 @@ 
 #include <linux/regmap.h>
 #include <linux/reboot.h>
 #include <linux/rational.h>
+
+#include "../clk-fractional-divider.h"
 #include "clk.h"
 
 /*
@@ -178,10 +180,8 @@  static void rockchip_fractional_approximation(struct clk_hw *hw,
 		unsigned long rate, unsigned long *parent_rate,
 		unsigned long *m, unsigned long *n)
 {
-	struct clk_fractional_divider *fd = to_clk_fd(hw);
 	unsigned long p_rate, p_parent_rate;
 	struct clk_hw *p_parent;
-	unsigned long scale;
 
 	p_rate = clk_hw_get_rate(clk_hw_get_parent(hw));
 	if ((rate * 20 > p_rate) && (p_rate % rate != 0)) {
@@ -190,18 +190,7 @@  static void rockchip_fractional_approximation(struct clk_hw *hw,
 		*parent_rate = p_parent_rate;
 	}
 
-	/*
-	 * Get rate closer to *parent_rate to guarantee there is no overflow
-	 * for m and n. In the result it will be the nearest rate left shifted
-	 * by (scale - fd->nwidth) bits.
-	 */
-	scale = fls_long(*parent_rate / rate - 1);
-	if (scale > fd->nwidth)
-		rate <<= scale - fd->nwidth;
-
-	rational_best_approximation(rate, *parent_rate,
-			GENMASK(fd->mwidth - 1, 0), GENMASK(fd->nwidth - 1, 0),
-			m, n);
+	clk_fractional_divider_general_approximation(hw, rate, parent_rate, m, n);
 }
 
 static struct clk *rockchip_clk_register_frac_branch(