diff mbox series

Revert "ARM: dts: exynos: Remove 'opp-shared' from Exynos4412 bus OPP-tables"

Message ID 20210222095405.3109-1-m.reichl@fivetechno.de (mailing list archive)
State New, archived
Headers show
Series Revert "ARM: dts: exynos: Remove 'opp-shared' from Exynos4412 bus OPP-tables" | expand

Commit Message

Markus Reichl Feb. 22, 2021, 9:54 a.m. UTC
This reverts commit a23beead41a18c3be3ca409cb52f35bc02e601b9.

I'm running an Odroid-X2 as headless 24/7 server.
With plain stable 5.10.1 I had 54 up days without problems.
With opp-shared removed on kernels before and now on 5.11
my system freezes after some days on disk activity to eMMC
(rsync, apt upgrade).

The spontaneous hangs are not easy to reproduce but testing this
for several months now I am quite confident that there is something
wrong with this patch.

Signed-off-by: Markus Reichl <m.reichl@fivetechno.de>
---
 arch/arm/boot/dts/exynos4412.dtsi | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Marek Szyprowski Feb. 23, 2021, 9:24 a.m. UTC | #1
Hi Markus,

On 22.02.2021 10:54, Markus Reichl wrote:
> This reverts commit a23beead41a18c3be3ca409cb52f35bc02e601b9.
>
> I'm running an Odroid-X2 as headless 24/7 server.
> With plain stable 5.10.1 I had 54 up days without problems.
> With opp-shared removed on kernels before and now on 5.11
> my system freezes after some days on disk activity to eMMC
> (rsync, apt upgrade).
>
> The spontaneous hangs are not easy to reproduce but testing this
> for several months now I am quite confident that there is something
> wrong with this patch.
>
> Signed-off-by: Markus Reichl <m.reichl@fivetechno.de>

Thanks for the report.

IMHO a straight revert is a bad idea. I would prefer to keep current opp 
definitions and disable the affected devfreq devices (probably right bus 
would be enough) or try to identify which transitions are responsible 
for that issue. I know that it would take some time to identify them, 
but that would be the best solution. Reverting leads to incorrect 
hardware description, what in turn confuses the driver and framework, 
what in turn hides a real problem.

Another problem related to devfreq on Exynos4412 has been introduced 
recently by the commit 86ad9a24f21e ("PM / devfreq: Add required OPPs 
support to passive governor"). You can see lots of the messages like 
this one:

devfreq soc:bus-acp: failed to update devfreq using passive governor

I didn't have time to check what's wrong there, but I consider devfreq 
on Exynos a little bit broken, so another solution would be just to 
disable it in the exynos_defconfig.

> ---
>   arch/arm/boot/dts/exynos4412.dtsi | 6 ++++++
>   1 file changed, 6 insertions(+)
>
> diff --git a/arch/arm/boot/dts/exynos4412.dtsi b/arch/arm/boot/dts/exynos4412.dtsi
> index a142fe84010b..6246df278431 100644
> --- a/arch/arm/boot/dts/exynos4412.dtsi
> +++ b/arch/arm/boot/dts/exynos4412.dtsi
> @@ -405,6 +405,7 @@
>   
>   		bus_dmc_opp_table: opp-table1 {
>   			compatible = "operating-points-v2";
> +			opp-shared;
>   
>   			opp-100000000 {
>   				opp-hz = /bits/ 64 <100000000>;
> @@ -431,6 +432,7 @@
>   
>   		bus_acp_opp_table: opp-table2 {
>   			compatible = "operating-points-v2";
> +			opp-shared;
>   
>   			opp-100000000 {
>   				opp-hz = /bits/ 64 <100000000>;
> @@ -500,6 +502,7 @@
>   
>   		bus_leftbus_opp_table: opp-table3 {
>   			compatible = "operating-points-v2";
> +			opp-shared;
>   
>   			opp-100000000 {
>   				opp-hz = /bits/ 64 <100000000>;
> @@ -522,6 +525,7 @@
>   
>   		bus_display_opp_table: opp-table4 {
>   			compatible = "operating-points-v2";
> +			opp-shared;
>   
>   			opp-160000000 {
>   				opp-hz = /bits/ 64 <160000000>;
> @@ -533,6 +537,7 @@
>   
>   		bus_fsys_opp_table: opp-table5 {
>   			compatible = "operating-points-v2";
> +			opp-shared;
>   
>   			opp-100000000 {
>   				opp-hz = /bits/ 64 <100000000>;
> @@ -544,6 +549,7 @@
>   
>   		bus_peri_opp_table: opp-table6 {
>   			compatible = "operating-points-v2";
> +			opp-shared;
>   
>   			opp-50000000 {
>   				opp-hz = /bits/ 64 <50000000>;

Best regards
Krzysztof Kozlowski Feb. 23, 2021, 8:01 p.m. UTC | #2
On Tue, Feb 23, 2021 at 10:24:41AM +0100, Marek Szyprowski wrote:
> Hi Markus,
> 
> On 22.02.2021 10:54, Markus Reichl wrote:
> > This reverts commit a23beead41a18c3be3ca409cb52f35bc02e601b9.
> >
> > I'm running an Odroid-X2 as headless 24/7 server.
> > With plain stable 5.10.1 I had 54 up days without problems.
> > With opp-shared removed on kernels before and now on 5.11
> > my system freezes after some days on disk activity to eMMC
> > (rsync, apt upgrade).
> >
> > The spontaneous hangs are not easy to reproduce but testing this
> > for several months now I am quite confident that there is something
> > wrong with this patch.
> >
> > Signed-off-by: Markus Reichl <m.reichl@fivetechno.de>
> 
> Thanks for the report.
> 
> IMHO a straight revert is a bad idea. I would prefer to keep current opp 
> definitions and disable the affected devfreq devices (probably right bus 
> would be enough) or try to identify which transitions are responsible 
> for that issue. I know that it would take some time to identify them, 
> but that would be the best solution. Reverting leads to incorrect 
> hardware description, what in turn confuses the driver and framework, 
> what in turn hides a real problem.

I agree with this approach. If devfreq is unusable on that platform,
let's try disabling the exynos-bus nodes. It could be enough to help.
The opp-shared does not look like proper fix for this problem, but
rather a incorrect solution which achieves the same result - disabling
frequency/voltage scaling on some buses.

> 
> Another problem related to devfreq on Exynos4412 has been introduced 
> recently by the commit 86ad9a24f21e ("PM / devfreq: Add required OPPs 
> support to passive governor"). You can see lots of the messages like 
> this one:
> 
> devfreq soc:bus-acp: failed to update devfreq using passive governor
> 
> I didn't have time to check what's wrong there, but I consider devfreq 
> on Exynos a little bit broken, so another solution would be just to 
> disable it in the exynos_defconfig.

Yes, I saw it as well. However defconfig is only defconfig, so customers
still would be affected and still might report bugs for it. Maybe better
to disable all exynos-bus nodes?

Best regards,
Krzysztof
diff mbox series

Patch

diff --git a/arch/arm/boot/dts/exynos4412.dtsi b/arch/arm/boot/dts/exynos4412.dtsi
index a142fe84010b..6246df278431 100644
--- a/arch/arm/boot/dts/exynos4412.dtsi
+++ b/arch/arm/boot/dts/exynos4412.dtsi
@@ -405,6 +405,7 @@ 
 
 		bus_dmc_opp_table: opp-table1 {
 			compatible = "operating-points-v2";
+			opp-shared;
 
 			opp-100000000 {
 				opp-hz = /bits/ 64 <100000000>;
@@ -431,6 +432,7 @@ 
 
 		bus_acp_opp_table: opp-table2 {
 			compatible = "operating-points-v2";
+			opp-shared;
 
 			opp-100000000 {
 				opp-hz = /bits/ 64 <100000000>;
@@ -500,6 +502,7 @@ 
 
 		bus_leftbus_opp_table: opp-table3 {
 			compatible = "operating-points-v2";
+			opp-shared;
 
 			opp-100000000 {
 				opp-hz = /bits/ 64 <100000000>;
@@ -522,6 +525,7 @@ 
 
 		bus_display_opp_table: opp-table4 {
 			compatible = "operating-points-v2";
+			opp-shared;
 
 			opp-160000000 {
 				opp-hz = /bits/ 64 <160000000>;
@@ -533,6 +537,7 @@ 
 
 		bus_fsys_opp_table: opp-table5 {
 			compatible = "operating-points-v2";
+			opp-shared;
 
 			opp-100000000 {
 				opp-hz = /bits/ 64 <100000000>;
@@ -544,6 +549,7 @@ 
 
 		bus_peri_opp_table: opp-table6 {
 			compatible = "operating-points-v2";
+			opp-shared;
 
 			opp-50000000 {
 				opp-hz = /bits/ 64 <50000000>;