diff mbox series

[v2] arm64: dts: rockchip: Remove overdrive-mode OPPs from RK3588J SoC dtsi

Message ID eeec0d30d79b019d111b3f0aa2456e69896b2caa.1742813866.git.dsimic@manjaro.org (mailing list archive)
State New
Headers show
Series [v2] arm64: dts: rockchip: Remove overdrive-mode OPPs from RK3588J SoC dtsi | expand

Commit Message

Dragan Simic March 24, 2025, 11 a.m. UTC
The differences in the vendor-approved CPU and GPU OPPs for the standard
Rockchip RK3588 variant [1] and the industrial Rockchip RK3588J variant [2]
come from the latter, presumably, supporting an extended temperature range
that's usually associated with industrial applications, despite the two SoC
variant datasheets specifying the same upper limit for the allowed ambient
temperature for both variants.  However, the lower temperature limit is
specified much lower for the RK3588J variant. [1][2]

To be on the safe side and to ensure maximum longevity of the RK3588J SoCs,
only the CPU and GPU OPPs that are declared by the vendor to be always safe
for this SoC variant may be provided.  As explained by the vendor [3] and
according to the RK3588J datasheet, [2] higher-frequency/higher-voltage
CPU and GPU OPPs can be used as well, but at the risk of reducing the SoC
lifetime expectancy.  Presumably, using the higher OPPs may be safe only
when not enjoying the assumed extended temperature range that the RK3588J,
as an SoC variant targeted specifically at higher-temperature, industrial
applications, is made (or binned) for.

Anyone able to keep their RK3588J-based board outside the above-presumed
extended temperature range at all times, and willing to take the associated
risk of possibly reducing the SoC lifetime expectancy, is free to apply
a DT overlay that adds the higher CPU and GPU OPPs.

With all this and the downstream RK3588(J) DT definitions [4][5] in mind,
let's delete the RK3588J CPU and GPU OPPs that are not considered belonging
to the normal operation mode for this SoC variant.  To quote the RK3588J
datasheet [2], "normal mode means the chipset works under safety voltage
and frequency;  for the industrial environment, highly recommend to keep in
normal mode, the lifetime is reasonably guaranteed", while "overdrive mode
brings higher frequency, and the voltage will increase accordingly;  under
the overdrive mode for a long time, the chipset may shorten the lifetime,
especially in high-temperature condition".

To sum the RK3588J datasheet [2] and the vendor-provided DTs up, [4][5]
the maximum allowed CPU core, GPU and NPU frequencies are as follows:

   IP core    | Normal mode | Overdrive mode
  ------------+-------------+----------------
   Cortex-A55 |   1,296 MHz |      1,704 MHz
   Cortex-A76 |   1,608 MHz |      2,016 MHz
   GPU        |     700 MHz |        850 MHz
   NPU        |     800 MHz |        950 MHz

Unfortunately, when it comes to the actual voltages for the RK3588J CPU and
GPU OPPs, there's a discrepancy between the RK3588J datasheet [2] and the
downstream kernel code. [4][5]  The RK3588J datasheet states that "the max.
working voltage of CPU/GPU/NPU is 0.75 V under the normal mode", while the
downstream kernel code actually allows voltage ranges that go up to 0.95 V,
which is still within the voltage range allowed by the datasheet.  However,
the RK3588J datasheet also tells us to "strictly refer to the software
configuration of SDK and the hardware reference design", so let's embrace
the voltage ranges provided by the downstream kernel code, which also
prevents the undesirable theoretical outcome of ending up with no usable
OPPs on a particular board, as a result of the board's voltage regulator(s)
being unable to deliver the exact voltages, for whatever reason.

The above-described voltage ranges for the RK3588J CPU OPPs remain taken
from the downstream kernel code [4][5] by picking the highest, worst-bin
values, which ensure that all RK3588J bins will work reliably.  Yes, with
some power inevitably wasted as unnecessarily generated heat, but the
reliability is paramount, together with the longevity.  This deficiency
may be revisited separately at some point in the future.

The provided RK3588J CPU OPPs follow the slightly debatable "provide only
the highest-frequency OPP from the same-voltage group" approach that's been
established earlier, [6] as a result of the "same-voltage, lower-frequency"
OPPs being considered inefficient from the IPA governor's standpoint, which
may also be revisited separately at some point in the future.

[1] https://wiki.friendlyelec.com/wiki/images/e/ee/Rockchip_RK3588_Datasheet_V1.6-20231016.pdf
[2] https://wmsc.lcsc.com/wmsc/upload/file/pdf/v2/lcsc/2403201054_Rockchip-RK3588J_C22364189.pdf
[3] https://lore.kernel.org/linux-rockchip/e55125ed-64fb-455e-b1e4-cebe2cf006e4@cherry.de/T/#u
[4] https://raw.githubusercontent.com/rockchip-linux/kernel/604cec4004abe5a96c734f2fab7b74809d2d742f/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
[5] https://raw.githubusercontent.com/rockchip-linux/kernel/604cec4004abe5a96c734f2fab7b74809d2d742f/arch/arm64/boot/dts/rockchip/rk3588j.dtsi
[6] https://lore.kernel.org/all/20240229-rk-dts-additions-v3-5-6afe8473a631@gmail.com/

Fixes: 667885a68658 ("arm64: dts: rockchip: Add OPP data for CPU cores on RK3588j")
Fixes: a7b2070505a2 ("arm64: dts: rockchip: Split GPU OPPs of RK3588 and RK3588j")
Cc: stable@vger.kernel.org
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Alexey Charkov <alchark@gmail.com>
Helped-by: Quentin Schulz <quentin.schulz@cherry.de>
Reviewed-by: Quentin Schulz <quentin.schulz@cherry.de>
Signed-off-by: Dragan Simic <dsimic@manjaro.org>
---

Notes:
    Changes in v2:
      - Reworded and expanded the patch description a bit, to include some
        more information and to make it more clear what are the implied
        speculations and assumptions, and what are the available official
        statements from Rockchip, as suggested by Quentin [7]
      - Collected Reviewed-by tag from Quentin [7]
    
    Link to v1: https://lore.kernel.org/linux-rockchip/f929da061de35925ea591c969f985430e23c4a7e.1742526811.git.dsimic@manjaro.org/T/#u
    
    [7] https://lore.kernel.org/linux-rockchip/71b7c81b-6a4e-442b-a661-04d63639962a@cherry.de/

 arch/arm64/boot/dts/rockchip/rk3588j.dtsi | 53 ++++++++---------------
 1 file changed, 17 insertions(+), 36 deletions(-)

Comments

Quentin Schulz March 26, 2025, 10:07 a.m. UTC | #1
Hi Dragan,

On 3/24/25 12:00 PM, Dragan Simic wrote:
> The differences in the vendor-approved CPU and GPU OPPs for the standard
> Rockchip RK3588 variant [1] and the industrial Rockchip RK3588J variant [2]
> come from the latter, presumably, supporting an extended temperature range
> that's usually associated with industrial applications, despite the two SoC
> variant datasheets specifying the same upper limit for the allowed ambient
> temperature for both variants.  However, the lower temperature limit is
> specified much lower for the RK3588J variant. [1][2]
> 
> To be on the safe side and to ensure maximum longevity of the RK3588J SoCs,
> only the CPU and GPU OPPs that are declared by the vendor to be always safe
> for this SoC variant may be provided.  As explained by the vendor [3] and
> according to the RK3588J datasheet, [2] higher-frequency/higher-voltage
> CPU and GPU OPPs can be used as well, but at the risk of reducing the SoC
> lifetime expectancy.  Presumably, using the higher OPPs may be safe only
> when not enjoying the assumed extended temperature range that the RK3588J,
> as an SoC variant targeted specifically at higher-temperature, industrial
> applications, is made (or binned) for.
> 
> Anyone able to keep their RK3588J-based board outside the above-presumed
> extended temperature range at all times, and willing to take the associated
> risk of possibly reducing the SoC lifetime expectancy, is free to apply
> a DT overlay that adds the higher CPU and GPU OPPs.
> 
> With all this and the downstream RK3588(J) DT definitions [4][5] in mind,
> let's delete the RK3588J CPU and GPU OPPs that are not considered belonging
> to the normal operation mode for this SoC variant.  To quote the RK3588J
> datasheet [2], "normal mode means the chipset works under safety voltage
> and frequency;  for the industrial environment, highly recommend to keep in

FYI, the answer from Rockchip support about what "industrial 
environment" means is:

"""
Industrial environments encompass a wide range of settings, from
manufacturing plants to chemical processing facilities. These
environments are characterized by the use of complex machinery,
stringent safety protocols, and the need for continuous operations.
"""

which is not really helping me understand when we should be able to use 
the overdrive mode.

Why would you buy an RK3588J variant if you don't plan on using them on 
the -40 - -20°C range that isn't supported by the RK3588 variant, which 
seems to me to be the only advertised difference?

It also seems like the RK3588M supports the same operating range as the 
RK3588J but at faster speeds? c.f. 
https://en.t-firefly.com/product/industry/aio3588mq#spec and 
https://download.t-firefly.com/%E4%BA%A7%E5%93%81%E8%A7%84%E6%A0%BC%E6%96%87%E6%A1%A3/%E6%A0%B8%E5%BF%83%E6%9D%BF/iCore-3588MQ%20-%20Automotive-Grade%20AI%20Core%20Board.pdf

Couldn't find a datasheet though.

Talk about confusing specs...

I'll stop caring from now about this very topic :)

Cheers,
Quentin
Dragan Simic March 27, 2025, 8:05 a.m. UTC | #2
Hello Quentin,

On 2025-03-26 11:07, Quentin Schulz wrote:
> On 3/24/25 12:00 PM, Dragan Simic wrote:
>> The differences in the vendor-approved CPU and GPU OPPs for the 
>> standard
>> Rockchip RK3588 variant [1] and the industrial Rockchip RK3588J 
>> variant [2]
>> come from the latter, presumably, supporting an extended temperature 
>> range
>> that's usually associated with industrial applications, despite the 
>> two SoC
>> variant datasheets specifying the same upper limit for the allowed 
>> ambient
>> temperature for both variants.  However, the lower temperature limit 
>> is
>> specified much lower for the RK3588J variant. [1][2]
>> 
>> To be on the safe side and to ensure maximum longevity of the RK3588J 
>> SoCs,
>> only the CPU and GPU OPPs that are declared by the vendor to be always 
>> safe
>> for this SoC variant may be provided.  As explained by the vendor [3] 
>> and
>> according to the RK3588J datasheet, [2] 
>> higher-frequency/higher-voltage
>> CPU and GPU OPPs can be used as well, but at the risk of reducing the 
>> SoC
>> lifetime expectancy.  Presumably, using the higher OPPs may be safe 
>> only
>> when not enjoying the assumed extended temperature range that the 
>> RK3588J,
>> as an SoC variant targeted specifically at higher-temperature, 
>> industrial
>> applications, is made (or binned) for.
>> 
>> Anyone able to keep their RK3588J-based board outside the 
>> above-presumed
>> extended temperature range at all times, and willing to take the 
>> associated
>> risk of possibly reducing the SoC lifetime expectancy, is free to 
>> apply
>> a DT overlay that adds the higher CPU and GPU OPPs.
>> 
>> With all this and the downstream RK3588(J) DT definitions [4][5] in 
>> mind,
>> let's delete the RK3588J CPU and GPU OPPs that are not considered 
>> belonging
>> to the normal operation mode for this SoC variant.  To quote the 
>> RK3588J
>> datasheet [2], "normal mode means the chipset works under safety 
>> voltage
>> and frequency;  for the industrial environment, highly recommend to 
>> keep in
> 
> FYI, the answer from Rockchip support about what "industrial
> environment" means is:
> 
> """
> Industrial environments encompass a wide range of settings, from
> manufacturing plants to chemical processing facilities. These
> environments are characterized by the use of complex machinery,
> stringent safety protocols, and the need for continuous operations.
> """
> 
> which is not really helping me understand when we should be able to
> use the overdrive mode.

Thanks for forwarding this!  I really can't escape comparing the
response from Rockchip support to the old funny story in which
a passenger on a plane asks a flight attendant where they are,
and the attendant responds that they're on a plane. :D

In other words, that's perfectly valid information that describes
what an industrial environment looks like, but it has nothing to
do with describing the specifics of the applications of RK3588J
in such environments.

> Why would you buy an RK3588J variant if you don't plan on using them
> on the -40 - -20°C range that isn't supported by the RK3588 variant,
> which seems to me to be the only advertised difference?

Yes, AFAICT that's the only directly related difference in the
hard numbers provided by the RK3588 and RK3588J datasheets.

> It also seems like the RK3588M supports the same operating range as
> the RK3588J but at faster speeds? c.f.
> https://en.t-firefly.com/product/industry/aio3588mq#spec and
> https://download.t-firefly.com/%E4%BA%A7%E5%93%81%E8%A7%84%E6%A0%BC%E6%96%87%E6%A1%A3/%E6%A0%B8%E5%BF%83%E6%9D%BF/iCore-3588MQ%20-%20Automotive-Grade%20AI%20Core%20Board.pdf
> 
> Couldn't find a datasheet though.

There's also the following document:
https://download.t-firefly.com/Spec/CoreBorads/iCore-3588Q_Specification_EN.pdf?v=1743061914

I've also been unable to find the RK3588M datasheet.  Regarding
the Firefly SoMs with different RK3588 variants, it does seem
that the RK3588M, i.e. the automotive variant, is capable of
reaching 2.0 GHz throughout its entire operating range.

Maybe the RK3588M datasheet will become publicly available at
some point, allowing us to learn a bit more about it.

> Talk about confusing specs...
> 
> I'll stop caring from now about this very topic :)

We've exhausted all the available resources, so there actually
isn't much more to do anyway.
diff mbox series

Patch

diff --git a/arch/arm64/boot/dts/rockchip/rk3588j.dtsi b/arch/arm64/boot/dts/rockchip/rk3588j.dtsi
index bce72bac4503..3045cb3bd68c 100644
--- a/arch/arm64/boot/dts/rockchip/rk3588j.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3588j.dtsi
@@ -11,74 +11,59 @@  cluster0_opp_table: opp-table-cluster0 {
 		compatible = "operating-points-v2";
 		opp-shared;
 
-		opp-1416000000 {
-			opp-hz = /bits/ 64 <1416000000>;
+		opp-1200000000 {
+			opp-hz = /bits/ 64 <1200000000>;
 			opp-microvolt = <750000 750000 950000>;
 			clock-latency-ns = <40000>;
 			opp-suspend;
 		};
-		opp-1608000000 {
-			opp-hz = /bits/ 64 <1608000000>;
-			opp-microvolt = <887500 887500 950000>;
-			clock-latency-ns = <40000>;
-		};
-		opp-1704000000 {
-			opp-hz = /bits/ 64 <1704000000>;
-			opp-microvolt = <937500 937500 950000>;
+		opp-1296000000 {
+			opp-hz = /bits/ 64 <1296000000>;
+			opp-microvolt = <775000 775000 950000>;
 			clock-latency-ns = <40000>;
 		};
 	};
 
 	cluster1_opp_table: opp-table-cluster1 {
 		compatible = "operating-points-v2";
 		opp-shared;
 
+		opp-1200000000{
+			opp-hz = /bits/ 64 <1200000000>;
+			opp-microvolt = <750000 750000 950000>;
+			clock-latency-ns = <40000>;
+		};
 		opp-1416000000 {
 			opp-hz = /bits/ 64 <1416000000>;
-			opp-microvolt = <750000 750000 950000>;
+			opp-microvolt = <762500 762500 950000>;
 			clock-latency-ns = <40000>;
 		};
 		opp-1608000000 {
 			opp-hz = /bits/ 64 <1608000000>;
 			opp-microvolt = <787500 787500 950000>;
 			clock-latency-ns = <40000>;
 		};
-		opp-1800000000 {
-			opp-hz = /bits/ 64 <1800000000>;
-			opp-microvolt = <875000 875000 950000>;
-			clock-latency-ns = <40000>;
-		};
-		opp-2016000000 {
-			opp-hz = /bits/ 64 <2016000000>;
-			opp-microvolt = <950000 950000 950000>;
-			clock-latency-ns = <40000>;
-		};
 	};
 
 	cluster2_opp_table: opp-table-cluster2 {
 		compatible = "operating-points-v2";
 		opp-shared;
 
+		opp-1200000000{
+			opp-hz = /bits/ 64 <1200000000>;
+			opp-microvolt = <750000 750000 950000>;
+			clock-latency-ns = <40000>;
+		};
 		opp-1416000000 {
 			opp-hz = /bits/ 64 <1416000000>;
-			opp-microvolt = <750000 750000 950000>;
+			opp-microvolt = <762500 762500 950000>;
 			clock-latency-ns = <40000>;
 		};
 		opp-1608000000 {
 			opp-hz = /bits/ 64 <1608000000>;
 			opp-microvolt = <787500 787500 950000>;
 			clock-latency-ns = <40000>;
 		};
-		opp-1800000000 {
-			opp-hz = /bits/ 64 <1800000000>;
-			opp-microvolt = <875000 875000 950000>;
-			clock-latency-ns = <40000>;
-		};
-		opp-2016000000 {
-			opp-hz = /bits/ 64 <2016000000>;
-			opp-microvolt = <950000 950000 950000>;
-			clock-latency-ns = <40000>;
-		};
 	};
 
 	gpu_opp_table: opp-table {
@@ -104,10 +89,6 @@  opp-700000000 {
 			opp-hz = /bits/ 64 <700000000>;
 			opp-microvolt = <750000 750000 850000>;
 		};
-		opp-850000000 {
-			opp-hz = /bits/ 64 <800000000>;
-			opp-microvolt = <787500 787500 850000>;
-		};
 	};
 };