From patchwork Wed Nov 13 08:40:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Lezcano X-Patchwork-Id: 11241371 X-Patchwork-Delegate: eduardo.valentin@ti.com Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 00EA213BD for ; Wed, 13 Nov 2019 08:40:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CE9CD222D0 for ; Wed, 13 Nov 2019 08:40:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="LbG0zOdg" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726086AbfKMIk4 (ORCPT ); Wed, 13 Nov 2019 03:40:56 -0500 Received: from mail-wr1-f66.google.com ([209.85.221.66]:42715 "EHLO mail-wr1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725966AbfKMIkz (ORCPT ); Wed, 13 Nov 2019 03:40:55 -0500 Received: by mail-wr1-f66.google.com with SMTP id a15so1285257wrf.9 for ; Wed, 13 Nov 2019 00:40:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id; bh=duwBoEZu8ONR3Kpqj7k/vnfSyGY52V9/HA6DrXTVVao=; b=LbG0zOdgKPABVSEgK1vjmpfZefrws6Ve9ii3LF79ah6JbAro9X/IalXn8g70oRwIs9 u9mXgOE+ECeneTh9bHGCfXT98oOe9dGQEmEJ/cndxRDTQe7xTau6KW+Z8l710c6b6VLD 0vvmR8FhCWelGuVRK49RBxU2k0ksf7iYplmEDqypYQcOkgMqiu7LQNRSD+GX2+8PXeoY qw1RMG6kZ68B15y5Ku1EM1UaW6KcYSx2eYSm40RWyBywK7chFGgYa9Ki6GBhGSLNTVo3 ZhJzyQcmImIJMKhb7IG3QTuEmRp6A1xys5nKsecxxfcKfNLqlWYEQNbQIUtEIDuDDI21 0f5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=duwBoEZu8ONR3Kpqj7k/vnfSyGY52V9/HA6DrXTVVao=; b=j0OXT5A/nRJChzq62vAfnQI4bW5/91/5ts3zp98czhtbaESyO4jGEWxiKnTVBKG9Jw 7igRTFJBkM0suC/3KLYqf1iOijFtbJVmqn+7CxIidOLqFtmChmtAjXk2dfgb09nhpVMV I0wYFiQsQNsjjLmByxxJmAdtJ859XHPghdB7PL+8Ng5CPSpliS98Vv6M9/8IInHeAUjJ zewvSyyZxOmMAyEiwNwxx1/0YkhPF9lGev05PtMk0lx+QajKooazQVAfAsYy6cPz3uZH CSEmXoWg28rCQFo496F7GWyeuR+Q6VHcNtXcPEMaOfhNX5nTY/tk2CVNjv66AhOdCfPC VAXw== X-Gm-Message-State: APjAAAXTeE5XN1PEXXUP91IAwVy6RL9mOdJwm3og72MZFEblJg4hhaMI oP1TUBAPaH5fnQcz/2GEfcoPKMiAhwQ= X-Google-Smtp-Source: APXvYqyhAqxkq744Z/HqS84ByvyvEvolM2hZLSHEnenA1CNXMue2JTN6yHOt6t1/M8gv/9Gnddkq1g== X-Received: by 2002:adf:ea8d:: with SMTP id s13mr1632661wrm.366.1573634453646; Wed, 13 Nov 2019 00:40:53 -0800 (PST) Received: from localhost.localdomain ([2a01:e34:ed2f:f020:8174:caa5:2144:f60]) by smtp.gmail.com with ESMTPSA id h205sm1667499wmf.35.2019.11.13.00.40.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Nov 2019 00:40:53 -0800 (PST) From: Daniel Lezcano To: rjw@rjwysocki.net, edubezval@gmail.com, rui.zhang@intel.com Cc: linux-pm@vger.kernel.org, viresh.kumar@linaro.org, amit.kucheria@linaro.org, linux-kernel@vger.kernel.org Subject: [PATCH 1/3] thermal/drivers/Kconfig: Convert the CPU cooling device to a choice Date: Wed, 13 Nov 2019 09:40:40 +0100 Message-Id: <20191113084042.5707-1-daniel.lezcano@linaro.org> X-Mailer: git-send-email 2.17.1 Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org The next changes will add a new way to cool down a CPU by injecting idle cycles. With the current configuration, a CPU cooling device is the cpufreq cooling device. As we want to add a new CPU cooling device, let's convert the CPU cooling to a choice giving a list of CPU cooling devices. At this point, there is obviously only one CPU cooling device. There is no functional changes. Signed-off-by: Daniel Lezcano --- drivers/thermal/Kconfig | 13 +++++++++++-- drivers/thermal/Makefile | 2 +- include/linux/cpu_cooling.h | 6 +++--- 3 files changed, 15 insertions(+), 6 deletions(-) diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig index 001a21abcc28..2b82c4861091 100644 --- a/drivers/thermal/Kconfig +++ b/drivers/thermal/Kconfig @@ -150,8 +150,17 @@ config THERMAL_GOV_POWER_ALLOCATOR config CPU_THERMAL bool "Generic cpu cooling support" - depends on CPU_FREQ depends on THERMAL_OF + help + Enable the CPU cooling features. If the system has no active + cooling device available, this option allows to use the CPU + as a cooling device. + +if CPU_THERMAL + +config CPU_FREQ_THERMAL + bool "CPU frequency cooling device" + depends on CPU_FREQ help This implements the generic cpu cooling mechanism through frequency reduction. An ACPI version of this already exists @@ -159,7 +168,7 @@ config CPU_THERMAL This will be useful for platforms using the generic thermal interface and not the ACPI interface. - If you want this support, you should say Y here. +endif config CLOCK_THERMAL bool "Generic clock cooling support" diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile index 74a37c7f847a..d3b01cc96981 100644 --- a/drivers/thermal/Makefile +++ b/drivers/thermal/Makefile @@ -19,7 +19,7 @@ thermal_sys-$(CONFIG_THERMAL_GOV_USER_SPACE) += user_space.o thermal_sys-$(CONFIG_THERMAL_GOV_POWER_ALLOCATOR) += power_allocator.o # cpufreq cooling -thermal_sys-$(CONFIG_CPU_THERMAL) += cpu_cooling.o +thermal_sys-$(CONFIG_CPU_FREQ_THERMAL) += cpu_cooling.o # clock cooling thermal_sys-$(CONFIG_CLOCK_THERMAL) += clock_cooling.o diff --git a/include/linux/cpu_cooling.h b/include/linux/cpu_cooling.h index b74732535e4b..3cdd85f987d7 100644 --- a/include/linux/cpu_cooling.h +++ b/include/linux/cpu_cooling.h @@ -19,7 +19,7 @@ struct cpufreq_policy; -#ifdef CONFIG_CPU_THERMAL +#ifdef CONFIG_CPU_FREQ_THERMAL /** * cpufreq_cooling_register - function to create cpufreq cooling device. * @policy: cpufreq policy. @@ -40,7 +40,7 @@ void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev); struct thermal_cooling_device * of_cpufreq_cooling_register(struct cpufreq_policy *policy); -#else /* !CONFIG_CPU_THERMAL */ +#else /* !CONFIG_CPU_FREQ_THERMAL */ static inline struct thermal_cooling_device * cpufreq_cooling_register(struct cpufreq_policy *policy) { @@ -58,6 +58,6 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy) { return NULL; } -#endif /* CONFIG_CPU_THERMAL */ +#endif /* CONFIG_CPU_FREQ_THERMAL */ #endif /* __CPU_COOLING_H__ */ From patchwork Wed Nov 13 08:40:41 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Lezcano X-Patchwork-Id: 11241373 X-Patchwork-Delegate: eduardo.valentin@ti.com Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EDD6F1515 for ; Wed, 13 Nov 2019 08:41:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B956B222CD for ; Wed, 13 Nov 2019 08:41:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="OSW3dN12" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727151AbfKMIk6 (ORCPT ); Wed, 13 Nov 2019 03:40:58 -0500 Received: from mail-wm1-f66.google.com ([209.85.128.66]:34487 "EHLO mail-wm1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725993AbfKMIk6 (ORCPT ); Wed, 13 Nov 2019 03:40:58 -0500 Received: by mail-wm1-f66.google.com with SMTP id j18so4287796wmk.1 for ; Wed, 13 Nov 2019 00:40:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=J6TOUAUpf290Soz5pSkfNzyRRTkHTSc9qnKiA/fCb8o=; b=OSW3dN12Br5VrxOmCjCQFR0EvtHiZzkJimb31Lm9lleVLBRDLaVusKq5VhVc9ATUAa B9rIsLQUo/Hr+1syjSXhNDqL92P0Cx7MM9ogARVa6tqcXvX3uEQm4WqfJzUDIh2HD5H3 FNmUVRXYaTnKZEVS3lzQ8B7Y246yHN170NLI3ETVLMgLNI3/zW1kT1Ts+gv0KGj0VPHt IcwlcJQEoGUetnxLgJRgTOu5Eu/ceufmd1WnqDBfcEmp2M355pcxPGbEyAovtbT0yGee XuGhftPDh0rPg5nKEBSZHqg5uc9oRIF1f/8WTeeed5jqQd3eAPsnDrkfdWIFYBRWIag7 TjEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=J6TOUAUpf290Soz5pSkfNzyRRTkHTSc9qnKiA/fCb8o=; b=gMO7nxjUlbiYnKWqfZM5fCAPGA6arCIWwj0nzD9qnrSe9cfb5Y8yOR96vzzNgMZPEJ FbWAw2dxpYYq+KfYN5A/3x1ajkccux8p2jmHZx7f2aARKhGUSL5+w8D63ERwThW8zwiY zDMjKKteD+Nzqpo2KxmtWjKkd7gKKllcgsybFlYVfpt6eBOmVuWyheyA04eeiKDm9/KU QLsikveJkrmzxfv5dWDXYxi5eXDeGqWmg8Ef6ih4FDxstB0grVtZC0hH5avVHP6dgFP3 /+xMM/lnBOtWZrf9c6eWjfQbKKoJqaVaT9C0HSlHZHV3oP0gMGxXxw7NN6LLPVV+E8lC wWQQ== X-Gm-Message-State: APjAAAUagMv/cIzosZJThgbpQc9xQd0R1Fnt5PJQsgSauGEB2SBVwbdy j8eI+LT7DrdIXRmCD0wcDVjd9g== X-Google-Smtp-Source: APXvYqxtS41T2LRXM/npT9sKYL3NwVNISNcgKbBRStR5v5OoKb7XpL89ob/4l8VlY4Qg1FMZsIYQ8Q== X-Received: by 2002:a05:600c:218e:: with SMTP id e14mr1510084wme.22.1573634454842; Wed, 13 Nov 2019 00:40:54 -0800 (PST) Received: from localhost.localdomain ([2a01:e34:ed2f:f020:8174:caa5:2144:f60]) by smtp.gmail.com with ESMTPSA id h205sm1667499wmf.35.2019.11.13.00.40.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Nov 2019 00:40:54 -0800 (PST) From: Daniel Lezcano To: rjw@rjwysocki.net, edubezval@gmail.com, rui.zhang@intel.com Cc: linux-pm@vger.kernel.org, viresh.kumar@linaro.org, amit.kucheria@linaro.org, linux-kernel@vger.kernel.org Subject: [PATCH 2/3] thermal/drivers/cpu_cooling: Add idle cooling device documentation Date: Wed, 13 Nov 2019 09:40:41 +0100 Message-Id: <20191113084042.5707-2-daniel.lezcano@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191113084042.5707-1-daniel.lezcano@linaro.org> References: <20191113084042.5707-1-daniel.lezcano@linaro.org> MIME-Version: 1.0 Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org Provide some documentation for the idle injection cooling effect in order to let people to understand the rational of the approach for the idle injection CPU cooling device. Signed-off-by: Daniel Lezcano --- .../driver-api/thermal/cpu-idle-cooling.rst | 166 ++++++++++++++++++ 1 file changed, 166 insertions(+) create mode 100644 Documentation/driver-api/thermal/cpu-idle-cooling.rst diff --git a/Documentation/driver-api/thermal/cpu-idle-cooling.rst b/Documentation/driver-api/thermal/cpu-idle-cooling.rst new file mode 100644 index 000000000000..457cd9979ddb --- /dev/null +++ b/Documentation/driver-api/thermal/cpu-idle-cooling.rst @@ -0,0 +1,166 @@ + +Situation: +---------- + +Under certain circumstances a SoC can reach the maximum temperature +limit or is unable to stabilize the temperature around a temperature +control. When the SoC has to stabilize the temperature, the kernel can +act on a cooling device to mitigate the dissipated power. When the +maximum temperature is reached and to prevent a reboot or a shutdown, +a decision must be taken to reduce the temperature under the critical +threshold, that impacts the performance. + +Another situation is when the silicon reaches a certain temperature +which continues to increase even if the dynamic leakage is reduced to +its minimum by clock gating the component. The runaway phenomena will +continue with the static leakage and only powering down the component, +thus dropping the dynamic and static leakage will allow the component +to cool down. + +Last but not least, the system can ask for a specific power budget but +because of the OPP density, we can only choose an OPP with a power +budget lower than the requested one and underuse the CPU, thus losing +performances. In other words, one OPP under uses the CPU with a power +lesser than the power budget and the next OPP exceed the power budget, +an intermediate OPP could have been used if it were present. + +Solutions: +---------- + +If we can remove the static and the dynamic leakage for a specific +duration in a controlled period, the SoC temperature will +decrease. Acting at the idle state duration or the idle cycle +injection period, we can mitigate the temperature by modulating the +power budget. + +The Operating Performance Point (OPP) density has a great influence on +the control precision of cpufreq, however different vendors have a +plethora of OPP density, and some have large power gap between OPPs, +that will result in loss of performance during thermal control and +loss of power in other scenes. + +At a specific OPP, we can assume injecting idle cycle on all CPUs, +belonging to the same cluster, with a duration greater than the +cluster idle state target residency, we drop the static and the +dynamic leakage for this period (modulo the energy needed to enter +this state). So the sustainable power with idle cycles has a linear +relation with the OPP’s sustainable power and can be computed with a +coefficient similar to: + + Power(IdleCycle) = Coef x Power(OPP) + +Idle Injection: +--------------- + +The base concept of the idle injection is to force the CPU to go to an +idle state for a specified time each control cycle, it provides +another way to control CPU power and heat in addition to +cpufreq. Ideally, if all CPUs belonging to the same cluster, inject +their idle cycle synchronously, the cluster can reach its power down +state with a minimum power consumption and static leakage +drop. However, these idle cycles injection will add extra latencies as +the CPUs will have to wakeup from a deep sleep state. + + ^ + | + | + |------- ------- ------- + |_______|_____|_______|_____|_______|___________ + + <-----> + idle <----> + running + +With the fixed idle injection duration, we can give a value which is +an acceptable performance drop off or latency when we reach a specific +temperature and we begin to mitigate by varying the Idle injection +period. + +The mitigation begins with a maximum period value which decrease when +more cooling effect is requested. When the period duration is equal to +the idle duration, then we are in a situation the platform can’t +dissipate the heat enough and the mitigation fails. In this case the +situation is considered critical and there is nothing to do. The idle +injection duration must be changed by configuration and until we reach +the cooling effect, otherwise an additionnal cooling device must be +used or ultimately decrease the SoC performance by dropping the +highest OPP point of the SoC. + +The idle injection duration value must comply with the constraints: + +- It is lesser or equal to the latency we tolerate when the mitigation + begins. It is platform dependent and will depend on the user + experience, reactivity vs performance trade off we want. This value + should be specified. + +- It is greater than the idle state’s target residency we want to go + for thermal mitigation, otherwise we end up consuming more energy. + +Minimum period +-------------- + +The idle injection duration being fixed, it is obvious the minimum +period can’t be lesser than that, otherwise we will be scheduling the +idle injection task right before the idle injection duration is +complete, so waking up the CPU to put it asleep again. + +Maximum period +-------------- + +The maximum period is the initial period when the mitigation +begins. Theoretically when we reach the thermal trip point, we have to +sustain a specified power for specific temperature but at this time we +consume: + + Power = Capacitance x Voltage^2 x Frequency x Utilisation + +... which is more than the sustainable power (or there is something +wrong on the system setup). The ‘Capacitance’ and ‘Utilisation’ are a +fixed value, ‘Voltage’ and the ‘Frequency’ are fixed artificially +because we don’t want to change the OPP. We can group the +‘Capacitance’ and the ‘Utilisation’ into a single term which is the +‘Dynamic Power Coefficient (Cdyn)’ Simplifying the above, we have: + + Pdyn = Cdyn x Voltage^2 x Frequency + +The IPA will ask us somehow to reduce our power in order to target the +sustainable power defined in the device tree. So with the idle +injection mechanism, we want an average power (Ptarget) resulting on +an amount of time running at full power on a specific OPP and idle +another amount of time. That could be put in a equation: + + P(opp)target = ((trunning x (P(opp)running) + (tidle P(opp)idle)) / + (trunning + tidle) + ... + + tidle = trunning x ((P(opp)running / P(opp)target) - 1) + +At this point if we know the running period for the CPU, that gives us +the idle injection, we need. Alternatively if we have the idle +injection duration, we can compute the running duration with: + + trunning = tidle / ((P(opp)running / P(opp)target) - 1) + +Practically, if the running power is lesses than the targeted power, +we end up with a negative time value, so obviously the equation usage +is bound to a power reduction, hence a higher OPP is needed to have +the running power greater than the targeted power. + +However, in this demonstration we ignore three aspects: + + * The static leakage is not defined here, we can introduce it in the + equation but assuming it will be zero most of the time as it is + difficult to get the values from the SoC vendors + + * The idle state wake up latency (or entry + exit latency) is not + taken into account, it must be added in the equation in order to + rigorously compute the idle injection + + * The injected idle duration must be greater than the idle state + target residency, otherwise we end up consuming more energy and + potentially invert the mitigation effect + +So the final equation is: + + trunning = (tidle - twakeup ) x + (((P(opp)dyn + P(opp)static ) - P(opp)target) / P(opp)target ) From patchwork Wed Nov 13 08:40:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Lezcano X-Patchwork-Id: 11241375 X-Patchwork-Delegate: eduardo.valentin@ti.com Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AE9771515 for ; Wed, 13 Nov 2019 08:41:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7CB26222CD for ; Wed, 13 Nov 2019 08:41:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="Tia/8E42" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725993AbfKMIlA (ORCPT ); Wed, 13 Nov 2019 03:41:00 -0500 Received: from mail-wm1-f68.google.com ([209.85.128.68]:40062 "EHLO mail-wm1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726120AbfKMIk6 (ORCPT ); Wed, 13 Nov 2019 03:40:58 -0500 Received: by mail-wm1-f68.google.com with SMTP id f3so996535wmc.5 for ; Wed, 13 Nov 2019 00:40:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Zp55sUzE/6lRk6a8ivnPTNcYxH9H+Lr3SiBACXS3e9U=; b=Tia/8E42XjbaQKn88fLe4L7drwoSdKwJdVXHfuantIzq2ohNNGYPjBoF6VaRmZla3U Ur4l6+lOwcfxvlzWDCqcrt/E93LHFOQKVdSvDElAiUsQbOlJtbWJSKiSiEsCN1asxt3d tUFT+sxPpQVKpBPgVeLSZ9P7D2HLLUJ66ZG45z5tA5KHo99a0iCP049Sr39z7haVCXg9 mCXwZezoZsXlOm6zq0iwZ/6n+lQxsETMwl/jFVKH8wDZefYkyW0Q7OaxdFjFe48QD2dY 0YL9MtZZX1+fzBsDFiwAHF6fayksdeR/ah5GCoc9PVWndfa0pXLqafipXyHOc0LUlVF9 9DWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Zp55sUzE/6lRk6a8ivnPTNcYxH9H+Lr3SiBACXS3e9U=; b=OqUaRZOz+m1qCFDjXVzBRXGvjF+uD0/NNFcy3eA/v4bIM9jDmZRU2yDG1Kr3i7SmzA TIWcbnRn3ibKRGV0MyRa7jeSmpUnR2utWqgSeW9lfp76D+h1ONlwMK4EvHuEfYrIFVSi HmNroGwAEwrAIkV5hTjGywNqlv8yPIRUA5xH/iA3bKSDOPA9f1ac6WKGL+4ENOVvPZ7o zVJKtaLlzURUrMNiOOndI0nBph6ChxpTpKkoZCywmbBAMNgcHQ6pZhK2ftan25XYy0nN kfAIwfSkM48gQyLhjcEK0i/g2uvb/vCt18srJCD0mHR7qxRqtScRJvnF0Uv0eAiF1Hmc 3KdQ== X-Gm-Message-State: APjAAAVuzsiJhnSNbkKpNCNAIJwfnCP7ECoKXJ8Y2fZByB9FtE0eQoyb gxptY1r6yzszAELDkrbB/5imXA== X-Google-Smtp-Source: APXvYqxy0Nu4iDJ1TERfnDLyecI5kRzzX3Fq81GFUXHoLdj5SoG8ZZgzgn0kLmkX/wK+bE7cwUOEng== X-Received: by 2002:a1c:99cb:: with SMTP id b194mr1722398wme.100.1573634456103; Wed, 13 Nov 2019 00:40:56 -0800 (PST) Received: from localhost.localdomain ([2a01:e34:ed2f:f020:8174:caa5:2144:f60]) by smtp.gmail.com with ESMTPSA id h205sm1667499wmf.35.2019.11.13.00.40.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Nov 2019 00:40:55 -0800 (PST) From: Daniel Lezcano To: rjw@rjwysocki.net, edubezval@gmail.com, rui.zhang@intel.com Cc: linux-pm@vger.kernel.org, viresh.kumar@linaro.org, amit.kucheria@linaro.org, linux-kernel@vger.kernel.org Subject: [PATCH 3/3] thermal/drivers/cpu_cooling: Introduce the cpu idle cooling driver Date: Wed, 13 Nov 2019 09:40:42 +0100 Message-Id: <20191113084042.5707-3-daniel.lezcano@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191113084042.5707-1-daniel.lezcano@linaro.org> References: <20191113084042.5707-1-daniel.lezcano@linaro.org> Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org The cpu idle cooling device offers a new method to cool down a CPU by injecting idle cycles at runtime. It has some similarities with the intel power clamp driver but it is actually designed to be more generic and relying on the idle injection powercap framework. The idle injection cycle is fixed while the running cycle is variable. That allows to have control on the device reactivity for the user experience. An idle state powering down the CPU or the cluster will allow to drop the static leakage, thus restoring the heat capacity of the SoC. It can be set with a trip point between the hot and the critical points, giving the opportunity to prevent a hard reset of the system when the cpufreq cooling fails to cool down the CPU. With more sophisticated boards having a per core sensor, the idle cooling device allows to cool down a single core without throttling the compute capacity of several cpus belonging to the same clock line, so it could be used in collaboration with the cpufreq cooling device. Signed-off-by: Daniel Lezcano --- drivers/thermal/Kconfig | 7 + drivers/thermal/Makefile | 1 + drivers/thermal/cpuidle_cooling.c | 233 ++++++++++++++++++++++++++++++ include/linux/cpu_cooling.h | 22 +++ 4 files changed, 263 insertions(+) create mode 100644 drivers/thermal/cpuidle_cooling.c diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig index 2b82c4861091..00d69906c508 100644 --- a/drivers/thermal/Kconfig +++ b/drivers/thermal/Kconfig @@ -168,6 +168,13 @@ config CPU_FREQ_THERMAL This will be useful for platforms using the generic thermal interface and not the ACPI interface. +config CPU_IDLE_THERMAL + bool "CPU idle cooling device" + depends on IDLE_INJECT + help + This implements the CPU cooling mechanism through + idle injection. This will throttle the CPU by injecting + idle cycle. endif config CLOCK_THERMAL diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile index d3b01cc96981..9c8aa2d4bd28 100644 --- a/drivers/thermal/Makefile +++ b/drivers/thermal/Makefile @@ -20,6 +20,7 @@ thermal_sys-$(CONFIG_THERMAL_GOV_POWER_ALLOCATOR) += power_allocator.o # cpufreq cooling thermal_sys-$(CONFIG_CPU_FREQ_THERMAL) += cpu_cooling.o +thermal_sys-$(CONFIG_CPU_IDLE_THERMAL) += cpuidle_cooling.o # clock cooling thermal_sys-$(CONFIG_CLOCK_THERMAL) += clock_cooling.o diff --git a/drivers/thermal/cpuidle_cooling.c b/drivers/thermal/cpuidle_cooling.c new file mode 100644 index 000000000000..6e911fa87c47 --- /dev/null +++ b/drivers/thermal/cpuidle_cooling.c @@ -0,0 +1,233 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2019 Linaro Limited. + * + * Author: Daniel Lezcano + * + */ +#include +#include +#include +#include +#include +#include +#include + +/** + * struct cpuidle_cooling_device - data for the idle cooling device + * @ii_dev: an atomic to keep track of the last task exiting the idle cycle + * @idle_duration_us: an integer defining the duration of the idle injection + * @state: an normalized integer giving the state of the cooling device + */ +struct cpuidle_cooling_device { + struct idle_inject_device *ii_dev; + unsigned int idle_duration_us; + unsigned long state; +}; + +static DEFINE_IDA(cpuidle_ida); + +/** + * cpuidle_cooling_runtime - Running time computation + * @idle_duration_us: the idle cooling device + * @state: a percentile based number + * + * The running duration is computed from the idle injection duration + * which is fixed. If we reach 100% of idle injection ratio, that + * means the running duration is zero. If we have a 50% ratio + * injection, that means we have equal duration for idle and for + * running duration. + * + * The formula is deduced as the following: + * + * running = idle x ((100 / ratio) - 1) + * + * For precision purpose for integer math, we use the following: + * + * running = (idle x 100) / ratio - idle + * + * For example, if we have an injected duration of 50%, then we end up + * with 10ms of idle injection and 10ms of running duration. + * + * Returns an unsigned int for an usec based runtime duration. + */ +static unsigned int cpuidle_cooling_runtime(unsigned int idle_duration_us, + unsigned long state) +{ + if (!state) + return 0; + + return ((idle_duration_us * 100) / state) - idle_duration_us; +} + +/** + * cpuidle_cooling_get_max_state - Get the maximum state + * @cdev : the thermal cooling device + * @state : a pointer to the state variable to be filled + * + * The function always gives 100 as the injection ratio is percentile + * based for consistency accros different platforms. + * + * The function can not fail, it always returns zero. + */ +static int cpuidle_cooling_get_max_state(struct thermal_cooling_device *cdev, + unsigned long *state) +{ + /* + * Depending on the configuration or the hardware, the running + * cycle and the idle cycle could be different. We want unify + * that to an 0..100 interval, so the set state interface will + * be the same whatever the platform is. + * + * The state 100% will make the cluster 100% ... idle. A 0% + * injection ratio means no idle injection at all and 50% + * means for 10ms of idle injection, we have 10ms of running + * time. + */ + *state = 100; + + return 0; +} + +/** + * cpuidle_cooling_get_cur_state - Get the current cooling state + * @cdev: the thermal cooling device + * @state: a pointer to the state + * + * The function just copy the state value from the private thermal + * cooling device structure, the mapping is 1 <-> 1. + * + * The function can not fail, it always returns zero. + */ +static int cpuidle_cooling_get_cur_state(struct thermal_cooling_device *cdev, + unsigned long *state) +{ + struct cpuidle_cooling_device *idle_cdev = cdev->devdata; + + *state = idle_cdev->state; + + return 0; +} + +/** + * cpuidle_cooling_set_cur_state - Set the current cooling state + * @cdev: the thermal cooling device + * @state: the target state + * + * The function checks first if we are initiating the mitigation which + * in turn wakes up all the idle injection tasks belonging to the idle + * cooling device. In any case, it updates the internal state for the + * cooling device. + * + * The function can not fail, it always returns zero. + */ +static int cpuidle_cooling_set_cur_state(struct thermal_cooling_device *cdev, + unsigned long state) +{ + struct cpuidle_cooling_device *idle_cdev = cdev->devdata; + struct idle_inject_device *ii_dev = idle_cdev->ii_dev; + unsigned long current_state = idle_cdev->state; + unsigned int runtime_us; + + idle_cdev->state = state; + + runtime_us = cpuidle_cooling_runtime(idle_cdev->idle_duration_us, + state); + + idle_inject_set_duration(ii_dev, runtime_us, + idle_cdev->idle_duration_us); + + if (current_state == 0 && state > 0) { + idle_inject_start(ii_dev); + } else if (current_state > 0 && !state) { + idle_inject_stop(ii_dev); + } + + return 0; +} + +/** + * cpuidle_cooling_ops - thermal cooling device ops + */ +static struct thermal_cooling_device_ops cpuidle_cooling_ops = { + .get_max_state = cpuidle_cooling_get_max_state, + .get_cur_state = cpuidle_cooling_get_cur_state, + .set_cur_state = cpuidle_cooling_set_cur_state, +}; + +/** + * cpuidle_of_cooling_register - Idle cooling device initialization function + * @drv: a cpuidle driver structure pointer + * + * This function is in charge of creating a cooling device per cpuidle + * driver and register it to thermal framework. + * + * Returns a valid pointer to a thermal cooling device, a PTR_ERR + * corresponding to the error detected in the underlying subsystems. + */ +struct thermal_cooling_device * +__init cpuidle_of_cooling_register(struct device_node *np, + struct cpuidle_driver *drv) +{ + struct idle_inject_device *ii_dev; + struct cpuidle_cooling_device *idle_cdev; + struct thermal_cooling_device *cdev; + char dev_name[THERMAL_NAME_LENGTH]; + int id, ret; + + ii_dev = idle_inject_register(drv->cpumask); + if (IS_ERR(ii_dev)) { + ret = PTR_ERR(ii_dev); + goto out; + } + + idle_cdev = kzalloc(sizeof(*idle_cdev), GFP_KERNEL); + if (!idle_cdev) { + ret = -ENOMEM; + goto out_unregister; + } + + id = ida_simple_get(&cpuidle_ida, 0, 0, GFP_KERNEL); + if (id < 0) { + ret = id; + goto out_kfree; + } + + idle_cdev->ii_dev = ii_dev; + idle_cdev->idle_duration_us = TICK_USEC; + + snprintf(dev_name, sizeof(dev_name), "thermal-idle-%d", id); + + cdev = thermal_of_cooling_device_register(np, dev_name, idle_cdev, + &cpuidle_cooling_ops); + if (IS_ERR(cdev)) { + ret = PTR_ERR(cdev); + goto out_id; + } + + return cdev; +out_id: + ida_simple_remove(&cpuidle_ida, id); +out_kfree: + kfree(idle_cdev); +out_unregister: + idle_inject_unregister(ii_dev); +out: + return ERR_PTR(ret); +} + +/** + * cpuidle_cooling_register - Idle cooling device initialization function + * @drv: a cpuidle driver structure pointer + * + * This function is in charge of creating a cooling device per cpuidle + * driver and register it to thermal framework. + * + * Returns a valid pointer to a thermal cooling device, a PTR_ERR + * corresponding to the error detected in the underlying subsystems. + */ +struct thermal_cooling_device * +__init cpuidle_cooling_register(struct cpuidle_driver *drv) +{ + return cpuidle_of_cooling_register(NULL, drv); +} diff --git a/include/linux/cpu_cooling.h b/include/linux/cpu_cooling.h index 3cdd85f987d7..7873ac2f740b 100644 --- a/include/linux/cpu_cooling.h +++ b/include/linux/cpu_cooling.h @@ -60,4 +60,26 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy) } #endif /* CONFIG_CPU_FREQ_THERMAL */ +struct cpuidle_driver; + +#ifdef CONFIG_CPU_IDLE_THERMAL +extern struct thermal_cooling_device * +__init cpuidle_cooling_register(struct cpuidle_driver *drv); +extern struct thermal_cooling_device * +__init cpuidle_of_cooling_register(struct device_node *np, + struct cpuidle_driver *drv); +#else /* CONFIG_CPU_IDLE_THERMAL */ +static inline struct thermal_cooling_device * +__init cpuidle_cooling_register(struct cpuidle_driver *drv) +{ + return ERR_PTR(-EINVAL); +} +extern struct thermal_cooling_device * +__init cpuidle_of_cooling_register(struct device_node *np, + struct cpuidle_driver *drv) +{ + return ERR_PTR(-EINVAL); +} +#endif /* CONFIG_CPU_IDLE_THERMAL */ + #endif /* __CPU_COOLING_H__ */