From patchwork Mon Aug 22 23:11:33 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Doug Smythies X-Patchwork-Id: 9294613 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id EBD32607FF for ; Mon, 22 Aug 2016 23:15:04 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DC65A28AD7 for ; Mon, 22 Aug 2016 23:15:04 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CFB5028AE2; Mon, 22 Aug 2016 23:15:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 28E1828AE3 for ; Mon, 22 Aug 2016 23:15:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752572AbcHVXO5 (ORCPT ); Mon, 22 Aug 2016 19:14:57 -0400 Received: from mail-pf0-f194.google.com ([209.85.192.194]:35697 "EHLO mail-pf0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932846AbcHVXNE (ORCPT ); Mon, 22 Aug 2016 19:13:04 -0400 Received: by mail-pf0-f194.google.com with SMTP id h186so7074652pfg.2 for ; Mon, 22 Aug 2016 16:12:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id; bh=n0YV4WstUEJo5qGvENZx+Wa8TNfD4Rbz3wgZ8aObdeY=; b=RKmbKYo7q5+YlrhQoYwcG1DmKu4mKgyC40SICu9s5UEpH2r0JIfDItFVDv9qMM3RNd YtP309AHxe0lwX61aYQ8V0+Td890fwE4ireFROR8uZ6BRKKnVlAlfRQr1PLypPFneTs3 a2DBOU0Q2Z1f5XO0nv811KszXZnGre1xWrpRxYP33vjRrT9FUETPWESr+WEf5YDKkuaa uGYT4Xr0eK5h5LV7hRxbiqYFQ9QQEByzKywzZQHiCsDpVpSSFUmzkXbw1ElsTn9DRU24 KC0TQwQggoUkkVWSgfzoQfhMxsalzNc1R+RRBOK4oxSTufCu5/igQzT6s8vair3SJaBl gkJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=n0YV4WstUEJo5qGvENZx+Wa8TNfD4Rbz3wgZ8aObdeY=; b=iHTegR3n9z1S4+h35hzmXwjTVq/3H7U/UzevdY+vhjzgw5flxsKejnrAZbDSu+rqSP Xj1WB+a5PAvdSelup6U8qL4/GKNM6x/o5bW2tL61lSwYHMFoA/2+d/ZlCUVAlTkCpoW5 E8Y9j1P5zzQZKm7yIQt63ZkMmK/uzp4yXqi9olNuPB9SeVdsjM8Up89LlDOVW3DZCdSd MxR9Qn05UDSKIj2l+E38IDBIWRetQrn89M4MjXHHwcteFP7ph32utX5f+9hPmO61UYPr utLmwZ4mKjfk037mE70rC5/nP0j6RdF2Ka44K135bdkPzAMfzw0Wt5Dh9/1q9CBGGOkN MH6g== X-Gm-Message-State: AEkoouvJ9SQcKyI/RY8tWAHtq/9TdHkS05hhvmnJgnIAAARuQnewKZ6f7+3ymf7uRZ1Orw== X-Received: by 10.98.0.83 with SMTP id 80mr47599194pfa.78.1471907526723; Mon, 22 Aug 2016 16:12:06 -0700 (PDT) Received: from s15.smythies.com (s173-180-45-4.bc.hsia.telus.net. [173.180.45.4]) by smtp.gmail.com with ESMTPSA id b68sm230538pfg.85.2016.08.22.16.12.05 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 22 Aug 2016 16:12:06 -0700 (PDT) From: Doug Smythies X-Google-Original-From: Doug Smythies To: srinivas.pandruvada@linux.intel.com Cc: rjw@rjwysocki.net, linux-pm@vger.kernel.org, Doug Smythies Subject: [RFC][PATCH 8 of 7] cpufreq: intel_pstate: add iir filter to pstate. Date: Mon, 22 Aug 2016 16:11:33 -0700 Message-Id: <1471907493-26639-1-git-send-email-dsmythies@telus.net> X-Mailer: git-send-email 2.7.4 Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Note: This is not a formal version of this patch, but rather an interim version. As a function of load / sleep frequency and how it beats against this drivers sampling times, the driver has a tendency to be underdamped and to oscillate, requiring a bandwidth limiting filter on the target PState. Add a simple IIR (Infinite Impulse Response) type filter to the target PState. The purpose is to dampen the inherent oscillations caused by a sampled system that can have measured load extremes in any given sample. The /sys/kernel/debug/pstate_snb/p_gain_pct has been temporarily re-tasked to be the gain for this filter. Optimal nominal gain setting is a tradeoff between response time and adequate damping. Since the time between runs of this driver are so extreme, the gain is adjusted as a function of the time since the last pass so as to reduce, or even eliminate, the influence of what might be a very stale old value. The default gain is 10 percent. Signed-off-by: Doug Smythies --- drivers/cpufreq/intel_pstate.c | 92 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 90 insertions(+), 2 deletions(-) diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index c43ef55..ab5c004 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -98,6 +98,7 @@ static inline u64 div_ext_fp(u64 x, u64 y) * @tsc: Difference of time stamp counter between last and * current sample * @time: Current time from scheduler + * @target: target pstate filtered. * * This structure is used in the cpudata structure to store performance sample * data for choosing next P State. @@ -108,6 +109,7 @@ struct sample { u64 aperf; u64 mperf; u64 tsc; + u64 target; u64 time; }; @@ -1021,7 +1023,7 @@ static struct cpu_defaults core_params = { .sample_rate_ms = 10, .deadband = 0, .setpoint = 97, - .p_gain_pct = 20, + .p_gain_pct = 10, .d_gain_pct = 0, .i_gain_pct = 0, .boost_iowait = true, @@ -1168,6 +1170,7 @@ static void intel_pstate_get_cpu_pstates(struct cpudata *cpu) pstate_funcs.get_vid(cpu); intel_pstate_set_min_pstate(cpu); + cpu->sample.target = int_tofp(cpu->pstate.min_pstate); } static inline void intel_pstate_calc_avg_perf(struct cpudata *cpu) @@ -1301,8 +1304,10 @@ static inline int32_t get_target_pstate_use_performance(struct cpudata *cpu) static inline int32_t get_target_pstate_default(struct cpudata *cpu) { struct sample *sample = &cpu->sample; + int64_t scaled_gain, unfiltered_target; int32_t busy_frac; int pstate; + u64 duration_ns; busy_frac = div_fp(sample->mperf, sample->tsc); sample->busy_scaled = busy_frac * 100; @@ -1313,7 +1318,89 @@ static inline int32_t get_target_pstate_default(struct cpudata *cpu) cpu->iowait_boost >>= 1; pstate = cpu->pstate.turbo_pstate; - return fp_toint((pstate + (pstate >> 2)) * busy_frac); + /* To Do: I think the above should be: + * + * if (limits.no_turbo || limits.turbo_disabled) + * pstate = cpu->pstate.max_pstate; + * else + * pstate = cpu->pstate.turbo_pstate; + * + * figure it out. + * + * no clamps. Pre-filter clamping was needed in past implementations. + * To Do: Is any pre-filter clamping needed here? */ + + unfiltered_target = (pstate + (pstate >> 2)) * busy_frac; + + /* + * Idle check. + * We have a deferrable timer. Very long durations can be + * either due to long idle (C0 time near 0), + * or due to short idle times that spanned jiffy boundaries + * (C0 time not near zero). + * + * To Do: As of the utilization stuff, I do not think the the + * spanning jiffy boundaries thing is true anymore. + * Check, and fix the comment. + * + * The very long durations are 0.4 seconds or more. + * Either way, a very long duration will effectively flush + * the IIR filter, otherwise falling edge load response times + * can be on the order of tens of seconds, because this driver + * runs very rarely. Furthermore, for higher periodic loads that + * just so happen to not be in the C0 state on jiffy boundaries, + * the long ago history should be forgotten. + * For cases of durations that are a few times the set sample + * period, increase the IIR filter gain so as to weight + * the current sample more appropriately. + * + * To Do: sample_time should be forced to be accurate. For + * example if the kernel is a 250 Hz kernel, then a + * sample_rate_ms of 10 should result in a sample_time of 12. + * + * To Do: Check that the IO Boost case is not filtered too much. + * It might be that a filter by-pass is needed for the boost case. + * However, the existing gain = f(duration) might be good enough. + * + * Bandwidth limit the output. For now, re-task p_gain_pct for this purpose. + * Use a smple IIR (Infinite Impulse Response) filter. + * + * scale the gain as a function of the time since the last run of this driver. + * For example, if the time since the last run is 5 times nominal, then the + * scaled gain is 5 times nominal. + * scaled_gain = gain * duration / nominal + */ + + duration_ns = cpu->sample.time - cpu->last_sample_time; + + scaled_gain = div_u64(int_tofp(duration_ns) * + (pid_params.p_gain_pct), (pid_params.sample_rate_ns)); + if (scaled_gain > int_tofp(100)) + scaled_gain = int_tofp(100); + /* + * This code should not be required, + * but short duration times have been observed + * To Do: Check if this code is actually still needed. I don't think so. + */ + if (scaled_gain < int_tofp(pid_params.p_gain_pct)) + scaled_gain = int_tofp(pid_params.p_gain_pct); + + /* + * Actual IIR filter: + * new output = old output * (1 - gain) + input * gain + * + * To Do: Often the actual pstate the system ran at over the last + * interval is not what was asked for, due to influence from + * other CPUs. It might make sense to use the average pstate + * (get_avg_pstate) as the old_output here (as per previous + * work by Philippe Longepe and Stephane Gasparini on the + * get_target_pstate_use_cpu_load method). Test it. + */ + cpu->sample.target = div_u64((int_tofp(100) - scaled_gain) * + cpu->sample.target + scaled_gain * + unfiltered_target, int_tofp(100)); + + return fp_toint(cpu->sample.target + (1 << (FRAC_BITS-1))); } static inline void intel_pstate_update_pstate(struct cpudata *cpu, int pstate) @@ -1579,6 +1666,7 @@ static void intel_pstate_stop_cpu(struct cpufreq_policy *policy) return; intel_pstate_set_min_pstate(cpu); + cpu->sample.target = int_tofp(cpu->pstate.min_pstate); } static int intel_pstate_cpu_init(struct cpufreq_policy *policy)