From patchwork Tue Aug 9 17:16:56 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Doug Smythies X-Patchwork-Id: 9271925 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id C6A7360839 for ; Tue, 9 Aug 2016 17:17:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B4D4328375 for ; Tue, 9 Aug 2016 17:17:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A961428389; Tue, 9 Aug 2016 17:17:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EBC3E28375 for ; Tue, 9 Aug 2016 17:17:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932213AbcHIRRD (ORCPT ); Tue, 9 Aug 2016 13:17:03 -0400 Received: from cmta19.telus.net ([209.171.16.92]:57673 "EHLO cmta19.telus.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752200AbcHIRRC (ORCPT ); Tue, 9 Aug 2016 13:17:02 -0400 Received: from dougxps ([173.180.45.4]) by cmsmtp with SMTP id XAe9bPgyV9p4bXAeAbZoDF; Tue, 09 Aug 2016 11:17:00 -0600 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=telus.net; s=neo; t=1470763020; bh=o/qnQaYU1kZzqYnAmeqlEKK4Ned3vIUPR6KF6WZULA0=; h=From:To:Cc:References:In-Reply-To:Subject:Date; b=fMHsnYbKcuBVbLd9cTzb0vyLeX5OiSVDginNyPN6quEQeiNa9FCa0BOC+HLQRJRPS CJp6SG4v4ZC9ftdZJ4G50rwPgq5HkCx68vqU6DPWNhMj6bUXLVYG0RHX7kzvmWdYWr S3Vc0J94lYQ7egIZBU5WmqXeOxZ/PmVtm11hYFoyyrPVNkPzM99mFeslfbDDAs1Ov7 9oVvYMqb/2rZYaeRYD+IEENyB4Col3ZLOZWb2LhFOxBFY45DktWWrEOfQuGg4CopkY WDoHwoUqKxuWZVG0dXcM/RKSlgWqOL6KVwgCRzo/zMM/gHJteswzUpAeBslpKWqr0X +1sKHoT4t2P7w== X-Authority-Analysis: v=2.2 cv=Mse90ySe c=1 sm=1 tr=0 a=zJWegnE7BH9C0Gl4FFgQyA==:117 a=zJWegnE7BH9C0Gl4FFgQyA==:17 a=Pyq9K9CWowscuQLKlpiwfMBGOR0=:19 a=IkcTkHD0fZMA:10 a=gu6fZOg2AAAA:8 a=dxdyOmzd7dhcIQshkdsA:9 a=-FEs8UIgK8oA:10 a=NWVoK91CQyQA:10 a=NWVoK91CQySWRX1oVYDe:22 a=PElWl26apYe9z2G_WOx8:22 a=2RSlZUUhi9gRBrsHwhhZ:22 From: "Doug Smythies" To: "'Rafael J. Wysocki'" Cc: "'Peter Zijlstra'" , "'Srinivas Pandruvada'" , "'Viresh Kumar'" , "'Linux Kernel Mailing List'" , "'Steve Muckle'" , "'Juri Lelli'" , "'Ingo Molnar'" , "'Linux PM list'" References: <3752826.3sXAQIvcIA@vostro.rjw.lan> <1572483.RZjvRFdxPx@vostro.rjw.lan> <001b01d1ee1c$e6a6a170$b3f3e450$@net> <10156236.RI1knH5Wfs@vostro.rjw.lan> In-Reply-To: <10156236.RI1knH5Wfs@vostro.rjw.lan> Subject: RE: [RFC][PATCH 7/7] cpufreq: intel_pstate: Change P-state selection algorithm for Core Date: Tue, 9 Aug 2016 10:16:56 -0700 Message-ID: <000801d1f261$d69fc740$83df55c0$@net> MIME-Version: 1.0 X-Mailer: Microsoft Office Outlook 12.0 Thread-Index: AdHwK5hOBzUn4K4SS1O5PfjvcvMAewBUW/nA Content-Language: en-ca X-CMAE-Envelope: MS4wfIC8zPKYUAP1y+84HuWR7h/9Ya11hLy+QuXoUxmJHIdCyHzOnM5HmaAxHCaXbx16ZC6r0k8DvAKI0iIutxD46FV/ZHMujZ+EY/Ug8njcJPa/AB0mvVrg YtCkgJPaONuYtNomhZyvLUDp2uZ/sxeVMHfT8IGud9O1Ulj0Lzx0J4e4C421AM1IHZHO0L7ix7ejlNypCY01MdtJyy9DE+Hu5vFh8I2FQLBNrfwrLwnt2ZGz cqogP/EOg/pcB1zSjCMDbinBvRD5JcyY/eueNAjiX+BsDo6EmyvaOuJNvLZgb3++NYhYg81IItgRHXtf4to7IL1BhjAR1PST8i5yNzSMhI5Mx5CApn/DkalW d0uWZFqXFKN5oA8e6FzdmWs9QVsqy2ao2Y0Fk9Vj93EF7RqdlZZlR592ou8AaThx+uXXbfZZUMz6DYhaEzYoDqR6FhCpAQ== Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 2016.08.05 17:02 Rafael J. Wysocki wrote: > On 2016.08.03 21:19 Doug Smythies wrote: >> On 2016.07.31 16:49 Rafael J. Wysocki wrote: >> >> >>> +static inline int32_t get_target_pstate_default(struct cpudata *cpu) >>> +{ >>> + struct sample *sample = &cpu->sample; >>> + int32_t busy_frac; >>> + int pstate; >>> + >>> + busy_frac = div_fp(sample->mperf, sample->tsc); >>> + sample->busy_scaled = busy_frac * 100; >>> + >>> + if (busy_frac < cpu->iowait_boost) >>> + busy_frac = cpu->iowait_boost; >>> + >>> + cpu->iowait_boost >>= 1; >>> + >>> + pstate = cpu->pstate.turbo_pstate; >>> + return fp_toint((pstate + (pstate >> 2)) * busy_frac); >>> +} >>> + >> >> The response curve is not normalized on the lower end to the minimum >> pstate for the processor, meaning the overall response will vary >> between processors as a function of minimum pstate. > But that's OK IMO. > > Mapping busy_frac = 0 to the minimum P-state would over-provision workloads > with small values of busy_frac. Agreed, mapping busy_frac = 0 to the minimum Pstate would be a bad thing to do. However, that is not what I meant. I meant that the mapping of busy-frac = N to the minimum pstate for the processor should give the same "N" (within granularity), regardless of the processor. Example, my processor, i7-2600K: max pstate = 38; min pstate = 16. Load before going to pstate, 17: 17 = (38 + 38/4) * load Load = N = 35.8 % Example, something like, i5-3337U (I think, I don't actually have one): max pstate = 27; min pstate = 8. Load before going to pstate, 9: 9 = (27 + 27/4) * load Load = N = 26.7 % It was a couple of years ago, so I should re-do the sensitivity analysis/testing, but I concluded that the performance / energy tradeoff was somewhat sensitive to "N". I am suggesting that the response curve, or transfer function, needs to be normalized, for any processor, to: Max pstate | __________ | / | / | / | / | / Min pstate| _______/ |__________________________ | | | | 0% N M 100% CPU load Currently M ~= 80% One time (not re-based since kernel 4.3) I did have a proposed solution [1], but it was expensive in terms of extra multiplies and divides. [1]: http://marc.info/?l=linux-pm&m=142881187323474&w=2 >> The clamping at maximum pstate at about 80% load seems at little high >> to me. Work I have done in various attempts to bring back the use of actual load >> has always ended up achieving maximum pstate before 80% load for best results. >> Even the get_target_pstate_cpu_load people reach the max pstate faster, and they >> are more about energy than performance. >> What was the criteria for the decision here? Are test results available for review >> and/or duplication by others? > This follows the coefficient used by the schedutil governor, but then the > metric is different, so quite possibly a different value may work better here. > > We'll test other values before applying this for sure. :-) I am now testing this change to the code (for M ~= 67%; N ~= 30% (my CPU); N ~= 22% (i5-3337U)): > > Several tests were done with this patch set. ...[cut]... >> Questions: >> Is there a migration plan? > > Not yet. We have quite a lot of testing to do first. > >> i.e. will there be an attempt to merge the current cpu_load method >> and this method into one method? > > Quite possibly if the results are good enough. > >> Then possibly the PID controller could be eliminated. > > Right. I think this change is important, and I'll help with it as best as I can. ... Doug A related CPU frequency Vs. Load graph will be sent to Rafael and Srinivas off-list. --- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index 8b2bdb7..909d441 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -1313,7 +1313,7 @@ static inline int32_t get_target_pstate_default(struct cpudata *cpu) cpu->iowait_boost >>= 1; pstate = cpu->pstate.turbo_pstate; - return fp_toint((pstate + (pstate >> 2)) * busy_frac); + return fp_toint((pstate + (pstate >> 1)) * busy_frac); } static inline void intel_pstate_update_pstate(struct cpudata *cpu, int pstate)