From patchwork Mon Dec 7 09:15:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mel Gorman X-Patchwork-Id: 11955161 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EC2D5C4361B for ; Mon, 7 Dec 2020 09:17:07 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A11CB22A85 for ; Mon, 7 Dec 2020 09:17:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A11CB22A85 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=techsingularity.net Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=1sm5vCvwjVzi1V/vzXtSiG3MF44PpecMdyYS4aniC2A=; b=nTyFceYmXHO35v2sON9pzCLqC yrUZxbFRQoXMzOduai1qFITDKMs1hFUmWq7OVw9OZfyYsVoEqV2HZ2OpthUoaZTisbz9tre9LV453 iobaH0p1EGYCyEzMtow66z5A5kKcHybYrhBEBSzg+xa5E+wv6eFDR1HVv72rpQZHJE6uaAdmytCyl l7fPPfMd38eoEFCE7dDA0xIOwyj+/QtizGnISJrJW12V5+It6ttYrqmysYUnAYvNCVCK7JdYjHqIk ijBIwsVRe8qkZ3ebqSWHy2XuYkDf4T+e18zBJxWGnQxUIg26ZuKBlgxoRJ/H4ij/Kz0LSbZ39zY5r bVl494oOQ==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kmCcU-0000cy-4d; Mon, 07 Dec 2020 09:15:46 +0000 Received: from outbound-smtp01.blacknight.com ([81.17.249.7]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kmCcR-0000cM-Lq for linux-arm-kernel@lists.infradead.org; Mon, 07 Dec 2020 09:15:44 +0000 Received: from mail.blacknight.com (pemlinmail02.blacknight.ie [81.17.254.11]) by outbound-smtp01.blacknight.com (Postfix) with ESMTPS id 539F7C4A6E for ; Mon, 7 Dec 2020 09:15:38 +0000 (GMT) Received: (qmail 12082 invoked from network); 7 Dec 2020 09:15:37 -0000 Received: from unknown (HELO stampy.112glenside.lan) (mgorman@techsingularity.net@[84.203.22.4]) by 81.17.254.9 with ESMTPA; 7 Dec 2020 09:15:37 -0000 From: Mel Gorman To: LKML Subject: [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU Date: Mon, 7 Dec 2020 09:15:13 +0000 Message-Id: <20201207091516.24683-2-mgorman@techsingularity.net> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201207091516.24683-1-mgorman@techsingularity.net> References: <20201207091516.24683-1-mgorman@techsingularity.net> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201207_041543_829161_DB4E78D2 X-CRM114-Status: GOOD ( 11.37 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Barry Song , Juri Lelli , Vincent Guittot , Peter Ziljstra , Aubrey Li , Ingo Molnar , Mel Gorman , Valentin Schneider , Linux-ARM Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org SIS_AVG_CPU was introduced as a means of avoiding a search when the average search cost indicated that the search would likely fail. It was a blunt instrument and disabled by 4c77b18cf8b7 ("sched/fair: Make select_idle_cpu() more aggressive") and later replaced with a proportional search depth by 1ad3aaf3fcd2 ("sched/core: Implement new approach to scale select_idle_cpu()"). While there are corner cases where SIS_AVG_CPU is better, it has now been disabled for almost three years. As the intent of SIS_PROP is to reduce the time complexity of select_idle_cpu(), lets drop SIS_AVG_CPU and focus on SIS_PROP as a throttling mechanism. Signed-off-by: Mel Gorman Signed-off-by: Mel Gorman Signed-off-by: Mel Gorman --- kernel/sched/fair.c | 3 --- kernel/sched/features.h | 1 - 2 files changed, 4 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 98075f9ea9a8..23934dbac635 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6161,9 +6161,6 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t avg_idle = this_rq()->avg_idle / 512; avg_cost = this_sd->avg_scan_cost + 1; - if (sched_feat(SIS_AVG_CPU) && avg_idle < avg_cost) - return -1; - if (sched_feat(SIS_PROP)) { u64 span_avg = sd->span_weight * avg_idle; if (span_avg > 4*avg_cost) diff --git a/kernel/sched/features.h b/kernel/sched/features.h index 68d369cba9e4..e875eabb6600 100644 --- a/kernel/sched/features.h +++ b/kernel/sched/features.h @@ -54,7 +54,6 @@ SCHED_FEAT(TTWU_QUEUE, true) /* * When doing wakeups, attempt to limit superfluous scans of the LLC domain. */ -SCHED_FEAT(SIS_AVG_CPU, false) SCHED_FEAT(SIS_PROP, true) /* From patchwork Mon Dec 7 09:15:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mel Gorman X-Patchwork-Id: 11955165 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F5A1C19437 for ; Mon, 7 Dec 2020 09:17:09 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A2BCE22A85 for ; Mon, 7 Dec 2020 09:17:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A2BCE22A85 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=techsingularity.net Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=fD1pH+4LDy8fNjDA5cwKuLCBeHyXlJ89lwEUcO86Q0o=; b=B8v4lKD7ddJE06cpYV+d9hrW3 6XB3j4lw6KqPOcAE1xiZy+rmUlQqNk+3VRzTdPlyTQhL3faIsIv+ufcOtk2febZABmjbihcHgI4Dl 1sko3l5cezx1bUAdNzy8T/FnEJD5kQwSh5gla4DUcm/ntyKxCBn2BH3LnODhcIVSDpb+xXNfl5jKC GW71S0tteb2iI4ZZrKjR5E2btZco4cLpibRiwci84QVvozWhjB64+PRHTOzLyZpgE6Z3Btq30peeI iO1TR5+BVo0p0iKxAM/XKsx52477JXHi0dtL0UW3UVTdajFin6/SCxf4fjYzOGEjofKrZEbOtLksE ks5crIXNg==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kmCcb-0000ee-2b; Mon, 07 Dec 2020 09:15:53 +0000 Received: from outbound-smtp33.blacknight.com ([81.17.249.66]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kmCcY-0000de-DI for linux-arm-kernel@lists.infradead.org; Mon, 07 Dec 2020 09:15:51 +0000 Received: from mail.blacknight.com (pemlinmail02.blacknight.ie [81.17.254.11]) by outbound-smtp33.blacknight.com (Postfix) with ESMTPS id A2DA2BAC01 for ; Mon, 7 Dec 2020 09:15:48 +0000 (GMT) Received: (qmail 13145 invoked from network); 7 Dec 2020 09:15:48 -0000 Received: from unknown (HELO stampy.112glenside.lan) (mgorman@techsingularity.net@[84.203.22.4]) by 81.17.254.9 with ESMTPA; 7 Dec 2020 09:15:48 -0000 From: Mel Gorman To: LKML Subject: [PATCH 2/4] sched/fair: Do not replace recent_used_cpu with the new target Date: Mon, 7 Dec 2020 09:15:14 +0000 Message-Id: <20201207091516.24683-3-mgorman@techsingularity.net> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201207091516.24683-1-mgorman@techsingularity.net> References: <20201207091516.24683-1-mgorman@techsingularity.net> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201207_041550_693635_F6279C1E X-CRM114-Status: GOOD ( 16.47 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Barry Song , Juri Lelli , Vincent Guittot , Peter Ziljstra , Aubrey Li , Ingo Molnar , Mel Gorman , Valentin Schneider , Linux-ARM Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org After select_idle_sibling, p->recent_used_cpu is set to the new target. However on the next wakeup, prev will be the same as recent_used_cpu unless the load balancer has moved the task since the last wakeup. It still works, but is less efficient than it can be after all the changes that went in since that reduce unnecessary migrations, load balancer changes etc. This patch preserves recent_used_cpu for longer. With tbench on a 2-socket CascadeLake machine, 80 logical CPUs, HT enabled 5.10.0-rc6 5.10.0-rc6 baseline-v2 altrecent-v2 Hmean 1 508.39 ( 0.00%) 502.05 * -1.25%* Hmean 2 986.70 ( 0.00%) 983.65 * -0.31%* Hmean 4 1914.55 ( 0.00%) 1920.24 * 0.30%* Hmean 8 3702.37 ( 0.00%) 3663.96 * -1.04%* Hmean 16 6573.11 ( 0.00%) 6545.58 * -0.42%* Hmean 32 10142.57 ( 0.00%) 10253.73 * 1.10%* Hmean 64 14348.40 ( 0.00%) 12506.31 * -12.84%* Hmean 128 21842.59 ( 0.00%) 21967.13 * 0.57%* Hmean 256 20813.75 ( 0.00%) 21534.52 * 3.46%* Hmean 320 20684.33 ( 0.00%) 21070.14 * 1.87%* The different was marginal except for 64 threads which showed in the baseline that the result was very unstable where as the patch was much more stable. This is somewhat machine specific as on a separate 80-cpu Broadwell machine the same test reported. 5.10.0-rc6 5.10.0-rc6 baseline-v2 altrecent-v2 Hmean 1 310.36 ( 0.00%) 291.81 * -5.98%* Hmean 2 340.86 ( 0.00%) 547.22 * 60.54%* Hmean 4 912.29 ( 0.00%) 1063.21 * 16.54%* Hmean 8 2116.40 ( 0.00%) 2103.60 * -0.60%* Hmean 16 4232.90 ( 0.00%) 4362.92 * 3.07%* Hmean 32 8442.03 ( 0.00%) 8642.10 * 2.37%* Hmean 64 11733.91 ( 0.00%) 11473.66 * -2.22%* Hmean 128 17727.24 ( 0.00%) 16784.23 * -5.32%* Hmean 256 16089.23 ( 0.00%) 16110.79 * 0.13%* Hmean 320 15992.60 ( 0.00%) 16071.64 * 0.49%* schedstats were not used in this series but from an earlier debugging effort, the schedstats after the test run were as follows; Ops SIS Search 5653107942.00 5726545742.00 Ops SIS Domain Search 3365067916.00 3319768543.00 Ops SIS Scanned 112173512543.00 99194352541.00 Ops SIS Domain Scanned 109885472517.00 96787575342.00 Ops SIS Failures 2923185114.00 2950166441.00 Ops SIS Recent Used Hit 56547.00 118064916.00 Ops SIS Recent Used Miss 1590899250.00 354942791.00 Ops SIS Recent Attempts 1590955797.00 473007707.00 Ops SIS Search Efficiency 5.04 5.77 Ops SIS Domain Search Eff 3.06 3.43 Ops SIS Fast Success Rate 40.47 42.03 Ops SIS Success Rate 48.29 48.48 Ops SIS Recent Success Rate 0.00 24.96 First interesting point is the ridiculous number of times runqueues are enabled -- almost 97 billion times over the course of 40 minutes With the patch, "Recent Used Hit" is over 2000 times more likely to succeed. The failure rate also increases by quite a lot but the cost is marginal even if the "Fast Success Rate" only increases by 2% overall. What cannot be observed from these stats is where the biggest impact as these stats cover low utilisation to over saturation. If graphed over time, the graphs show that the sched domain is only scanned at negligible rates until the machine is fully busy. With low utilisation, the "Fast Success Rate" is almost 100% until the machine is fully busy. For 320 clients, the success rate is close to 0% which is unsurprising. Signed-off-by: Mel Gorman --- kernel/sched/fair.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 23934dbac635..01b38fc17bca 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6274,6 +6274,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target) /* Check a recently used CPU as a potential idle candidate: */ recent_used_cpu = p->recent_used_cpu; + p->recent_used_cpu = prev; if (recent_used_cpu != prev && recent_used_cpu != target && cpus_share_cache(recent_used_cpu, target) && @@ -6765,9 +6766,6 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int wake_flags) } else if (wake_flags & WF_TTWU) { /* XXX always ? */ /* Fast path */ new_cpu = select_idle_sibling(p, prev_cpu, new_cpu); - - if (want_affine) - current->recent_used_cpu = cpu; } rcu_read_unlock(); From patchwork Mon Dec 7 09:15:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mel Gorman X-Patchwork-Id: 11955167 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69F0EC1B0D8 for ; Mon, 7 Dec 2020 09:17:09 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1FB3F22A85 for ; Mon, 7 Dec 2020 09:17:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1FB3F22A85 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=techsingularity.net Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=5yr2zx7ysid4vCZFPG5LShYTJaKMGqMbEMIPM3EAoL4=; b=1hNf5PbS+MNMo06oAPd+PS0sz gtE1f1uAHSjw3ISClBy7LNeGOyohCQGDEv30k8CBVtTRXIAQAw2x1TP6+TfUtmW+sEW8+THLVhj2t Ai0wLTQXH2Sh14Fa281s+OfeFXsOAT7RmewQXcIeO4/SgY9wFfY8qD/IXoIUWOn3P25i2eMYdwoC7 ThKjTQ6gW2pMRYHE7V4vBDhhiR4duoF+ONCGqO66ViaYG80qZ/ls/Tac7ij0sPGbxfxxC5oPUMLd/ QWUZ8wSsRkN+dHZ6rMkJrY05uU3vtm6eYHLAbDhozMVNPmhvr0nSaW1THqe9LnieEu+9ZYTLy5UXn jNO0YCjMg==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kmCcl-0000gN-87; Mon, 07 Dec 2020 09:16:03 +0000 Received: from outbound-smtp18.blacknight.com ([46.22.139.245]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kmCcj-0000fY-2M for linux-arm-kernel@lists.infradead.org; Mon, 07 Dec 2020 09:16:02 +0000 Received: from mail.blacknight.com (pemlinmail02.blacknight.ie [81.17.254.11]) by outbound-smtp18.blacknight.com (Postfix) with ESMTPS id 51EF71C3A88 for ; Mon, 7 Dec 2020 09:15:59 +0000 (GMT) Received: (qmail 13901 invoked from network); 7 Dec 2020 09:15:59 -0000 Received: from unknown (HELO stampy.112glenside.lan) (mgorman@techsingularity.net@[84.203.22.4]) by 81.17.254.9 with ESMTPA; 7 Dec 2020 09:15:58 -0000 From: Mel Gorman To: LKML Subject: [PATCH 3/4] sched/fair: Return an idle cpu if one is found after a failed search for an idle core Date: Mon, 7 Dec 2020 09:15:15 +0000 Message-Id: <20201207091516.24683-4-mgorman@techsingularity.net> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201207091516.24683-1-mgorman@techsingularity.net> References: <20201207091516.24683-1-mgorman@techsingularity.net> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201207_041601_234193_854265F0 X-CRM114-Status: GOOD ( 15.26 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Barry Song , Juri Lelli , Vincent Guittot , Peter Ziljstra , Aubrey Li , Ingo Molnar , Mel Gorman , Valentin Schneider , Linux-ARM Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org select_idle_core is called when SMT is active and there is likely a free core available. It may find idle CPUs but this information is simply discarded and the scan starts over again with select_idle_cpu. This patch caches information on idle CPUs found during the search for a core and uses one if no core is found. This is a tradeoff. There may be a slight impact when utilisation is low and an idle core can be found quickly. It provides improvements as the number of busy CPUs approaches 50% of the domain size when SMT is enabled. With tbench on a 2-socket CascadeLake machine, 80 logical CPUs, HT enabled 5.10.0-rc6 5.10.0-rc6 schedstat idlecandidate Hmean 1 500.06 ( 0.00%) 505.67 * 1.12%* Hmean 2 975.90 ( 0.00%) 974.06 * -0.19%* Hmean 4 1902.95 ( 0.00%) 1904.43 * 0.08%* Hmean 8 3761.73 ( 0.00%) 3721.02 * -1.08%* Hmean 16 6713.93 ( 0.00%) 6769.17 * 0.82%* Hmean 32 10435.31 ( 0.00%) 10312.58 * -1.18%* Hmean 64 12325.51 ( 0.00%) 13792.01 * 11.90%* Hmean 128 21225.21 ( 0.00%) 20963.44 * -1.23%* Hmean 256 20532.83 ( 0.00%) 20335.62 * -0.96%* Hmean 320 20334.81 ( 0.00%) 20147.25 * -0.92%* Note that there is a significant corner case. As the SMT scan may be terminated early, not all CPUs have been visited and select_idle_cpu() is still called for a full scan. This case is handled in the next patch. Signed-off-by: Mel Gorman Reviewed-by: Vincent Guittot --- kernel/sched/fair.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 01b38fc17bca..00c3b526a5bd 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6066,6 +6066,7 @@ void __update_idle_core(struct rq *rq) */ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int target) { + int idle_candidate = -1; struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask); int core, cpu; @@ -6085,6 +6086,11 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int idle = false; break; } + + if (idle_candidate == -1 && + cpumask_test_cpu(cpu, p->cpus_ptr)) { + idle_candidate = cpu; + } } if (idle) @@ -6098,7 +6104,7 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int */ set_idle_cores(target, 0); - return -1; + return idle_candidate; } /* From patchwork Mon Dec 7 09:15:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mel Gorman X-Patchwork-Id: 11955169 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA01AC4361B for ; Mon, 7 Dec 2020 09:17:33 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7D4E922A85 for ; Mon, 7 Dec 2020 09:17:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7D4E922A85 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=techsingularity.net Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=sj3Uetr3fkHMkovDz9kOJIBW9HQgFsNiMEuY8DWgEPQ=; b=Sd27bF5iWn4Eoi6tVdGUqx1RE iHnE738bV9D/F5Fjm2v9ZDvswcZ0GeT2baULnE61+EBRfuCP+UBH+6MKCFEhfTCEfMVSm4o5wEq1z EZgTJJS3LWvlnxcQfr4xCnnWZ4SKJ8Sp7Z200ZnHr0QqU9iQxybCCbtot0wNW4nqJMFN7d+N+UARa 53mt7/hIT+dj7dPEK0akPLuN+Luhh0hpUhU8rfzNa+w8U+ty55cbsdbXw8ztQSrYtiZc5JN1QcU0R DfuAFeCLmv/df2d1UleEQAqk507kOPswyTaef47gRy+qpZIo3/WE47OF3Bjas1PNV5Cs5NKHgKck4 gUozVbL3w==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kmCcw-0000je-80; Mon, 07 Dec 2020 09:16:14 +0000 Received: from outbound-smtp29.blacknight.com ([81.17.249.32]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kmCct-0000iI-LV for linux-arm-kernel@lists.infradead.org; Mon, 07 Dec 2020 09:16:12 +0000 Received: from mail.blacknight.com (pemlinmail02.blacknight.ie [81.17.254.11]) by outbound-smtp29.blacknight.com (Postfix) with ESMTPS id E36F4BEC70 for ; Mon, 7 Dec 2020 09:16:09 +0000 (GMT) Received: (qmail 14772 invoked from network); 7 Dec 2020 09:16:09 -0000 Received: from unknown (HELO stampy.112glenside.lan) (mgorman@techsingularity.net@[84.203.22.4]) by 81.17.254.9 with ESMTPA; 7 Dec 2020 09:16:09 -0000 From: Mel Gorman To: LKML Subject: [PATCH 4/4] sched/fair: Avoid revisiting CPUs multiple times during select_idle_sibling Date: Mon, 7 Dec 2020 09:15:16 +0000 Message-Id: <20201207091516.24683-5-mgorman@techsingularity.net> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201207091516.24683-1-mgorman@techsingularity.net> References: <20201207091516.24683-1-mgorman@techsingularity.net> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201207_041611_914144_97B4776D X-CRM114-Status: GOOD ( 18.85 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Barry Song , Juri Lelli , Vincent Guittot , Peter Ziljstra , Aubrey Li , Ingo Molnar , Mel Gorman , Valentin Schneider , Linux-ARM Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org select_idle_core() potentially searches a number of CPUs for idle candidates before select_idle_cpu() clears the mask and revisits the same CPUs. This patch moves the initialisation of select_idle_mask to the top-level and reuses the same mask across both select_idle_core and select_idle_cpu. select_idle_smt() is left alone as the cost of checking one SMT sibling is marginal relative to calling __clear_cpumask_cpu() for every CPU visited by select_idle_core(). With tbench on a 2-socket CascadeLake machine, 80 logical CPUs, HT enabled 5.10.0-rc6 5.10.0-rc6 altrecent-v2 singlepass-v2 Hmean 1 502.05 ( 0.00%) 498.53 * -0.70%* Hmean 2 983.65 ( 0.00%) 982.33 * -0.13%* Hmean 4 1920.24 ( 0.00%) 1929.34 * 0.47%* Hmean 8 3663.96 ( 0.00%) 3536.94 * -3.47%* Hmean 16 6545.58 ( 0.00%) 6560.21 * 0.22%* Hmean 32 10253.73 ( 0.00%) 10374.30 * 1.18%* Hmean 64 12506.31 ( 0.00%) 11692.26 * -6.51%* Hmean 128 21967.13 ( 0.00%) 21705.80 * -1.19%* Hmean 256 21534.52 ( 0.00%) 21223.50 * -1.44%* Hmean 320 21070.14 ( 0.00%) 21023.31 * -0.22%* As before, results are somewhat workload and machine-specific with a mix of good and bad. For example, netperf UDP_STREAM on an 80-cpu Broadwell machine shows; 5.10.0-rc6 5.10.0-rc6 altrecent-v2 singlepass-v2 Hmean send-64 232.78 ( 0.00%) 258.02 * 10.85%* Hmean send-128 464.69 ( 0.00%) 511.53 * 10.08%* Hmean send-256 904.72 ( 0.00%) 999.40 * 10.46%* Hmean send-1024 3512.08 ( 0.00%) 3868.86 * 10.16%* Hmean send-2048 6777.61 ( 0.00%) 7435.05 * 9.70%* Hmean send-3312 10352.45 ( 0.00%) 11321.43 * 9.36%* Hmean send-4096 12660.58 ( 0.00%) 13577.08 * 7.24%* Hmean send-8192 19240.36 ( 0.00%) 21321.34 * 10.82%* Hmean send-16384 29691.27 ( 0.00%) 31296.47 * 5.41%* The real question is -- is it better to scan CPU runqueues just once where possible or scan multiple times? This patch scans once but it must do additional work to track that scanning so for shallow scans it's a loss and or deep scans, it's a win. The main downside to this patch is that cpumask manipulations are more complex because two cpumasks are involved. The alternative is that select_idle_core() would scan the full domain and always return a CPU which was previously considered. Alternatively, the depth of select_idle_core() could be limited. Similarly, select_idle_core() often scans when no idle core is available as test_idle_cores is very race-prone and maybe there is a better approach to determining if select_idle_core() is likely to succeed or not. Signed-off-by: Mel Gorman --- kernel/sched/fair.c | 40 +++++++++++++++++++++++++++++----------- 1 file changed, 29 insertions(+), 11 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 00c3b526a5bd..3b1736dc5bde 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6064,10 +6064,11 @@ void __update_idle_core(struct rq *rq) * there are no idle cores left in the system; tracked through * sd_llc->shared->has_idle_cores and enabled through update_idle_core() above. */ -static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int target) +static int select_idle_core(struct task_struct *p, struct sched_domain *sd, + int target, struct cpumask *cpus_scan) { int idle_candidate = -1; - struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask); + struct cpumask *cpus; int core, cpu; if (!static_branch_likely(&sched_smt_present)) @@ -6076,12 +6077,21 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int if (!test_idle_cores(target, false)) return -1; - cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); + /* + * Repurpose load_balance_mask to avoid rescanning cores while + * cpus_scan tracks the cpu if select_idle_cpu() is necessary. + * In this context, it should be impossible to enter LB and + * clobber load_balance_mask. + */ + cpus = this_cpu_cpumask_var_ptr(load_balance_mask); + cpumask_copy(cpus, cpus_scan); for_each_cpu_wrap(core, cpus, target) { bool idle = true; for_each_cpu(cpu, cpu_smt_mask(core)) { + __cpumask_clear_cpu(cpu, cpus_scan); + if (!available_idle_cpu(cpu)) { idle = false; break; @@ -6130,7 +6140,8 @@ static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int t #else /* CONFIG_SCHED_SMT */ -static inline int select_idle_core(struct task_struct *p, struct sched_domain *sd, int target) +static inline int select_idle_core(struct task_struct *p, struct sched_domain *sd, + int target, struct cpumask *cpus) { return -1; } @@ -6147,9 +6158,9 @@ static inline int select_idle_smt(struct task_struct *p, struct sched_domain *sd * comparing the average scan cost (tracked in sd->avg_scan_cost) against the * average idle time for this rq (as found in rq->avg_idle). */ -static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int target) +static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, + int target, struct cpumask *cpus) { - struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask); struct sched_domain *this_sd; u64 avg_cost, avg_idle; u64 time; @@ -6176,9 +6187,6 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t } time = cpu_clock(this); - - cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); - for_each_cpu_wrap(cpu, cpus, target) { if (!--nr) return -1; @@ -6240,6 +6248,7 @@ static inline bool asym_fits_capacity(int task_util, int cpu) static int select_idle_sibling(struct task_struct *p, int prev, int target) { struct sched_domain *sd; + struct cpumask *cpus_scan; unsigned long task_util; int i, recent_used_cpu; @@ -6319,11 +6328,20 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target) if (!sd) return target; - i = select_idle_core(p, sd, target); + /* + * Init the select_idle_mask. select_idle_core() will mask + * out the CPUs that have already been limited to limit the + * search in select_idle_cpu(). Further clearing is not + * done as select_idle_smt checks only one CPU. + */ + cpus_scan = this_cpu_cpumask_var_ptr(select_idle_mask); + cpumask_and(cpus_scan, sched_domain_span(sd), p->cpus_ptr); + + i = select_idle_core(p, sd, target, cpus_scan); if ((unsigned)i < nr_cpumask_bits) return i; - i = select_idle_cpu(p, sd, target); + i = select_idle_cpu(p, sd, target, cpus_scan); if ((unsigned)i < nr_cpumask_bits) return i;