From patchwork Thu Dec 3 14:11:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mel Gorman X-Patchwork-Id: 11948943 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B1A4C43211 for ; Thu, 3 Dec 2020 14:13:11 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 990E020705 for ; Thu, 3 Dec 2020 14:13:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 990E020705 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=techsingularity.net Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=MferWYc3FTx6fWLSjEObDVYxJoLWVD55cM9O57KlNlk=; b=XrVsI3WG8kqpNgoKbNmVNFSTf GbOvNGXcwIHadW1dEsv3U0YcuIdoWiKLRTt511cz68FuBbd+oR97/szWWHYQ6X3/XlFipHUzSgOYz u3F0uuH9lIMY8mXZ/sq1AOhdEHUm7dD1R3GirC/kBAc1+76mvdcfR+qz2eus36pOyQdUKB8iXT8Ps iMWnr9v1e+g0LTm6v1khhtAYxg7zGqUriuM9Wxm36DJN1Ap9pE/6B/k6r+eduJMvewGYaAClMEiT0 /OYh0XIdkLtZG6Y8z68RYke7JjnFr8WnqGxQ8L43/fHFh78qITkuRFWNQQxezFzgPv89anTQvrmEG mfrLHMnAg==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kkpKi-0003np-On; Thu, 03 Dec 2020 14:11:44 +0000 Received: from outbound-smtp11.blacknight.com ([46.22.139.106]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kkpKS-0003gh-MI for linux-arm-kernel@lists.infradead.org; Thu, 03 Dec 2020 14:11:31 +0000 Received: from mail.blacknight.com (pemlinmail03.blacknight.ie [81.17.254.16]) by outbound-smtp11.blacknight.com (Postfix) with ESMTPS id D84BD1C3737 for ; Thu, 3 Dec 2020 14:11:25 +0000 (GMT) Received: (qmail 22768 invoked from network); 3 Dec 2020 14:11:25 -0000 Received: from unknown (HELO stampy.112glenside.lan) (mgorman@techsingularity.net@[84.203.22.4]) by 81.17.254.9 with ESMTPA; 3 Dec 2020 14:11:25 -0000 From: Mel Gorman To: LKML Subject: [PATCH 05/10] sched/fair: Do not replace recent_used_cpu with the new target Date: Thu, 3 Dec 2020 14:11:19 +0000 Message-Id: <20201203141124.7391-6-mgorman@techsingularity.net> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201203141124.7391-1-mgorman@techsingularity.net> References: <20201203141124.7391-1-mgorman@techsingularity.net> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201203_091128_881918_4DA4A16C X-CRM114-Status: GOOD ( 15.52 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Barry Song , Juri Lelli , Vincent Guittot , Peter Ziljstra , Aubrey Li , Ingo Molnar , Mel Gorman , Valentin Schneider , Linux-ARM Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org After select_idle_sibling, p->recent_used_cpu is set to the new target. However on the next wakeup, prev will be the same as recent_used_cpu unless the load balancer has moved the task since the last wakeup. It still works, but is less efficient than it can be after all the changes that went in since that reduce unnecessary migrations, load balancer changes etc. This patch preserves recent_used_cpu for longer. With tbench on a 2-socket CascadeLake machine, 80 logical CPUs, HT enabled 5.10.0-rc6 5.10.0-rc6 idlecandidate-v1r10 altrecent-v1r10 Hmean 1 505.67 ( 0.00%) 501.34 * -0.86%* Hmean 2 974.06 ( 0.00%) 981.39 * 0.75%* Hmean 4 1904.43 ( 0.00%) 1926.13 * 1.14%* Hmean 8 3721.02 ( 0.00%) 3799.86 * 2.12%* Hmean 16 6769.17 ( 0.00%) 6938.40 * 2.50%* Hmean 32 10312.58 ( 0.00%) 10632.11 * 3.10%* Hmean 64 13792.01 ( 0.00%) 13670.17 * -0.88%* Hmean 128 20963.44 ( 0.00%) 21456.33 * 2.35%* Hmean 256 20335.62 ( 0.00%) 21070.24 * 3.61%* Hmean 320 20147.25 ( 0.00%) 20624.92 * 2.37%* The benefit is marginal, the main impact is on how it affects p->recent_used_cpu and whether a domain search happens. From the schedstats patches and schedstat enabled Ops SIS Search 5653107942.00 5726545742.00 Ops SIS Domain Search 3365067916.00 3319768543.00 Ops SIS Scanned 112173512543.00 99194352541.00 Ops SIS Domain Scanned 109885472517.00 96787575342.00 Ops SIS Failures 2923185114.00 2950166441.00 Ops SIS Recent Used Hit 56547.00 118064916.00 Ops SIS Recent Used Miss 1590899250.00 354942791.00 Ops SIS Recent Attempts 1590955797.00 473007707.00 Ops SIS Search Efficiency 5.04 5.77 Ops SIS Domain Search Eff 3.06 3.43 Ops SIS Fast Success Rate 40.47 42.03 Ops SIS Success Rate 48.29 48.48 Ops SIS Recent Success Rate 0.00 24.96 (First interesting point is the ridiculous number of times runqueues are enabled -- almost 97 billion times over the course of 40 minutes) Note "Recent Used Hit" is over 2000 times more likely to succeed. The failure rate also increases by quite a lot but the cost is marginal even if the "Fast Success Rate" only increases by 2% overall. What cannot be observed from these stats is where the biggest impact as these stats cover low utilisation to over saturation. If graphed over time, the graphs show that the sched domain is only scanned at negligible rates until the machine is fully busy. With low utilisation, the "Fast Success Rate" is almost 100% until the machine is fully busy. For 320 clients, the success rate is close to 0% which is unsurprising. Signed-off-by: Mel Gorman --- kernel/sched/fair.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 845bc0cd9158..68dd9cd62fbd 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6293,6 +6293,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target) /* Check a recently used CPU as a potential idle candidate: */ recent_used_cpu = p->recent_used_cpu; + p->recent_used_cpu = prev; if (recent_used_cpu != prev && recent_used_cpu != target && cpus_share_cache(recent_used_cpu, target)) { @@ -6789,9 +6790,6 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int wake_flags) } else if (wake_flags & WF_TTWU) { /* XXX always ? */ /* Fast path */ new_cpu = select_idle_sibling(p, prev_cpu, new_cpu); - - if (want_affine) - current->recent_used_cpu = cpu; } rcu_read_unlock();