[RFC,00/10] Reduce time complexity of select_idle_sibling

Message ID	20201203141124.7391-1-mgorman@techsingularity.net (mailing list archive)
Headers	show Return-Path: <SRS0=glJB=FH=lists.infradead.org=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C444820705 From: Mel Gorman <mgorman@techsingularity.net> To: LKML <linux-kernel@vger.kernel.org> Subject: [RFC PATCH 00/10] Reduce time complexity of select_idle_sibling Date: Thu, 3 Dec 2020 14:11:14 +0000 Message-Id: <20201203141124.7391-1-mgorman@techsingularity.net> MIME-Version: 1.0 Precedence: list Cc: Barry Song <song.bao.hua@hisilicon.com>, Juri Lelli <juri.lelli@redhat.com>, Vincent Guittot <vincent.guittot@linaro.org>, Peter Ziljstra <peterz@infradead.org>, Aubrey Li <aubrey.li@linux.intel.com>, Ingo Molnar <mingo@redhat.com>, Mel Gorman <mgorman@techsingularity.net>, Valentin Schneider <valentin.schneider@arm.com>, Linux-ARM <linux-arm-kernel@lists.infradead.org> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org
Series	Reduce time complexity of select_idle_sibling \| expand [RFC,00/10] Reduce time complexity of select_idle_sibling [01/10] sched/fair: Track efficiency of select_idle_sibling [02/10] sched/fair: Track efficiency of task recent_used_cpu [03/10] sched/fair: Remove SIS_AVG_CPU [04/10] sched/fair: Return an idle cpu if one is found after a failed search for an idle core [05/10] sched/fair: Do not replace recent_used_cpu with the new target [06/10] sched/fair: Clear the target CPU from the cpumask of CPUs searched [07/10] sched/fair: Account for the idle cpu/smt search cost [08/10] sched/fair: Reintroduce SIS_AVG_CPU but in the context of SIS_PROP to reduce search depth [09/10] sched/fair: Limit the search for an idle core [10/10] sched/fair: Avoid revisiting CPUs multiple times during select_idle_sibling

Message ID

20201203141124.7391-1-mgorman@techsingularity.net (mailing list archive)

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C444820705
From: Mel Gorman <mgorman@techsingularity.net>
To: LKML <linux-kernel@vger.kernel.org>
Subject: [RFC PATCH 00/10] Reduce time complexity of select_idle_sibling
Date: Thu,  3 Dec 2020 14:11:14 +0000
Message-Id: <20201203141124.7391-1-mgorman@techsingularity.net>
MIME-Version: 1.0
Precedence: list
Cc: Barry Song <song.bao.hua@hisilicon.com>,
 Juri Lelli <juri.lelli@redhat.com>,
 Vincent Guittot <vincent.guittot@linaro.org>,
 Peter Ziljstra <peterz@infradead.org>, Aubrey Li <aubrey.li@linux.intel.com>,
 Ingo Molnar <mingo@redhat.com>, Mel Gorman <mgorman@techsingularity.net>,
 Valentin Schneider <valentin.schneider@arm.com>,
 Linux-ARM <linux-arm-kernel@lists.infradead.org>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org>
Errors-To: 
 linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org

Series

Reduce time complexity of select_idle_sibling | expand

Message

Mel Gorman Dec. 3, 2020, 2:11 p.m. UTC

This is an early prototype that has not been tested heavily. While parts
of it may stand on its own, the motivation to release early is Aubrey
Li's series on using an idle cpumask to optimise the search and Barry
Song's series on representing clusters on die. The series is based on
tip/sched/core rebased to 5.10-rc6.

Patches 1-2 add schedstats to track the search efficiency of
	select_idle_sibling. They can be dropped from the final version but
	are useful when looking at select_idle_sibling in general. MMTests
	can already parse the stats and generate useful data including
	graphs over time.

Patch 3 kills SIS_AVG_CPU but is partially reintroduced later in the
	context of SIS_PROP.

Patch 4 notes that select_idle_core() can find an idle CPU that is
	not a free core yet it is ignored and a second search is conducted
	in select_idle_cpu() which is wasteful. Note that this patch
	will definitely change in the final version.

Patch 5 adjusts p->recent_used_cpu so that it has a higher success rate
	and avoids searching the domain in some cases.

Patch 6 notes that select_idle_* always starts with a CPU that is
	definitely not idle and fixes that.

Patch 7 notes that SIS_PROP is only partially accounting for search
	costs. While this might be accidentally beneficial, it makes it
	much harder to reason about the effectiveness of SIS_PROP.

Patch 8 uses similar logic to SIS_AVG_CPU but in the context of
	SIS_PROP to throttle the search depth.

Patches 9 and 10 are stupid in the context of this series. They
	are included even though it makes no sense to use SIS_PROP logic in
	select_idle_core() as it already has throttling logic. The point
	is to illustrate that the select_idle_mask can be initialised
	at the start of a domain search used to mask out CPUs that have
	already been visited.

In the context of Aubrey's and Barry's work, select_idle_mask would
be initialised *after* select_idle_core as select_idle_core uses
select_idle_mask for its own purposes. In Aubrey's case, the next
step would be to scan idle_cpus_span as those CPUs may still be idle
and bias the search towards likely idle candidates. If that fails,
select_idle_mask clears all the bits set in idle_cpus_span and then
scans the remainder. Similar observations apply to Barry's work, scan the
local domain first, mask out those bits then scan the remaining CPUs in
the cluster.

The final version of this series will drop patches 1-2 unless there is
demand and definitely drop patches 9-10. However, all 4 patches may be
useful in the context of Aubrey's and Barry's work. Patches 1-2 would
give more precise results on exactly how much they are improving "SIS
Domain Search Efficiency" which may be more illustrative than just the
headline performance figures of a given workload. The final version of
this series will also adjust patch 4. If select_idle_core() runs at all
then it definitely should return a CPU -- either an idle CPU or the target
as it has already searched the entire domain and no further searching
should be conducted. Barry might change that back so that a cluster can
be scanned but it would be done in the context of the cluster series.