diff mbox

[RFCv5,36/46] sched: Prevent unnecessary active balance of single task in sched group

Message ID 1436293469-25707-37-git-send-email-morten.rasmussen@arm.com (mailing list archive)
State RFC
Headers show

Commit Message

Morten Rasmussen July 7, 2015, 6:24 p.m. UTC
Scenarios with the busiest group having just one task and the local
being idle on topologies with sched groups with different numbers of
cpus manage to dodge all load-balance bailout conditions resulting the
nr_balance_failed counter to be incremented. This eventually causes an
pointless active migration of the task. This patch prevents this by not
incrementing the counter when the busiest group only has one task.
ASYM_PACKING migrations and migrations due to reduced capacity should
still take place as these are explicitly captured by
need_active_balance().

A better solution would be to not attempt the load-balance in the first
place, but that requires significant changes to the order of bailout
conditions and statistics gathering.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>

Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
---
 kernel/sched/fair.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Peter Zijlstra Aug. 15, 2015, 9:46 a.m. UTC | #1
On Tue, Jul 07, 2015 at 07:24:19PM +0100, Morten Rasmussen wrote:
> Scenarios with the busiest group having just one task and the local
> being idle on topologies with sched groups with different numbers of
> cpus manage to dodge all load-balance bailout conditions resulting the
> nr_balance_failed counter to be incremented. This eventually causes an
> pointless active migration of the task. This patch prevents this by not
> incrementing the counter when the busiest group only has one task.
> ASYM_PACKING migrations and migrations due to reduced capacity should
> still take place as these are explicitly captured by
> need_active_balance().
> 
> A better solution would be to not attempt the load-balance in the first
> place, but that requires significant changes to the order of bailout
> conditions and statistics gathering.

*groan*, and this is of course triggered by your 2+3 core TC2 thingy.

Yes, asymmetric groups like that are a pain.
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 8e0cbd4..f7cb6c9 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6127,6 +6127,7 @@  struct lb_env {
 	int			new_dst_cpu;
 	enum cpu_idle_type	idle;
 	long			imbalance;
+	unsigned int		src_grp_nr_running;
 	/* The set of CPUs under consideration for load-balancing */
 	struct cpumask		*cpus;
 
@@ -7150,6 +7151,8 @@  static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd
 	if (env->sd->flags & SD_NUMA)
 		env->fbq_type = fbq_classify_group(&sds->busiest_stat);
 
+	env->src_grp_nr_running = sds->busiest_stat.sum_nr_running;
+
 	if (!env->sd->parent) {
 		/* update overload indicator if we are at root domain */
 		if (env->dst_rq->rd->overload != overload)
@@ -7788,7 +7791,8 @@  static int load_balance(int this_cpu, struct rq *this_rq,
 		 * excessive cache_hot migrations and active balances.
 		 */
 		if (idle != CPU_NEWLY_IDLE)
-			sd->nr_balance_failed++;
+			if (env.src_grp_nr_running > 1)
+				sd->nr_balance_failed++;
 
 		if (need_active_balance(&env)) {
 			raw_spin_lock_irqsave(&busiest->lock, flags);