diff mbox

[v6,1/9] sched: Extend scheduler's asym packing

Message ID c3bf1c59ee56dad1dc20ad3b3c06c166abb7a57e.1477000078.git.tim.c.chen@linux.intel.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Tim Chen Oct. 20, 2016, 9:59 p.m. UTC
We generalize the scheduler's asym packing to provide an ordering
of the cpu beyond just the cpu number.  This allows the use of the
ASYM_PACKING scheduler machinery to move loads to preferred CPU in a
sched domain. The preference is defined with the cpu priority
given by arch_asym_cpu_priority(cpu).

We also record the most preferred cpu in a sched group when
we build the cpu's capacity for fast lookup of preferred cpu
during load balancing.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
---
 include/linux/sched.h |  2 ++
 kernel/sched/core.c   | 15 +++++++++++++++
 kernel/sched/fair.c   | 35 ++++++++++++++++++++++++-----------
 kernel/sched/sched.h  |  6 ++++++
 4 files changed, 47 insertions(+), 11 deletions(-)

Comments

Thomas Gleixner Oct. 26, 2016, 10:27 a.m. UTC | #1
On Thu, 20 Oct 2016, Tim Chen wrote:

> We generalize the scheduler's asym packing to provide an ordering
> of the cpu beyond just the cpu number.  This allows the use of the
> ASYM_PACKING scheduler machinery to move loads to preferred CPU in a
> sched domain. The preference is defined with the cpu priority
> given by arch_asym_cpu_priority(cpu).
> 
> We also record the most preferred cpu in a sched group when
> we build the cpu's capacity for fast lookup of preferred cpu
> during load balancing.
> 
> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>

This SOB-chain is bogus. Same for all other patches.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tim Chen Oct. 26, 2016, 6:10 p.m. UTC | #2
On Wed, 2016-10-26 at 12:27 +0200, Thomas Gleixner wrote:
> On Thu, 20 Oct 2016, Tim Chen wrote:
> 
> > 
> > We generalize the scheduler's asym packing to provide an ordering
> > of the cpu beyond just the cpu number.  This allows the use of the
> > ASYM_PACKING scheduler machinery to move loads to preferred CPU in a
> > sched domain. The preference is defined with the cpu priority
> > given by arch_asym_cpu_priority(cpu).
> > 
> > We also record the most preferred cpu in a sched group when
> > we build the cpu's capacity for fast lookup of preferred cpu
> > during load balancing.
> > 
> > Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
> This SOB-chain is bogus. Same for all other patches.
> 

I am the primary author of the patch so I have my sign-off on top.  There
were also much internal discussions/reviews between myself, Peter and Srinivas,
before we post the first version of this patch.
I incorporated their inputs into the patch and added their sign-offs.  

Can you be more explicit on why you think the sign-offs here are bogus?

Thanks.

Tim

  
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Gleixner Oct. 26, 2016, 6:23 p.m. UTC | #3
On Wed, 26 Oct 2016, Tim Chen wrote:
> On Wed, 2016-10-26 at 12:27 +0200, Thomas Gleixner wrote:
> > On Thu, 20 Oct 2016, Tim Chen wrote:
> > 
> > > 
> > > We generalize the scheduler's asym packing to provide an ordering
> > > of the cpu beyond just the cpu number.  This allows the use of the
> > > ASYM_PACKING scheduler machinery to move loads to preferred CPU in a
> > > sched domain. The preference is defined with the cpu priority
> > > given by arch_asym_cpu_priority(cpu).
> > > 
> > > We also record the most preferred cpu in a sched group when
> > > we build the cpu's capacity for fast lookup of preferred cpu
> > > during load balancing.
> > > 
> > > Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> > > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > > Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
> > This SOB-chain is bogus. Same for all other patches.
> > 
> 
> I am the primary author of the patch so I have my sign-off on top.  There
> were also much internal discussions/reviews between myself, Peter and Srinivas,
> before we post the first version of this patch.
> I incorporated their inputs into the patch and added their sign-offs.  
> 
> Can you be more explicit on why you think the sign-offs here are bogus?

Because SOB chains document the way a patch takes from the author to the
kernel. The above says:

You authored the patch and sent it to Peter who sent it to Srinivas. So how
does it end up sent from you in my inbox?

We have no formal tag for co developed patches, but it's common practice to
either acknowledge contributions from others in free text form or use a
non-documented tag like 'Co-developed-by:' or 'Co-authored-by:'.

Thanks,

	tglx
diff mbox

Patch

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 348f51b..75d96d6 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1057,6 +1057,8 @@  static inline int cpu_numa_flags(void)
 }
 #endif
 
+int arch_asym_cpu_priority(int cpu);
+
 struct sched_domain_attr {
 	int relax_domain_level;
 };
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 94732d1..4465410 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6307,7 +6307,22 @@  static void init_sched_groups_capacity(int cpu, struct sched_domain *sd)
 	WARN_ON(!sg);
 
 	do {
+		int cpu, max_cpu = -1;
+
 		sg->group_weight = cpumask_weight(sched_group_cpus(sg));
+
+		if (!(sd->flags & SD_ASYM_PACKING))
+			goto next;
+
+		for_each_cpu(cpu, sched_group_cpus(sg)) {
+			if (max_cpu < 0)
+				max_cpu = cpu;
+			else if (sched_asym_prefer(cpu, max_cpu))
+				max_cpu = cpu;
+		}
+		sg->asym_prefer_cpu = max_cpu;
+
+next:
 		sg = sg->next;
 	} while (sg != sd->groups);
 
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 2d4ad72..f50e4d7 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -100,6 +100,16 @@  const_debug unsigned int sysctl_sched_migration_cost = 500000UL;
  */
 unsigned int __read_mostly sysctl_sched_shares_window = 10000000UL;
 
+#ifdef CONFIG_SMP
+/*
+ * For asym packing, by default the lower numbered cpu has higher priority.
+ */
+int __weak arch_asym_cpu_priority(int cpu)
+{
+	return -cpu;
+}
+#endif
+
 #ifdef CONFIG_CFS_BANDWIDTH
 /*
  * Amount of runtime to allocate from global (tg) to local (per-cfs_rq) pool
@@ -7101,16 +7111,18 @@  static bool update_sd_pick_busiest(struct lb_env *env,
 	if (env->idle == CPU_NOT_IDLE)
 		return true;
 	/*
-	 * ASYM_PACKING needs to move all the work to the lowest
-	 * numbered CPUs in the group, therefore mark all groups
-	 * higher than ourself as busy.
+	 * ASYM_PACKING needs to move all the work to the highest
+	 * prority CPUs in the group, therefore mark all groups
+	 * of lower priority than ourself as busy.
 	 */
-	if (sgs->sum_nr_running && env->dst_cpu < group_first_cpu(sg)) {
+	if (sgs->sum_nr_running &&
+	    sched_asym_prefer(env->dst_cpu, sg->asym_prefer_cpu)) {
 		if (!sds->busiest)
 			return true;
 
-		/* Prefer to move from highest possible cpu's work */
-		if (group_first_cpu(sds->busiest) < group_first_cpu(sg))
+		/* Prefer to move from lowest priority cpu's work */
+		if (sched_asym_prefer(sds->busiest->asym_prefer_cpu,
+				      sg->asym_prefer_cpu))
 			return true;
 	}
 
@@ -7262,8 +7274,8 @@  static int check_asym_packing(struct lb_env *env, struct sd_lb_stats *sds)
 	if (!sds->busiest)
 		return 0;
 
-	busiest_cpu = group_first_cpu(sds->busiest);
-	if (env->dst_cpu > busiest_cpu)
+	busiest_cpu = sds->busiest->asym_prefer_cpu;
+	if (sched_asym_prefer(busiest_cpu, env->dst_cpu))
 		return 0;
 
 	env->imbalance = DIV_ROUND_CLOSEST(
@@ -7601,10 +7613,11 @@  static int need_active_balance(struct lb_env *env)
 
 		/*
 		 * ASYM_PACKING needs to force migrate tasks from busy but
-		 * higher numbered CPUs in order to pack all tasks in the
-		 * lowest numbered CPUs.
+		 * lower priority CPUs in order to pack all tasks in the
+		 * highest priority CPUs.
 		 */
-		if ((sd->flags & SD_ASYM_PACKING) && env->src_cpu > env->dst_cpu)
+		if ((sd->flags & SD_ASYM_PACKING) &&
+		    sched_asym_prefer(env->dst_cpu, env->src_cpu))
 			return 1;
 	}
 
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 055f935..cd3d413 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -539,6 +539,11 @@  struct dl_rq {
 
 #ifdef CONFIG_SMP
 
+static inline bool sched_asym_prefer(int a, int b)
+{
+	return arch_asym_cpu_priority(a) > arch_asym_cpu_priority(b);
+}
+
 /*
  * We add the notion of a root-domain which will be used to define per-domain
  * variables. Each exclusive cpuset essentially defines an island domain by
@@ -905,6 +910,7 @@  struct sched_group {
 
 	unsigned int group_weight;
 	struct sched_group_capacity *sgc;
+	int asym_prefer_cpu;		/* cpu of highest priority in group */
 
 	/*
 	 * The CPUs this group covers.