From patchwork Wed Apr  6 17:24:07 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Dario Faggioli <dario.faggioli@citrix.com>
X-Patchwork-Id: 8764041
Return-Path: <xen-devel-bounces@lists.xen.org>
X-Original-To: patchwork-xen-devel@patchwork.kernel.org
Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org
Received: from mail.kernel.org (mail.kernel.org [198.145.29.136])
	by patchwork2.web.kernel.org (Postfix) with ESMTP id AED3AC0553
	for <patchwork-xen-devel@patchwork.kernel.org>;
	Wed,  6 Apr 2016 17:26:19 +0000 (UTC)
Received: from mail.kernel.org (localhost [127.0.0.1])
	by mail.kernel.org (Postfix) with ESMTP id 7A3ED201ED
	for <patchwork-xen-devel@patchwork.kernel.org>;
	Wed,  6 Apr 2016 17:26:18 +0000 (UTC)
Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120])
	(using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by mail.kernel.org (Postfix) with ESMTPS id 3275C201E4
	for <patchwork-xen-devel@patchwork.kernel.org>;
	Wed,  6 Apr 2016 17:26:17 +0000 (UTC)
Received: from localhost ([127.0.0.1] helo=lists.xenproject.org)
	by lists.xenproject.org with esmtp (Exim 4.84_2)
	(envelope-from <xen-devel-bounces@lists.xen.org>)
	id 1anrBe-0001XC-NE; Wed, 06 Apr 2016 17:24:14 +0000
Received: from mail6.bemta6.messagelabs.com ([85.158.143.247])
	by lists.xenproject.org with esmtp (Exim 4.84_2)
	(envelope-from <raistlin.df@gmail.com>) id 1anrBd-0001W9-6T
	for xen-devel@lists.xenproject.org; Wed, 06 Apr 2016 17:24:13 +0000
Received: from [85.158.143.35] by server-2.bemta-6.messagelabs.com id
	FA/40-09532-C3645075; Wed, 06 Apr 2016 17:24:12 +0000
X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrIIsWRWlGSWpSXmKPExsXiVRvkqGvtxhp
	u8Gu/nsX3LZOZHBg9Dn+4whLAGMWamZeUX5HAmvHy8m6WgnP+FXOmHGJsYDxp28XIxSEkMJNR
	4sbkWUwgDovAGlaJv1MugjkSApdYJeZdXMfexcgJ5MRI/F31jw3CrpXY+nQnC4gtJKAicXP7K
	iaIUYuZJE4sXMQMkhAW0JM4cvQHO4SdJHF0+VpGEJtNwEDizY69rCC2iICSxL1Vk5lAbGaBaI
	mVD5vB6lkEVCV2Pf4EZvMK2Et8XnQQaBkHB6eAg8Seye4Qe+0lrn45AbZKVEBOYuXlFlaIckG
	JkzOfgJUzC2hKrN+lDzFdXmL72znMExhFZiGpmoVQNQtJ1QJG5lWM6sWpRWWpRbqGeklFmekZ
	JbmJmTm6hgZmermpxcWJ6ak5iUnFesn5uZsYgcHPAAQ7GHc+dzrEKMnBpCTK6ynBGi7El5SfU
	pmRWJwRX1Sak1p8iFGGg0NJgpfDFSgnWJSanlqRlpkDjEOYtAQHj5IIrytImre4IDG3ODMdIn
	WKUZdjy9R7a5mEWPLy81KlxHlfugAVCYAUZZTmwY2ApYRLjLJSwryMQEcJ8RSkFuVmlqDKv2I
	U52BUEuY1BVnFk5lXArfpFdARTEBH1AszgRxRkoiQkmpgZFb6qTNJYl19vaXnzDuHOf833Pwk
	G31St+lFkkty54Z7Pt9Nzliv+ZJz7+aZtGjLiCeLb651kZmVGlYq/D1vetzdlyEbH4v5P+eo/
	N5f7X95rfupfx9Wn+Fbq2KRenSCmJGyzoKNN3O/eeeFnQ678a9xIcudWDZb9+8b+leLcU3e4u
	H5vPClnRJLcUaioRZzUXEiAIw6AMQEAwAA
X-Env-Sender: raistlin.df@gmail.com
X-Msg-Ref: server-7.tower-21.messagelabs.com!1459963451!7988592!1
X-Originating-IP: [74.125.82.65]
X-SpamReason: No, hits=0.5 required=7.0 tests=BODY_RANDOM_LONG
X-StarScan-Received: 
X-StarScan-Version: 8.28; banners=-,-,-
X-VirusChecked: Checked
Received: (qmail 16470 invoked from network); 6 Apr 2016 17:24:11 -0000
Received: from mail-wm0-f65.google.com (HELO mail-wm0-f65.google.com)
	(74.125.82.65)
	by server-7.tower-21.messagelabs.com with AES128-GCM-SHA256 encrypted
	SMTP; 6 Apr 2016 17:24:11 -0000
Received: by mail-wm0-f65.google.com with SMTP id i204so15164975wmd.0
	for <xen-devel@lists.xenproject.org>;
	Wed, 06 Apr 2016 10:24:11 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=sender:subject:from:to:cc:date:message-id:in-reply-to:references
	:user-agent:mime-version:content-transfer-encoding;
	bh=1ppbyKiYMy/7f9xSs7OY+NLhSpbdaf+zgQknGS42k4A=;
	b=O8FC4RzN2UAgdBX2iBaPiw9TOjrI+zyyX4hvLD59Dm4WxH1Quo3o0tGwqRo066kuHY
	Sd/VJvqy1mNsmAcnA6+2gmkta+NX4Rrs7BWO31fGWbu6TetRKTLjHUaZ3THb9Qm7Kkvc
	0R2lIw384Le94lzlABNx3LXYikdB7bs/VcmvPZd8XTzdgsfT4/0Dfkgo9mVfixUM5YHp
	1gOW6veoRIP4LTiare1LA7WpPhlN8pNZUhfVzwUpv67Bted51FX3VTJkQCSFXaeON0dF
	a/qePuOj9SgD6OL+3/1kR36djHHM5Wz4GD0Qab9pKU4WW3BkFtaehrIENyiCiZ3kVSYh
	iiEw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20130820;
	h=x-gm-message-state:sender:subject:from:to:cc:date:message-id
	:in-reply-to:references:user-agent:mime-version
	:content-transfer-encoding;
	bh=1ppbyKiYMy/7f9xSs7OY+NLhSpbdaf+zgQknGS42k4A=;
	b=BXtFAfBzN7X3jIJc6x7TPghpxMogg8++mV9FriztPGc3DbE44WPLfMAK7Om2qht6ES
	JtVsx8OP3ZydBooPpWSnWw9OwR21lxcSoEU1lLz2yYbqcaNatXSsRYBMToWV+Cy5/ex6
	T09Q3sZ6NGkx2DOXpbhu8+vofQhofvQc+RSPCC0Serkg56Y3gsadoRHAl9KxI6rdu79x
	gJZs0gIKW61Ad60VERE/lY2guHXGb03cpO9RPvT3g4YNfPJK8IVSnDYh3JSyBagCpLny
	9Ik0iVjl/2XuOb2vDTc7Lybj0Psx+niOSoXHKgHtCXG/ob5GGmHhht0/KTKJ15MzuDFO
	oeqQ==
X-Gm-Message-State: 
 AD7BkJL7MVGodfZbuHgVHv93NlwEqSgCxIWFdmAmvFpRg7x+7biUC0TlIV614GjipJla/A==
X-Received: by 10.194.222.234 with SMTP id
	qp10mr26285089wjc.138.1459963451438;
	Wed, 06 Apr 2016 10:24:11 -0700 (PDT)
Received: from Solace.fritz.box (net-37-116-155-252.cust.vodafonedsl.it.
	[37.116.155.252]) by smtp.gmail.com with ESMTPSA id
	ux5sm4262761wjc.17.2016.04.06.10.24.09
	(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
	Wed, 06 Apr 2016 10:24:10 -0700 (PDT)
From: Dario Faggioli <dario.faggioli@citrix.com>
To: xen-devel@lists.xenproject.org
Date: Wed, 06 Apr 2016 19:24:07 +0200
Message-ID: <20160406172407.25877.89123.stgit@Solace.fritz.box>
In-Reply-To: <20160406170023.25877.15622.stgit@Solace.fritz.box>
References: <20160406170023.25877.15622.stgit@Solace.fritz.box>
User-Agent: StGit/0.17.1-dirty
MIME-Version: 1.0
Cc: George Dunlap <george.dunlap@eu.citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>
Subject: [Xen-devel] [PATCH v2 10/11] xen: sched: privde some scratch space
	for not putting cpumasks on stack
X-BeenThere: xen-devel@lists.xen.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: Xen developer discussion <xen-devel.lists.xen.org>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Errors-To: xen-devel-bounces@lists.xen.org
Sender: "Xen-devel" <xen-devel-bounces@lists.xen.org>
X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,
	RCVD_IN_DNSWL_MED, T_DKIM_INVALID,
	UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

directly, from schedule.c, for any scheduler that needs
it to use it.

In fact, Credit1 and RTDS needs this already. Credit2 is
also going to need it, for supporting hard affinity
(which is, typically, what requires a lot of cpumask
manipulations, inside various functions).

Therefore, let's define the scratch space at a broader
scope, to limit code duplication in handling it.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
---
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes from v1:
* scratch space for cpumask is not "global", and defined
  in schedule.c, as suggested during review.
---
 xen/common/sched_credit.c  |   34 ++++++-------------------
 xen/common/sched_credit2.c |    1 +
 xen/common/sched_rt.c      |   59 +++-----------------------------------------
 xen/common/schedule.c      |    8 ++++++
 xen/include/xen/sched-if.h |    4 +++
 5 files changed, 25 insertions(+), 81 deletions(-)

diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index 540d515..eac3f5e 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -171,20 +171,9 @@ struct csched_pcpu {
     struct timer ticker;
     unsigned int tick;
     unsigned int idle_bias;
-    /* Store this here to avoid having too many cpumask_var_t-s on stack */
-    cpumask_var_t balance_mask;
 };
 
 /*
- * Convenience macro for accessing the per-PCPU cpumask we need for
- * implementing the two steps (soft and hard affinity) balancing logic.
- * It is stored in csched_pcpu so that serialization is not an issue,
- * as there is a csched_pcpu for each PCPU, and we always hold the
- * runqueue lock for the proper PCPU when using this.
- */
-#define csched_balance_mask(c) (CSCHED_PCPU(c)->balance_mask)
-
-/*
  * Virtual CPU
  */
 struct csched_vcpu {
@@ -416,10 +405,10 @@ static inline void __runq_tickle(struct csched_vcpu *new)
 
             /* Are there idlers suitable for new (for this balance step)? */
             csched_balance_cpumask(new->vcpu, balance_step,
-                                   csched_balance_mask(cpu));
-            cpumask_and(csched_balance_mask(cpu),
-                        csched_balance_mask(cpu), &idle_mask);
-            new_idlers_empty = cpumask_empty(csched_balance_mask(cpu));
+                                   cpumask_scratch_cpu(cpu));
+            cpumask_and(cpumask_scratch_cpu(cpu),
+                        cpumask_scratch_cpu(cpu), &idle_mask);
+            new_idlers_empty = cpumask_empty(cpumask_scratch_cpu(cpu));
 
             /*
              * Let's not be too harsh! If there aren't idlers suitable
@@ -445,8 +434,8 @@ static inline void __runq_tickle(struct csched_vcpu *new)
             if ( new_idlers_empty && new->pri > cur->pri )
             {
                 csched_balance_cpumask(cur->vcpu, balance_step,
-                                       csched_balance_mask(cpu));
-                if ( cpumask_intersects(csched_balance_mask(cpu),
+                                       cpumask_scratch_cpu(cpu));
+                if ( cpumask_intersects(cpumask_scratch_cpu(cpu),
                                         &idle_mask) )
                 {
                     SCHED_VCPU_STAT_CRANK(cur, kicked_away);
@@ -519,7 +508,6 @@ csched_free_pdata(const struct scheduler *ops, void *pcpu, int cpu)
 
     spin_unlock_irqrestore(&prv->lock, flags);
 
-    free_cpumask_var(spc->balance_mask);
     xfree(spc);
 }
 
@@ -533,12 +521,6 @@ csched_alloc_pdata(const struct scheduler *ops, int cpu)
     if ( spc == NULL )
         return ERR_PTR(-ENOMEM);
 
-    if ( !alloc_cpumask_var(&spc->balance_mask) )
-    {
-        xfree(spc);
-        return ERR_PTR(-ENOMEM);
-    }
-
     return spc;
 }
 
@@ -1592,9 +1574,9 @@ csched_runq_steal(int peer_cpu, int cpu, int pri, int balance_step)
                  && !__vcpu_has_soft_affinity(vc, vc->cpu_hard_affinity) )
                 continue;
 
-            csched_balance_cpumask(vc, balance_step, csched_balance_mask(cpu));
+            csched_balance_cpumask(vc, balance_step, cpumask_scratch_cpu(cpu));
             if ( __csched_vcpu_is_migrateable(vc, cpu,
-                                              csched_balance_mask(cpu)) )
+                                              cpumask_scratch_cpu(cpu)) )
             {
                 /* We got a candidate. Grab it! */
                 TRACE_3D(TRC_CSCHED_STOLEN_VCPU, peer_cpu,
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 3b45816..084963a 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -2252,6 +2252,7 @@ csched2_init(struct scheduler *ops)
     if ( prv == NULL )
         return -ENOMEM;
     ops->sched_data = prv;
+
     spin_lock_init(&prv->lock);
     INIT_LIST_HEAD(&prv->sdom);
 
diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index 3bb8c71..673fc92 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -155,24 +155,6 @@
 #define TRC_RTDS_BUDGET_REPLENISH TRC_SCHED_CLASS_EVT(RTDS, 4)
 #define TRC_RTDS_SCHED_TASKLET    TRC_SCHED_CLASS_EVT(RTDS, 5)
 
- /*
-  * Useful to avoid too many cpumask_var_t on the stack.
-  */
-static cpumask_var_t *_cpumask_scratch;
-#define cpumask_scratch _cpumask_scratch[smp_processor_id()]
-
-/*
- * We want to only allocate the _cpumask_scratch array the first time an
- * instance of this scheduler is used, and avoid reallocating and leaking
- * the old one when more instance are activated inside new cpupools. We
- * also want to get rid of it when the last instance is de-inited.
- *
- * So we (sort of) reference count the number of initialized instances. This
- * does not need to happen via atomic_t refcounters, as it only happens either
- * during boot, or under the protection of the cpupool_lock spinlock.
- */
-static unsigned int nr_rt_ops;
-
 static void repl_timer_handler(void *data);
 
 /*
@@ -301,12 +283,11 @@ rt_dump_vcpu(const struct scheduler *ops, const struct rt_vcpu *svc)
     /*
      * We can't just use 'cpumask_scratch' because the dumping can
      * happen from a pCPU outside of this scheduler's cpupool, and
-     * hence it's not right to use the pCPU's scratch mask (which
-     * may even not exist!). On the other hand, it is safe to use
-     * svc->vcpu->processor's own scratch space, since we hold the
-     * runqueue lock.
+     * hence it's not right to use its pCPU's scratch mask.
+     * On the other hand, it is safe to use svc->vcpu->processor's
+     * own scratch space, since we hold the runqueue lock.
      */
-    mask = _cpumask_scratch[svc->vcpu->processor];
+    mask = cpumask_scratch_cpu(svc->vcpu->processor);
 
     cpupool_mask = cpupool_domain_cpumask(svc->vcpu->domain);
     cpumask_and(mask, cpupool_mask, svc->vcpu->cpu_hard_affinity);
@@ -609,16 +590,6 @@ rt_init(struct scheduler *ops)
     if ( prv == NULL )
         return -ENOMEM;
 
-    ASSERT( _cpumask_scratch == NULL || nr_rt_ops > 0 );
-
-    if ( !_cpumask_scratch )
-    {
-        _cpumask_scratch = xmalloc_array(cpumask_var_t, nr_cpu_ids);
-        if ( !_cpumask_scratch )
-            goto no_mem;
-    }
-    nr_rt_ops++;
-
     spin_lock_init(&prv->lock);
     INIT_LIST_HEAD(&prv->sdom);
     INIT_LIST_HEAD(&prv->runq);
@@ -636,10 +607,6 @@ rt_init(struct scheduler *ops)
     prv->repl_timer = NULL;
 
     return 0;
-
- no_mem:
-    xfree(prv);
-    return -ENOMEM;
 }
 
 static void
@@ -647,14 +614,6 @@ rt_deinit(struct scheduler *ops)
 {
     struct rt_private *prv = rt_priv(ops);
 
-    ASSERT( _cpumask_scratch && nr_rt_ops > 0 );
-
-    if ( (--nr_rt_ops) == 0 )
-    {
-        xfree(_cpumask_scratch);
-        _cpumask_scratch = NULL;
-    }
-
     kill_timer(prv->repl_timer);
     xfree(prv->repl_timer);
 
@@ -718,9 +677,6 @@ rt_alloc_pdata(const struct scheduler *ops, int cpu)
 {
     struct rt_private *prv = rt_priv(ops);
 
-    if ( !alloc_cpumask_var(&_cpumask_scratch[cpu]) )
-        return ERR_PTR(-ENOMEM);
-
     if ( prv->repl_timer == NULL )
     {
         /* Allocate the timer on the first cpu of this pool. */
@@ -735,12 +691,6 @@ rt_alloc_pdata(const struct scheduler *ops, int cpu)
     return NULL;
 }
 
-static void
-rt_free_pdata(const struct scheduler *ops, void *pcpu, int cpu)
-{
-    free_cpumask_var(_cpumask_scratch[cpu]);
-}
-
 static void *
 rt_alloc_domdata(const struct scheduler *ops, struct domain *dom)
 {
@@ -1484,7 +1434,6 @@ static const struct scheduler sched_rtds_def = {
     .init           = rt_init,
     .deinit         = rt_deinit,
     .alloc_pdata    = rt_alloc_pdata,
-    .free_pdata     = rt_free_pdata,
     .init_pdata     = rt_init_pdata,
     .switch_sched   = rt_switch_sched,
     .alloc_domdata  = rt_alloc_domdata,
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index 5559aa1..922b035 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -65,6 +65,14 @@ static void poll_timer_fn(void *data);
 DEFINE_PER_CPU(struct schedule_data, schedule_data);
 DEFINE_PER_CPU(struct scheduler *, scheduler);
 
+/*
+ * Scratch space, for avoiding having too many cpumask_var_t on the stack.
+ * Properly serializing access, if necessary, is responsibility of each
+ * scheduler (typically, one can expect this to be protected by the per pCPU
+ * or per runqueue lock).
+ */
+DEFINE_PER_CPU(cpumask_t, cpumask_scratch);
+
 extern const struct scheduler *__start_schedulers_array[], *__end_schedulers_array[];
 #define NUM_SCHEDULERS (__end_schedulers_array - __start_schedulers_array)
 #define schedulers __start_schedulers_array
diff --git a/xen/include/xen/sched-if.h b/xen/include/xen/sched-if.h
index 9cebe41..1db7c8d 100644
--- a/xen/include/xen/sched-if.h
+++ b/xen/include/xen/sched-if.h
@@ -47,6 +47,10 @@ DECLARE_PER_CPU(struct schedule_data, schedule_data);
 DECLARE_PER_CPU(struct scheduler *, scheduler);
 DECLARE_PER_CPU(struct cpupool *, cpupool);
 
+DECLARE_PER_CPU(cpumask_t, cpumask_scratch);
+#define cpumask_scratch        (&this_cpu(cpumask_scratch))
+#define cpumask_scratch_cpu(c) (&per_cpu(cpumask_scratch, c))
+
 #define sched_lock(kind, param, cpu, irq, arg...) \
 static inline spinlock_t *kind##_schedule_lock##irq(param EXTRA_TYPE(arg)) \
 { \