From patchwork Tue May 3 21:46:42 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dario Faggioli X-Patchwork-Id: 9008641 Return-Path: X-Original-To: patchwork-xen-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 466A0BF29F for ; Tue, 3 May 2016 21:49:10 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 15A5520373 for ; Tue, 3 May 2016 21:49:09 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id BA8172026F for ; Tue, 3 May 2016 21:49:07 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1axi9Y-0005Nb-R4; Tue, 03 May 2016 21:46:48 +0000 Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1axi9X-0005NV-VG for xen-devel@lists.xenproject.org; Tue, 03 May 2016 21:46:48 +0000 Received: from [193.109.254.147] by server-2.bemta-14.messagelabs.com id 91/9C-03279-74C19275; Tue, 03 May 2016 21:46:47 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrEIsWRWlGSWpSXmKPExsXiVRvkpOsmoxl u8HY+v8X3LZOZHBg9Dn+4whLAGMWamZeUX5HAmvHoi2BBi3/F57kTWBoY19h3MXJxCAnMYJTY e38LI4jDIrCGVeL91otgjoTAJVaJvz8aWLsYOYGcGIkF2z6yQNjlEi07H4HFhQRUJG5uX8UEM WoBk8SJa/vYQBLCAnoSR47+YIewoyVmX9rMCGKzCRhIvNmxF6xZREBJ4t6qyWDNzAJ3GSXa5v cygyRYBFQlzvX8BGvgFfCRWLFoA9hQTiD7ZsdjqM3eEmtajoHFRQXkJFZebmGFqBeUODnzCdC lHEBDNSXW79IHCTMLyEtsfzuHeQKjyCwkVbMQqmYhqVrAyLyKUb04tagstUjXUC+pKDM9oyQ3 MTNH19DQRC83tbg4MT01JzGpWC85P3cTIzD8GYBgB+PRTudDjJIcTEqivJ4SmuFCfEn5KZUZi cUZ8UWlOanFhxhlODiUJHj/SwHlBItS01Mr0jJzgJEIk5bg4FES4b0BkuYtLkjMLc5Mh0idYt Tl2DL13lomIZa8/LxUKXFefmmgIgGQoozSPLgRsKRwiVFWSpiXEegoIZ6C1KLczBJU+VeM4hy MSsK8+0BW8WTmlcBtegV0BBPQEdnrVUGOKElESEk1MCboXwyfmplZupmfJ1L4uvu5/dXTWU5l cr4UdOy9r9/4uumnhu0FYWaBr7FNobrs5UXLxCdz/at3aI+9teADW8u956dKt4aeUE1VP6x+T uEgj8P2g21cwmctXbb+6G25lLsx8P48UWftW88NA3+8e/fYySR2nQSXWurJlUINT01PvOs+t4 vRVImlOCPRUIu5qDgRACDKmkcFAwAA X-Env-Sender: raistlin.df@gmail.com X-Msg-Ref: server-12.tower-27.messagelabs.com!1462312005!39379237!1 X-Originating-IP: [74.125.82.66] X-SpamReason: No, hits=0.5 required=7.0 tests=BODY_RANDOM_LONG X-StarScan-Received: X-StarScan-Version: 8.34; banners=-,-,- X-VirusChecked: Checked Received: (qmail 32470 invoked from network); 3 May 2016 21:46:46 -0000 Received: from mail-wm0-f66.google.com (HELO mail-wm0-f66.google.com) (74.125.82.66) by server-12.tower-27.messagelabs.com with AES128-GCM-SHA256 encrypted SMTP; 3 May 2016 21:46:46 -0000 Received: by mail-wm0-f66.google.com with SMTP id n129so6256423wmn.1 for ; Tue, 03 May 2016 14:46:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=8KOtiyxgHU05tu64iF1xAOWRfLV5feFEAhiOS1rSlTM=; b=Nq314dKW7VAANCGQ7XhYrDy2GG9yNSM681vNkexjaAHnmSCdNsn/fGbP1Ps3ldxzOk U4Ft1qf2PPegFmpv0JeRZ/D8ETL8zFErVW1BJKYQ17wZirXjabqLBOoWL3s1mDqe0qaY gLnod/lfjXFed/dFJDVV9xqDK82CMfDHLFxgTU02D8dnXrgLB/28+O8cPTOyL+s/z8eO JCy32uYeQIQG8SI3on1yl0pX0zYDZEWenECeLXwYPGoejbScM/CcOjxiR+BGtf5926sC IrQTE35DOL6Bjb+MCf3LqX5/nP+oS5psuf62206iyGIpfcZY//FPoojM+IdtDSVct3B0 O52w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:subject:from:to:cc:date:message-id :in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=8KOtiyxgHU05tu64iF1xAOWRfLV5feFEAhiOS1rSlTM=; b=AJJE5KwSu0VaNnETCGKhMv+L+G7QsLSk90qXBgSDsrOCYFrFDNhEDNuvCg4upmnEc4 WyPXEz64jzUZvhCtnsjdFb7oqa1EWd0NvhUEbTL3r7A7/tH1+uKCL1NY1n24cRwXtV2j njHiXnzMkGyK7bA6pRJcj770edLYzn7eoUhm5zzwa9OiB2C8XfqvhL9kyoeTHwJgLTWs ZBBxlq43rC+Tye7d1sQJpMxeNx+JX84OJViIUJBsGHg01j7SnfInAHsS/WLUOQ6EWZ7N UdDvaWrjSQtEykws39P/vjc3KA0c+yKvSlNB5lYB9Jdf2Q+6DS09MstLYJJhybpbNZX4 S9NQ== X-Gm-Message-State: AOPr4FVuPdln2cqwrNXNnkdNWO736MUlyvcxqgzCVLBkL+ayI91Wra2/22nXjCxKoujWqg== X-Received: by 10.28.103.2 with SMTP id b2mr27704250wmc.66.1462312005612; Tue, 03 May 2016 14:46:45 -0700 (PDT) Received: from Solace.fritz.box (net-93-65-132-113.cust.vodafonedsl.it. [93.65.132.113]) by smtp.gmail.com with ESMTPSA id gg7sm525633wjd.10.2016.05.03.14.46.43 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 03 May 2016 14:46:44 -0700 (PDT) From: Dario Faggioli To: xen-devel@lists.xenproject.org Date: Tue, 03 May 2016 23:46:42 +0200 Message-ID: <146231200265.25631.11092489140094592368.stgit@Solace.fritz.box> In-Reply-To: <146231184906.25631.6550047090421454264.stgit@Solace.fritz.box> References: <146231184906.25631.6550047090421454264.stgit@Solace.fritz.box> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Cc: Wei Liu , Steve Capper , George Dunlap , Julien Grall , Varun.Swara@arm.com Subject: [Xen-devel] [PATCH for 4.7 2/4] xen: sched: fix killing an uninitialized timer in free_pdata. X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED, T_DKIM_INVALID, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP commit 64269d9365 "sched: implement .init_pdata in Credit, Credit2 and RTDS" helped fixing Credit2 runqueues, and the races we had in switching scheduler for pCPUs, but introduced another issue. In fact, if CPU bringup fails during __cpu_up() (and, more precisely, after CPU_UP_PREPARE, but before CPU_STARTING) the CPU_UP_CANCELED notifier would be executed, which calls the free_pdata hook. Such hook does, right now, two things: (1) undo the initialization done inside the init_pdata hook and (2) free the memory allocated by the alloc_pdata hook. However, in the failure path just described, it is possible that only alloc_pdata were called, and this is potentially an issue (depending on how actually free_pdata does). In fact, for Credit1 (the only scheduler that actually allocates per-pCPU data), this result in calling kill_timer() on a timer that had not yet been initialized, which causes the following: (XEN) Xen call trace: (XEN) [<000000000022e304>] timer.c#active_timer+0x8/0x24 (PC) (XEN) [<000000000022f624>] kill_timer+0x108/0x2e0 (LR) (XEN) [<00000000002208c0>] sched_credit.c#csched_free_pdata+0xd8/0x114 (XEN) [<0000000000227a18>] schedule.c#cpu_schedule_callback+0xc0/0x12c (XEN) [<0000000000219944>] notifier_call_chain+0x78/0x9c (XEN) [<00000000002015fc>] cpu_up+0x104/0x130 (XEN) [<000000000028f7c0>] start_xen+0xaf8/0xce0 (XEN) [<00000000810021d8>] 00000000810021d8 (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) Assertion 'timer->status >= TIMER_STATUS_inactive' failed at timer.c:279 (XEN) **************************************** Solve this by making the scheduler hooks API symmetric again, i.e., by adding a deinit_pdata hook and making it responsible of undoing what init_pdata did, rather than asking to free_pdata to do everything. This is cleaner and, in the case at hand, makes it possible to only call free_pdata (which is the right thing to do) as only allocation and no initialization was performed. Reported-by: Julien Grall Signed-off-by: Dario Faggioli Reviewed-by: George Dunlap --- Cc: George Dunlap Cc: Konrad Rzeszutek Wilk Cc: Varun.Swara@arm.com Cc: Steve Capper Cc: Wei Liu --- xen/common/sched_credit.c | 31 +++++++++++++++++++++++++++---- xen/common/sched_credit2.c | 16 +++++++++++++--- xen/common/schedule.c | 38 +++++++++++++++++++++++++++++++++++++- xen/include/xen/sched-if.h | 1 + 4 files changed, 78 insertions(+), 8 deletions(-) diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c index bc36837..db4d42a 100644 --- a/xen/common/sched_credit.c +++ b/xen/common/sched_credit.c @@ -485,11 +485,35 @@ static void csched_free_pdata(const struct scheduler *ops, void *pcpu, int cpu) { struct csched_private *prv = CSCHED_PRIV(ops); + + /* + * pcpu either points to a valid struct csched_pcpu, or is NULL, if we're + * beeing called from CPU_UP_CANCELLED, because bringing up a pCPU failed + * very early. xfree() does not really mind, but we want to be sure that, + * when we get here, either init_pdata has never been called, or + * deinit_pdata has been called already. + */ + ASSERT(!cpumask_test_cpu(cpu, prv->cpus)); + + xfree(pcpu); +} + +static void +csched_deinit_pdata(const struct scheduler *ops, void *pcpu, int cpu) +{ + struct csched_private *prv = CSCHED_PRIV(ops); struct csched_pcpu *spc = pcpu; unsigned long flags; - if ( spc == NULL ) - return; + /* + * Scheduler specific data for this pCPU must still be there and and be + * valid. In fact, if we are here: + * 1. alloc_pdata must have been called for this cpu, and free_pdata + * must not have been called on it before us, + * 2. init_pdata must have been called on this cpu, and deinit_pdata + * (us!) must not have been called on it already. + */ + ASSERT(spc && cpumask_test_cpu(cpu, prv->cpus)); spin_lock_irqsave(&prv->lock, flags); @@ -507,8 +531,6 @@ csched_free_pdata(const struct scheduler *ops, void *pcpu, int cpu) kill_timer(&prv->master_ticker); spin_unlock_irqrestore(&prv->lock, flags); - - xfree(spc); } static void * @@ -2091,6 +2113,7 @@ static const struct scheduler sched_credit_def = { .free_vdata = csched_free_vdata, .alloc_pdata = csched_alloc_pdata, .init_pdata = csched_init_pdata, + .deinit_pdata = csched_deinit_pdata, .free_pdata = csched_free_pdata, .switch_sched = csched_switch_sched, .alloc_domdata = csched_alloc_domdata, diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index 50c1b9d..803cc44 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -2201,6 +2201,12 @@ csched2_init_pdata(const struct scheduler *ops, void *pdata, int cpu) unsigned long flags; unsigned rqi; + /* + * pdata contains what alloc_pdata returned. But since we don't (need to) + * implement alloc_pdata, either that's NULL, or something is very wrong! + */ + ASSERT(!pdata); + spin_lock_irqsave(&prv->lock, flags); old_lock = pcpu_schedule_lock(cpu); @@ -2261,7 +2267,7 @@ csched2_switch_sched(struct scheduler *new_ops, unsigned int cpu, } static void -csched2_free_pdata(const struct scheduler *ops, void *pcpu, int cpu) +csched2_deinit_pdata(const struct scheduler *ops, void *pcpu, int cpu) { unsigned long flags; struct csched2_private *prv = CSCHED2_PRIV(ops); @@ -2270,7 +2276,11 @@ csched2_free_pdata(const struct scheduler *ops, void *pcpu, int cpu) spin_lock_irqsave(&prv->lock, flags); - ASSERT(cpumask_test_cpu(cpu, &prv->initialized)); + /* + * alloc_pdata is not implemented, so pcpu must be NULL. On the other + * hand, init_pdata must have been called for this pCPU. + */ + ASSERT(!pcpu && cpumask_test_cpu(cpu, &prv->initialized)); /* Find the old runqueue and remove this cpu from it */ rqi = prv->runq_map[cpu]; @@ -2387,7 +2397,7 @@ static const struct scheduler sched_credit2_def = { .alloc_vdata = csched2_alloc_vdata, .free_vdata = csched2_free_vdata, .init_pdata = csched2_init_pdata, - .free_pdata = csched2_free_pdata, + .deinit_pdata = csched2_deinit_pdata, .switch_sched = csched2_switch_sched, .alloc_domdata = csched2_alloc_domdata, .free_domdata = csched2_free_domdata, diff --git a/xen/common/schedule.c b/xen/common/schedule.c index 5546999..80fea39 100644 --- a/xen/common/schedule.c +++ b/xen/common/schedule.c @@ -1546,6 +1546,38 @@ static int cpu_schedule_callback( struct schedule_data *sd = &per_cpu(schedule_data, cpu); int rc = 0; + /* + * From the scheduler perspective, bringing up a pCPU requires + * allocating and initializing the per-pCPU scheduler specific data, + * as well as "registering" this pCPU to the scheduler (which may + * involve modifying some scheduler wide data structures). + * This happens by calling the alloc_pdata and init_pdata hooks, in + * this order. A scheduler that does not need to allocate any per-pCPU + * data can avoid implementing alloc_pdata. init_pdata may, however, be + * necessary/useful in this case too (e.g., it can contain the "register + * the pCPU to the scheduler" part). alloc_pdata (if present) is called + * during CPU_UP_PREPARE. init_pdata (if present) is called during + * CPU_STARTING. + * + * On the other hand, at teardown, we need to reverse what has been done + * during initialization, and then free the per-pCPU specific data. This + * happens by calling the deinit_pdata and free_pdata hooks, in this + * order. If no per-pCPU memory was allocated, there is no need to + * provide an implementation of free_pdata. deinit_pdata may, however, + * be necessary/useful in this case too (e.g., it can undo something done + * on scheduler wide data structure during init_pdata). Both deinit_pdata + * and free_pdata are called during CPU_DEAD. + * + * If someting goes wrong during bringup, we go to CPU_UP_CANCELLED + * *before* having called init_pdata. In this case, as there is no + * initialization needing undoing, only free_pdata should be called. + * This means it is possible to call free_pdata just after alloc_pdata, + * without a init_pdata/deinit_pdata "cycle" in between the two. + * + * So, in summary, the usage pattern should look either + * - alloc_pdata-->init_pdata-->deinit_pdata-->free_pdata, or + * - alloc_pdata-->free_pdata. + */ switch ( action ) { case CPU_STARTING: @@ -1554,8 +1586,10 @@ static int cpu_schedule_callback( case CPU_UP_PREPARE: rc = cpu_schedule_up(cpu); break; - case CPU_UP_CANCELED: case CPU_DEAD: + SCHED_OP(sched, deinit_pdata, sd->sched_priv, cpu); + /* Fallthrough */ + case CPU_UP_CANCELED: cpu_schedule_down(cpu); break; default: @@ -1713,6 +1747,8 @@ int schedule_cpu_switch(unsigned int cpu, struct cpupool *c) SCHED_OP(new_ops, tick_resume, cpu); + SCHED_OP(old_ops, deinit_pdata, ppriv_old, cpu); + SCHED_OP(old_ops, free_vdata, vpriv_old); SCHED_OP(old_ops, free_pdata, ppriv_old, cpu); diff --git a/xen/include/xen/sched-if.h b/xen/include/xen/sched-if.h index 1db7c8d..bc0e794 100644 --- a/xen/include/xen/sched-if.h +++ b/xen/include/xen/sched-if.h @@ -138,6 +138,7 @@ struct scheduler { void (*free_pdata) (const struct scheduler *, void *, int); void * (*alloc_pdata) (const struct scheduler *, int); void (*init_pdata) (const struct scheduler *, void *, int); + void (*deinit_pdata) (const struct scheduler *, void *, int); void (*free_domdata) (const struct scheduler *, void *); void * (*alloc_domdata) (const struct scheduler *, struct domain *);