From patchwork Thu Jul 5 23:45:58 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stephen Boyd X-Patchwork-Id: 1162491 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) by patchwork1.kernel.org (Postfix) with ESMTP id 21AEB3FE4F for ; Thu, 5 Jul 2012 23:49:37 +0000 (UTC) Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.76 #1 (Red Hat Linux)) id 1Smvko-0006eS-Qo; Thu, 05 Jul 2012 23:46:34 +0000 Received: from wolverine01.qualcomm.com ([199.106.114.254]) by merlin.infradead.org with esmtps (Exim 4.76 #1 (Red Hat Linux)) id 1Smvkg-0006eE-6d for linux-arm-kernel@lists.infradead.org; Thu, 05 Jul 2012 23:46:30 +0000 X-IronPort-AV: E=McAfee;i="5400,1158,6763"; a="207822673" Received: from pdmz-ns-mip.qualcomm.com (HELO mostmsg01.qualcomm.com) ([199.106.114.10]) by wolverine01.qualcomm.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 05 Jul 2012 16:46:03 -0700 Received: from sboyd-linux.qualcomm.com (pdmz-ns-snip_218_1.qualcomm.com [192.168.218.1]) by mostmsg01.qualcomm.com (Postfix) with ESMTPA id 9BAEF10004AC; Thu, 5 Jul 2012 16:46:02 -0700 (PDT) From: Stephen Boyd To: linux-arm-kernel@lists.infradead.org Subject: [PATCH] ARM: smp: Fix suspicious RCU originating from cpu_die() Date: Thu, 5 Jul 2012 16:45:58 -0700 Message-Id: <1341531958-31721-1-git-send-email-sboyd@codeaurora.org> X-Mailer: git-send-email 1.7.11.1.107.g7260167 X-Spam-Note: CRM114 invocation failed X-Spam-Score: -4.2 (----) X-Spam-Report: SpamAssassin version 3.3.2 on merlin.infradead.org summary: Content analysis details: (-4.2 points) pts rule name description ---- ---------------------- -------------------------------------------------- -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium trust [199.106.114.254 listed in list.dnswl.org] -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] Cc: "Paul E. McKenney" , linux-kernel@vger.kernel.org X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: linux-arm-kernel-bounces@lists.infradead.org Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org While running hotplug tests I ran into this RCU splat =============================== [ INFO: suspicious RCU usage. ] 3.4.0 #3275 Tainted: G W ------------------------------- include/linux/rcupdate.h:729 rcu_read_lock() used illegally while idle! other info that might help us debug this: RCU used illegally from idle CPU! rcu_scheduler_active = 1, debug_locks = 0 RCU used illegally from extended quiescent state! 4 locks held by swapper/2/0: #0: ((cpu_died).wait.lock){......}, at: [] complete+0x1c/0x5c #1: (&p->pi_lock){-.-.-.}, at: [] try_to_wake_up+0x2c/0x388 #2: (&rq->lock){-.-.-.}, at: [] try_to_wake_up+0x130/0x388 #3: (rcu_read_lock){.+.+..}, at: [] cpuacct_charge+0x28/0x1f4 stack backtrace: [] (unwind_backtrace+0x0/0x12c) from [] (cpuacct_charge+0x94/0x1f4) [] (cpuacct_charge+0x94/0x1f4) from [] (update_curr+0x24c/0x2c8) [] (update_curr+0x24c/0x2c8) from [] (enqueue_task_fair+0x50/0x194) [] (enqueue_task_fair+0x50/0x194) from [] (enqueue_task+0x30/0x34) [] (enqueue_task+0x30/0x34) from [] (ttwu_activate+0x14/0x38) [] (ttwu_activate+0x14/0x38) from [] (try_to_wake_up+0x178/0x388) [] (try_to_wake_up+0x178/0x388) from [] (__wake_up_common+0x34/0x78) [] (__wake_up_common+0x34/0x78) from [] (complete+0x48/0x5c) [] (complete+0x48/0x5c) from [] (cpu_die+0x2c/0x58) [] (cpu_die+0x2c/0x58) from [] (cpu_idle+0x64/0xfc) [] (cpu_idle+0x64/0xfc) from [<80208160>] (0x80208160) When a cpu is marked offline during its idle thread it calls cpu_die() during an RCU idle period. cpu_die() calls complete() to notify the killing process that the cpu has died. complete() calls into the scheduler code which sometimes grabs an RCU read lock in cpuacct_charge(). To avoid this problem, copy what x86 is doing and have a per_cpu variable to track the cpu state and have the killing process poll that variable. Cc: "Paul E. McKenney" Signed-off-by: Stephen Boyd --- arch/arm/kernel/smp.c | 26 ++++++++++++++++---------- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 2c7217d..5430ea4 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -59,6 +59,7 @@ enum ipi_msg_type { }; static DECLARE_COMPLETION(cpu_running); +static DEFINE_PER_CPU(int, cpu_state) = { 0 }; int __cpuinit __cpu_up(unsigned int cpu, struct task_struct *idle) { @@ -143,22 +144,26 @@ int __cpu_disable(void) return 0; } -static DECLARE_COMPLETION(cpu_died); - /* * called on the thread which is asking for a CPU to be shutdown - * waits until shutdown has completed, or it is timed out. */ void __cpu_die(unsigned int cpu) { - if (!wait_for_completion_timeout(&cpu_died, msecs_to_jiffies(5000))) { - pr_err("CPU%u: cpu didn't die\n", cpu); - return; + unsigned int i; + + for (i = 0; i < 50; i++) { + if (per_cpu(cpu_state, cpu) == CPU_DEAD) { + if (platform_cpu_kill(cpu)) { + pr_notice("CPU%u: shutdown\n", cpu); + return; + } else { + break; + } + } + msleep(100); } - printk(KERN_NOTICE "CPU%u: shutdown\n", cpu); - - if (!platform_cpu_kill(cpu)) - printk("CPU%u: unable to kill\n", cpu); + pr_err("CPU%u: unable to kill\n", cpu); } /* @@ -179,7 +184,7 @@ void __ref cpu_die(void) mb(); /* Tell __cpu_die() that this CPU is now safe to dispose of */ - complete(&cpu_died); + __this_cpu_write(cpu_state, CPU_DEAD); /* * actual CPU shutdown procedure is at least platform (if not @@ -258,6 +263,7 @@ asmlinkage void __cpuinit secondary_start_kernel(void) * before we continue - which happens after __cpu_up returns. */ set_cpu_online(cpu, true); + per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE; complete(&cpu_running); /*