From patchwork Wed Aug 22 04:03:49 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stephen Boyd X-Patchwork-Id: 1359111 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) by patchwork1.kernel.org (Postfix) with ESMTP id CA9793FC71 for ; Wed, 22 Aug 2012 04:07:07 +0000 (UTC) Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.76 #1 (Red Hat Linux)) id 1T42Ah-0006qu-SX; Wed, 22 Aug 2012 04:03:59 +0000 Received: from wolverine02.qualcomm.com ([199.106.114.251]) by merlin.infradead.org with esmtps (Exim 4.76 #1 (Red Hat Linux)) id 1T42Ae-0006qg-DQ for linux-arm-kernel@lists.infradead.org; Wed, 22 Aug 2012 04:03:57 +0000 X-IronPort-AV: E=McAfee;i="5400,1158,6811"; a="225707544" Received: from pdmz-ns-snip_115_219.qualcomm.com (HELO mostmsg01.qualcomm.com) ([199.106.115.219]) by wolverine02.qualcomm.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 21 Aug 2012 21:03:53 -0700 Received: from sboyd-linux.qualcomm.com (pdmz-ns-snip_218_1.qualcomm.com [192.168.218.1]) by mostmsg01.qualcomm.com (Postfix) with ESMTPA id DD7EF10004C2; Tue, 21 Aug 2012 21:03:51 -0700 (PDT) From: Stephen Boyd To: linux-arm-kernel@lists.infradead.org Subject: [RFC/PATCH] ARM: smp: Fix cpu_up() racing with sys_reboot Date: Tue, 21 Aug 2012 21:03:49 -0700 Message-Id: <1345608229-5707-1-git-send-email-sboyd@codeaurora.org> X-Mailer: git-send-email 1.7.12 X-Spam-Note: CRM114 invocation failed X-Spam-Score: -4.2 (----) X-Spam-Report: SpamAssassin version 3.3.2 on merlin.infradead.org summary: Content analysis details: (-4.2 points) pts rule name description ---- ---------------------- -------------------------------------------------- -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium trust [199.106.114.251 listed in list.dnswl.org] -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] Cc: linux-kernel@vger.kernel.org X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: linux-arm-kernel-bounces@lists.infradead.org Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org Nothing stops a process from hotplugging in a CPU concurrently with a sys_reboot() call. In such a situation we could have ipi_cpu_stop() mark a cpu as 'offline' and _cpu_up() ignore the fact that the CPU is not really offline and call the CPU_UP_PREPARE notifier. When this happens stop_machine code will complain that the cpu thread already exists and BUG_ON(). CPU0 CPU1 sys_reboot() kernel_restart() machine_restart() machine_shutdown() smp_send_stop() ... ipi_cpu_stop() set_cpu_online(1, false) local_irq_disable() while(1) cpu_up() _cpu_up() if (!cpu_online(1)) __cpu_notify(CPU_UP_PREPARE...) cpu_stop_cpu_callback() BUG_ON(stopper->thread) This is easily reproducible by hotplugging in and out in a tight loop while also rebooting. Since the CPU is not really offline and hasn't gone through the proper steps to be marked as such, let's mark the CPU as inactive. This is just as easily testable as online and avoids any possibility of _cpu_up() trying to bring the CPU back online when it never was offline to begin with. Signed-off-by: Stephen Boyd --- Perhaps we can take the hotplug lock in the sys_reboot() case but I don't think that actually fixes everything. For example, in cases where machine_shutdown() is called from emergency_restart() we would have to take the hotplug lock which doesn't really seem feasible. arch/arm/kernel/smp.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index ebd8ad2..836b771 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -478,7 +478,7 @@ static void ipi_cpu_stop(unsigned int cpu) raw_spin_unlock(&stop_lock); } - set_cpu_online(cpu, false); + set_cpu_active(cpu, false); local_fiq_disable(); local_irq_disable(); @@ -568,10 +568,10 @@ void smp_send_stop(void) /* Wait up to one second for other CPUs to stop */ timeout = USEC_PER_SEC; - while (num_online_cpus() > 1 && timeout--) + while (num_active_cpus() > 1 && timeout--) udelay(1); - if (num_online_cpus() > 1) + if (num_active_cpus() > 1) pr_warning("SMP: failed to stop secondary CPUs\n"); smp_kill_cpus(&mask);