From patchwork Wed Feb 4 16:53:55 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Krzysztof Kozlowski X-Patchwork-Id: 5778231 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 832FFBF440 for ; Wed, 4 Feb 2015 16:57:39 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 7E6152010B for ; Wed, 4 Feb 2015 16:57:38 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7ECE920109 for ; Wed, 4 Feb 2015 16:57:37 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1YJ3ET-0006Ph-8g; Wed, 04 Feb 2015 16:55:17 +0000 Received: from mailout3.w1.samsung.com ([210.118.77.13]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1YJ3EO-0005Eq-SG for linux-arm-kernel@lists.infradead.org; Wed, 04 Feb 2015 16:55:14 +0000 Received: from eucpsbgm2.samsung.com (unknown [203.254.199.245]) by mailout3.w1.samsung.com (Oracle Communications Messaging Server 7u4-24.01(7.0.4.24.0) 64bit (built Nov 17 2011)) with ESMTP id <0NJ9007D2B625O20@mailout3.w1.samsung.com> for linux-arm-kernel@lists.infradead.org; Wed, 04 Feb 2015 16:58:50 +0000 (GMT) X-AuditID: cbfec7f5-b7fc86d0000066b7-6e-54d24e435a7e Received: from eusync4.samsung.com ( [203.254.199.214]) by eucpsbgm2.samsung.com (EUCPMTA) with SMTP id A8.67.26295.34E42D45; Wed, 04 Feb 2015 16:52:19 +0000 (GMT) Received: from AMDC1943.digital.local ([106.116.151.171]) by eusync4.samsung.com (Oracle Communications Messaging Server 7u4-24.01(7.0.4.24.0) 64bit (built Nov 17 2011)) with ESMTPA id <0NJ9000N2AZ65K80@eusync4.samsung.com>; Wed, 04 Feb 2015 16:54:46 +0000 (GMT) From: Krzysztof Kozlowski To: Russell King , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: [PATCH] ARM: Don't use complete() during __cpu_die Date: Wed, 04 Feb 2015 17:53:55 +0100 Message-id: <1423068835-25470-1-git-send-email-k.kozlowski@samsung.com> X-Mailer: git-send-email 1.9.1 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrJJMWRmVeSWpSXmKPExsVy+t/xa7rOfpdCDOYeMbf4O+kYu8XGGetZ Ld4/X89s8fqFocWmx9dYLS7vmsNmcfsyr8XaI3fZLZZev8hk8Xbzd1YHLo8189YwerQ097B5 /P41idFj8Z6XTB4PDm1m8di8pN6jb8sqRo/Pm+QCOKK4bFJSczLLUov07RK4Mi7c2slecESn 4uW5fUwNjKdUuhg5OSQETCQOXF7JBGGLSVy4t56ti5GLQ0hgKaPErk+zWCCcPiaJzsYJ7CBV bALGEpuXLwGq4uAQEciXWLLQD6SGWWAxk8TnG92sIDXCAtYSFxZ9BbNZBFQl+l42M4PYvALu EjsOzGaG2CYncfLYZNYJjNwLGBlWMYqmliYXFCel5xrpFSfmFpfmpesl5+duYoQE19cdjEuP WR1iFOBgVOLh7ei9GCLEmlhWXJl7iFGCg1lJhLfD61KIEG9KYmVValF+fFFpTmrxIUYmDk6p Bkauss1zVpuF7DxwMlJJTerInrkpK/UrV5y7eCq6+MTR2cfm/c6v81mkOvHp6lBbs7vCUiUX 9XmvlW4xz/3aY7tRPdB73ldn9pZ8n4sVtsprv/Z0luya/K2r4oLjSt1/Hy5VZB15IqrW+8ry 2PKirMPdMhcFE9Yl/AlijZtdeN66133ZannBqi9KLMUZiYZazEXFiQCrXVb1DAIAAA== X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20150204_085513_072781_720902D1 X-CRM114-Status: GOOD ( 20.54 ) X-Spam-Score: -5.0 (-----) Cc: Mark Rutland , Krzysztof Kozlowski , paulmck@linux.vnet.ibm.com, Arnd Bergmann , Bartlomiej Zolnierkiewicz , Fengguang Wu , Marek Szyprowski X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The complete() should not be used on offlined CPU. Rewrite the wait-complete mechanism with wait_on_bit_timeout(). The CPU triggering hot unplug (e.g. CPU0) will loop until some bit is cleared. In each iteration schedule_timeout() is used with initial sleep time of 1 ms. Later it is increased to 10 ms. The dying CPU will clear the bit which is safe in that context. This fixes following RCU warning on ARMv8 (Exynos 4412, Trats2) during suspend to RAM: [ 31.113925] =============================== [ 31.113928] [ INFO: suspicious RCU usage. ] [ 31.113935] 3.19.0-rc7-next-20150203 #1914 Not tainted [ 31.113938] ------------------------------- [ 31.113943] kernel/sched/fair.c:4740 suspicious rcu_dereference_check() usage! [ 31.113946] [ 31.113946] other info that might help us debug this: [ 31.113946] [ 31.113952] [ 31.113952] RCU used illegally from offline CPU! [ 31.113952] rcu_scheduler_active = 1, debug_locks = 0 [ 31.113957] 3 locks held by swapper/1/0: [ 31.113988] #0: ((cpu_died).wait.lock){......}, at: [] complete+0x14/0x44 [ 31.114012] #1: (&p->pi_lock){-.-.-.}, at: [] try_to_wake_up+0x28/0x300 [ 31.114035] #2: (rcu_read_lock){......}, at: [] select_task_rq_fair+0x5c/0xa04 [ 31.114038] [ 31.114038] stack backtrace: [ 31.114046] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.19.0-rc7-next-20150203 #1914 [ 31.114050] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [ 31.114076] [] (unwind_backtrace) from [] (show_stack+0x10/0x14) [ 31.114091] [] (show_stack) from [] (dump_stack+0x70/0xbc) [ 31.114105] [] (dump_stack) from [] (select_task_rq_fair+0x6e0/0xa04) [ 31.114118] [] (select_task_rq_fair) from [] (try_to_wake_up+0xd4/0x300) [ 31.114129] [] (try_to_wake_up) from [] (__wake_up_common+0x4c/0x80) [ 31.114140] [] (__wake_up_common) from [] (__wake_up_locked+0x14/0x1c) [ 31.114150] [] (__wake_up_locked) from [] (complete+0x34/0x44) [ 31.114167] [] (complete) from [] (cpu_die+0x24/0x84) [ 31.114179] [] (cpu_die) from [] (cpu_startup_entry+0x328/0x358) [ 31.114189] [] (cpu_startup_entry) from [<40008784>] (0x40008784) [ 31.114226] CPU1: shutdown Signed-off-by: Krzysztof Kozlowski Acked-by: Paul E. McKenney --- arch/arm/kernel/smp.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 49 insertions(+), 3 deletions(-) diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 86ef244c5a24..bb8ff465975f 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -26,6 +26,7 @@ #include #include #include +#include #include #include @@ -76,6 +77,9 @@ enum ipi_msg_type { static DECLARE_COMPLETION(cpu_running); +#define CPU_DIE_WAIT_BIT 0 +static unsigned long wait_cpu_die; + static struct smp_operations smp_ops; void __init smp_set_ops(struct smp_operations *ops) @@ -133,6 +137,8 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle) pr_err("CPU%u: failed to boot: %d\n", cpu, ret); } + set_bit(CPU_DIE_WAIT_BIT, &wait_cpu_die); + smp_mb__after_atomic(); memset(&secondary_data, 0, sizeof(secondary_data)); return ret; @@ -213,7 +219,40 @@ int __cpu_disable(void) return 0; } -static DECLARE_COMPLETION(cpu_died); +/* + * Wait for 5000*1 ms for 'wait_cpu_die' bit to be cleared. + * Actually the real wait time will be longer because of schedule() + * called bit_wait_timeout. + * + * Returns 0 if bit was cleared (CPU died) or non-zero + * otherwise (1 or negative ERRNO). + */ +static int wait_for_cpu_die(void) +{ + int retries = 5000, sleep_ms = 1, ret = 0; + + might_sleep(); + + smp_mb__before_atomic(); + while (test_bit(CPU_DIE_WAIT_BIT, &wait_cpu_die)) { + ret = out_of_line_wait_on_bit_timeout(&wait_cpu_die, + CPU_DIE_WAIT_BIT, bit_wait_timeout, + TASK_UNINTERRUPTIBLE, + msecs_to_jiffies(sleep_ms)); + if (!ret || (--retries <= 0)) + break; + + if (retries < 4000) { + /* After ~1000 ms increase sleeping time to 10 ms */ + retries = 400; + sleep_ms = 10; + } + + smp_mb__before_atomic(); /* For next test_bit() in loop */ + } + + return ret; +} /* * called on the thread which is asking for a CPU to be shutdown - @@ -221,7 +260,7 @@ static DECLARE_COMPLETION(cpu_died); */ void __cpu_die(unsigned int cpu) { - if (!wait_for_completion_timeout(&cpu_died, msecs_to_jiffies(5000))) { + if (wait_for_cpu_die()) { pr_err("CPU%u: cpu didn't die\n", cpu); return; } @@ -236,6 +275,10 @@ void __cpu_die(unsigned int cpu) */ if (!platform_cpu_kill(cpu)) pr_err("CPU%u: unable to kill\n", cpu); + + /* Prepare the bit for some next CPU die */ + set_bit(CPU_DIE_WAIT_BIT, &wait_cpu_die); + smp_mb__after_atomic(); } /* @@ -250,6 +293,8 @@ void __ref cpu_die(void) { unsigned int cpu = smp_processor_id(); + WARN_ON(!test_bit(CPU_DIE_WAIT_BIT, &wait_cpu_die)); + idle_task_exit(); local_irq_disable(); @@ -267,7 +312,8 @@ void __ref cpu_die(void) * this returns, power and/or clocks can be removed at any point * from this CPU and its cache by platform_cpu_kill(). */ - complete(&cpu_died); + clear_bit(CPU_DIE_WAIT_BIT, &wait_cpu_die); + smp_mb__after_atomic(); /* * Ensure that the cache lines associated with that completion are