From patchwork Tue Jan 22 11:45:56 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: tangchen X-Patchwork-Id: 2018081 Return-Path: X-Original-To: patchwork-linux-acpi@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork2.kernel.org (Postfix) with ESMTP id 1AE4FDF2EB for ; Tue, 22 Jan 2013 11:48:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753831Ab3AVLrA (ORCPT ); Tue, 22 Jan 2013 06:47:00 -0500 Received: from cn.fujitsu.com ([222.73.24.84]:39050 "EHLO song.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-OK) by vger.kernel.org with ESMTP id S1752264Ab3AVLqw (ORCPT ); Tue, 22 Jan 2013 06:46:52 -0500 X-IronPort-AV: E=Sophos;i="4.84,514,1355068800"; d="scan'208";a="6628487" Received: from unknown (HELO tang.cn.fujitsu.com) ([10.167.250.3]) by song.cn.fujitsu.com with ESMTP; 22 Jan 2013 19:44:41 +0800 Received: from fnstmail02.fnst.cn.fujitsu.com (tang.cn.fujitsu.com [127.0.0.1]) by tang.cn.fujitsu.com (8.14.3/8.13.1) with ESMTP id r0MBkj2Q020816; Tue, 22 Jan 2013 19:46:47 +0800 Received: from tangchen.fnst.cn.fujitsu.com ([10.167.225.117]) by fnstmail02.fnst.cn.fujitsu.com (Lotus Domino Release 8.5.3) with ESMTP id 2013012219455170-1091878 ; Tue, 22 Jan 2013 19:45:51 +0800 From: Tang Chen To: akpm@linux-foundation.org, rjw@sisk.pl, len.brown@intel.com, mingo@redhat.com, tglx@linutronix.de, minchan.kim@gmail.com, rientjes@google.com, benh@kernel.crashing.org, paulus@samba.org, cl@linux.com, kosaki.motohiro@jp.fujitsu.com, isimatu.yasuaki@jp.fujitsu.com, wujianguo@huawei.com, wency@cn.fujitsu.com, hpa@zytor.com, linfeng@cn.fujitsu.com, laijs@cn.fujitsu.com, mgorman@suse.de, yinghai@kernel.org, glommer@parallels.com, jiang.liu@huawei.com, julian.calaby@gmail.com, sfr@canb.auug.org.au Cc: x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-acpi@vger.kernel.org, Jiang Liu , Mel Gorman , Peter Zijlstra Subject: [PATCH Bug fix 5/5] Do not use cpu_to_node() to find an offlined cpu's node. Date: Tue, 22 Jan 2013 19:45:56 +0800 Message-Id: <1358855156-6126-6-git-send-email-tangchen@cn.fujitsu.com> X-Mailer: git-send-email 1.7.10.1 In-Reply-To: <1358855156-6126-1-git-send-email-tangchen@cn.fujitsu.com> References: <1358855156-6126-1-git-send-email-tangchen@cn.fujitsu.com> X-MIMETrack: Itemize by SMTP Server on mailserver/fnst(Release 8.5.3|September 15, 2011) at 2013/01/22 19:45:51, Serialize by Router on mailserver/fnst(Release 8.5.3|September 15, 2011) at 2013/01/22 19:45:53, Serialize complete at 2013/01/22 19:45:53 Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org If a cpu is offline, its nid will be set to -1, and cpu_to_node(cpu) will return -1. As a result, cpumask_of_node(nid) will return NULL. In this case, find_next_bit() in for_each_cpu will get a NULL pointer and cause panic. Here is a call trace: [ 609.824017] Call Trace: [ 609.824017] [ 609.824017] [] select_fallback_rq+0x71/0x190 [ 609.824017] [] ? try_to_wake_up+0x2e/0x2f0 [ 609.824017] [] try_to_wake_up+0x2cb/0x2f0 [ 609.824017] [] ? __run_hrtimer+0x78/0x320 [ 609.824017] [] wake_up_process+0x15/0x20 [ 609.824017] [] hrtimer_wakeup+0x22/0x30 [ 609.824017] [] __run_hrtimer+0x83/0x320 [ 609.824017] [] ? update_rmtp+0x80/0x80 [ 609.824017] [] hrtimer_interrupt+0x106/0x280 [ 609.824017] [] ? sd_free_ctl_entry+0x68/0x70 [ 609.824017] [] smp_apic_timer_interrupt+0x69/0x99 [ 609.824017] [] apic_timer_interrupt+0x6f/0x80 There is a hrtimer process sleeping, whose cpu has already been offlined. When it is waken up, it tries to find another cpu to run, and get a -1 nid. As a result, cpumask_of_node(-1) returns NULL, and causes ernel panic. This patch fixes this problem by judging if the nid is -1. If nid is not -1, a cpu on the same node will be picked. Else, a online cpu on another node will be picked. Cc: Yasuaki Ishimatsu Cc: David Rientjes Cc: Jiang Liu Cc: Minchan Kim Cc: KOSAKI Motohiro Cc: Andrew Morton Cc: Mel Gorman Cc: Thomas Gleixner Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: Peter Zijlstra Signed-off-by: Tang Chen Signed-off-by: Wen Congyang --- kernel/sched/core.c | 28 +++++++++++++++++++--------- 1 files changed, 19 insertions(+), 9 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 257002c..035ee9f 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1132,18 +1132,28 @@ EXPORT_SYMBOL_GPL(kick_process); */ static int select_fallback_rq(int cpu, struct task_struct *p) { - const struct cpumask *nodemask = cpumask_of_node(cpu_to_node(cpu)); + int nid = cpu_to_node(cpu); + const struct cpumask *nodemask = NULL; enum { cpuset, possible, fail } state = cpuset; int dest_cpu; - /* Look for allowed, online CPU in same node. */ - for_each_cpu(dest_cpu, nodemask) { - if (!cpu_online(dest_cpu)) - continue; - if (!cpu_active(dest_cpu)) - continue; - if (cpumask_test_cpu(dest_cpu, tsk_cpus_allowed(p))) - return dest_cpu; + /* + * If the node that the cpu is on has been offlined, cpu_to_node() + * will return -1. There is no cpu on the node, and we should + * select the cpu on the other node. + */ + if (nid != -1) { + nodemask = cpumask_of_node(nid); + + /* Look for allowed, online CPU in same node. */ + for_each_cpu(dest_cpu, nodemask) { + if (!cpu_online(dest_cpu)) + continue; + if (!cpu_active(dest_cpu)) + continue; + if (cpumask_test_cpu(dest_cpu, tsk_cpus_allowed(p))) + return dest_cpu; + } } for (;;) {