From patchwork Mon Apr 14 15:11:13 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Igor Mammedov X-Patchwork-Id: 3982531 Return-Path: X-Original-To: patchwork-linux-acpi@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 546489F336 for ; Mon, 14 Apr 2014 15:12:42 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 768262016C for ; Mon, 14 Apr 2014 15:12:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 86FE520170 for ; Mon, 14 Apr 2014 15:12:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754562AbaDNPMX (ORCPT ); Mon, 14 Apr 2014 11:12:23 -0400 Received: from mx1.redhat.com ([209.132.183.28]:57964 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753910AbaDNPMR (ORCPT ); Mon, 14 Apr 2014 11:12:17 -0400 Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s3EFBrrn012129 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 14 Apr 2014 11:11:54 -0400 Received: from dell-pet610-01.lab.eng.brq.redhat.com (dell-pet610-01.lab.eng.brq.redhat.com [10.34.42.20]) by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id s3EFBYAX032171; Mon, 14 Apr 2014 11:11:44 -0400 From: Igor Mammedov To: linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com, x86@kernel.org, imammedo@redhat.com, bp@suse.de, paul.gortmaker@windriver.com, JBeulich@suse.com, prarit@redhat.com, drjones@redhat.com, toshi.kani@hp.com, riel@redhat.com, gong.chen@linux.intel.com, andi@firstfloor.org, lenb@kernel.org, rjw@rjwysocki.net, linux-acpi@vger.kernel.org Subject: [PATCH v4 1/5] x86: fix list corruption on CPU hotplug Date: Mon, 14 Apr 2014 17:11:13 +0200 Message-Id: <1397488277-14865-2-git-send-email-imammedo@redhat.com> In-Reply-To: <1397488277-14865-1-git-send-email-imammedo@redhat.com> References: <1397488277-14865-1-git-send-email-imammedo@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.27 Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org X-Spam-Status: No, score=-7.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP currently if AP wake up is failed, master CPU marks AP as not present in do_boot_cpu() by calling set_cpu_present(cpu, false). That leads to following list corruption on the next physical CPU hotplug: [ 418.107336] WARNING: CPU: 1 PID: 45 at lib/list_debug.c:33 __list_add+0xbe/0xd0() [ 418.115268] list_add corruption. prev->next should be next (ffff88003dc57600), but was ffff88003e20c3a0. (prev=ffff88003e20c3a0). [ 418.123693] Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT ipt_REJECT cfg80211 xt_conntrack rfkill ee [ 418.138979] CPU: 1 PID: 45 Comm: kworker/u10:1 Not tainted 3.14.0-rc6+ #387 [ 418.149989] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 [ 418.165750] Workqueue: kacpi_hotplug acpi_hotplug_work_fn [ 418.166433] 0000000000000021 ffff880038ca7988 ffffffff8159b22d 0000000000000021 [ 418.176460] ffff880038ca79d8 ffff880038ca79c8 ffffffff8106942c ffff880038ca79e8 [ 418.177453] ffff88003e20c3a0 ffff88003dc57600 ffff88003e20c3a0 00000000ffffffea [ 418.178445] Call Trace: [ 418.185811] [] dump_stack+0x49/0x5c [ 418.186440] [] warn_slowpath_common+0x8c/0xc0 [ 418.187192] [] warn_slowpath_fmt+0x46/0x50 [ 418.191231] [] ? acpi_ns_get_node+0xb7/0xc7 [ 418.193889] [] __list_add+0xbe/0xd0 [ 418.196649] [] kobject_add_internal+0x79/0x200 [ 418.208610] [] kobject_add_varg+0x38/0x60 [ 418.213831] [] kobject_add+0x44/0x70 [ 418.229961] [] device_add+0xd0/0x550 [ 418.234991] [] ? pm_runtime_init+0xe5/0xf0 [ 418.250226] [] device_register+0x1e/0x30 [ 418.255296] [] register_cpu+0xe3/0x130 [ 418.266539] [] arch_register_cpu+0x65/0x150 [ 418.285845] [] acpi_processor_hotadd_init+0x5a/0x9b ... Which is caused by the fact that generic_processor_info() allocates logical CPU id by calling: cpu = cpumask_next_zero(-1, cpu_present_mask); which returns id of previously failed to wake up CPU, since its bit is cleared by do_boot_cpu() and as result register_cpu() tries to register another CPU with the same id as already present but failed to be onlined CPU. Taking in account that AP will not do anything if master CPU failed to wake it up, there is no reason to mark that AP as not present and break next cpu hotplug attempts. As a side effect of not marking AP as not present, user would be allowed to online it again later. Signed-off-by: Igor Mammedov Acked-by: Toshi Kani --- arch/x86/kernel/smpboot.c | 1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 3482693..6124f15 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -860,7 +860,6 @@ static int do_boot_cpu(int apicid, int cpu, struct task_struct *idle) /* was set by cpu_init() */ cpumask_clear_cpu(cpu, cpu_initialized_mask); - set_cpu_present(cpu, false); per_cpu(x86_cpu_to_apicid, cpu) = BAD_APICID; }