From patchwork Tue Sep 20 18:46:35 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Daney X-Patchwork-Id: 9342269 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id F0A316077A for ; Tue, 20 Sep 2016 18:48:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E238029A8B for ; Tue, 20 Sep 2016 18:48:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D687E29AC0; Tue, 20 Sep 2016 18:48:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED,FREEMAIL_FROM,RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=unavailable version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 3924929A8B for ; Tue, 20 Sep 2016 18:48:44 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.85_2 #1 (Red Hat Linux)) id 1bmQ4S-0006eT-55; Tue, 20 Sep 2016 18:47:08 +0000 Received: from mail-pf0-x243.google.com ([2607:f8b0:400e:c00::243]) by bombadil.infradead.org with esmtps (Exim 4.85_2 #1 (Red Hat Linux)) id 1bmQ4M-0006dJ-GE for linux-arm-kernel@lists.infradead.org; Tue, 20 Sep 2016 18:47:04 +0000 Received: by mail-pf0-x243.google.com with SMTP id n24so1293138pfb.3 for ; Tue, 20 Sep 2016 11:46:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id; bh=u3C4NFEnhukaypDKD5NduX1jcEBfa0AFlpYdhk6JQpU=; b=lui/ut4DrKqekzJS0PPDG/jk0Nak+7aCF7nu6NMJH0Kj1Q7U+xa6FMxJ3KVvt4dI/R EbuDMgnJVkFLfq1WH3EVJE4QX7/Yw+Q3G/OJ6gut2rwDNob0AXo2woY9wQrv1PUQ7vm4 6erk7JqcaXf0y35ch0A4As66u8ce6ryWuXGMuxp28KC/2PpzkSBtJbOP2N/ZQC8t8g9N gD6bwvJpGqJm6KyFXcD47EgZqT7iJUL6rxqXDjJR14cvyW0WWljwm4Xx1qAzlXhjGajy dlWCZ8Cq8uw22G930yxCpPIxp/oc+Oalr9Z6zCzK9CIsHmHssD7njjCYAapXo1B8yDhP TUwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=u3C4NFEnhukaypDKD5NduX1jcEBfa0AFlpYdhk6JQpU=; b=eEQlbp02Rnco1vx2DHfRXXSrsUUMJdY57fjook7PAYbsl5xuTyKzORBYRv7uj6lxmk hGAuqrrwJ37kCiZ+Z3lOMhL5XltnRwEPawKj0+siHc+onjIb1ZLDwhyVK5+Y207NKawh Jig8L30lRI0B1EPyqJdHgDJ+UOxuhUpRqNIRg7mM1PTYEfmZqOvrAKmxiI87MGMSRE6C hyx0wTRHlBnMGDYDe7wMabtnovIIxWThYF/MTJKMzCpITxhsFLehWh97YjwE/cPwq3QD VZre7E+e2kvKNVaHRqdUcTSfWArBk1lIde5dEB8mVHte8All3bnbXX3nly78UEjp2eHj G9RA== X-Gm-Message-State: AE9vXwMEt/CUPWLd17kyEunn8giQnEx5Ljfo1TK8h7+0tN9mcMGsTHmo33PlpewFTd5Xbg== X-Received: by 10.98.30.133 with SMTP id e127mr58467842pfe.104.1474397199439; Tue, 20 Sep 2016 11:46:39 -0700 (PDT) Received: from dl.caveonetworks.com (50-233-148-156-static.hfc.comcastbusiness.net. [50.233.148.156]) by smtp.gmail.com with ESMTPSA id t16sm53783852pfj.76.2016.09.20.11.46.37 (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 20 Sep 2016 11:46:38 -0700 (PDT) Received: from dl.caveonetworks.com (localhost.localdomain [127.0.0.1]) by dl.caveonetworks.com (8.14.5/8.14.5) with ESMTP id u8KIkakH016553; Tue, 20 Sep 2016 11:46:36 -0700 Received: (from ddaney@localhost) by dl.caveonetworks.com (8.14.5/8.14.5/Submit) id u8KIkZHJ016551; Tue, 20 Sep 2016 11:46:35 -0700 From: David Daney To: linux-kernel@vger.kernel.org, Marc Zyngier , Hanjun Guo , Will Deacon , Ganapatrao Kulkarni , Catalin Marinas , Mark Rutland , Suzuki K Poulose Subject: [PATCH] arm64: Call numa_store_cpu_info() earlier. Date: Tue, 20 Sep 2016 11:46:35 -0700 Message-Id: <1474397195-16520-1-git-send-email-ddaney.cavm@gmail.com> X-Mailer: git-send-email 1.7.11.7 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20160920_114702_616315_2BA6E15F X-CRM114-Status: GOOD ( 16.90 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Robert Richter , linux-arm-kernel@lists.infradead.org, David Daney MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP From: David Daney The wq_numa_init() function makes a private CPU to node map by calling cpu_to_node() early in the boot process, before the non-boot CPUs are brought online. Since the default implementation of cpu_to_node() returns zero for CPUs that have never been brought online, the workqueue system's view is that *all* CPUs are on node zero. When the unbound workqueue for a non-zero node is created, the tsk_cpus_allowed() for the worker threads is the empty set because there are, in the view of the workqueue system, no CPUs on non-zero nodes. The code in try_to_wake_up() using this empty cpumask ends up using the cpumask empty set value of NR_CPUS as an index into the per-CPU area pointer array, and gets garbage as it is one past the end of the array. This results in: [ 0.881970] Unable to handle kernel paging request at virtual address fffffb1008b926a4 [ 1.970095] pgd = fffffc00094b0000 [ 1.973530] [fffffb1008b926a4] *pgd=0000000000000000, *pud=0000000000000000, *pmd=0000000000000000 [ 1.982610] Internal error: Oops: 96000004 [#1] SMP [ 1.987541] Modules linked in: [ 1.990631] CPU: 48 PID: 295 Comm: cpuhp/48 Tainted: G W 4.8.0-rc6-preempt-vol+ #9 [ 1.999435] Hardware name: Cavium ThunderX CN88XX board (DT) [ 2.005159] task: fffffe0fe89cc300 task.stack: fffffe0fe8b8c000 [ 2.011158] PC is at try_to_wake_up+0x194/0x34c [ 2.015737] LR is at try_to_wake_up+0x150/0x34c [ 2.020318] pc : [] lr : [] pstate: 600000c5 [ 2.027803] sp : fffffe0fe8b8fb10 [ 2.031149] x29: fffffe0fe8b8fb10 x28: 0000000000000000 [ 2.036522] x27: fffffc0008c63bc8 x26: 0000000000001000 [ 2.041896] x25: fffffc0008c63c80 x24: fffffc0008bfb200 [ 2.047270] x23: 00000000000000c0 x22: 0000000000000004 [ 2.052642] x21: fffffe0fe89d25bc x20: 0000000000001000 [ 2.058014] x19: fffffe0fe89d1d00 x18: 0000000000000000 [ 2.063386] x17: 0000000000000000 x16: 0000000000000000 [ 2.068760] x15: 0000000000000018 x14: 0000000000000000 [ 2.074133] x13: 0000000000000000 x12: 0000000000000000 [ 2.079505] x11: 0000000000000000 x10: 0000000000000000 [ 2.084879] x9 : 0000000000000000 x8 : 0000000000000000 [ 2.090251] x7 : 0000000000000040 x6 : 0000000000000000 [ 2.095621] x5 : ffffffffffffffff x4 : 0000000000000000 [ 2.100991] x3 : 0000000000000000 x2 : 0000000000000000 [ 2.106364] x1 : fffffc0008be4c24 x0 : ffffff0ffffada80 [ 2.111737] [ 2.113236] Process cpuhp/48 (pid: 295, stack limit = 0xfffffe0fe8b8c020) [ 2.120102] Stack: (0xfffffe0fe8b8fb10 to 0xfffffe0fe8b90000) [ 2.125914] fb00: fffffe0fe8b8fb80 fffffc00080e7648 . . . [ 2.442859] Call trace: [ 2.445327] Exception stack(0xfffffe0fe8b8f940 to 0xfffffe0fe8b8fa70) [ 2.451843] f940: fffffe0fe89d1d00 0000040000000000 fffffe0fe8b8fb10 fffffc00080e7468 [ 2.459767] f960: fffffe0fe8b8f980 fffffc00080e4958 ffffff0ff91ab200 fffffc00080e4b64 [ 2.467690] f980: fffffe0fe8b8f9d0 fffffc00080e515c fffffe0fe8b8fa80 0000000000000000 [ 2.475614] f9a0: fffffe0fe8b8f9d0 fffffc00080e58e4 fffffe0fe8b8fa80 0000000000000000 [ 2.483540] f9c0: fffffe0fe8d10000 0000000000000040 fffffe0fe8b8fa50 fffffc00080e5ac4 [ 2.491465] f9e0: ffffff0ffffada80 fffffc0008be4c24 0000000000000000 0000000000000000 [ 2.499387] fa00: 0000000000000000 ffffffffffffffff 0000000000000000 0000000000000040 [ 2.507309] fa20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 2.515233] fa40: 0000000000000000 0000000000000000 0000000000000000 0000000000000018 [ 2.523156] fa60: 0000000000000000 0000000000000000 [ 2.528089] [] try_to_wake_up+0x194/0x34c [ 2.533723] [] wake_up_process+0x28/0x34 [ 2.539275] [] create_worker+0x110/0x19c [ 2.544824] [] alloc_unbound_pwq+0x3cc/0x4b0 [ 2.550724] [] wq_update_unbound_numa+0x10c/0x1e4 [ 2.557066] [] workqueue_online_cpu+0x220/0x28c [ 2.563234] [] cpuhp_invoke_callback+0x6c/0x168 [ 2.569398] [] cpuhp_up_callbacks+0x44/0xe4 [ 2.575210] [] cpuhp_thread_fun+0x13c/0x148 [ 2.581027] [] smpboot_thread_fn+0x19c/0x1a8 [ 2.586929] [] kthread+0xdc/0xf0 [ 2.591776] [] ret_from_fork+0x10/0x50 [ 2.597147] Code: b00057e1 91304021 91005021 b8626822 (b8606821) [ 2.603464] ---[ end trace 58c0cd36b88802bc ]--- [ 2.608138] Kernel panic - not syncing: Fatal exception Fix by moving call to numa_store_cpu_info() for all CPUs into smp_prepare_cpus(), which happens before wq_numa_init(). Since smp_store_cpu_info() now contains only a single function call, simplify by removing the function and out-lining its contents. Suggested-by: Robert Richter fixes: 1a2db300348b ("arm64, numa: Add NUMA support for arm64 platforms.") Cc: # 4.7.x- Signed-off-by: David Daney Reviewed-by: Robert Richter Tested-by: Yisheng Xie --- arch/arm64/kernel/smp.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index d93d433..3ff173e 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -201,12 +201,6 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle) return ret; } -static void smp_store_cpu_info(unsigned int cpuid) -{ - store_cpu_topology(cpuid); - numa_store_cpu_info(cpuid); -} - /* * This is the secondary CPU boot entry. We're using this CPUs * idle thread stack, but a set of temporary page tables. @@ -254,7 +248,7 @@ asmlinkage void secondary_start_kernel(void) */ notify_cpu_starting(cpu); - smp_store_cpu_info(cpu); + store_cpu_topology(cpu); /* * OK, now it's safe to let the boot CPU continue. Wait for @@ -689,10 +683,13 @@ void __init smp_prepare_cpus(unsigned int max_cpus) { int err; unsigned int cpu; + unsigned int this_cpu; init_cpu_topology(); - smp_store_cpu_info(smp_processor_id()); + this_cpu = smp_processor_id(); + store_cpu_topology(this_cpu); + numa_store_cpu_info(this_cpu); /* * If UP is mandated by "nosmp" (which implies "maxcpus=0"), don't set @@ -719,6 +716,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus) continue; set_cpu_present(cpu, true); + numa_store_cpu_info(cpu); } }