From patchwork Thu Apr 18 14:02:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Li X-Patchwork-Id: 10907425 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 204E117E0 for ; Thu, 18 Apr 2019 13:57:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EBC92271FD for ; Thu, 18 Apr 2019 13:57:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E000C28A3B; Thu, 18 Apr 2019 13:57:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 45365288E0 for ; Thu, 18 Apr 2019 13:57:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-ID:Date:Subject:To :From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=c09nqGvudYePRgC9JJJOJUxFyx6QxR2UeN+ic03atGs=; b=S1a5B4YKreyZgB PWZX57DkD3n1qbgAwL+ZMRtfVumbFwy0b3Kr7fkBN3NJtWkyRl+3Tu9y2XYx8++hnXwDCEA77lgxH KVaUTKuMf/8rFS0KLei/USHdJT7G3vt7b3XT48KWCMQRY3z3sxQbcsX1ySPdbQ42l/Qll8vtfqHKw vXcBgVinzrqd97gYQDdioEYSKTB9svjJYSA8KhvHQini38z6tdHqeoaiX6GVhPk2i+N5wdDYIfNAa kDc8AZ/2jaoRzXu1p8CjJur9JnI3ZWF17egQZkbq1JsdFOajC8iwTHrbseTA6apZLN4jkJVEshXuV 1ynIjKY496K4Qt4RCuVQ==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1hH7XR-00006r-Sf; Thu, 18 Apr 2019 13:57:17 +0000 Received: from szxga07-in.huawei.com ([45.249.212.35] helo=huawei.com) by bombadil.infradead.org with esmtps (Exim 4.90_1 #2 (Red Hat Linux)) id 1hH7XN-00005n-Uj for linux-arm-kernel@lists.infradead.org; Thu, 18 Apr 2019 13:57:15 +0000 Received: from DGGEMS409-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id F28B755DD13D2D4D9E9B; Thu, 18 Apr 2019 21:57:01 +0800 (CST) Received: from euler.huawei.com (10.175.104.193) by DGGEMS409-HUB.china.huawei.com (10.3.19.209) with Microsoft SMTP Server id 14.3.408.0; Thu, 18 Apr 2019 21:56:57 +0800 From: Wei Li To: , , Subject: [PATCH] arm64: fix kernel stack overflow in kdump capture kernel Date: Thu, 18 Apr 2019 22:02:50 +0800 Message-ID: <20190418140251.31218-1-liwei391@huawei.com> X-Mailer: git-send-email 2.17.1 MIME-Version: 1.0 X-Originating-IP: [10.175.104.193] X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20190418_065714_174033_F3B9CBC7 X-CRM114-Status: GOOD ( 13.24 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mark.rutland@arm.com, daniel.thompson@linaro.org, marc.zyngier@arm.com, christoffer.dall@arm.com, james.morse@arm.com, joel@joelfernandes.org, linux-arm-kernel@lists.infradead.org Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP When enabling ARM64_PSEUDO_NMI feature in kdump capture kernel, it will report a kernel stack overflow exception: [ 0.000000] CPU features: detected: IRQ priority masking [ 0.000000] alternatives: patching kernel code [ 0.000000] Insufficient stack space to handle exception! [ 0.000000] ESR: 0x96000044 -- DABT (current EL) [ 0.000000] FAR: 0x0000000000000040 [ 0.000000] Task stack: [0xffff0000097f0000..0xffff0000097f4000] [ 0.000000] IRQ stack: [0x0000000000000000..0x0000000000004000] [ 0.000000] Overflow stack: [0xffff80002b7cf290..0xffff80002b7d0290] [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.34-lw+ #3 [ 0.000000] pstate: 400003c5 (nZcv DAIF -PAN -UAO) [ 0.000000] pc : el1_sync+0x0/0xb8 [ 0.000000] lr : el1_irq+0xb8/0x140 [ 0.000000] sp : 0000000000000040 [ 0.000000] pmr_save: 00000070 [ 0.000000] x29: ffff0000097f3f60 x28: ffff000009806240 [ 0.000000] x27: 0000000080000000 x26: 0000000000004000 [ 0.000000] x25: 0000000000000000 x24: ffff000009329028 [ 0.000000] x23: 0000000040000005 x22: ffff000008095c6c [ 0.000000] x21: ffff0000097f3f70 x20: 0000000000000070 [ 0.000000] x19: ffff0000097f3e30 x18: ffffffffffffffff [ 0.000000] x17: 0000000000000000 x16: 0000000000000000 [ 0.000000] x15: ffff0000097f9708 x14: ffff000089a382ef [ 0.000000] x13: ffff000009a382fd x12: ffff000009824000 [ 0.000000] x11: ffff0000097fb7b0 x10: ffff000008730028 [ 0.000000] x9 : ffff000009440018 x8 : 000000000000000d [ 0.000000] x7 : 6b20676e69686374 x6 : 000000000000003b [ 0.000000] x5 : 0000000000000000 x4 : ffff000008093600 [ 0.000000] x3 : 0000000400000008 x2 : 7db2e689fc2b8e00 [ 0.000000] x1 : 0000000000000000 x0 : ffff0000097f3e30 [ 0.000000] Kernel panic - not syncing: kernel stack overflow [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.34-lw+ #3 [ 0.000000] Call trace: [ 0.000000] dump_backtrace+0x0/0x1b8 [ 0.000000] show_stack+0x24/0x30 [ 0.000000] dump_stack+0xa8/0xcc [ 0.000000] panic+0x134/0x30c [ 0.000000] __stack_chk_fail+0x0/0x28 [ 0.000000] handle_bad_stack+0xfc/0x108 [ 0.000000] __bad_stack+0x90/0x94 [ 0.000000] el1_sync+0x0/0xb8 [ 0.000000] init_gic_priority_masking+0x4c/0x70 [ 0.000000] smp_prepare_boot_cpu+0x60/0x68 [ 0.000000] start_kernel+0x1e8/0x53c [ 0.000000] ---[ end Kernel panic - not syncing: kernel stack overflow ]--- The reason is init_gic_priority_masking() may unmask PSR.I while the irq stacks are not inited yet. Some "NMI" could be raised unfortunately and it will just go into this exception. In this patch, we just write the PMR in smp_prepare_boot_cpu(), and delay unmasking PSR.I after irq stacks inited in init_IRQ(). Fixes: e79321883842 ("arm64: Switch to PMR masking when starting CPUs") Signed-off-by: Wei Li --- arch/arm64/include/asm/smp.h | 2 ++ arch/arm64/kernel/irq.c | 2 ++ arch/arm64/kernel/smp.c | 8 ++++---- 3 files changed, 8 insertions(+), 4 deletions(-) diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h index 18553f399e08..4d8eb35f4cdd 100644 --- a/arch/arm64/include/asm/smp.h +++ b/arch/arm64/include/asm/smp.h @@ -158,6 +158,8 @@ bool cpus_are_stuck_in_kernel(void); extern void crash_smp_send_stop(void); extern bool smp_crash_stop_failed(void); +extern void init_gic_priority_masking(bool enable); + #endif /* ifndef __ASSEMBLY__ */ #endif /* ifndef __ASM_SMP_H */ diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c index 92fa81798fb9..7e65280f2ebd 100644 --- a/arch/arm64/kernel/irq.c +++ b/arch/arm64/kernel/irq.c @@ -75,4 +75,6 @@ void __init init_IRQ(void) irqchip_init(); if (!handle_arch_irq) panic("No interrupt controller found."); + if (system_uses_irq_prio_masking()) + init_gic_priority_masking(true); } diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 824de7038967..07b6e9df151a 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -181,7 +181,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle) return ret; } -static void init_gic_priority_masking(void) +void init_gic_priority_masking(bool enable) { u32 cpuflags; @@ -195,7 +195,7 @@ static void init_gic_priority_masking(void) gic_write_pmr(GIC_PRIO_IRQOFF); /* We can only unmask PSR.I if we can take aborts */ - if (!(cpuflags & PSR_A_BIT)) + if (enable && !(cpuflags & PSR_A_BIT)) write_sysreg(cpuflags & ~PSR_I_BIT, daif); } @@ -226,7 +226,7 @@ asmlinkage notrace void secondary_start_kernel(void) cpu_uninstall_idmap(); if (system_uses_irq_prio_masking()) - init_gic_priority_masking(); + init_gic_priority_masking(true); preempt_disable(); trace_hardirqs_off(); @@ -451,7 +451,7 @@ void __init smp_prepare_boot_cpu(void) /* Conditionally switch to GIC PMR for interrupt masking */ if (system_uses_irq_prio_masking()) - init_gic_priority_masking(); + init_gic_priority_masking(false); } static u64 __init of_get_cpu_mpidr(struct device_node *dn)