From patchwork Sat Jan 2 07:02:31 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Young X-Patchwork-Id: 7940471 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id D13729F387 for ; Sat, 2 Jan 2016 07:05:28 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id C187B20490 for ; Sat, 2 Jan 2016 07:05:27 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A7CC620481 for ; Sat, 2 Jan 2016 07:05:26 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1aFGDY-0003fs-8D; Sat, 02 Jan 2016 07:03:12 +0000 Received: from mx1.redhat.com ([209.132.183.28]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1aFGDV-0003ef-IT for linux-arm-kernel@lists.infradead.org; Sat, 02 Jan 2016 07:03:10 +0000 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) by mx1.redhat.com (Postfix) with ESMTPS id 01856537D; Sat, 2 Jan 2016 07:02:46 +0000 (UTC) Received: from dhcp-128-65.nay.redhat.com (vpn1-7-167.pek2.redhat.com [10.72.7.167]) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u0272Yjj009943 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Sat, 2 Jan 2016 02:02:40 -0500 Date: Sat, 2 Jan 2016 15:02:31 +0800 From: Dave Young To: Grygorii Strashko Subject: Re: bbb kexec bug: Unhandled fault external abort on non-linefetch (0x1028) at 0xfa1ac140 Message-ID: <20160102070231.GA23855@dhcp-128-65.nay.redhat.com> References: <20151227073840.GC23608@dhcp-128-65.nay.redhat.com> <20151228071809.GA20621@dhcp-128-65.nay.redhat.com> <568127FE.3010001@ti.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <568127FE.3010001@ti.com> User-Agent: Mutt/1.5.23 (2015-06-09) X-Scanned-By: MIMEDefang 2.68 on 10.5.11.26 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20160101_230309_683817_BB7C406A X-CRM114-Status: GOOD ( 23.77 ) X-Spam-Score: -6.9 (------) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-omap@vger.kernel.org, will.deacon@arm.com, linux-arm-kernel@lists.infradead.org Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hi, Grygorii Thanks fot your reply. On 12/28/15 at 02:15pm, Grygorii Strashko wrote: > On 12/28/2015 09:18 AM, Dave Young wrote: > > On 12/27/15 at 03:38pm, Dave Young wrote: > >> Here is what I get when I test kdump on Beagle bone black: > >> > >> Added a printk line at the begin of function omap_gpio_rmw: > >> printk("########## %lx, %x, %x\n", base, reg, mask); > >> > >> Any hints how to fix it? I tried call the machine_kexec_mask_interrupts > >> at runtime kernel also panics so it may not limit to kdump case. > >> > >> [ 66.340168] ########## fa1ac000, 140, 1 > >> [ 66.344456] Unhandled fault: external abort on non-linefetch (0x1028) at 0xfa1ac140 > >> [ 66.352142] pgd = dd9f0000 > > [...] > > >> [ 66.727278] [] (omap_set_gpio_triggering) from [] (omap_gpio_mask_irq+0x29/0x34) > > Usually such back-trace means that you are trying to access HW > which is disabled (powered off) already. Or this HW IP has never been enabled. It is possible, but how to detect such disabled gpio in this for_each_irq_desc loop? I tried below, it works for me but I'm not sure if it is a right fix. --- arch/arm/kernel/machine_kexec.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) > > >> [ 66.736457] [] (omap_gpio_mask_irq) from [] (machine_crash_shutdown+0xb9/0x104) > >> [ 66.745551] [] (machine_crash_shutdown) from [] (crash_kexec+0x35/0x68) > >> [ 66.753942] [] (crash_kexec) from [] (die+0x1b9/0x390) > >> [ 66.760859] [] (die) from [] (__do_kernel_fault.part.0+0x4f/0x1cc) > >> [ 66.768824] [] (__do_kernel_fault.part.0) from [] (do_page_fault+0x155/0x29c) > >> [ 66.777740] [] (do_page_fault) from [] (do_DataAbort+0x2f/0x88) > >> [ 66.785432] [] (do_DataAbort) from [] (__dabt_svc+0x3b/0x80) > >> [ 66.792858] Exception stack(0xddc39e58 to 0xddc39ea0) > >> [ 66.797929] 9e40: 00000063 df93647c > >> [ 66.806144] 9e60: 1f26a000 00000000 00000001 00000063 00000007 c0702e3c 00000000 ddc38000 > >> [ 66.814359] 9e80: 00000000 7f70d614 00000030 ddc39ea8 c021e54b c021e54c 600e0033 ffffffff > >> [ 66.822575] [] (__dabt_svc) from [] (sysrq_handle_crash+0x18/0x1c) > >> [ 66.830530] [] (sysrq_handle_crash) from [] (__handle_sysrq+0x79/0x10c) > >> [ 66.838919] [] (__handle_sysrq) from [] (write_sysrq_trigger+0x45/0x50) > >> [ 66.847310] [] (write_sysrq_trigger) from [] (proc_reg_write+0x43/0x68) > >> [ 66.855700] [] (proc_reg_write) from [] (__vfs_write+0xf/0x8c) > >> [ 66.863304] [] (__vfs_write) from [] (vfs_write+0x5f/0x128) > >> [ 66.870646] [] (vfs_write) from [] (SyS_write+0x2b/0x68) > >> [ 66.877729] [] (SyS_write) from [] (ret_fast_syscall+0x1/0x4c) > >> [ 66.885332] Code: 443c 4643 f6a9 f9a1 (6823) 0732 > >> [ 66.890145] ---[ end trace 5a39094ece4dc200 ]--- > >> [ 66.894782] Kernel panic - not syncing: Fatal exception > >> [ 66.900033] ---[ end Kernel panic - not syncing: Fatal exception > >> > > > -- > regards, > -grygorii Thanks Dave --- linux.orig/arch/arm/kernel/machine_kexec.c +++ linux/arch/arm/kernel/machine_kexec.c @@ -106,7 +106,7 @@ static void machine_kexec_mask_interrupt if (chip->irq_eoi && irqd_irq_inprogress(&desc->irq_data)) chip->irq_eoi(&desc->irq_data); - if (chip->irq_mask) + if ((chip->irq_mask) && !irqd_irq_masked(&desc->irq_data)) chip->irq_mask(&desc->irq_data); if (chip->irq_disable && !irqd_irq_disabled(&desc->irq_data))