From patchwork Sun Oct 14 22:24:49 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Russell King - ARM Linux X-Patchwork-Id: 1591801 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) by patchwork2.kernel.org (Postfix) with ESMTP id AD97EE00AD for ; Sun, 14 Oct 2012 22:27:33 +0000 (UTC) Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.76 #1 (Red Hat Linux)) id 1TNWcJ-000887-CE; Sun, 14 Oct 2012 22:25:03 +0000 Received: from [2002:4e20:1eda::1] (helo=caramon.arm.linux.org.uk) by merlin.infradead.org with esmtps (Exim 4.76 #1 (Red Hat Linux)) id 1TNWcE-00087o-Mo for linux-arm-kernel@lists.infradead.org; Sun, 14 Oct 2012 22:24:59 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=arm.linux.org.uk; s=caramon; h=Sender:In-Reply-To:Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date; bh=nHab9yKgcqvdMa0Vex52aVJf1Aa5qIs7teXW3tQlBpI=; b=otC6R+pyjE4gY8PntLn2fWUOTDGtKAyjyUYDDEE8WtNlQc46aKLcVwCwdJjLfecyWDAeO52MdVBtZbX3t377uh3vKjwIpfDg1nLQUy6kFnkprJzmVMSCU0YzvtpW9QfKzweQ3LE8YwI71Tr0UYtXVaD1ezKb3jeFO5m8xqV6lqI=; Received: from n2100.arm.linux.org.uk ([2002:4e20:1eda:1:214:fdff:fe10:4f86]:52513) by caramon.arm.linux.org.uk with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.76) (envelope-from ) id 1TNWc7-0007Bx-5F; Sun, 14 Oct 2012 23:24:51 +0100 Received: from linux by n2100.arm.linux.org.uk with local (Exim 4.76) (envelope-from ) id 1TNWc6-0000nv-5k; Sun, 14 Oct 2012 23:24:50 +0100 Date: Sun, 14 Oct 2012 23:24:49 +0100 From: Russell King - ARM Linux To: Daniel Mack Subject: Re: [git pull] signals pile 3 Message-ID: <20121014222449.GN21164@n2100.arm.linux.org.uk> References: <20121013005334.GM2616@ZenIV.linux.org.uk> <507ADBBB.9090209@gmail.com> <20121014202431.GL21164@n2100.arm.linux.org.uk> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20121014202431.GL21164@n2100.arm.linux.org.uk> User-Agent: Mutt/1.5.19 (2009-01-05) X-Spam-Note: CRM114 invocation failed X-Spam-Score: -1.0 (-) X-Spam-Report: SpamAssassin version 3.3.2 on merlin.infradead.org summary: Content analysis details: (-1.0 points) pts rule name description ---- ---------------------- -------------------------------------------------- -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid 0.8 RDNS_NONE Delivered to internal network by a host with no rDNS 0.0 T_DKIM_INVALID DKIM-Signature header exists but is not valid Cc: linux-arch@vger.kernel.org, Linus Torvalds , Al Viro , "linux-arm-kernel@lists.infradead.org" , linux-kernel@vger.kernel.org X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: linux-arm-kernel-bounces@lists.infradead.org Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org Okay, here's the post-mortem diagnosis. What's happening is as follows (I'm very certain of this.) We come through the usual init, and issue (see init/main.c): kernel_thread(kernel_init, NULL, CLONE_FS | CLONE_SIGHAND); This creates a new thread, which falls through to the ret_from_fork assembly, with r4 set NULL and r5 set to kernel_init. You can see this in your oops dump register set - r5 is 0xc0344555, which is the address of kernel_init plus 1 which marks the function as Thumb code. Now, let's look at this code a little closer - this is what the disassembly looks like: c000d180 : c000d180: f03a fe08 bl c0047d94 c000d184: 2d00 cmp r5, #0 c000d186: bf1e ittt ne c000d188: 4620 movne r0, r4 c000d18a: 46fe movne lr, pc <-- XXXXXXX c000d18c: 46af movne pc, r5 c000d18e: 46e9 mov r9, sp c000d190: ea4f 3959 mov.w r9, r9, lsr #13 c000d194: ea4f 3949 mov.w r9, r9, lsl #13 c000d198: e7c8 b.n c000d12c c000d19a: bf00 nop c000d19c: f3af 8000 nop.w I have marked one instruction, and it's the significant one. Eventually, having had a successful call to kernel_execve(), kernel_init() returns zero. In returning, it uses the value in 'lr' which was set by the instruction I marked above. Unfortunately, this causes lr to contain 0xc000d18e - an even address. This switches the ISA to ARM on return but with a non word aligned PC value. So, what do we end up executing? Well, not the instructions above - yes the opcodes, but they don't mean the same thing in ARM mode. In ARM mode, it looks like this instead: c000d18c: 46e946af strbtmi r4, [r9], pc, lsr #13 c000d190: 3959ea4f ldmdbcc r9, {r0, r1, r2, r3, r6, r9, fp, sp, lr, pc}^ c000d194: 3949ea4f stmdbcc r9, {r0, r1, r2, r3, r6, r9, fp, sp, lr, pc}^ c000d198: bf00e7c8 svclt 0x0000e7c8 c000d19c: 8000f3af andhi pc, r0, pc, lsr #7 c000d1a0: e88db092 stm sp, {r1, r4, r7, ip, sp, pc} c000d1a4: 46e81fff ; instruction: 0x46e81fff c000d1a8: 8a00f3ef bhi 0xc004a16c c000d1ac: 0a0cf08a beq 0xc03493dc I have included more above, because it's relevant. The PSR flags which we can see in the oops dump are nZCv, so Z and C are set. All the above ARM instructions are not executed, except for two. c000d1a0, which has no writeback, and writes below the current stack pointer (and that data is lost when we take the next exception.) The other instruction which is executed is c000d1ac, which takes us to... 0xc03493dc. However, remember that bit 1 of the PC got set. So that makes it 0xc03493de. And that value is the value we find in the oops dump for PC. What is the instruction here when interpreted in ARM mode? 0: f71e150c ; instruction: 0xf71e150c and there we have our undefined instruction (remember that the 'never' condition code, 0xf, has been deprecated and is now always executed.) So, what we have above is a consistent and sane story for how we ended up at such a strange place in the kernel with such an odd register dump - with no unanswered questions about what happened to get us there. In light of this, I'm 100% certain that the patch below will fix the issue you're seeing - please test this and get back to me ASAP, thanks. arch/arm/kernel/entry-common.S | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) Tested-by: Daniel Mack diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S index 417bac1..3471175 100644 --- a/arch/arm/kernel/entry-common.S +++ b/arch/arm/kernel/entry-common.S @@ -88,9 +88,9 @@ ENTRY(ret_from_fork) bl schedule_tail cmp r5, #0 movne r0, r4 - movne lr, pc + adrne lr, BSYM(1f) movne pc, r5 - get_thread_info tsk +1: get_thread_info tsk b ret_slow_syscall ENDPROC(ret_from_fork)