From patchwork Tue Dec 8 17:23:32 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 7800331 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 6E1169F349 for ; Tue, 8 Dec 2015 17:26:56 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 8864F204B0 for ; Tue, 8 Dec 2015 17:26:55 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9E6D220411 for ; Tue, 8 Dec 2015 17:26:54 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1a6M0R-0002v8-OT; Tue, 08 Dec 2015 17:24:51 +0000 Received: from foss.arm.com ([217.140.101.70]) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1a6M0O-0002tP-FB for linux-arm-kernel@lists.infradead.org; Tue, 08 Dec 2015 17:24:48 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1927149; Tue, 8 Dec 2015 09:24:04 -0800 (PST) Received: from [10.1.209.158] (melchizedek.cambridge.arm.com [10.1.209.158]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D160D3F4FF; Tue, 8 Dec 2015 09:24:24 -0800 (PST) Message-ID: <56671214.30402@arm.com> Date: Tue, 08 Dec 2015 17:23:32 +0000 From: James Morse User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.6.0 MIME-Version: 1.0 To: Jungseok Lee Subject: Re: [PATCH v8 3/4] arm64: Add do_softirq_own_stack() and enable irq_stacks References: <1449226948-14251-1-git-send-email-james.morse@arm.com> <1449226948-14251-4-git-send-email-james.morse@arm.com> <20151207224805.GA20777@MBP.local> <20151208114321.GD19612@arm.com> <4EBA6141-5CFB-4CAC-97D2-26346AAA91F0@gmail.com> In-Reply-To: <4EBA6141-5CFB-4CAC-97D2-26346AAA91F0@gmail.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20151208_092448_531861_144F3D26 X-CRM114-Status: GOOD ( 18.64 ) X-Spam-Score: -6.9 (------) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Catalin Marinas , Will Deacon , linux-arm-kernel@lists.infradead.org, AKASHI Takahiro Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 08/12/15 16:02, Jungseok Lee wrote: > I've seen the following BUG log with CONFIG_DEBUG_SPINLOCK=y kernel. > > BUG: spinlock lockup suspected on CPU#1 > > Under that option, I cannot even complete a single kernel build successfully. > I hope I'm the only person to experience it. My Android machine is running > well for over 12 hours now with the below change. I can't reproduce this, can you send me your .config file (off-list)? Do you have any other patches in your tree? What hardware are you using? > If I read the patches correctly, the dummy stack frame looks as follows. > > top ------------ <- irq_stack_ptr > | dummy_lr | > ------------ > | x29 | > ------------ <- new frame pointer (x29) > | x19 | > ------------ > | xzr | > ------------ > > So, we should refer to x19 in order to retrieve frame->sp. But, frame->sp is > xzr under the current implementation. I suspect this causes spinlock lockup. This is the sort of place where it is too easy to make an off-by-one error, I will go through it all with the debugger again tomorrow. I'm not seeing this when testing on Juno. This would only affect the tracing code, are you running perf or ftrace at the same time? I've just re-tested this with defconfig, and the following hack: =======%<======= While running: > sudo ./perf record -e mem:
:x -ag -- sleep 180 and > dd if=/dev/sda of=/dev/null bs=512 iflag=direct; This should cause lots of interrupts from /dev/sda, and cause the tracing code to step between irq_stack and the original task stack frequently. The BUG_ON() doesn't fire, and the perf trace output looks correct. My only theory is that there is an off by one, and its reading what was x29 instead. This wouldn't show up in these tests, but might be a problem for aarch32 user-space, as presumably x29==0 when it switches to aarch64 mode for el0_irq(). I will try this tomorrow. Thanks, James =======%<======= diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c index b947eeffa5b2..686086e4d870 100644 --- a/arch/arm64/kernel/stacktrace.c +++ b/arch/arm64/kernel/stacktrace.c @@ -72,8 +72,10 @@ int notrace unwind_frame(struct stackframe *frame) * If we reach the end of the stack - and its an interrupt stack, * read the original task stack pointer from the dummy frame. */ - if (frame->sp == irq_stack_ptr) + if (frame->sp == irq_stack_ptr) { frame->sp = IRQ_STACK_TO_TASK_STACK(irq_stack_ptr); + BUG_ON(frame->sp == 0); + } return 0; }