From patchwork Tue Jan 26 11:33:17 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ding Tianhong X-Patchwork-Id: 8120651 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 45BEC9F818 for ; Tue, 26 Jan 2016 11:36:03 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 5237F20265 for ; Tue, 26 Jan 2016 11:36:02 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0B9C82021F for ; Tue, 26 Jan 2016 11:36:01 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1aO1tF-0001dD-9r; Tue, 26 Jan 2016 11:34:29 +0000 Received: from szxga01-in.huawei.com ([58.251.152.64]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1aO1t8-0001Fh-MV for linux-arm-kernel@lists.infradead.org; Tue, 26 Jan 2016 11:34:25 +0000 Received: from 172.24.1.49 (EHLO szxeml426-hub.china.huawei.com) ([172.24.1.49]) by szxrg01-dlp.huawei.com (MOS 4.3.7-GA FastPath queued) with ESMTP id DDS73121; Tue, 26 Jan 2016 19:33:30 +0800 (CST) Received: from [127.0.0.1] (10.177.22.246) by szxeml426-hub.china.huawei.com (10.82.67.181) with Microsoft SMTP Server id 14.3.235.1; Tue, 26 Jan 2016 19:33:19 +0800 Subject: Re: Unhandled level 2 translation fault on A72 board. To: Catalin Marinas References: <56A72246.4050105@huawei.com> <20160126110358.GA23579@localhost.localdomain> From: Ding Tianhong Message-ID: <56A7597D.6020609@huawei.com> Date: Tue, 26 Jan 2016 19:33:17 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <20160126110358.GA23579@localhost.localdomain> X-Originating-IP: [10.177.22.246] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A090205.56A7598B.0045, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2013-06-18 04:22:30, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: c592e1dedd102df65d54e297bdf4ed22 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20160126_033423_742207_7B1EC00C X-CRM114-Status: GOOD ( 17.62 ) X-Spam-Score: -4.2 (----) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Arnd Bergmann , Will Deacon , Linuxarm , "linux-arm-kernel@lists.infradead.org" , "Guohanjun \(Hanjun Guo\)" Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 2016/1/26 19:03, Catalin Marinas wrote: > On Tue, Jan 26, 2016 at 03:37:42PM +0800, Ding Tianhong wrote: >> I met this problem when running the hackbench test on A72 chip board: >> >> sh[4779]: unhandled level 2 translation fault (11) at 0x7f96be0c80, esr 0x83000006 >> pgd = ffffffc01a1f0000 >> [7f96be0c80] *pgd=0000000084a20003, *pud=0000000084a20003, *pmd=0000000000000000 >> >> CPU: 1 PID: 4779 Comm: sh Tainted: G O 4.1.15+ #21 >> Hardware name: Hisilicon PhosphorHi1382 EVB (DT) >> task: ffffffc0163cc500 ti: ffffffc083abc000 task.ti: ffffffc083abc000 >> PC is at 0x7f96be0c80 >> LR is at 0x7fb2684eb4 >> pc : [<0000007f96be0c80>] lr : [<0000007fb2684eb4>] pstate: 60000000 > > So here it's user space trying to execute from 0x7f96be0c80 (instruction > abort). > >> sh[4963]: unhandled level 2 translation fault (11) at 0x00000000, esr 0x92000006 >> pgd = ffffffc0180c6000 >> [00000000] *pgd=0000000015157003, *pud=0000000015157003, *pmd=0000000000000000 >> >> CPU: 0 PID: 4963 Comm: sh Tainted: G O 4.1.15+ #21 >> Hardware name: Hisilicon PhosphorHi1382 EVB (DT) >> task: ffffffc0163cb980 ti: ffffffc0840c8000 task.ti: ffffffc0840c8000 >> PC is at 0x42c0c8 >> LR is at 0x42c03c >> pc : [<000000000042c0c8>] lr : [<000000000042c03c>] pstate: 80000000 > > And here you have a null pointer dereference. > >> if I run the benchmark only on the core which is in the same cluster, >> it looks fine and no error happened, but if I enable the core which in >> the different cluster, it will happened. >> >> I remember that I met the same problem on the A57 and fix it by enable >> the [bit6] of the CPUECTLR_EL1 and enable MN, But this time, I enable >> the same setting and looks no effort, I have no idea about this >> problem, does A57 and A72 has so big difference on TLB? > > I can't tell for sure it's a TLB issue. The kernel page table dump shows > *pmd being 0, so the fault is correctly called "level 2 translation > fault". It also seems that there is no vma at this address, hence the > kernel reports it as unhandled. It looks like data corruption which > could be caused by cache or TLB incoherence. Just make sure the > interconnect linking the two clusters is configured correctly by > _firmware_ before Linux starts. > Hi Catalin: Thanks for the apply, I have try to apply this patch to test: --- arch/arm64/kernel/process.c | 9 +++++++++ 1 file changed, 9 insertions(+) hw_breakpoint_thread_switch(next); contextidr_thread_switch(next); +tlb_flush_thread(prev); + /* * Complete any pending TLB or cache maintenance on this CPU in case * the thread migrates to a different CPU. The hackbench would work fine after this patch, so I guess that the old thread tlb may not be invalidate as soon as possible, but I don't know why, everything is fine on A57, Does I miss something? Ding diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index 6391485..d7d8439 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -283,6 +283,13 @@ static void tls_thread_switch(struct task_struct *next) : : "r" (tpidr), "r" (tpidrro)); } +static void tlb_flush_thread(struct task_struct *prev) +{ +/* Flush the prev task's TLB entries */ +if (prev->mm) +flush_tlb_mm(prev->mm); +} + /* * Thread switching. */ @@ -296,6 +303,8 @@ struct task_struct *__switch_to(struct task_struct *prev,