From patchwork Thu Jul 24 14:56:15 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Salter X-Patchwork-Id: 4618441 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id DF0FA9F295 for ; Thu, 24 Jul 2014 14:59:24 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 2D3CE201D3 for ; Thu, 24 Jul 2014 14:59:20 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7B1B6201BC for ; Thu, 24 Jul 2014 14:59:15 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1XAKST-0006Yk-3w; Thu, 24 Jul 2014 14:57:25 +0000 Received: from mx1.redhat.com ([209.132.183.28]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1XAKSP-0006St-TR for linux-arm-kernel@lists.infradead.org; Thu, 24 Jul 2014 14:57:23 +0000 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s6OEulBp012486 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 24 Jul 2014 10:56:47 -0400 Received: from deneb.redhat.com (ovpn-113-127.phx2.redhat.com [10.3.113.127]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id s6OEujKS022048; Thu, 24 Jul 2014 10:56:45 -0400 From: Mark Salter To: Catalin Marinas Subject: [PATCH] arm64: fix soft lockup due to large tlb flush range Date: Thu, 24 Jul 2014 10:56:15 -0400 Message-Id: <1406213775-28617-1-git-send-email-msalter@redhat.com> In-Reply-To: <20140724142417.GE13371@arm.com> References: <20140724142417.GE13371@arm.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20140724_075721_995897_1CA73E16 X-CRM114-Status: GOOD ( 15.59 ) X-Spam-Score: -5.0 (-----) Cc: Laura Abbott , Will Deacon , Russell King , linux-mm@kvack.org, Mark Salter , Eric Miao , linux-arm-kernel@lists.infradead.org X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Under certain loads, this soft lockup has been observed: BUG: soft lockup - CPU#2 stuck for 22s! [ip6tables:1016] Modules linked in: ip6t_rpfilter ip6t_REJECT cfg80211 rfkill xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw vfat fat efivarfs xfs libcrc32c CPU: 2 PID: 1016 Comm: ip6tables Not tainted 3.13.0-0.rc7.30.sa2.aarch64 #1 task: fffffe03e81d1400 ti: fffffe03f01f8000 task.ti: fffffe03f01f8000 PC is at __cpu_flush_kern_tlb_range+0xc/0x40 LR is at __purge_vmap_area_lazy+0x28c/0x3ac pc : [] lr : [] pstate: 80000145 sp : fffffe03f01fbb70 x29: fffffe03f01fbb70 x28: fffffe03f01f8000 x27: fffffe0000b19000 x26: 00000000000000d0 x25: 000000000000001c x24: fffffe03f01fbc50 x23: fffffe03f01fbc58 x22: fffffe03f01fbc10 x21: fffffe0000b2a3f8 x20: 0000000000000802 x19: fffffe0000b2a3c8 x18: 000003fffdf52710 x17: 000003ff9d8bb910 x16: fffffe000050fbfc x15: 0000000000005735 x14: 000003ff9d7e1a5c x13: 0000000000000000 x12: 000003ff9d7e1a5c x11: 0000000000000007 x10: fffffe0000c09af0 x9 : fffffe0000ad1000 x8 : 000000000000005c x7 : fffffe03e8624000 x6 : 0000000000000000 x5 : 0000000000000000 x4 : 0000000000000000 x3 : fffffe0000c09cc8 x2 : 0000000000000000 x1 : 000fffffdfffca80 x0 : 000fffffcd742150 The __cpu_flush_kern_tlb_range() function looks like: ENTRY(__cpu_flush_kern_tlb_range) dsb sy lsr x0, x0, #12 lsr x1, x1, #12 1: tlbi vaae1is, x0 add x0, x0, #1 cmp x0, x1 b.lo 1b dsb sy isb ret ENDPROC(__cpu_flush_kern_tlb_range) The above soft lockup shows the PC at tlbi insn with: x0 = 0x000fffffcd742150 x1 = 0x000fffffdfffca80 So __cpu_flush_kern_tlb_range has 0x128ba930 tlbi flushes left after it has already been looping for 23 seconds!. Looking up one frame at __purge_vmap_area_lazy(), there is: ... list_for_each_entry_rcu(va, &vmap_area_list, list) { if (va->flags & VM_LAZY_FREE) { if (va->va_start < *start) *start = va->va_start; if (va->va_end > *end) *end = va->va_end; nr += (va->va_end - va->va_start) >> PAGE_SHIFT; list_add_tail(&va->purge_list, &valist); va->flags |= VM_LAZY_FREEING; va->flags &= ~VM_LAZY_FREE; } } ... if (nr || force_flush) flush_tlb_kernel_range(*start, *end); So if two areas are being freed, the range passed to flush_tlb_kernel_range() may be as large as the vmalloc space. For arm64, this is ~240GB for 4k pagesize and ~2TB for 64kpage size. This patch works around this problem by adding a loop limit. If the range is larger than the limit, use flush_tlb_all() rather than flushing based on individual pages. The limit chosen is arbitrary and would be better if based on the actual size of the tlb. I looked through the ARM ARM but didn't see any easy way to get the actual tlb size, so for now the arbitrary limit is better than the soft lockup. Signed-off-by: Mark Salter --- arch/arm64/include/asm/tlbflush.h | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-) diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index b9349c4..af3e572 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -98,8 +98,8 @@ static inline void flush_tlb_page(struct vm_area_struct *vma, dsb(ish); } -static inline void flush_tlb_range(struct vm_area_struct *vma, - unsigned long start, unsigned long end) +static inline void __flush_tlb_range(struct vm_area_struct *vma, + unsigned long start, unsigned long end) { unsigned long asid = (unsigned long)ASID(vma->vm_mm) << 48; unsigned long addr; @@ -112,7 +112,9 @@ static inline void flush_tlb_range(struct vm_area_struct *vma, dsb(ish); } -static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end) +#define MAX_TLB_LOOP 1024 + +static inline void __flush_tlb_kernel_range(unsigned long start, unsigned long end) { unsigned long addr; start >>= 12; @@ -124,6 +126,23 @@ static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end dsb(ish); } +static inline void flush_tlb_range(struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + if (((end - start) >> PAGE_SHIFT) < MAX_TLB_LOOP) + __flush_tlb_range(vma, start, end); + else + flush_tlb_mm(vma->vm_mm); +} + +static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end) +{ + if (((end - start) >> PAGE_SHIFT) < MAX_TLB_LOOP) + __flush_tlb_kernel_range(start, end); + else + flush_tlb_all(); +} + /* * On AArch64, the cache coherency is handled via the set_pte_at() function. */