From patchwork Fri May 2 15:20:35 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Salter X-Patchwork-Id: 4102401 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 7CEA79F1E1 for ; Fri, 2 May 2014 15:23:20 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 43CD120155 for ; Fri, 2 May 2014 15:23:19 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F411C20375 for ; Fri, 2 May 2014 15:23:17 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1WgFGx-0002qT-2f; Fri, 02 May 2014 15:21:11 +0000 Received: from mx1.redhat.com ([209.132.183.28]) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1WgFGu-0002jq-Kf for linux-arm-kernel@lists.infradead.org; Fri, 02 May 2014 15:21:09 +0000 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s42FKjBX030309 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 2 May 2014 11:20:45 -0400 Received: from deneb.redhat.com (ovpn-113-147.phx2.redhat.com [10.3.113.147]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id s42FKhJQ018333; Fri, 2 May 2014 11:20:44 -0400 From: Mark Salter To: Catalin Marinas , Will Deacon Subject: [PATCH 2/2] arm64: fix soft lockup due to large tlb flush range Date: Fri, 2 May 2014 11:20:35 -0400 Message-Id: <1399044035-11274-3-git-send-email-msalter@redhat.com> In-Reply-To: <1399044035-11274-1-git-send-email-msalter@redhat.com> References: <1399044035-11274-1-git-send-email-msalter@redhat.com> X-Scanned-By: MIMEDefang 2.67 on 10.5.11.12 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20140502_082108_719950_129FE89B X-CRM114-Status: GOOD ( 18.51 ) X-Spam-Score: -5.7 (-----) Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Mark Salter X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Under certain loads, this soft lockup has been observed: BUG: soft lockup - CPU#2 stuck for 22s! [ip6tables:1016] Modules linked in: ip6t_rpfilter ip6t_REJECT cfg80211 rfkill xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw vfat fat efivarfs xfs libcrc32c CPU: 2 PID: 1016 Comm: ip6tables Not tainted 3.13.0-0.rc7.30.sa2.aarch64 #1 task: fffffe03e81d1400 ti: fffffe03f01f8000 task.ti: fffffe03f01f8000 PC is at __cpu_flush_kern_tlb_range+0xc/0x40 LR is at __purge_vmap_area_lazy+0x28c/0x3ac pc : [] lr : [] pstate: 80000145 sp : fffffe03f01fbb70 x29: fffffe03f01fbb70 x28: fffffe03f01f8000 x27: fffffe0000b19000 x26: 00000000000000d0 x25: 000000000000001c x24: fffffe03f01fbc50 x23: fffffe03f01fbc58 x22: fffffe03f01fbc10 x21: fffffe0000b2a3f8 x20: 0000000000000802 x19: fffffe0000b2a3c8 x18: 000003fffdf52710 x17: 000003ff9d8bb910 x16: fffffe000050fbfc x15: 0000000000005735 x14: 000003ff9d7e1a5c x13: 0000000000000000 x12: 000003ff9d7e1a5c x11: 0000000000000007 x10: fffffe0000c09af0 x9 : fffffe0000ad1000 x8 : 000000000000005c x7 : fffffe03e8624000 x6 : 0000000000000000 x5 : 0000000000000000 x4 : 0000000000000000 x3 : fffffe0000c09cc8 x2 : 0000000000000000 x1 : 000fffffdfffca80 x0 : 000fffffcd742150 The __cpu_flush_kern_tlb_range() function looks like: ENTRY(__cpu_flush_kern_tlb_range) dsb sy lsr x0, x0, #12 lsr x1, x1, #12 1: tlbi vaae1is, x0 add x0, x0, #1 cmp x0, x1 b.lo 1b dsb sy isb ret ENDPROC(__cpu_flush_kern_tlb_range) The above soft lockup shows the PC at tlbi insn with: x0 = 0x000fffffcd742150 x1 = 0x000fffffdfffca80 So __cpu_flush_kern_tlb_range has 0x128ba930 tlbi flushes left after it has already been looping for 23 seconds!. Looking up one frame at __purge_vmap_area_lazy(), there is: ... list_for_each_entry_rcu(va, &vmap_area_list, list) { if (va->flags & VM_LAZY_FREE) { if (va->va_start < *start) *start = va->va_start; if (va->va_end > *end) *end = va->va_end; nr += (va->va_end - va->va_start) >> PAGE_SHIFT; list_add_tail(&va->purge_list, &valist); va->flags |= VM_LAZY_FREEING; va->flags &= ~VM_LAZY_FREE; } } ... if (nr || force_flush) flush_tlb_kernel_range(*start, *end); So if two areas are being freed, the range passed to flush_tlb_kernel_range() may be as large as the vmalloc space. For arm64, this is ~240GB for 4k pagesize and ~2TB for 64kpage size. This patch works around this problem by adding a loop limit. If the range is larger than the limit, use flush_tlb_all() rather than flushing based on individual pages. The limit chosen is arbitrary and would be better if based on the actual size of the tlb. I looked through the ARM ARM but didn't see any easy way to get the actual tlb size, so for now the arbitrary limit is better than the soft lockup. Signed-off-by: Mark Salter --- arch/arm64/include/asm/tlbflush.h | 28 +++++++++++++++++++++++++--- 1 file changed, 25 insertions(+), 3 deletions(-) diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index 8b48203..34ea52a 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -99,10 +99,32 @@ static inline void flush_tlb_page(struct vm_area_struct *vma, } /* - * Convert calls to our calling convention. + * The flush by range functions may take a very large range. + * If we need to invalidate a large range, it may be better + * to invalidate all tlb entries at once rather than looping + * through and invalidating individual entries. + * + * Here, we just use a fixed (arbitrary) number. It would be + * better if this was based on the actual size of the tlb... */ -#define flush_tlb_range(vma,start,end) __cpu_flush_user_tlb_range(start,end,vma) -#define flush_tlb_kernel_range(s,e) __cpu_flush_kern_tlb_range(s,e) +#define MAX_TLB_LOOP 128 + +static inline void flush_tlb_range(struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + if (((end - start) >> PAGE_SHIFT) < MAX_TLB_LOOP) + __cpu_flush_user_tlb_range(start, end, vma); + else + flush_tlb_mm(vma->vm_mm); +} + +static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end) +{ + if (((end - start) >> PAGE_SHIFT) < MAX_TLB_LOOP) + __cpu_flush_kern_tlb_range(start, end); + else + flush_tlb_all(); +} /* * On AArch64, the cache coherency is handled via the set_pte_at() function.