From patchwork Sat Sep 14 14:14:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kefeng Wang X-Patchwork-Id: 13804406 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3BB8DC0219B for ; Sat, 14 Sep 2024 14:16:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type: Content-Transfer-Encoding:MIME-Version:Message-ID:Date:Subject:CC:To:From: Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender :Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References:List-Owner; bh=6hEoSxrzVr8UGSqn9E+AtU4LHDHeU5GIBeUeSB3u4fw=; b=c4CejB0xPyx66IsM1AMsHQ3w4j YFXYTeQLAcwo2nhD0eLF8SOLARQ87M6fqrmVyaoto2AjJ7hyQZBF3XhJiuKI5juHabEDssXVXNSwy /apUofX8RGjo/O0v9+LCUgis35Gl3r2FV8MuxxNZCrqAR1NY2D4qqFf7kNWgpy5F9/GFtXnTGOdlw C3R8GkKcHU+yLa0EMf/xlv7xnvFDDVAaxIefN5sjkGXaPtYMCsQbQ3iNev5pKGspR0b2oFrj56F2N bnj1UqukjyYuld/GFif+ttvEUZvZFXnJ2RzMsQWsG3LKmKa94jwTslHutUin+3tAqLN8wqzRep7ei 8zdlgy0Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1spTZe-00000000mhN-0sSg; Sat, 14 Sep 2024 14:16:30 +0000 Received: from szxga04-in.huawei.com ([45.249.212.190]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1spTYY-00000000mcX-1BaQ for linux-arm-kernel@lists.infradead.org; Sat, 14 Sep 2024 14:15:24 +0000 Received: from mail.maildlp.com (unknown [172.19.163.44]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4X5Y8V2cpgz2CpKj; Sat, 14 Sep 2024 22:14:34 +0800 (CST) Received: from dggpemf100008.china.huawei.com (unknown [7.185.36.138]) by mail.maildlp.com (Postfix) with ESMTPS id BC7CD1401F1; Sat, 14 Sep 2024 22:15:07 +0800 (CST) Received: from localhost.localdomain (10.175.112.125) by dggpemf100008.china.huawei.com (7.185.36.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Sat, 14 Sep 2024 22:15:07 +0800 From: Kefeng Wang To: Catalin Marinas , Will Deacon CC: Ryan Roberts , , Kefeng Wang , Yicong Yang Subject: [PATCH] arm64: optimize flush tlb kernel range Date: Sat, 14 Sep 2024 22:14:41 +0800 Message-ID: <20240914141441.2340469-1-wangkefeng.wang@huawei.com> X-Mailer: git-send-email 2.27.0 MIME-Version: 1.0 X-Originating-IP: [10.175.112.125] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To dggpemf100008.china.huawei.com (7.185.36.138) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240914_071523_659765_1437829E X-CRM114-Status: GOOD ( 11.77 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Currently the kernel TLBs is flushed page by page if the target VA range is less than MAX_DVM_OPS * PAGE_SIZE, otherwise we'll brutally issue a TLBI ALL. But we could optimize it when CPU supports TLB range operations, convert to use __flush_tlb_range_nosync() like other tlb range flush to improve performance. Signed-off-by: Yicong Yang Signed-off-by: Kefeng Wang --- arch/arm64/include/asm/tlbflush.h | 43 +++++++++++++------------------ arch/arm64/mm/contpte.c | 3 ++- 2 files changed, 20 insertions(+), 26 deletions(-) diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index 95fbc8c05607..8537fad83999 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -431,12 +431,12 @@ do { \ #define __flush_s2_tlb_range_op(op, start, pages, stride, tlb_level) \ __flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, false, kvm_lpa2_is_enabled()); -static inline void __flush_tlb_range_nosync(struct vm_area_struct *vma, - unsigned long start, unsigned long end, - unsigned long stride, bool last_level, - int tlb_level) +static __always_inline void __flush_tlb_range_nosync(struct mm_struct *mm, + unsigned long asid, unsigned long start, unsigned long end, + unsigned long stride, bool last_level, int tlb_level) { - unsigned long asid, pages; + bool tlbi_user = !!asid; + unsigned long pages; start = round_down(start, stride); end = round_up(end, stride); @@ -451,21 +451,24 @@ static inline void __flush_tlb_range_nosync(struct vm_area_struct *vma, if ((!system_supports_tlb_range() && (end - start) >= (MAX_DVM_OPS * stride)) || pages > MAX_TLBI_RANGE_PAGES) { - flush_tlb_mm(vma->vm_mm); + if (asid) + flush_tlb_mm(mm); + else + flush_tlb_all(); return; } dsb(ishst); - asid = ASID(vma->vm_mm); if (last_level) __flush_tlb_range_op(vale1is, start, pages, stride, asid, - tlb_level, true, lpa2_is_enabled()); + tlb_level, tlbi_user, lpa2_is_enabled()); else - __flush_tlb_range_op(vae1is, start, pages, stride, asid, - tlb_level, true, lpa2_is_enabled()); + __flush_tlb_range_op(vae1is, start, pages, stride, tlbi_user, + tlb_level, tlbi_user, lpa2_is_enabled()); - mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, start, end); + if (asid) + mmu_notifier_arch_invalidate_secondary_tlbs(mm, start, end); } static inline void __flush_tlb_range(struct vm_area_struct *vma, @@ -473,8 +476,8 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, unsigned long stride, bool last_level, int tlb_level) { - __flush_tlb_range_nosync(vma, start, end, stride, - last_level, tlb_level); + __flush_tlb_range_nosync(vma->vm_mm, ASID(vma->vm_mm), start, end, + stride, last_level, tlb_level); dsb(ish); } @@ -492,19 +495,9 @@ static inline void flush_tlb_range(struct vm_area_struct *vma, static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end) { - unsigned long addr; - if ((end - start) > (MAX_DVM_OPS * PAGE_SIZE)) { - flush_tlb_all(); - return; - } - - start = __TLBI_VADDR(start, 0); - end = __TLBI_VADDR(end, 0); - - dsb(ishst); - for (addr = start; addr < end; addr += 1 << (PAGE_SHIFT - 12)) - __tlbi(vaale1is, addr); + __flush_tlb_range_nosync(&init_mm, 0, start, end, PAGE_SIZE, false, + TLBI_TTL_UNKNOWN); dsb(ish); isb(); } diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c index 55107d27d3f8..7f93f19dc50b 100644 --- a/arch/arm64/mm/contpte.c +++ b/arch/arm64/mm/contpte.c @@ -335,7 +335,8 @@ int contpte_ptep_clear_flush_young(struct vm_area_struct *vma, * eliding the trailing DSB applies here. */ addr = ALIGN_DOWN(addr, CONT_PTE_SIZE); - __flush_tlb_range_nosync(vma, addr, addr + CONT_PTE_SIZE, + __flush_tlb_range_nosync(vma->vm_mm, ASID(vma->vm_mm), + addr, addr + CONT_PTE_SIZE, PAGE_SIZE, true, 3); }