From patchwork Thu Feb 2 21:45:06 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shanker Donthineni X-Patchwork-Id: 9553269 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id E3FED60424 for ; Thu, 2 Feb 2017 21:45:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D392728437 for ; Thu, 2 Feb 2017 21:45:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C832E284BC; Thu, 2 Feb 2017 21:45:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, T_DKIM_INVALID autolearn=no version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [65.50.211.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 481E628437 for ; Thu, 2 Feb 2017 21:45:53 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux)) id 1cZPCQ-0001yo-3K; Thu, 02 Feb 2017 21:45:50 +0000 Received: from smtp.codeaurora.org ([198.145.29.96]) by bombadil.infradead.org with esmtps (Exim 4.87 #1 (Red Hat Linux)) id 1cZPCM-0001vN-8b for linux-arm-kernel@lists.infradead.org; Thu, 02 Feb 2017 21:45:48 +0000 Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 8556A607C2; Thu, 2 Feb 2017 21:45:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1486071925; bh=U9G6+9Y5b8WQNbnTGxJlLyW0m5MCqktBdSzUaTH6avE=; h=From:To:Cc:Subject:Date:From; b=GHuK0TlxlM8TRCQGQ2hVr5GxAxpTmWLfEWllmMWGJvae4wNcyDLg303CbqZPK9njM dZYPCq7UWYbKn1AC3WN5dvrYNno50mT+s1jA8zq5NQiFZexCaveU5SIjgE1wNDVNFj svSZ/dndZ7kfE1EgGWW55BESTqmu6wvcIEfyQgf8= Received: from shankerd-ubuntu.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: shankerd@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id 4A1B960768; Thu, 2 Feb 2017 21:45:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1486071925; bh=U9G6+9Y5b8WQNbnTGxJlLyW0m5MCqktBdSzUaTH6avE=; h=From:To:Cc:Subject:Date:From; b=GHuK0TlxlM8TRCQGQ2hVr5GxAxpTmWLfEWllmMWGJvae4wNcyDLg303CbqZPK9njM dZYPCq7UWYbKn1AC3WN5dvrYNno50mT+s1jA8zq5NQiFZexCaveU5SIjgE1wNDVNFj svSZ/dndZ7kfE1EgGWW55BESTqmu6wvcIEfyQgf8= DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 4A1B960768 Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=shankerd@codeaurora.org From: Shanker Donthineni To: Catalin Marinas Subject: [PATCH] arm64: cache: Skip an unnecessary data cache clean PoU operation Date: Thu, 2 Feb 2017 15:45:06 -0600 Message-Id: <1486071906-2773-1-git-send-email-shankerd@codeaurora.org> X-Mailer: git-send-email 1.9.1 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20170202_134546_358701_A20A68C2 X-CRM114-Status: GOOD ( 17.99 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , Vikram Sethi , Suzuki K Poulose , Marc Zyngier , linux-kernel , James Morse , Shanker Donthineni , linux-arm-kernel MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP The cache management functions always do the data cache PoU (point of unification) operations even though it is not required on some systems. NO need to clean data cache till PoU if all the cache levels below PoUIS are WT (Write-Through) caches. It causes a huge performance degradation when operating on a larger memory area, especially THP with 64K page size kernel. For each online CPU, check the need of 'dc cvau' instruction and update a global variable __skip_dcache_pou. The two functions __flush_cache_user_range() and __clean_dcache_area_pou() are patched using an alternative primitive to skip an unnecessary code execution. It won't change the existing behavior if any one of the CPU is capable of WB cache below PoUIS level. Signed-off-by: Shanker Donthineni --- arch/arm64/include/asm/cachetype.h | 6 ++++++ arch/arm64/include/asm/cpucaps.h | 3 ++- arch/arm64/kernel/cpufeature.c | 12 ++++++++++++ arch/arm64/kernel/cpuinfo.c | 23 +++++++++++++++++++++++ arch/arm64/mm/cache.S | 3 +++ 5 files changed, 46 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/cachetype.h b/arch/arm64/include/asm/cachetype.h index f558869..f05974c 100644 --- a/arch/arm64/include/asm/cachetype.h +++ b/arch/arm64/include/asm/cachetype.h @@ -39,6 +39,12 @@ extern unsigned long __icache_flags; +extern bool __skip_dcache_pou; + +#define CLIDR_LOUIS_SHIFT (21) +#define CLIDR_LOUIS_MASK (0x7) +#define CLIDR_LOUIS(x) (((x) >> CLIDR_LOUIS_SHIFT) & CLIDR_LOUIS_MASK) + /* * NumSets, bits[27:13] - (Number of sets in cache) - 1 * Associativity, bits[12:3] - (Associativity of cache) - 1 diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index 4174f09..6f4ea61 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -35,7 +35,8 @@ #define ARM64_HYP_OFFSET_LOW 14 #define ARM64_MISMATCHED_CACHE_LINE_SIZE 15 #define ARM64_HAS_NO_FPSIMD 16 +#define ARM64_SKIP_DCACHE_POU 17 -#define ARM64_NCAPS 17 +#define ARM64_NCAPS 18 #endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index fdf8f04..eaa86d1 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -755,6 +755,12 @@ static bool has_no_fpsimd(const struct arm64_cpu_capabilities *entry, int __unus ID_AA64PFR0_FP_SHIFT) < 0; } +static bool check_dcache_pou_skipped(const struct arm64_cpu_capabilities *entry, + int __unused) +{ + return __skip_dcache_pou; +} + static const struct arm64_cpu_capabilities arm64_features[] = { { .desc = "GIC system register CPU interface", @@ -845,6 +851,12 @@ static bool has_no_fpsimd(const struct arm64_cpu_capabilities *entry, int __unus .min_field_value = 0, .matches = has_no_fpsimd, }, + { + .desc = "Skip data cache clean PoU operation", + .capability = ARM64_SKIP_DCACHE_POU, + .def_scope = SCOPE_SYSTEM, + .matches = check_dcache_pou_skipped, + }, {}, }; diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c index 7b7be71..4fdbb55 100644 --- a/arch/arm64/kernel/cpuinfo.c +++ b/arch/arm64/kernel/cpuinfo.c @@ -50,6 +50,7 @@ }; unsigned long __icache_flags; +bool __skip_dcache_pou = true; static const char *const hwcap_str[] = { "fp", @@ -305,6 +306,25 @@ static void cpuinfo_detect_icache_policy(struct cpuinfo_arm64 *info) pr_info("Detected %s I-cache on CPU%d\n", icache_policy_str[l1ip], cpu); } +/* + * Check if all the data cache levels below LoUIS doesn't support WB. + * Return value 1 if any one of cache level below LoUIS has WB cache + * else return value 0. + */ +static bool is_dcache_below_pou_wt(void) +{ + u32 louis = CLIDR_LOUIS(read_sysreg(clidr_el1)); + u32 lvl, csidr; + + for (lvl = 0; lvl < louis; lvl++) { + csidr = cache_get_ccsidr(lvl << 1); + if (csidr & CCSIDR_EL1_WRITE_BACK) + return false; + } + + return true; +} + static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info) { info->reg_cntfrq = arch_timer_get_cntfrq(); @@ -345,6 +365,9 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info) } cpuinfo_detect_icache_policy(info); + + if (__skip_dcache_pou) + __skip_dcache_pou = is_dcache_below_pou_wt(); } void cpuinfo_store_cpu(void) diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S index 83c27b6e..bb3cdb3 100644 --- a/arch/arm64/mm/cache.S +++ b/arch/arm64/mm/cache.S @@ -50,6 +50,7 @@ ENTRY(flush_icache_range) */ ENTRY(__flush_cache_user_range) uaccess_ttbr0_enable x2, x3 + alternative_insn "nop", "b 2f", ARM64_SKIP_DCACHE_POU dcache_line_size x2, x3 sub x3, x2, #1 bic x4, x0, x3 @@ -60,6 +61,7 @@ user_alt 9f, "dc cvau, x4", "dc civac, x4", ARM64_WORKAROUND_CLEAN_CACHE b.lo 1b dsb ish +2: icache_line_size x2, x3 sub x3, x2, #1 bic x4, x0, x3 @@ -104,6 +106,7 @@ ENDPIPROC(__flush_dcache_area) * - size - size in question */ ENTRY(__clean_dcache_area_pou) + alternative_insn "nop", "ret", ARM64_SKIP_DCACHE_POU dcache_by_line_op cvau, ish, x0, x1, x2, x3 ret ENDPROC(__clean_dcache_area_pou)