From patchwork Fri Jan 13 21:23:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Heiko_St=C3=BCbner?= X-Patchwork-Id: 13101667 X-Patchwork-Delegate: palmer@dabbelt.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DE4F7C6379F for ; Fri, 13 Jan 2023 21:24:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=gxjmRgP7jsMoIGLMKCmVft29zJgqPl5zTc6/idZZtJE=; b=2UEA2rSculTgDz 5LEDPAiH9BcO6Lze9434jXshj1fMxjFliC7VShIQqX8A1G2eg7LhB+5yjtqsYw//tSnQ+YaAC7+ZE OeZYbGhHL9coGORZ2rTJKUiOvf9uvGKb8yIJ+XgedCZt1J8W5M0PSKn9jFM3OUve8WrZMeccwFAvJ 7+2A+RTOgCx4KFw2jvL99hrdEYIAp3JHPBQyA26olbBmqKsmuvCo2i42x5UXRq3Fjf6mkszumC8Qq pTZtKAr3oPtMEKN9ma9Q3Tge6DHtKcyD9BblEu6c4fsjttg9B4S+5sverDjoGMPI5gI9CeQ5tObLY cSLMmPJDpN2VLtptfsBQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1pGRX2-004b5p-Se; Fri, 13 Jan 2023 21:24:12 +0000 Received: from gloria.sntech.de ([185.11.138.130]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1pGRWl-004axP-QO for linux-riscv@lists.infradead.org; Fri, 13 Jan 2023 21:24:00 +0000 Received: from ip5b412258.dynamic.kabel-deutschland.de ([91.65.34.88] helo=phil.lan) by gloria.sntech.de with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pGRWk-0003WV-Ci; Fri, 13 Jan 2023 22:23:54 +0100 From: Heiko Stuebner To: linux-riscv@lists.infradead.org, palmer@dabbelt.com Cc: christoph.muellner@vrull.eu, conor@kernel.org, philipp.tomsich@vrull.eu, ajones@ventanamicro.com, heiko@sntech.de, jszhang@kernel.org, Heiko Stuebner Subject: [PATCH 1/4] RISC-V: use bit-values instead of numbers to identify patched cpu-features Date: Fri, 13 Jan 2023 22:23:48 +0100 Message-Id: <20230113212351.3534769-2-heiko@sntech.de> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20230113212351.3534769-1-heiko@sntech.de> References: <20230113212351.3534769-1-heiko@sntech.de> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230113_132355_904221_BB216F17 X-CRM114-Status: GOOD ( 13.36 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Heiko Stuebner RISC-V cpufeatures are often based on available extensions and maybe even some combination of them. Using a bitfield for the errata-id gives us a simple way to also require a combination of extensions for a specific alternative patch. Signed-off-by: Heiko Stuebner --- arch/riscv/include/asm/errata_list.h | 6 +++--- arch/riscv/kernel/cpufeature.c | 12 +++++------- 2 files changed, 8 insertions(+), 10 deletions(-) diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h index 95e626b7281e..40c9e9c3295b 100644 --- a/arch/riscv/include/asm/errata_list.h +++ b/arch/riscv/include/asm/errata_list.h @@ -22,9 +22,9 @@ #define ERRATA_THEAD_NUMBER 3 #endif -#define CPUFEATURE_SVPBMT 0 -#define CPUFEATURE_ZICBOM 1 -#define CPUFEATURE_ZBB 2 +#define CPUFEATURE_SVPBMT (1 << 0) +#define CPUFEATURE_ZICBOM (1 << 1) +#define CPUFEATURE_ZBB (1 << 2) #define CPUFEATURE_NUMBER 3 #ifdef __ASSEMBLY__ diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index 7bfc6eb9a5cf..8c83bd9d0e22 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -350,13 +350,13 @@ static u32 __init_or_module cpufeature_probe(unsigned int stage) u32 cpu_req_feature = 0; if (cpufeature_probe_svpbmt(stage)) - cpu_req_feature |= BIT(CPUFEATURE_SVPBMT); + cpu_req_feature |= CPUFEATURE_SVPBMT; if (cpufeature_probe_zicbom(stage)) - cpu_req_feature |= BIT(CPUFEATURE_ZICBOM); + cpu_req_feature |= CPUFEATURE_ZICBOM; if (cpufeature_probe_zbb(stage)) - cpu_req_feature |= BIT(CPUFEATURE_ZBB); + cpu_req_feature |= CPUFEATURE_ZBB; return cpu_req_feature; } @@ -367,19 +367,17 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin, { u32 cpu_req_feature = cpufeature_probe(stage); struct alt_entry *alt; - u32 tmp; for (alt = begin; alt < end; alt++) { if (alt->vendor_id != 0) continue; - if (alt->errata_id >= CPUFEATURE_NUMBER) { + if (alt->errata_id & GENMASK(31, CPUFEATURE_NUMBER)) { WARN(1, "This feature id:%d is not in kernel cpufeature list", alt->errata_id); continue; } - tmp = (1U << alt->errata_id); - if (cpu_req_feature & tmp) { + if ((cpu_req_feature & alt->errata_id) == alt->errata_id) { patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len); riscv_alternative_fix_offsets(alt->old_ptr, alt->alt_len, alt->old_ptr - alt->alt_ptr); From patchwork Fri Jan 13 21:23:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Heiko_St=C3=BCbner?= X-Patchwork-Id: 13101670 X-Patchwork-Delegate: palmer@dabbelt.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 96081C3DA78 for ; Fri, 13 Jan 2023 21:24:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=Lznts5J8MYI+VeLOLlDV3RF1RojlMG8y/54Ydtip3Zc=; b=M4S3LNzj82JbOv Uf8IJEG97mW1xAWYPPXNAdJ6PDq/JrxTMCm4fP3zmCFSkwFrE4EeA7/RDzvLEc9+L1fZ1Gq/BpNVy 1JKTYyt0+vQzsrdl+7G/zxHdBpu4al30l0SiOnyMGJ7CSU5pEDeqLd0DX8dohgeHfRHELNgvSP1/n xBIMW4sRp8HXvGzrqWI2CXi+aiN7yvcZil+8ZCSAgHRq4+FttBNkn6kTEWK8VZFpXamjNRg+V1tZs nDOhujOLOCIg+46PbTXD12gWgY4mT4FdQMwjcWDa/8KjTZqTj1A6x3QEYxYCZyyvU9jjTwJyt9Hen Zy8biY11kI5TPJnVa/Yw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1pGRXQ-004bIg-FT; Fri, 13 Jan 2023 21:24:36 +0000 Received: from gloria.sntech.de ([185.11.138.130]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1pGRWm-004axY-Dh for linux-riscv@lists.infradead.org; Fri, 13 Jan 2023 21:24:03 +0000 Received: from ip5b412258.dynamic.kabel-deutschland.de ([91.65.34.88] helo=phil.lan) by gloria.sntech.de with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pGRWk-0003WV-Ms; Fri, 13 Jan 2023 22:23:54 +0100 From: Heiko Stuebner To: linux-riscv@lists.infradead.org, palmer@dabbelt.com Cc: christoph.muellner@vrull.eu, conor@kernel.org, philipp.tomsich@vrull.eu, ajones@ventanamicro.com, heiko@sntech.de, jszhang@kernel.org, Heiko Stuebner Subject: [PATCH 2/4] RISC-V: add alternative-field for bits to not match against Date: Fri, 13 Jan 2023 22:23:49 +0100 Message-Id: <20230113212351.3534769-3-heiko@sntech.de> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20230113212351.3534769-1-heiko@sntech.de> References: <20230113212351.3534769-1-heiko@sntech.de> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230113_132356_647569_2661614A X-CRM114-Status: GOOD ( 20.47 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Heiko Stuebner Alternatives on RISC-V do not necessarily know about each other. So an alternative is always defined by the new code and a vendor+"errata" identifier and this whole block then points to the old code it wants to possibly replace. This is actually also a nice feature, as it reduces complexity and allows different sources for alternatives (cpu-features, vendor-errata) to co-exist. When using a bitfield for cpufeatures to support combinations it creates the need to also specify what to not match against. For example an alternative for zbb could simply work for any core supporting zbb but on the other hand it could also have a still better variant for zbb + extension-x. Signed-off-by: Heiko Stuebner --- arch/riscv/include/asm/alternative-macros.h | 64 +++++++++++---------- arch/riscv/include/asm/alternative.h | 1 + arch/riscv/include/asm/errata_list.h | 18 +++--- arch/riscv/kernel/cpufeature.c | 3 +- arch/riscv/lib/strcmp.S | 2 +- arch/riscv/lib/strlen.S | 2 +- arch/riscv/lib/strncmp.S | 2 +- 7 files changed, 48 insertions(+), 44 deletions(-) diff --git a/arch/riscv/include/asm/alternative-macros.h b/arch/riscv/include/asm/alternative-macros.h index 2c0f4c887289..b80ea0d15c67 100644 --- a/arch/riscv/include/asm/alternative-macros.h +++ b/arch/riscv/include/asm/alternative-macros.h @@ -6,18 +6,19 @@ #ifdef __ASSEMBLY__ -.macro ALT_ENTRY oldptr newptr vendor_id errata_id new_len +.macro ALT_ENTRY oldptr newptr new_len vendor_id errata_id errata_not RISCV_PTR \oldptr RISCV_PTR \newptr REG_ASM \vendor_id REG_ASM \new_len .word \errata_id + .word \errata_not .endm -.macro ALT_NEW_CONTENT vendor_id, errata_id, enable = 1, new_c : vararg +.macro ALT_NEW_CONTENT vendor_id, errata_id, errata_not, enable = 1, new_c : vararg .if \enable .pushsection .alternative, "a" - ALT_ENTRY 886b, 888f, \vendor_id, \errata_id, 889f - 888f + ALT_ENTRY 886b, 888f, 889f - 888f, \vendor_id, \errata_id, \errata_not .popsection .subsection 1 888 : @@ -33,7 +34,7 @@ .endif .endm -.macro ALTERNATIVE_CFG old_c, new_c, vendor_id, errata_id, enable +.macro ALTERNATIVE_CFG old_c, new_c, vendor_id, errata_id, errata_not, enable 886 : .option push .option norvc @@ -41,13 +42,13 @@ \old_c .option pop 887 : - ALT_NEW_CONTENT \vendor_id, \errata_id, \enable, \new_c + ALT_NEW_CONTENT \vendor_id, \errata_id, \errata_not, \enable, \new_c .endm -.macro ALTERNATIVE_CFG_2 old_c, new_c_1, vendor_id_1, errata_id_1, enable_1, \ - new_c_2, vendor_id_2, errata_id_2, enable_2 - ALTERNATIVE_CFG "\old_c", "\new_c_1", \vendor_id_1, \errata_id_1, \enable_1 - ALT_NEW_CONTENT \vendor_id_2, \errata_id_2, \enable_2, \new_c_2 +.macro ALTERNATIVE_CFG_2 old_c, new_c_1, vendor_id_1, errata_id_1, errata_not_1, enable_1, \ + new_c_2, vendor_id_2, errata_id_2, errata_not_2, enable_2 + ALTERNATIVE_CFG "\old_c", "\new_c_1", \vendor_id_1, \errata_id_1, \errata_not_1, \enable_1 + ALT_NEW_CONTENT \vendor_id_2, \errata_id_2, \errata_not_2, \enable_2, \new_c_2 .endm #define __ALTERNATIVE_CFG(...) ALTERNATIVE_CFG __VA_ARGS__ @@ -58,17 +59,18 @@ #include #include -#define ALT_ENTRY(oldptr, newptr, vendor_id, errata_id, newlen) \ +#define ALT_ENTRY(oldptr, newptr, newlen, vendor_id, errata_id, errata_not) \ RISCV_PTR " " oldptr "\n" \ RISCV_PTR " " newptr "\n" \ REG_ASM " " vendor_id "\n" \ REG_ASM " " newlen "\n" \ - ".word " errata_id "\n" + ".word " errata_id "\n" \ + ".word " errata_not "\n" -#define ALT_NEW_CONTENT(vendor_id, errata_id, enable, new_c) \ +#define ALT_NEW_CONTENT(vendor_id, errata_id, errata_not, enable, new_c) \ ".if " __stringify(enable) " == 1\n" \ ".pushsection .alternative, \"a\"\n" \ - ALT_ENTRY("886b", "888f", __stringify(vendor_id), __stringify(errata_id), "889f - 888f") \ + ALT_ENTRY("886b", "888f", "889f - 888f", __stringify(vendor_id), __stringify(errata_id), __stringify(errata_not)) \ ".popsection\n" \ ".subsection 1\n" \ "888 :\n" \ @@ -83,7 +85,7 @@ ".previous\n" \ ".endif\n" -#define __ALTERNATIVE_CFG(old_c, new_c, vendor_id, errata_id, enable) \ +#define __ALTERNATIVE_CFG(old_c, new_c, vendor_id, errata_id, errata_not, enable) \ "886 :\n" \ ".option push\n" \ ".option norvc\n" \ @@ -91,22 +93,22 @@ old_c "\n" \ ".option pop\n" \ "887 :\n" \ - ALT_NEW_CONTENT(vendor_id, errata_id, enable, new_c) + ALT_NEW_CONTENT(vendor_id, errata_id, errata_not, enable, new_c) -#define __ALTERNATIVE_CFG_2(old_c, new_c_1, vendor_id_1, errata_id_1, enable_1, \ - new_c_2, vendor_id_2, errata_id_2, enable_2) \ - __ALTERNATIVE_CFG(old_c, new_c_1, vendor_id_1, errata_id_1, enable_1) \ - ALT_NEW_CONTENT(vendor_id_2, errata_id_2, enable_2, new_c_2) +#define __ALTERNATIVE_CFG_2(old_c, new_c_1, vendor_id_1, errata_id_1, errata_not_1, enable_1, \ + new_c_2, vendor_id_2, errata_id_2, errata_not_2, enable_2) \ + __ALTERNATIVE_CFG(old_c, new_c_1, vendor_id_1, errata_id_1, errata_not_1, enable_1) \ + ALT_NEW_CONTENT(vendor_id_2, errata_id_2, errata_not_2, enable_2, new_c_2) #endif /* __ASSEMBLY__ */ -#define _ALTERNATIVE_CFG(old_c, new_c, vendor_id, errata_id, CONFIG_k) \ - __ALTERNATIVE_CFG(old_c, new_c, vendor_id, errata_id, IS_ENABLED(CONFIG_k)) +#define _ALTERNATIVE_CFG(old_c, new_c, vendor_id, errata_id, errata_not, CONFIG_k) \ + __ALTERNATIVE_CFG(old_c, new_c, vendor_id, errata_id, errata_not, IS_ENABLED(CONFIG_k)) -#define _ALTERNATIVE_CFG_2(old_c, new_c_1, vendor_id_1, errata_id_1, CONFIG_k_1, \ - new_c_2, vendor_id_2, errata_id_2, CONFIG_k_2) \ - __ALTERNATIVE_CFG_2(old_c, new_c_1, vendor_id_1, errata_id_1, IS_ENABLED(CONFIG_k_1), \ - new_c_2, vendor_id_2, errata_id_2, IS_ENABLED(CONFIG_k_2)) +#define _ALTERNATIVE_CFG_2(old_c, new_c_1, vendor_id_1, errata_id_1, errata_not_1, CONFIG_k_1, \ + new_c_2, vendor_id_2, errata_id_2, errata_not_2, CONFIG_k_2) \ + __ALTERNATIVE_CFG_2(old_c, new_c_1, vendor_id_1, errata_id_1, errata_not_1, IS_ENABLED(CONFIG_k_1), \ + new_c_2, vendor_id_2, errata_id_2, errata_not_2, IS_ENABLED(CONFIG_k_2)) #else /* CONFIG_RISCV_ALTERNATIVE */ #ifdef __ASSEMBLY__ @@ -148,8 +150,8 @@ * CONFIG_k: The Kconfig of this errata. When Kconfig is disabled, the old * content will alwyas be executed. */ -#define ALTERNATIVE(old_content, new_content, vendor_id, errata_id, CONFIG_k) \ - _ALTERNATIVE_CFG(old_content, new_content, vendor_id, errata_id, CONFIG_k) +#define ALTERNATIVE(old_content, new_content, vendor_id, errata_id, errata_not, CONFIG_k) \ + _ALTERNATIVE_CFG(old_content, new_content, vendor_id, errata_id, errata_not, CONFIG_k) /* * A vendor wants to replace an old_content, but another vendor has used @@ -158,9 +160,9 @@ * on the following sample code and then replace ALTERNATIVE() with * ALTERNATIVE_2() to append its customized content. */ -#define ALTERNATIVE_2(old_content, new_content_1, vendor_id_1, errata_id_1, CONFIG_k_1, \ - new_content_2, vendor_id_2, errata_id_2, CONFIG_k_2) \ - _ALTERNATIVE_CFG_2(old_content, new_content_1, vendor_id_1, errata_id_1, CONFIG_k_1, \ - new_content_2, vendor_id_2, errata_id_2, CONFIG_k_2) +#define ALTERNATIVE_2(old_content, new_content_1, vendor_id_1, errata_id_1, errata_not_1, CONFIG_k_1, \ + new_content_2, vendor_id_2, errata_id_2, errata_not_2, CONFIG_k_2) \ + _ALTERNATIVE_CFG_2(old_content, new_content_1, vendor_id_1, errata_id_1, errata_not_1, CONFIG_k_1, \ + new_content_2, vendor_id_2, errata_id_2, errata_not_2, CONFIG_k_2) #endif diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h index 1bd4027d34ca..d08c563ab7d8 100644 --- a/arch/riscv/include/asm/alternative.h +++ b/arch/riscv/include/asm/alternative.h @@ -36,6 +36,7 @@ struct alt_entry { unsigned long vendor_id; /* cpu vendor id */ unsigned long alt_len; /* The replacement size */ unsigned int errata_id; /* The errata id */ + unsigned int errata_not; /* Errata id not to match against */ } __packed; struct errata_checkfunc_id { diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h index 40c9e9c3295b..043b79c79824 100644 --- a/arch/riscv/include/asm/errata_list.h +++ b/arch/riscv/include/asm/errata_list.h @@ -32,19 +32,19 @@ #define ALT_INSN_FAULT(x) \ ALTERNATIVE(__stringify(RISCV_PTR do_trap_insn_fault), \ __stringify(RISCV_PTR sifive_cip_453_insn_fault_trp), \ - SIFIVE_VENDOR_ID, ERRATA_SIFIVE_CIP_453, \ + SIFIVE_VENDOR_ID, ERRATA_SIFIVE_CIP_453, 0, \ CONFIG_ERRATA_SIFIVE_CIP_453) #define ALT_PAGE_FAULT(x) \ ALTERNATIVE(__stringify(RISCV_PTR do_page_fault), \ __stringify(RISCV_PTR sifive_cip_453_page_fault_trp), \ - SIFIVE_VENDOR_ID, ERRATA_SIFIVE_CIP_453, \ + SIFIVE_VENDOR_ID, ERRATA_SIFIVE_CIP_453, 0, \ CONFIG_ERRATA_SIFIVE_CIP_453) #else /* !__ASSEMBLY__ */ #define ALT_FLUSH_TLB_PAGE(x) \ asm(ALTERNATIVE("sfence.vma %0", "sfence.vma", SIFIVE_VENDOR_ID, \ - ERRATA_SIFIVE_CIP_1200, CONFIG_ERRATA_SIFIVE_CIP_1200) \ + ERRATA_SIFIVE_CIP_1200, 0, CONFIG_ERRATA_SIFIVE_CIP_1200) \ : : "r" (addr) : "memory") /* @@ -56,9 +56,9 @@ asm(ALTERNATIVE("sfence.vma %0", "sfence.vma", SIFIVE_VENDOR_ID, \ #define ALT_SVPBMT(_val, prot) \ asm(ALTERNATIVE_2("li %0, 0\t\nnop", \ "li %0, %1\t\nslli %0,%0,%3", 0, \ - CPUFEATURE_SVPBMT, CONFIG_RISCV_ISA_SVPBMT, \ + CPUFEATURE_SVPBMT, 0, CONFIG_RISCV_ISA_SVPBMT, \ "li %0, %2\t\nslli %0,%0,%4", THEAD_VENDOR_ID, \ - ERRATA_THEAD_PBMT, CONFIG_ERRATA_THEAD_PBMT) \ + ERRATA_THEAD_PBMT, 0, CONFIG_ERRATA_THEAD_PBMT) \ : "=r"(_val) \ : "I"(prot##_SVPBMT >> ALT_SVPBMT_SHIFT), \ "I"(prot##_THEAD >> ALT_THEAD_PBMT_SHIFT), \ @@ -82,7 +82,7 @@ asm volatile(ALTERNATIVE( \ "slli t3, t3, %3\n\t" \ "or %0, %0, t3\n\t" \ "2:", THEAD_VENDOR_ID, \ - ERRATA_THEAD_PBMT, CONFIG_ERRATA_THEAD_PBMT) \ + ERRATA_THEAD_PBMT, 0, CONFIG_ERRATA_THEAD_PBMT) \ : "+r"(_val) \ : "I"(_PAGE_MTMASK_THEAD >> ALT_THEAD_PBMT_SHIFT), \ "I"(_PAGE_PMA_THEAD >> ALT_THEAD_PBMT_SHIFT), \ @@ -130,7 +130,7 @@ asm volatile(ALTERNATIVE_2( \ "add a0, a0, %0\n\t" \ "2:\n\t" \ "bltu a0, %2, 3b\n\t" \ - "nop", 0, CPUFEATURE_ZICBOM, CONFIG_RISCV_ISA_ZICBOM, \ + "nop", 0, CPUFEATURE_ZICBOM, 0, CONFIG_RISCV_ISA_ZICBOM, \ "mv a0, %1\n\t" \ "j 2f\n\t" \ "3:\n\t" \ @@ -139,7 +139,7 @@ asm volatile(ALTERNATIVE_2( \ "2:\n\t" \ "bltu a0, %2, 3b\n\t" \ THEAD_SYNC_S, THEAD_VENDOR_ID, \ - ERRATA_THEAD_CMO, CONFIG_ERRATA_THEAD_CMO) \ + ERRATA_THEAD_CMO, 0, CONFIG_ERRATA_THEAD_CMO) \ : : "r"(_cachesize), \ "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)), \ "r"((unsigned long)(_start) + (_size)) \ @@ -152,7 +152,7 @@ asm volatile(ALTERNATIVE_2( \ asm volatile(ALTERNATIVE( \ "csrr %0, " __stringify(CSR_SSCOUNTOVF), \ "csrr %0, " __stringify(THEAD_C9XX_CSR_SCOUNTEROF), \ - THEAD_VENDOR_ID, ERRATA_THEAD_PMU, \ + THEAD_VENDOR_ID, ERRATA_THEAD_PMU, 0, \ CONFIG_ERRATA_THEAD_PMU) \ : "=r" (__ovl) : \ : "memory") diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index 8c83bd9d0e22..a65bebdadb68 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -377,7 +377,8 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin, continue; } - if ((cpu_req_feature & alt->errata_id) == alt->errata_id) { + if ((cpu_req_feature & alt->errata_id) == alt->errata_id && + (~cpu_req_feature & alt->errata_not)) { patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len); riscv_alternative_fix_offsets(alt->old_ptr, alt->alt_len, alt->old_ptr - alt->alt_ptr); diff --git a/arch/riscv/lib/strcmp.S b/arch/riscv/lib/strcmp.S index 8148b6418f61..ce85bbbee4b9 100644 --- a/arch/riscv/lib/strcmp.S +++ b/arch/riscv/lib/strcmp.S @@ -9,7 +9,7 @@ /* int strcmp(const char *cs, const char *ct) */ SYM_FUNC_START(strcmp) - ALTERNATIVE("nop", "j strcmp_zbb", 0, CPUFEATURE_ZBB, CONFIG_RISCV_ISA_ZBB) + ALTERNATIVE_2("nop", "j strcmp_zbb", 0, CPUFEATURE_ZBB, 0, CONFIG_RISCV_ISA_ZBB) /* * Returns diff --git a/arch/riscv/lib/strlen.S b/arch/riscv/lib/strlen.S index 0f9dbf93301a..8fdd53a734b4 100644 --- a/arch/riscv/lib/strlen.S +++ b/arch/riscv/lib/strlen.S @@ -9,7 +9,7 @@ /* int strlen(const char *s) */ SYM_FUNC_START(strlen) - ALTERNATIVE("nop", "j strlen_zbb", 0, CPUFEATURE_ZBB, CONFIG_RISCV_ISA_ZBB) + ALTERNATIVE("nop", "j strlen_zbb", 0, CPUFEATURE_ZBB, 0, CONFIG_RISCV_ISA_ZBB) /* * Returns diff --git a/arch/riscv/lib/strncmp.S b/arch/riscv/lib/strncmp.S index 7940ddab2d48..e46ad168f1e4 100644 --- a/arch/riscv/lib/strncmp.S +++ b/arch/riscv/lib/strncmp.S @@ -9,7 +9,7 @@ /* int strncmp(const char *cs, const char *ct, size_t count) */ SYM_FUNC_START(strncmp) - ALTERNATIVE("nop", "j strncmp_zbb", 0, CPUFEATURE_ZBB, CONFIG_RISCV_ISA_ZBB) + ALTERNATIVE("nop", "j strncmp_zbb", 0, CPUFEATURE_ZBB, 0, CONFIG_RISCV_ISA_ZBB) /* * Returns From patchwork Fri Jan 13 21:23:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Heiko_St=C3=BCbner?= X-Patchwork-Id: 13101668 X-Patchwork-Delegate: palmer@dabbelt.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CF644C3DA78 for ; Fri, 13 Jan 2023 21:24:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=kVUKIdCgGUHo3iDEcCQW6Ym60YvS2bqoWcmPHxJFPD8=; b=caK6Pnf7aTTfxf bE5lPOw9e8dW9CEiWrgqmFxz9+y750wRyPROb6hdzmI+3pNfC1wc3EKgzqZNqFVviaMM0xaonwgFU bKdRJyx3yie6hinozTAWWjId4OqEYZ3dQUx7XOS71Y4uWwxPViiiIe6tJx9XO9LYt9ImkpI63QptR Z6jbHB3K+N9I6FooPLDIwSPN8X2wd6VR1CmSASP+BXmMm0l1jYLwqVlEFrgWuZQJmyAB7TSoMoryz 0YMv9aHAtD8aohJNyL/L0BWYSVPsEqop5RVWzftT2djy3OEmznv2L0YcieZ6FdPRMVaTmdoc2KCUf Pko+1v0xc5+DDdJHN9/g==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1pGRXD-004bAl-MA; Fri, 13 Jan 2023 21:24:23 +0000 Received: from gloria.sntech.de ([185.11.138.130]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1pGRWm-004axd-N8 for linux-riscv@lists.infradead.org; Fri, 13 Jan 2023 21:24:02 +0000 Received: from ip5b412258.dynamic.kabel-deutschland.de ([91.65.34.88] helo=phil.lan) by gloria.sntech.de with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pGRWl-0003WV-2N; Fri, 13 Jan 2023 22:23:55 +0100 From: Heiko Stuebner To: linux-riscv@lists.infradead.org, palmer@dabbelt.com Cc: christoph.muellner@vrull.eu, conor@kernel.org, philipp.tomsich@vrull.eu, ajones@ventanamicro.com, heiko@sntech.de, jszhang@kernel.org, Heiko Stuebner Subject: [PATCH 3/4] RISC-V: add cpufeature probing for fast-unaligned access Date: Fri, 13 Jan 2023 22:23:50 +0100 Message-Id: <20230113212351.3534769-4-heiko@sntech.de> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20230113212351.3534769-1-heiko@sntech.de> References: <20230113212351.3534769-1-heiko@sntech.de> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230113_132356_813413_2ACCF95A X-CRM114-Status: GOOD ( 12.28 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Heiko Stuebner Use the recently added misaligned access descriptor and derive a cpufeature id from it so that it can be used in alternative patches. We assume slow unaligned access if any cpu-core does not support fast access. Signed-off-by: Heiko Stuebner --- arch/riscv/include/asm/errata_list.h | 9 +++++---- arch/riscv/kernel/cpufeature.c | 20 ++++++++++++++++++++ 2 files changed, 25 insertions(+), 4 deletions(-) diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h index 043b79c79824..6ce0c22ae994 100644 --- a/arch/riscv/include/asm/errata_list.h +++ b/arch/riscv/include/asm/errata_list.h @@ -22,10 +22,11 @@ #define ERRATA_THEAD_NUMBER 3 #endif -#define CPUFEATURE_SVPBMT (1 << 0) -#define CPUFEATURE_ZICBOM (1 << 1) -#define CPUFEATURE_ZBB (1 << 2) -#define CPUFEATURE_NUMBER 3 +#define CPUFEATURE_SVPBMT (1 << 0) +#define CPUFEATURE_ZICBOM (1 << 1) +#define CPUFEATURE_ZBB (1 << 2) +#define CPUFEATURE_FAST_UNALIGNED (1 << 3) +#define CPUFEATURE_NUMBER 4 #ifdef __ASSEMBLY__ diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index a65bebdadb68..640b78f6aaa9 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -337,6 +337,23 @@ static bool __init_or_module cpufeature_probe_zbb(unsigned int stage) return true; } +static bool __init_or_module cpufeature_probe_fast_unaligned(unsigned int stage) +{ + int cpu; + + if (stage == RISCV_ALTERNATIVES_EARLY_BOOT) + return false; + + for_each_possible_cpu(cpu) { + long perf = per_cpu(misaligned_access_speed, cpu); + + if (perf != RISCV_HWPROBE_MISALIGNED_FAST) + return false; + } + + return true; +} + /* * Probe presence of individual extensions. * @@ -358,6 +375,9 @@ static u32 __init_or_module cpufeature_probe(unsigned int stage) if (cpufeature_probe_zbb(stage)) cpu_req_feature |= CPUFEATURE_ZBB; + if (cpufeature_probe_fast_unaligned(stage)) + cpu_req_feature |= CPUFEATURE_FAST_UNALIGNED; + return cpu_req_feature; } From patchwork Fri Jan 13 21:23:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Heiko_St=C3=BCbner?= X-Patchwork-Id: 13101669 X-Patchwork-Delegate: palmer@dabbelt.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9A0D9C3DA78 for ; Fri, 13 Jan 2023 21:24:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=dG9KhkRflb0eObikN9jOyGnhhBUGckXZLV2rlfgaOKo=; b=JKEw257yEdaAkV MZlxWrXf+NrdF1S1gmzu4L262ZNd8485BOlXPAUfQ2QBjAUvDlw5s4avu4mAh84H2fiRO45iB8oSK 7FsvVw3bL29tmx449P+xz0Mw+5v4ZmcUnzCtNfgWlLyOPJpLdw+ZXFxco8yfjaSAAwmvk+rDjzX5v 4/cFM26mCMVZZ9qb6SMGFNpUMZ0aX78hi1qdr0IQ2PhTzk/xzwIbknX0vqzhvYbAw93+7cxvlpLUG 78Gocm8qameBXTtFKdaD4qQNuUxoWszf5tVmBs6tqtV6hoVEOku+7KOosaoNCPBxGHkoptpOz9F1q NCdf1xa+uTRQIhFn/ICw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1pGRXJ-004bE5-UG; Fri, 13 Jan 2023 21:24:29 +0000 Received: from gloria.sntech.de ([185.11.138.130]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1pGRWm-004ayB-Pp for linux-riscv@lists.infradead.org; Fri, 13 Jan 2023 21:24:03 +0000 Received: from ip5b412258.dynamic.kabel-deutschland.de ([91.65.34.88] helo=phil.lan) by gloria.sntech.de with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pGRWl-0003WV-Dt; Fri, 13 Jan 2023 22:23:55 +0100 From: Heiko Stuebner To: linux-riscv@lists.infradead.org, palmer@dabbelt.com Cc: christoph.muellner@vrull.eu, conor@kernel.org, philipp.tomsich@vrull.eu, ajones@ventanamicro.com, heiko@sntech.de, jszhang@kernel.org, Heiko Stuebner Subject: [PATCH 4/4] RISC-V: add strcmp variant using zbb and fast-unaligned access Date: Fri, 13 Jan 2023 22:23:51 +0100 Message-Id: <20230113212351.3534769-5-heiko@sntech.de> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20230113212351.3534769-1-heiko@sntech.de> References: <20230113212351.3534769-1-heiko@sntech.de> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230113_132356_896353_03B9815B X-CRM114-Status: GOOD ( 17.85 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Heiko Stuebner On cores that can do unaligned access fast in hardware, there are some more optimizations possible, so add a second strcmp variant for that case. Signed-off-by: Heiko Stuebner --- arch/riscv/lib/strcmp.S | 170 +++++++++++++++++++++++++++++++++++++++- 1 file changed, 169 insertions(+), 1 deletion(-) diff --git a/arch/riscv/lib/strcmp.S b/arch/riscv/lib/strcmp.S index ce85bbbee4b9..53f41d032aae 100644 --- a/arch/riscv/lib/strcmp.S +++ b/arch/riscv/lib/strcmp.S @@ -9,7 +9,13 @@ /* int strcmp(const char *cs, const char *ct) */ SYM_FUNC_START(strcmp) - ALTERNATIVE_2("nop", "j strcmp_zbb", 0, CPUFEATURE_ZBB, 0, CONFIG_RISCV_ISA_ZBB) + ALTERNATIVE_2("nop", + "j strcmp_zbb_unaligned", 0, + CPUFEATURE_ZBB | CPUFEATURE_FAST_UNALIGNED, 0, + CONFIG_RISCV_ISA_ZBB, + "j strcmp_zbb", 0, + CPUFEATURE_ZBB, CPUFEATURE_FAST_UNALIGNED, + CONFIG_RISCV_ISA_ZBB) /* * Returns @@ -116,6 +122,168 @@ strcmp_zbb: sub a0, t0, t1 ret +strcmp_zbb_unaligned: + + /* + * Returns + * a0 - comparison result, value like strcmp + * + * Parameters + * a0 - string1 + * a1 - string2 + * + * Clobbers + * a3, a4, a5, a6, a7, t0, t1, t2, t3, t4, t5 + */ + +#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ +# error big endian is untested! +# define CZ ctz +# define SHIFT srl +# define SHIFT2 sll +#else +# define CZ ctz +# define SHIFT sll +# define SHIFT2 srl +#endif + + /* a3...delta from a0 to a1. */ + sub a3, a1, a0 + li a4, -1 + andi a7, a3, SZREG-1 + andi a5, a0, SZREG-1 + bnez a7, 7f + bnez a5, 6f + + .p2align 4 +1: + REG_L t0, 0(a0) + add a7, a0, a3 + addi a0, a0, SZREG + REG_L t1, 0(a7) + +2: + orc.b t3, t0 + bne t3, a4, 4f + beq t0, t1, 1b + + /* Words don't match, and no NUL byte in one word. + Get bytes in big-endian order and compare as words. */ +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ + rev8 t0, t0 + rev8 t1, t1 +#endif + /* Synthesize (t0 >= t1) ? 1 : -1 in a branchless sequence. */ + sltu a0, t0, t1 + neg a0, a0 + ori a0, a0, 1 + ret + +3: + orc.b t3, t0 +4: + /* Words don't match or NUL byte in at least one word. + t3 holds orc.b value of t0. */ + xor a7, t0, t1 + orc.b a7, a7 + + orn a7, a7, t3 + CZ t5, a7 + +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ + rev8 t0, t0 + rev8 t1, t1 +#endif + sll t0, t0, t5 + sll t1, t1, t5 + srl t0, t0, SZREG*8-8 + srl t1, t1, SZREG*8-8 + +5: + sub a0, t0, t1 + ret + + .p2align 4 +6: + /* Sources are mutually aligned, but are not currently at an + alignment boundary. Round down the addresses and then mask a3 + the bytes that precede the start point. */ + andi a0, a0, -SZREG + add a7, a0, a3 + REG_L t0, 0(a0) + addi a0, a0, SZREG + REG_L t1, 0(a7) + /* Get number of bits to mask. */ + sll t5, a1, 3 + /* Bits to mask are now 0, others are 1. */ + SHIFT a7, a4, t5 + /* Or with inverted value -> masked bits become 1. */ + orn t0, t0, a7 + orn t1, t1, a7 + j 2b + +7: + /* Skip slow loop if a0 is aligned. */ + beqz a5, 9f +8: + /* Align a0 to 8 bytes. */ + lbu t0, 0(a0) + lbu t1, 0(a1) + beqz t0, 5b + bne t0, t1, 5b + addi a0, a0, 1 + addi a1, a1, 1 + andi a5, a0, SZREG-1 + bnez a5, 8b + +9: + /* a0 is aligned. Align a1 down and check for NUL there. + * If there is no NUL, we may read the next word from a1. + * If there is a NUL, we must not read a complete word from a1 + * because we might cross a page boundary. */ + /* Get number of bits to mask (upper bits are ignored by shifts). */ + sll t5, a1, 3 + /* a6 := align_down (a1) */ + andi a6, a1, -SZREG + REG_L t2, 0(a6) + addi a6, a6, SZREG + + /* Bits to mask are now 0, others are 1. */ + SHIFT a7, a4, t5 + /* Or with inverted value -> masked bits become 1. */ + orn t4, t2, a7 + /* Check for NUL in next aligned word. */ + orc.b t4, t4 + bne t4, a4, 11f + + .p2align 4 +10: + /* Read the (aligned) t0 and the unaligned t1. */ + REG_L t0, 0(a0) + addi a0, a0, SZREG + REG_L t1, 0(a1) + addi a1, a1, SZREG + orc.b t3, t0 + bne t3, a4, 4b + bne t0, t1, 4b + + /* Read the next aligned-down word. */ + REG_L t2, 0(a6) + addi a6, a6, SZREG + orc.b t4, t2 + beq t4, a4, 10b + +11: + /* a0 points to unread word (only first bytes relevant). + * t2 holds next aligned-down word with NUL. + * Compare the first bytes of t0 with the last bytes of t2. */ + REG_L t0, 0(a0) + /* Shift NUL bytes into t2 to become t1. */ + SHIFT2 t1, t2, t5 + bne t0, t1, 3b + li a0, 0 + ret + .option pop #endif SYM_FUNC_END(strcmp)