From patchwork Mon Jan 9 18:17:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Heiko Stuebner X-Patchwork-Id: 13094086 X-Patchwork-Delegate: palmer@dabbelt.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D1A6EC54EBD for ; Mon, 9 Jan 2023 18:27:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=FowlnAXeDZch+VlKxZQoi83Jaepqky4ENviiIKF0l7c=; b=X62CfTzTKKs9Zw 2GZ6SUevyFKs1wbZBZOwuvb18P24k0l3PwNEp3kbLXeJ2tE4pg080JJ8OPD6Z0H3bjRWZNDLj2h6B 28uKgD022viEnlkaUbSsYVDjii3L+fJjyUPYvg9z5ilUh3vZxqRMinrmebEvudm0/w2+UDUts3EPQ Z9WammT9Su54JjcgCT9i/F8D3lng6bW6dcnz6cU9sHwCPC/JFC8iFD5VwQdNKRiLzd2xsvm0rnDi2 1CBlqvBKcRUpRGUAElhvyqn1S4fhzH3ElFb4WGOBV3kIdjw7XMxdqlEYouQQ0rIJmz1v6j7+b5Hih JLa7pcmHYCauQB6XYLIw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1pEwrL-003HFj-M3; Mon, 09 Jan 2023 18:26:59 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1pEwio-003DW1-UJ for linux-riscv@bombadil.infradead.org; Mon, 09 Jan 2023 18:18:11 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Transfer-Encoding:MIME-Version :References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=fYtHf1WMpqGCMN6ig5IFLz8FjZsLpTc5h53lmE7xqq4=; b=KACl+gTJ8Rgoc78DeDdyF4SjdD MUrAyDWN/8TvD/2JmZGhsTpQVWWQkvkdjCCLfnox3FPobdFKRTRL2Ekd1T0dqmOwm0IUiNTJzPcmk SaqIbP5zWLLo9FSZ7NNQxnTM4tzREAWPkb4as+FywyveRcaHY6DPeJ5WxCKeh4oa3H1yHQmzHwT8F YGdLKka4/5v1MK64U9ABUUjxtPFR7kQ4Dv82LRTCXj1fQh4RyQFFI2ULgZkx/4BUqRMlOB65P9jvF NPY75BNNqDRbCNqL3G8JOYlO1BKWMrIrih2EKJ3QstFRO+hw7Mzg/kopkuMmdx8zrkoJBDzXvRSQG 8HyAPXqQ==; Received: from gloria.sntech.de ([185.11.138.130]) by desiato.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1pEwib-002p21-1I for linux-riscv@lists.infradead.org; Mon, 09 Jan 2023 18:18:01 +0000 Received: from wf0783.dip.tu-dresden.de ([141.76.183.15] helo=phil.dip.tu-dresden.de) by gloria.sntech.de with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pEwib-0005hU-CS; Mon, 09 Jan 2023 19:17:57 +0100 From: Heiko Stuebner To: linux-riscv@lists.infradead.org, palmer@dabbelt.com Cc: christoph.muellner@vrull.eu, conor@kernel.org, philipp.tomsich@vrull.eu, ajones@ventanamicro.com, heiko@sntech.de, jszhang@kernel.org, Heiko Stuebner Subject: [PATCH v4 1/5] RISC-V: move some stray __RISCV_INSN_FUNCS definitions from kprobes Date: Mon, 9 Jan 2023 19:17:51 +0100 Message-Id: <20230109181755.2383085-2-heiko@sntech.de> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20230109181755.2383085-1-heiko@sntech.de> References: <20230109181755.2383085-1-heiko@sntech.de> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230109_181757_506717_6306E878 X-CRM114-Status: GOOD ( 12.79 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Heiko Stuebner The __RISCV_INSN_FUNCS originally declared riscv_insn_is_* functions inside the kprobes implementation. This got moved into a central header in commit ec5f90877516 ("RISC-V: Move riscv_insn_is_* macros into a common header"). Though it looks like I overlooked two of them, so fix that. FENCE itself is an instruction defined directly by its own opcode, while the created riscv_isn_is_system function covers all instructions defined under the SYSTEM opcode. Fixes: ec5f90877516 ("RISC-V: Move riscv_insn_is_* macros into a common header") Signed-off-by: Heiko Stuebner Reviewed-by: Conor Dooley Reviewed-by: Andrew Jones --- arch/riscv/include/asm/insn.h | 10 ++++++++++ arch/riscv/kernel/probes/simulate-insn.h | 3 --- 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/arch/riscv/include/asm/insn.h b/arch/riscv/include/asm/insn.h index 98453535324a..0455b4dcb0a7 100644 --- a/arch/riscv/include/asm/insn.h +++ b/arch/riscv/include/asm/insn.h @@ -128,6 +128,7 @@ #define RVC_C2_RD_OPOFF 7 /* parts of opcode for RVG*/ +#define RVG_OPCODE_FENCE 0x0f #define RVG_OPCODE_AUIPC 0x17 #define RVG_OPCODE_BRANCH 0x63 #define RVG_OPCODE_JALR 0x67 @@ -163,6 +164,7 @@ #define RVG_MATCH_AUIPC (RVG_OPCODE_AUIPC) #define RVG_MATCH_JALR (RV_ENCODE_FUNCT3(JALR) | RVG_OPCODE_JALR) #define RVG_MATCH_JAL (RVG_OPCODE_JAL) +#define RVG_MATCH_FENCE (RVG_OPCODE_FENCE) #define RVG_MATCH_BEQ (RV_ENCODE_FUNCT3(BEQ) | RVG_OPCODE_BRANCH) #define RVG_MATCH_BNE (RV_ENCODE_FUNCT3(BNE) | RVG_OPCODE_BRANCH) #define RVG_MATCH_BLT (RV_ENCODE_FUNCT3(BLT) | RVG_OPCODE_BRANCH) @@ -182,6 +184,7 @@ #define RVG_MASK_AUIPC (RV_INSN_OPCODE_MASK) #define RVG_MASK_JALR (RV_INSN_FUNCT3_MASK | RV_INSN_OPCODE_MASK) #define RVG_MASK_JAL (RV_INSN_OPCODE_MASK) +#define RVG_MASK_FENCE (RV_INSN_OPCODE_MASK) #define RVC_MASK_C_JALR (RVC_INSN_FUNCT4_MASK | RVC_INSN_J_RS2_MASK | RVC_INSN_OPCODE_MASK) #define RVC_MASK_C_JR (RVC_INSN_FUNCT4_MASK | RVC_INSN_J_RS2_MASK | RVC_INSN_OPCODE_MASK) #define RVC_MASK_C_JAL (RVC_INSN_FUNCT3_MASK | RVC_INSN_OPCODE_MASK) @@ -233,6 +236,13 @@ __RISCV_INSN_FUNCS(c_bnez, RVC_MASK_C_BNEZ, RVC_MATCH_C_BNEZ) __RISCV_INSN_FUNCS(c_ebreak, RVC_MASK_C_EBREAK, RVC_MATCH_C_EBREAK) __RISCV_INSN_FUNCS(ebreak, RVG_MASK_EBREAK, RVG_MATCH_EBREAK) __RISCV_INSN_FUNCS(sret, RVG_MASK_SRET, RVG_MATCH_SRET) +__RISCV_INSN_FUNCS(fence, RVG_MASK_FENCE, RVG_MATCH_FENCE); + +/* special case to catch _any_ system instruction */ +static __always_inline bool riscv_insn_is_system(u32 code) +{ + return (code & RV_INSN_OPCODE_MASK) == RVG_OPCODE_SYSTEM; +} /* special case to catch _any_ branch instruction */ static __always_inline bool riscv_insn_is_branch(u32 code) diff --git a/arch/riscv/kernel/probes/simulate-insn.h b/arch/riscv/kernel/probes/simulate-insn.h index a19aaa0feb44..61e35db31001 100644 --- a/arch/riscv/kernel/probes/simulate-insn.h +++ b/arch/riscv/kernel/probes/simulate-insn.h @@ -12,9 +12,6 @@ } \ } while (0) -__RISCV_INSN_FUNCS(system, 0x7f, 0x73); -__RISCV_INSN_FUNCS(fence, 0x7f, 0x0f); - #define RISCV_INSN_SET_SIMULATE(name, code) \ do { \ if (riscv_insn_is_##name(code)) { \ From patchwork Mon Jan 9 18:17:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Heiko Stuebner X-Patchwork-Id: 13094084 X-Patchwork-Delegate: palmer@dabbelt.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 34A3CC54EBD for ; Mon, 9 Jan 2023 18:26:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=NlwYvyJHzAQZShwMWlgs7BmzDjIpEalDIHT/QC2Im+g=; b=sUMYHov7swB+pe wNmzS/jlv1cJOHiaSfhAiwz09OQL9J8R8t9JIPB5nG19dDh14ewSoleKDEW3c485gEIUvhFHpV6KF pDh4whRdVnUZwhajPcjv5CeXM1/cUnW3faRCKTdUr3K7ppnBSpwCV8uAhYjUk40Buxjnrv3W835Xt DTDxft8r7eXlUAwy7j+Vf4POKv6Iw6B2xiC9orFp5mzglZs3D5x+VDwVLS9bA1gKkpQGbsTNQcHli n97G7BZFZISETpmImNJ9Jnyvz3QDXcm1fq8+D6mEROnF+XV40tZucKwyKi8Px5E3diJlxvX37ejij b2S/cpJ3qQCgjQYwt0Iw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1pEwr1-003H3o-4U; Mon, 09 Jan 2023 18:26:39 +0000 Received: from gloria.sntech.de ([185.11.138.130]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1pEwig-003DQl-Ct for linux-riscv@lists.infradead.org; Mon, 09 Jan 2023 18:18:05 +0000 Received: from wf0783.dip.tu-dresden.de ([141.76.183.15] helo=phil.dip.tu-dresden.de) by gloria.sntech.de with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pEwib-0005hU-Nl; Mon, 09 Jan 2023 19:17:57 +0100 From: Heiko Stuebner To: linux-riscv@lists.infradead.org, palmer@dabbelt.com Cc: christoph.muellner@vrull.eu, conor@kernel.org, philipp.tomsich@vrull.eu, ajones@ventanamicro.com, heiko@sntech.de, jszhang@kernel.org, Heiko Stuebner Subject: [PATCH v4 2/5] RISC-V: add helpers for J-type immediate handling Date: Mon, 9 Jan 2023 19:17:52 +0100 Message-Id: <20230109181755.2383085-3-heiko@sntech.de> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20230109181755.2383085-1-heiko@sntech.de> References: <20230109181755.2383085-1-heiko@sntech.de> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230109_101802_529996_2255D18D X-CRM114-Status: GOOD ( 11.51 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Heiko Stuebner Similar to the helper for the u-type + i-type imm handling, add helper- functions for j-type immediates. While it also would be possible to open-code that bit of imm wiggling using the macros already provided in insn.h, it's way better to have consistency on how we handle this across all function encoding/decoding. Signed-off-by: Heiko Stuebner --- arch/riscv/include/asm/insn.h | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/arch/riscv/include/asm/insn.h b/arch/riscv/include/asm/insn.h index 0455b4dcb0a7..9eea61a3028f 100644 --- a/arch/riscv/include/asm/insn.h +++ b/arch/riscv/include/asm/insn.h @@ -301,6 +301,32 @@ static __always_inline bool riscv_insn_is_branch(u32 code) (RVC_X(x_, RVC_B_IMM_7_6_OPOFF, RVC_B_IMM_7_6_MASK) << RVC_B_IMM_7_6_OFF) | \ (RVC_IMM_SIGN(x_) << RVC_B_IMM_SIGN_OFF); }) +/* + * Get the immediate from a J-type instruction. + * + * @insn: instruction to process + * Return: immediate + */ +static inline s32 riscv_insn_extract_jtype_imm(u32 insn) +{ + return RV_EXTRACT_JTYPE_IMM(insn); +} + +/* + * Update a J-type instruction with an immediate value. + * + * @insn: pointer to the jtype instruction + * @imm: the immediate to insert into the instruction + */ +static inline void riscv_insn_insert_jtype_imm(u32 *insn, s32 imm) +{ + *insn &= ~GENMASK(31, 12); + *insn |= (((imm & (RV_J_IMM_10_1_MASK << RV_J_IMM_10_1_OFF)) << RV_I_IMM_11_0_OPOFF) | + ((imm & (RV_J_IMM_11_MASK << RV_J_IMM_11_OFF)) << RV_J_IMM_11_OPOFF) | + ((imm & (RV_J_IMM_19_12_OPOFF << RV_J_IMM_19_12_OFF)) << RV_J_IMM_19_12_OPOFF) | + ((imm & (1 << RV_J_IMM_SIGN_OFF)) << RV_J_IMM_SIGN_OPOFF)); +} + /* * Put together one immediate from a U-type and I-type instruction pair. * From patchwork Mon Jan 9 18:17:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Heiko Stuebner X-Patchwork-Id: 13094081 X-Patchwork-Delegate: palmer@dabbelt.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 56282C5479D for ; Mon, 9 Jan 2023 18:26:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=VUyLNfiG3aMvxmwesb9lQdygwPABSWO/Mj48vnL0Bks=; b=gwNg8XfGbVwvdR D9WAQdwI4I9KvWvTOSu950IamzGS7Yj924GctQLLBBVn4FxkPs4/r+CzjoTQX0znb5v11fbcSFkQS jtRvYX0tav+6MERTxmS3TG5Sdk/AqLQ9gCfntsH/EqIOrxE4u+D3l7LKTq7xtSGPQT/FLDTkHH0GU aqfgnc6Q06/zJvScZDcqtszp7AQsmSzkuj5aYQx8DkzT3FbXIHxLN9+FJu0CKOUEU8DMfJhAcrFyy k1SZ4YHMBAhzLqOjgiQy9nz1/bDZPXUR24ahGMP82Y/2S+yp5bMV+WjjUySH5al1ORqKWdzvFYGNO sM/da+TQoOoX//i2eGoA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1pEwqH-003GgQ-Ir; Mon, 09 Jan 2023 18:25:53 +0000 Received: from gloria.sntech.de ([185.11.138.130]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1pEwig-003DQq-Bt for linux-riscv@lists.infradead.org; Mon, 09 Jan 2023 18:18:03 +0000 Received: from wf0783.dip.tu-dresden.de ([141.76.183.15] helo=phil.dip.tu-dresden.de) by gloria.sntech.de with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pEwic-0005hU-2g; Mon, 09 Jan 2023 19:17:58 +0100 From: Heiko Stuebner To: linux-riscv@lists.infradead.org, palmer@dabbelt.com Cc: christoph.muellner@vrull.eu, conor@kernel.org, philipp.tomsich@vrull.eu, ajones@ventanamicro.com, heiko@sntech.de, jszhang@kernel.org, Heiko Stuebner Subject: [PATCH v4 3/5] RISC-V: fix jal addresses in patched alternatives Date: Mon, 9 Jan 2023 19:17:53 +0100 Message-Id: <20230109181755.2383085-4-heiko@sntech.de> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20230109181755.2383085-1-heiko@sntech.de> References: <20230109181755.2383085-1-heiko@sntech.de> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230109_101802_455041_94BBC7B0 X-CRM114-Status: GOOD ( 13.43 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Heiko Stuebner Alternatives live in a different section, so addresses used by jal instructions will point to wrong locations after the patch got applied. Similar to arm64, adjust the location to consider that offset. Signed-off-by: Heiko Stuebner Reviewed-by: Andrew Jones --- arch/riscv/kernel/alternative.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c index 6212ea0eed72..985c284fe9f4 100644 --- a/arch/riscv/kernel/alternative.c +++ b/arch/riscv/kernel/alternative.c @@ -79,6 +79,21 @@ static void riscv_alternative_fix_auipc_jalr(void *ptr, u32 auipc_insn, patch_text_nosync(ptr, call, sizeof(u32) * 2); } +static void riscv_alternative_fix_jal(void *ptr, u32 jal_insn, int patch_offset) +{ + s32 imm; + + /* get and adjust new target address */ + imm = riscv_insn_extract_jtype_imm(jal_insn); + imm -= patch_offset; + + /* update instructions */ + riscv_insn_insert_jtype_imm(&jal_insn, imm); + + /* patch the call place again */ + patch_text_nosync(ptr, &jal_insn, sizeof(u32)); +} + void riscv_alternative_fix_offsets(void *alt_ptr, unsigned int len, int patch_offset) { @@ -106,6 +121,18 @@ void riscv_alternative_fix_offsets(void *alt_ptr, unsigned int len, riscv_alternative_fix_auipc_jalr(alt_ptr + i * sizeof(u32), insn, insn2, patch_offset); } + + if (riscv_insn_is_jal(insn)) { + s32 imm = riscv_insn_extract_jtype_imm(insn); + + /* don't modify jumps inside the alternative block */ + if ((alt_ptr + i * sizeof(u32) + imm) >= alt_ptr && + (alt_ptr + i * sizeof(u32) + imm) < (alt_ptr + len)) + continue; + + riscv_alternative_fix_jal(alt_ptr + i * sizeof(u32), + insn, patch_offset); + } } } From patchwork Mon Jan 9 18:17:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Heiko Stuebner X-Patchwork-Id: 13094083 X-Patchwork-Delegate: palmer@dabbelt.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0F9EFC61DB3 for ; Mon, 9 Jan 2023 18:26:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=BnciTdIAlyax4635b8KlMnwm1gze/XlEZFLaqXQcdm8=; b=AqR3RUG5j0PelJ ew0znZz2KPr78wmOlnsZN2qrTXsPZueigRZlCNw2fn4IFzoCaBKV2XDi9D0Gcfaj8W4q6o87vRSRq K53rvEZy5vDzTW8njtotVWN811PwA2KVLSsSqw7Zg0sA0xiwvko58CWtr+TH0mcoTLKV3J+cg0QUN 6ruDAmr0ljXQLE90TuVLj0itnDPOE+oukvMi++tlr8jS2xtqlzZV+NqtHeW6815pC9ezD41dIZxD7 blWXGXiRem9vmq4M9V4BfhnAA1/CM/FR++NawFw48F13wHgrHU/CvLt8L71KL7VbohQL3CzZCYjRy NwEJ86XINIyTy9t+zDGg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1pEwqg-003GsP-M0; Mon, 09 Jan 2023 18:26:18 +0000 Received: from gloria.sntech.de ([185.11.138.130]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1pEwig-003DQw-D6 for linux-riscv@lists.infradead.org; Mon, 09 Jan 2023 18:18:06 +0000 Received: from wf0783.dip.tu-dresden.de ([141.76.183.15] helo=phil.dip.tu-dresden.de) by gloria.sntech.de with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pEwic-0005hU-FF; Mon, 09 Jan 2023 19:17:58 +0100 From: Heiko Stuebner To: linux-riscv@lists.infradead.org, palmer@dabbelt.com Cc: christoph.muellner@vrull.eu, conor@kernel.org, philipp.tomsich@vrull.eu, ajones@ventanamicro.com, heiko@sntech.de, jszhang@kernel.org, Heiko Stuebner Subject: [PATCH v4 4/5] RISC-V: add infrastructure to allow different str* implementations Date: Mon, 9 Jan 2023 19:17:54 +0100 Message-Id: <20230109181755.2383085-5-heiko@sntech.de> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20230109181755.2383085-1-heiko@sntech.de> References: <20230109181755.2383085-1-heiko@sntech.de> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230109_101802_577129_F79C7DAE X-CRM114-Status: GOOD ( 17.20 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Heiko Stuebner Depending on supported extensions on specific RISC-V cores, optimized str* functions might make sense. This adds basic infrastructure to allow patching the function calls via alternatives later on. The Linux kernel provides standard implementations for string functions but when architectures want to extend them, they need to provide their own. The added generic string functions are done in assembler (taken from disassembling the main-kernel functions for now) to allow us to control the used registers and extend them with optimized variants. Signed-off-by: Heiko Stuebner Reviewed-by: Conor Dooley Reviewed-by: Andrew Jones --- arch/riscv/include/asm/string.h | 10 +++++++++ arch/riscv/kernel/riscv_ksyms.c | 3 +++ arch/riscv/lib/Makefile | 3 +++ arch/riscv/lib/strcmp.S | 37 ++++++++++++++++++++++++++++++ arch/riscv/lib/strlen.S | 28 +++++++++++++++++++++++ arch/riscv/lib/strncmp.S | 40 +++++++++++++++++++++++++++++++++ arch/riscv/purgatory/Makefile | 13 +++++++++++ 7 files changed, 134 insertions(+) create mode 100644 arch/riscv/lib/strcmp.S create mode 100644 arch/riscv/lib/strlen.S create mode 100644 arch/riscv/lib/strncmp.S diff --git a/arch/riscv/include/asm/string.h b/arch/riscv/include/asm/string.h index 909049366555..a96b1fea24fe 100644 --- a/arch/riscv/include/asm/string.h +++ b/arch/riscv/include/asm/string.h @@ -18,6 +18,16 @@ extern asmlinkage void *__memcpy(void *, const void *, size_t); #define __HAVE_ARCH_MEMMOVE extern asmlinkage void *memmove(void *, const void *, size_t); extern asmlinkage void *__memmove(void *, const void *, size_t); + +#define __HAVE_ARCH_STRCMP +extern asmlinkage int strcmp(const char *cs, const char *ct); + +#define __HAVE_ARCH_STRLEN +extern asmlinkage __kernel_size_t strlen(const char *); + +#define __HAVE_ARCH_STRNCMP +extern asmlinkage int strncmp(const char *cs, const char *ct, size_t count); + /* For those files which don't want to check by kasan. */ #if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__) #define memcpy(dst, src, len) __memcpy(dst, src, len) diff --git a/arch/riscv/kernel/riscv_ksyms.c b/arch/riscv/kernel/riscv_ksyms.c index 5ab1c7e1a6ed..a72879b4249a 100644 --- a/arch/riscv/kernel/riscv_ksyms.c +++ b/arch/riscv/kernel/riscv_ksyms.c @@ -12,6 +12,9 @@ EXPORT_SYMBOL(memset); EXPORT_SYMBOL(memcpy); EXPORT_SYMBOL(memmove); +EXPORT_SYMBOL(strcmp); +EXPORT_SYMBOL(strlen); +EXPORT_SYMBOL(strncmp); EXPORT_SYMBOL(__memset); EXPORT_SYMBOL(__memcpy); EXPORT_SYMBOL(__memmove); diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile index 25d5c9664e57..6c74b0bedd60 100644 --- a/arch/riscv/lib/Makefile +++ b/arch/riscv/lib/Makefile @@ -3,6 +3,9 @@ lib-y += delay.o lib-y += memcpy.o lib-y += memset.o lib-y += memmove.o +lib-y += strcmp.o +lib-y += strlen.o +lib-y += strncmp.o lib-$(CONFIG_MMU) += uaccess.o lib-$(CONFIG_64BIT) += tishift.o diff --git a/arch/riscv/lib/strcmp.S b/arch/riscv/lib/strcmp.S new file mode 100644 index 000000000000..94440fb8390c --- /dev/null +++ b/arch/riscv/lib/strcmp.S @@ -0,0 +1,37 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +#include +#include +#include + +/* int strcmp(const char *cs, const char *ct) */ +SYM_FUNC_START(strcmp) + /* + * Returns + * a0 - comparison result, value like strcmp + * + * Parameters + * a0 - string1 + * a1 - string2 + * + * Clobbers + * t0, t1, t2 + */ + mv t2, a1 +1: + lbu t1, 0(a0) + lbu t0, 0(a1) + addi a0, a0, 1 + addi a1, a1, 1 + beq t1, t0, 3f + li a0, 1 + bgeu t1, t0, 2f + li a0, -1 +2: + mv a1, t2 + ret +3: + bnez t1, 1b + li a0, 0 + j 2b +SYM_FUNC_END(strcmp) diff --git a/arch/riscv/lib/strlen.S b/arch/riscv/lib/strlen.S new file mode 100644 index 000000000000..09a7aaff26c8 --- /dev/null +++ b/arch/riscv/lib/strlen.S @@ -0,0 +1,28 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +#include +#include +#include + +/* int strlen(const char *s) */ +SYM_FUNC_START(strlen) + /* + * Returns + * a0 - string length + * + * Parameters + * a0 - String to measure + * + * Clobbers: + * t0, t1 + */ + mv t1, a0 +1: + lbu t0, 0(t1) + bnez t0, 2f + sub a0, t1, a0 + ret +2: + addi t1, t1, 1 + j 1b +SYM_FUNC_END(strlen) diff --git a/arch/riscv/lib/strncmp.S b/arch/riscv/lib/strncmp.S new file mode 100644 index 000000000000..493ab6febcb2 --- /dev/null +++ b/arch/riscv/lib/strncmp.S @@ -0,0 +1,40 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +#include +#include +#include + +/* int strncmp(const char *cs, const char *ct, size_t count) */ +SYM_FUNC_START(strncmp) + /* + * Returns + * a0 - comparison result, value like strncmp + * + * Parameters + * a0 - string1 + * a1 - string2 + * a2 - number of characters to compare + * + * Clobbers + * t0, t1, t2 + */ + li t0, 0 +1: + beq a2, t0, 4f + add t1, a0, t0 + add t2, a1, t0 + lbu t1, 0(t1) + lbu t2, 0(t2) + beq t1, t2, 3f + li a0, 1 + bgeu t1, t2, 2f + li a0, -1 +2: + ret +3: + addi t0, t0, 1 + bnez t1, 1b +4: + li a0, 0 + j 2b +SYM_FUNC_END(strncmp) diff --git a/arch/riscv/purgatory/Makefile b/arch/riscv/purgatory/Makefile index dd58e1d99397..d16bf715a586 100644 --- a/arch/riscv/purgatory/Makefile +++ b/arch/riscv/purgatory/Makefile @@ -2,6 +2,7 @@ OBJECT_FILES_NON_STANDARD := y purgatory-y := purgatory.o sha256.o entry.o string.o ctype.o memcpy.o memset.o +purgatory-y += strcmp.o strlen.o strncmp.o targets += $(purgatory-y) PURGATORY_OBJS = $(addprefix $(obj)/,$(purgatory-y)) @@ -18,6 +19,15 @@ $(obj)/memcpy.o: $(srctree)/arch/riscv/lib/memcpy.S FORCE $(obj)/memset.o: $(srctree)/arch/riscv/lib/memset.S FORCE $(call if_changed_rule,as_o_S) +$(obj)/strcmp.o: $(srctree)/arch/riscv/lib/strcmp.S FORCE + $(call if_changed_rule,as_o_S) + +$(obj)/strlen.o: $(srctree)/arch/riscv/lib/strlen.S FORCE + $(call if_changed_rule,as_o_S) + +$(obj)/strncmp.o: $(srctree)/arch/riscv/lib/strncmp.S FORCE + $(call if_changed_rule,as_o_S) + $(obj)/sha256.o: $(srctree)/lib/crypto/sha256.c FORCE $(call if_changed_rule,cc_o_c) @@ -77,6 +87,9 @@ CFLAGS_ctype.o += $(PURGATORY_CFLAGS) AFLAGS_REMOVE_entry.o += -Wa,-gdwarf-2 AFLAGS_REMOVE_memcpy.o += -Wa,-gdwarf-2 AFLAGS_REMOVE_memset.o += -Wa,-gdwarf-2 +AFLAGS_REMOVE_strcmp.o += -Wa,-gdwarf-2 +AFLAGS_REMOVE_strlen.o += -Wa,-gdwarf-2 +AFLAGS_REMOVE_strncmp.o += -Wa,-gdwarf-2 $(obj)/purgatory.ro: $(PURGATORY_OBJS) FORCE $(call if_changed,ld) From patchwork Mon Jan 9 18:17:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Heiko Stuebner X-Patchwork-Id: 13094085 X-Patchwork-Delegate: palmer@dabbelt.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2F0AFC5479D for ; Mon, 9 Jan 2023 18:26:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=dBf3GWPWxS4Fm2YzMltntlmSEM4Sy386OtYKtK38c8Y=; b=acKBO2lfh6XF+t yrJT1x7hev/7I979fVr4ugV2GUMLZvgciY5j6poDxxr9WvzRM2S6lhDDBLNPvi0Dv117YLjfZoKbP gdcxMckrgtmBmSms6BpvaaidHtKJEPf0aJnaKctubxRSV/wTt/Nbqc8INLq8ozjNyeGKIyftFfNw9 t231fvtQn/bDTENEe+R3oUQJmf8d8AjoYPOk+mG+RuxW6QyOOUubMbfQTPyCyh76+8XzxMe+N/QpS yKBeJtnTaQX6oweGC/E5pM7omhJknV0Cd44U9WfyX0gFAmjdDh/HDAtT69HeLNZ9RupKGvykDkkD+ 3WRF51gM2FwhIkhM8Law==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1pEwrB-003H9u-8K; Mon, 09 Jan 2023 18:26:49 +0000 Received: from gloria.sntech.de ([185.11.138.130]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1pEwig-003DR3-8l for linux-riscv@lists.infradead.org; Mon, 09 Jan 2023 18:18:06 +0000 Received: from wf0783.dip.tu-dresden.de ([141.76.183.15] helo=phil.dip.tu-dresden.de) by gloria.sntech.de with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pEwic-0005hU-Q9; Mon, 09 Jan 2023 19:17:58 +0100 From: Heiko Stuebner To: linux-riscv@lists.infradead.org, palmer@dabbelt.com Cc: christoph.muellner@vrull.eu, conor@kernel.org, philipp.tomsich@vrull.eu, ajones@ventanamicro.com, heiko@sntech.de, jszhang@kernel.org, Heiko Stuebner Subject: [PATCH v4 5/5] RISC-V: add zbb support to string functions Date: Mon, 9 Jan 2023 19:17:55 +0100 Message-Id: <20230109181755.2383085-6-heiko@sntech.de> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20230109181755.2383085-1-heiko@sntech.de> References: <20230109181755.2383085-1-heiko@sntech.de> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230109_101802_620431_176B4F4F X-CRM114-Status: GOOD ( 26.11 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Heiko Stuebner Add handling for ZBB extension and add support for using it as a variant for optimized string functions. Support for the Zbb-str-variants is limited to the GNU-assembler for now, as LLVM has not yet acquired the functionality to selectively change the arch option in assembler code. This is still under review at https://reviews.llvm.org/D123515 Co-developed-by: Christoph Muellner Signed-off-by: Christoph Muellner Signed-off-by: Heiko Stuebner Reviewed-by: Conor Dooley --- arch/riscv/Kconfig | 24 ++++++ arch/riscv/include/asm/errata_list.h | 3 +- arch/riscv/include/asm/hwcap.h | 1 + arch/riscv/include/asm/string.h | 2 + arch/riscv/kernel/cpu.c | 1 + arch/riscv/kernel/cpufeature.c | 18 +++++ arch/riscv/lib/strcmp.S | 94 ++++++++++++++++++++++ arch/riscv/lib/strlen.S | 114 +++++++++++++++++++++++++++ arch/riscv/lib/strncmp.S | 111 ++++++++++++++++++++++++++ 9 files changed, 367 insertions(+), 1 deletion(-) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index e2b656043abf..7c814fbf9527 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -416,6 +416,30 @@ config RISCV_ISA_SVPBMT If you don't know what to do here, say Y. +config TOOLCHAIN_HAS_ZBB + bool + default y + depends on !64BIT || $(cc-option,-mabi=lp64 -march=rv64ima_zbb) + depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zbb) + depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900 + depends on AS_IS_GNU + +config RISCV_ISA_ZBB + bool "Zbb extension support for bit manipulation instructions" + depends on TOOLCHAIN_HAS_ZBB + depends on !XIP_KERNEL && MMU + select RISCV_ALTERNATIVE + default y + help + Adds support to dynamically detect the presence of the ZBB + extension (basic bit manipulation) and enable its usage. + + The Zbb extension provides instructions to accelerate a number + of bit-specific operations (count bit population, sign extending, + bitrotation, etc). + + If you don't know what to do here, say Y. + config TOOLCHAIN_HAS_ZICBOM bool default y diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h index 4180312d2a70..95e626b7281e 100644 --- a/arch/riscv/include/asm/errata_list.h +++ b/arch/riscv/include/asm/errata_list.h @@ -24,7 +24,8 @@ #define CPUFEATURE_SVPBMT 0 #define CPUFEATURE_ZICBOM 1 -#define CPUFEATURE_NUMBER 2 +#define CPUFEATURE_ZBB 2 +#define CPUFEATURE_NUMBER 3 #ifdef __ASSEMBLY__ diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h index 86328e3acb02..b727491fb100 100644 --- a/arch/riscv/include/asm/hwcap.h +++ b/arch/riscv/include/asm/hwcap.h @@ -59,6 +59,7 @@ enum riscv_isa_ext_id { RISCV_ISA_EXT_ZIHINTPAUSE, RISCV_ISA_EXT_SSTC, RISCV_ISA_EXT_SVINVAL, + RISCV_ISA_EXT_ZBB, RISCV_ISA_EXT_ID_MAX }; static_assert(RISCV_ISA_EXT_ID_MAX <= RISCV_ISA_EXT_MAX); diff --git a/arch/riscv/include/asm/string.h b/arch/riscv/include/asm/string.h index a96b1fea24fe..17dfc4ab4c80 100644 --- a/arch/riscv/include/asm/string.h +++ b/arch/riscv/include/asm/string.h @@ -6,6 +6,8 @@ #ifndef _ASM_RISCV_STRING_H #define _ASM_RISCV_STRING_H +#include +#include #include #include diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c index 1b9a5a66e55a..c4d1aa166f8b 100644 --- a/arch/riscv/kernel/cpu.c +++ b/arch/riscv/kernel/cpu.c @@ -162,6 +162,7 @@ arch_initcall(riscv_cpuinfo_init); * extensions by an underscore. */ static struct riscv_isa_ext_data isa_ext_arr[] = { + __RISCV_ISA_EXT_DATA(zbb, RISCV_ISA_EXT_ZBB), __RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF), __RISCV_ISA_EXT_DATA(sstc, RISCV_ISA_EXT_SSTC), __RISCV_ISA_EXT_DATA(svinval, RISCV_ISA_EXT_SVINVAL), diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index 205bbd6b1fce..bf3a791d7110 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -222,6 +222,7 @@ void __init riscv_fill_hwcap(void) set_bit(nr, this_isa); } } else { + SET_ISA_EXT_MAP("zbb", RISCV_ISA_EXT_ZBB); SET_ISA_EXT_MAP("sscofpmf", RISCV_ISA_EXT_SSCOFPMF); SET_ISA_EXT_MAP("svpbmt", RISCV_ISA_EXT_SVPBMT); SET_ISA_EXT_MAP("zicbom", RISCV_ISA_EXT_ZICBOM); @@ -301,6 +302,20 @@ static bool __init_or_module cpufeature_probe_zicbom(unsigned int stage) return true; } +static bool __init_or_module cpufeature_probe_zbb(unsigned int stage) +{ + if (!IS_ENABLED(CONFIG_RISCV_ISA_ZBB)) + return false; + + if (stage == RISCV_ALTERNATIVES_EARLY_BOOT) + return false; + + if (!riscv_isa_extension_available(NULL, ZBB)) + return false; + + return true; +} + /* * Probe presence of individual extensions. * @@ -318,6 +333,9 @@ static u32 __init_or_module cpufeature_probe(unsigned int stage) if (cpufeature_probe_zicbom(stage)) cpu_req_feature |= BIT(CPUFEATURE_ZICBOM); + if (cpufeature_probe_zbb(stage)) + cpu_req_feature |= BIT(CPUFEATURE_ZBB); + return cpu_req_feature; } diff --git a/arch/riscv/lib/strcmp.S b/arch/riscv/lib/strcmp.S index 94440fb8390c..5428a8f2eb84 100644 --- a/arch/riscv/lib/strcmp.S +++ b/arch/riscv/lib/strcmp.S @@ -3,9 +3,14 @@ #include #include #include +#include +#include /* int strcmp(const char *cs, const char *ct) */ SYM_FUNC_START(strcmp) + + ALTERNATIVE("nop", "j variant_zbb", 0, CPUFEATURE_ZBB, CONFIG_RISCV_ISA_ZBB) + /* * Returns * a0 - comparison result, value like strcmp @@ -34,4 +39,93 @@ SYM_FUNC_START(strcmp) bnez t1, 1b li a0, 0 j 2b + +/* + * Variant of strcmp using the ZBB extension if available + */ +#ifdef CONFIG_RISCV_ISA_ZBB +variant_zbb: +#define src1 a0 +#define result a0 +#define src2 t5 +#define data1 t0 +#define data2 t1 +#define align t2 +#define data1_orcb t3 +#define m1 t4 + +.option push +.option arch,+zbb + + /* + * Returns + * a0 - comparison result, value like strcmp + * + * Parameters + * a0 - string1 + * a1 - string2 + * + * Clobbers + * t0, t1, t2, t3, t4, t5 + */ + mv src2, a1 + + or align, src1, src2 + li m1, -1 + and align, align, SZREG-1 + bnez align, 3f + + /* Main loop for aligned string. */ + .p2align 3 +1: + REG_L data1, 0(src1) + REG_L data2, 0(src2) + orc.b data1_orcb, data1 + bne data1_orcb, m1, 2f + addi src1, src1, SZREG + addi src2, src2, SZREG + beq data1, data2, 1b + + /* + * Words don't match, and no null byte in the first + * word. Get bytes in big-endian order and compare. + */ +#ifndef CONFIG_CPU_BIG_ENDIAN + rev8 data1, data1 + rev8 data2, data2 +#endif + + /* Synthesize (data1 >= data2) ? 1 : -1 in a branchless sequence. */ + sltu result, data1, data2 + neg result, result + ori result, result, 1 + ret + +2: + /* + * Found a null byte. + * If words don't match, fall back to simple loop. + */ + bne data1, data2, 3f + + /* Otherwise, strings are equal. */ + li result, 0 + ret + + /* Simple loop for misaligned strings. */ + .p2align 3 +3: + lbu data1, 0(src1) + lbu data2, 0(src2) + addi src1, src1, 1 + addi src2, src2, 1 + bne data1, data2, 4f + bnez data1, 3b + +4: + sub result, data1, data2 + ret + +.option pop +#endif SYM_FUNC_END(strcmp) diff --git a/arch/riscv/lib/strlen.S b/arch/riscv/lib/strlen.S index 09a7aaff26c8..738efb04307d 100644 --- a/arch/riscv/lib/strlen.S +++ b/arch/riscv/lib/strlen.S @@ -3,9 +3,14 @@ #include #include #include +#include +#include /* int strlen(const char *s) */ SYM_FUNC_START(strlen) + + ALTERNATIVE("nop", "j variant_zbb", 0, CPUFEATURE_ZBB, CONFIG_RISCV_ISA_ZBB) + /* * Returns * a0 - string length @@ -25,4 +30,113 @@ SYM_FUNC_START(strlen) 2: addi t1, t1, 1 j 1b + +/* + * Variant of strlen using the ZBB extension if available + */ +#ifdef CONFIG_RISCV_ISA_ZBB +variant_zbb: + +#define src a0 +#define result a0 +#define addr t0 +#define data t1 +#define offset t2 +#define offset_bits t2 +#define valid_bytes t3 +#define m1 t3 + +#ifdef CONFIG_CPU_BIG_ENDIAN +# define CZ clz +# define SHIFT sll +#else +# define CZ ctz +# define SHIFT srl +#endif + +.option push +.option arch,+zbb + + /* + * Returns + * a0 - string length + * + * Parameters + * a0 - String to measure + * + * Clobbers + * t0, t1, t2, t3 + */ + + /* Number of irrelevant bytes in the first word. */ + andi offset, src, SZREG-1 + + /* Align pointer. */ + andi addr, src, -SZREG + + li valid_bytes, SZREG + sub valid_bytes, valid_bytes, offset + slli offset_bits, offset, RISCV_LGPTR + + /* Get the first word. */ + REG_L data, 0(addr) + + /* + * Shift away the partial data we loaded to remove the irrelevant bytes + * preceding the string with the effect of adding NUL bytes at the + * end of the string. + */ + SHIFT data, data, offset_bits + + /* Convert non-NUL into 0xff and NUL into 0x00. */ + orc.b data, data + + /* Convert non-NUL into 0x00 and NUL into 0xff. */ + not data, data + + /* + * Search for the first set bit (corresponding to a NUL byte in the + * original chunk). + */ + CZ data, data + + /* + * The first chunk is special: commpare against the number + * of valid bytes in this chunk. + */ + srli result, data, 3 + bgtu valid_bytes, result, 3f + + /* Prepare for the word comparison loop. */ + addi offset, addr, SZREG + li m1, -1 + + /* + * Our critical loop is 4 instructions and processes data in + * 4 byte or 8 byte chunks. + */ + .p2align 3 +1: + REG_L data, SZREG(addr) + addi addr, addr, SZREG + orc.b data, data + beq data, m1, 1b +2: + not data, data + CZ data, data + + /* Get number of processed words. */ + sub offset, addr, offset + + /* Add number of characters in the first word. */ + add result, result, offset + srli data, data, 3 + + /* Add number of characters in the last word. */ + add result, result, data +3: + ret + +.option pop +#endif SYM_FUNC_END(strlen) diff --git a/arch/riscv/lib/strncmp.S b/arch/riscv/lib/strncmp.S index 493ab6febcb2..851428b439dc 100644 --- a/arch/riscv/lib/strncmp.S +++ b/arch/riscv/lib/strncmp.S @@ -3,9 +3,14 @@ #include #include #include +#include +#include /* int strncmp(const char *cs, const char *ct, size_t count) */ SYM_FUNC_START(strncmp) + + ALTERNATIVE("nop", "j variant_zbb", 0, CPUFEATURE_ZBB, CONFIG_RISCV_ISA_ZBB) + /* * Returns * a0 - comparison result, value like strncmp @@ -37,4 +42,110 @@ SYM_FUNC_START(strncmp) 4: li a0, 0 j 2b + +/* + * Variant of strncmp using the ZBB extension if available + */ +#ifdef CONFIG_RISCV_ISA_ZBB +variant_zbb: + +#define src1 a0 +#define result a0 +#define src2 t6 +#define len a2 +#define data1 t0 +#define data2 t1 +#define align t2 +#define data1_orcb t3 +#define limit t4 +#define m1 t5 + +.option push +.option arch,+zbb + + /* + * Returns + * a0 - comparison result, like strncmp + * + * Parameters + * a0 - string1 + * a1 - string2 + * a2 - number of characters to compare + * + * Clobbers + * t0, t1, t2, t3, t4, t5, t6 + */ + mv src2, a1 + + or align, src1, src2 + li m1, -1 + and align, align, SZREG-1 + add limit, src1, len + bnez align, 4f + + /* Adjust limit for fast-path. */ + addi limit, limit, -SZREG + + /* Main loop for aligned string. */ + .p2align 3 +1: + bgt src1, limit, 3f + REG_L data1, 0(src1) + REG_L data2, 0(src2) + orc.b data1_orcb, data1 + bne data1_orcb, m1, 2f + addi src1, src1, SZREG + addi src2, src2, SZREG + beq data1, data2, 1b + + /* + * Words don't match, and no null byte in the first + * word. Get bytes in big-endian order and compare. + */ +#ifndef CONFIG_CPU_BIG_ENDIAN + rev8 data1, data1 + rev8 data2, data2 +#endif + + /* Synthesize (data1 >= data2) ? 1 : -1 in a branchless sequence. */ + sltu result, data1, data2 + neg result, result + ori result, result, 1 + ret + +2: + /* + * Found a null byte. + * If words don't match, fall back to simple loop. + */ + bne data1, data2, 3f + + /* Otherwise, strings are equal. */ + li result, 0 + ret + + /* Simple loop for misaligned strings. */ +3: + /* Restore limit for slow-path. */ + addi limit, limit, SZREG + .p2align 3 +4: + bge src1, limit, 6f + lbu data1, 0(src1) + lbu data2, 0(src2) + addi src1, src1, 1 + addi src2, src2, 1 + bne data1, data2, 5f + bnez data1, 4b + +5: + sub result, data1, data2 + ret + +6: + li result, 0 + ret + +.option pop +#endif SYM_FUNC_END(strncmp)