From patchwork Sun Feb 16 22:55:27 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Biggers X-Patchwork-Id: 13976779 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 87565C021A4 for ; Sun, 16 Feb 2025 23:01:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=wGCCEoctitDA2XIehts5d53AVsEdsZ62lDHR23lj7Pk=; b=BA9K8Dm90bwIqM Qxo7k06dwOdHJkOuHRcPXoH603IJ2uyzINlhrjVJ/Gfxe8ef1IgnNQkoL6PUTbuz1CgYqMiB0lv8S 1q5MdJLm0vo/ZvIZK5Gb9eEBBS439XKDvu/IyAnEkM++jIZGi0TM+UtMzpqY5I7dY/h2adJQD/Jth K8LVzk0Kx3NpaFW43BB1iWqyqysxt+vGannEI+E8oiyknc5cKkKOQjwXwJ2o+uZdnkFal7/2hv8r7 9dlgnFNc4lboP1JPDu6dJOfP3hTdZdUe2BAKBNYq3eXgWaxkmvxoFATjPyKDT+fHrqVvIepXRGgKi 6+Dzd8Tuxdz6i00Ms9Kg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tjnct-00000002kLl-03dy; Sun, 16 Feb 2025 23:00:39 +0000 Received: from nyc.source.kernel.org ([2604:1380:45d1:ec00::3]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tjncn-00000002kIy-2ejw for linux-riscv@lists.infradead.org; Sun, 16 Feb 2025 23:00:36 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 55317A41013; Sun, 16 Feb 2025 22:58:47 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C8B03C4CEE8; Sun, 16 Feb 2025 23:00:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1739746832; bh=CrKjpjSJnMDAw+XXemZV5l2/Pd7FR1iZTJImF71zJ4E=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=YHmc41bFdCIcQeOix/am4JlPUoCKk0eta005AzUtfaQBJRLM5XNS4S7YjiQng0PgH IH4zKvLDbOaSWC+6KT839BQxAHf0LFNEtuRzkqPFHJ2PVaIzJgct0oTR9CHMR4St8Z ROrDYqy9fIXnWvZEjCsae50IkSe4ToK2jZ2A98iTtD253L2uUuDYtSOafe3AcbZ/fv OVdlXsCqiaCWBo3HOAy8C83NJVd2QoG4ouMoMo6ED9U6+hKjapRxAhvmhU4GN3Kq0e jKvRgSkAbwkgEvmoTfFsQn+1nhKiEvHFfpWKA2CrN+5h1A+fgAiCv0jmt2MY0sm4Nh eNcMAj7vXejGA== From: Eric Biggers To: linux-kernel@vger.kernel.org Cc: linux-crypto@vger.kernel.org, linux-riscv@lists.infradead.org, Zhihang Shao , Ard Biesheuvel , Xiao Wang , Charlie Jenkins Subject: [PATCH 1/4] riscv/crc: add "template" for Zbc optimized CRC functions Date: Sun, 16 Feb 2025 14:55:27 -0800 Message-ID: <20250216225530.306980-2-ebiggers@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250216225530.306980-1-ebiggers@kernel.org> References: <20250216225530.306980-1-ebiggers@kernel.org> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250216_150033_810773_78EA6422 X-CRM114-Status: GOOD ( 29.04 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Eric Biggers Add a "template" crc-clmul-template.h that can generate RISC-V Zbc optimized CRC functions. Each generated CRC function is parameterized by CRC length and bit order, and it accepts a pointer to the constants struct required for the specific CRC polynomial desired. Update gen-crc-consts.py to support generating the needed constants structs. This makes it possible to easily wire up a Zbc optimized implementation of almost any CRC. The design generally follows what I did for x86, but it is simplified by using RISC-V's scalar carryless multiplication Zbc, which has no equivalent on x86. RISC-V's clmulr instruction is also helpful. A potential switch to Zvbc (or support for Zvbc alongside Zbc) is left for future work. For long messages Zvbc should be fastest, but it would need to be shown to be worthwhile over just using Zbc which is significantly more convenient to use, especially in the kernel context. Compared to the existing Zbc-optimized CRC32 code and the earlier proposed Zbc-optimized CRC-T10DIF code (https://lore.kernel.org/r/20250211071101.181652-1-zhihang.shao.iscas@gmail.com), this submission deduplicates the code among CRC variants and is significantly more optimized. It uses "folding" to take better advantage of instruction-level parallelism (to a more limited extent than x86 for now, but it could be extended to more), it reworks the Barrett reduction to eliminate unnecessary instructions, and it documents all the math used and makes all the constants reproducible. Signed-off-by: Eric Biggers --- arch/riscv/lib/crc-clmul-template.h | 265 ++++++++++++++++++++++++++++ scripts/gen-crc-consts.py | 55 +++++- 2 files changed, 319 insertions(+), 1 deletion(-) create mode 100644 arch/riscv/lib/crc-clmul-template.h diff --git a/arch/riscv/lib/crc-clmul-template.h b/arch/riscv/lib/crc-clmul-template.h new file mode 100644 index 000000000000..77187e7f1762 --- /dev/null +++ b/arch/riscv/lib/crc-clmul-template.h @@ -0,0 +1,265 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* Copyright 2025 Google LLC */ + +/* + * This file is a "template" that generates a CRC function optimized using the + * RISC-V Zbc (scalar carryless multiplication) extension. The includer of this + * file must define the following parameters to specify the type of CRC: + * + * crc_t: the data type of the CRC, e.g. u32 for a 32-bit CRC + * LSB_CRC: 0 for a msb (most-significant-bit) first CRC, i.e. natural + * mapping between bits and polynomial coefficients + * 1 for a lsb (least-significant-bit) first CRC, i.e. reflected + * mapping between bits and polynomial coefficients + */ + +#include +#include + +#define CRC_BITS (8 * sizeof(crc_t)) /* a.k.a. 'n' */ + +static inline unsigned long clmul(unsigned long a, unsigned long b) +{ + unsigned long res; + + asm(".option push\n" + ".option arch,+zbc\n" + "clmul %0, %1, %2\n" + ".option pop\n" + : "=r" (res) : "r" (a), "r" (b)); + return res; +} + +static inline unsigned long clmulh(unsigned long a, unsigned long b) +{ + unsigned long res; + + asm(".option push\n" + ".option arch,+zbc\n" + "clmulh %0, %1, %2\n" + ".option pop\n" + : "=r" (res) : "r" (a), "r" (b)); + return res; +} + +static inline unsigned long clmulr(unsigned long a, unsigned long b) +{ + unsigned long res; + + asm(".option push\n" + ".option arch,+zbc\n" + "clmulr %0, %1, %2\n" + ".option pop\n" + : "=r" (res) : "r" (a), "r" (b)); + return res; +} + +/* + * crc_load_long() loads one "unsigned long" of aligned data bytes, producing a + * polynomial whose bit order matches the CRC's bit order. + */ +#ifdef CONFIG_64BIT +# if LSB_CRC +# define crc_load_long(x) le64_to_cpup(x) +# else +# define crc_load_long(x) be64_to_cpup(x) +# endif +#else +# if LSB_CRC +# define crc_load_long(x) le32_to_cpup(x) +# else +# define crc_load_long(x) be32_to_cpup(x) +# endif +#endif + +/* XOR @crc into the end of @msgpoly that represents the high-order terms. */ +static inline unsigned long +crc_clmul_prep(crc_t crc, unsigned long msgpoly) +{ +#if LSB_CRC + return msgpoly ^ crc; +#else + return msgpoly ^ ((unsigned long)crc << (BITS_PER_LONG - CRC_BITS)); +#endif +} + +/* + * Multiply the long-sized @msgpoly by x^n (a.k.a. x^CRC_BITS) and reduce it + * modulo the generator polynomial G. This gives the CRC of @msgpoly. + */ +static inline crc_t +crc_clmul_long(unsigned long msgpoly, const struct crc_clmul_consts *consts) +{ + unsigned long tmp; + + /* + * First step of Barrett reduction with integrated multiplication by + * x^n: calculate floor((msgpoly * x^n) / G). This is the value by + * which G needs to be multiplied to cancel out the x^n and higher terms + * of msgpoly * x^n. Do it using the following formula: + * + * msb-first: + * floor((msgpoly * floor(x^(BITS_PER_LONG-1+n) / G)) / x^(BITS_PER_LONG-1)) + * lsb-first: + * floor((msgpoly * floor(x^(BITS_PER_LONG-1+n) / G) * x) / x^BITS_PER_LONG) + * + * barrett_reduction_const_1 contains floor(x^(BITS_PER_LONG-1+n) / G), + * which fits a long exactly. Using any lower power of x there would + * not carry enough precision through the calculation, while using any + * higher power of x would require extra instructions to handle a wider + * multiplication. In the msb-first case, using this power of x results + * in needing a floored division by x^(BITS_PER_LONG-1), which matches + * what clmulr produces. In the lsb-first case, a factor of x gets + * implicitly introduced by each carryless multiplication (shown as + * '* x' above), and the floored division instead needs to be by + * x^BITS_PER_LONG which matches what clmul produces. + */ +#if LSB_CRC + tmp = clmul(msgpoly, consts->barrett_reduction_const_1); +#else + tmp = clmulr(msgpoly, consts->barrett_reduction_const_1); +#endif + + /* + * Second step of Barrett reduction: + * + * crc := (msgpoly * x^n) + (G * floor((msgpoly * x^n) / G)) + * + * This reduces (msgpoly * x^n) modulo G by adding the appropriate + * multiple of G to it. The result uses only the x^0..x^(n-1) terms. + * HOWEVER, since the unreduced value (msgpoly * x^n) is zero in those + * terms in the first place, it is more efficient to do the equivalent: + * + * crc := ((G - x^n) * floor((msgpoly * x^n) / G)) mod x^n + * + * In the lsb-first case further modify it to the following which avoids + * a shift, as the crc ends up in the physically low n bits from clmulr: + * + * product := ((G - x^n) * x^(BITS_PER_LONG - n)) * floor((msgpoly * x^n) / G) * x + * crc := floor(product / x^(BITS_PER_LONG + 1 - n)) mod x^n + * + * barrett_reduction_const_2 contains the constant multiplier (G - x^n) + * or (G - x^n) * x^(BITS_PER_LONG - n) from the formulas above. The + * cast of the result to crc_t is essential, as it applies the mod x^n! + */ +#if LSB_CRC + return clmulr(tmp, consts->barrett_reduction_const_2); +#else + return clmul(tmp, consts->barrett_reduction_const_2); +#endif +} + +/* Update @crc with the data from @msgpoly. */ +static inline crc_t +crc_clmul_update_long(crc_t crc, unsigned long msgpoly, + const struct crc_clmul_consts *consts) +{ + return crc_clmul_long(crc_clmul_prep(crc, msgpoly), consts); +} + +/* Update @crc with 1 <= @len < sizeof(unsigned long) bytes of data. */ +static inline crc_t +crc_clmul_update_partial(crc_t crc, const u8 *p, size_t len, + const struct crc_clmul_consts *consts) +{ + unsigned long msgpoly; + size_t i; + +#if LSB_CRC + msgpoly = (unsigned long)p[0] << (BITS_PER_LONG - 8); + for (i = 1; i < len; i++) + msgpoly = (msgpoly >> 8) ^ ((unsigned long)p[i] << (BITS_PER_LONG - 8)); +#else + msgpoly = p[0]; + for (i = 1; i < len; i++) + msgpoly = (msgpoly << 8) ^ p[i]; +#endif + + if (len >= sizeof(crc_t)) { + #if LSB_CRC + msgpoly ^= (unsigned long)crc << (BITS_PER_LONG - 8*len); + #else + msgpoly ^= (unsigned long)crc << (8*len - CRC_BITS); + #endif + return crc_clmul_long(msgpoly, consts); + } +#if LSB_CRC + msgpoly ^= (unsigned long)crc << (BITS_PER_LONG - 8*len); + return crc_clmul_long(msgpoly, consts) ^ (crc >> (8*len)); +#else + msgpoly ^= crc >> (CRC_BITS - 8*len); + return crc_clmul_long(msgpoly, consts) ^ (crc << (8*len)); +#endif +} + +static inline crc_t +crc_clmul(crc_t crc, const void *p, size_t len, + const struct crc_clmul_consts *consts) +{ + size_t align; + + /* This implementation assumes that the CRC fits in an unsigned long. */ + BUILD_BUG_ON(sizeof(crc_t) > sizeof(unsigned long)); + + /* If the buffer is not long-aligned, align it. */ + align = (unsigned long)p % sizeof(unsigned long); + if (align && len) { + align = min(sizeof(unsigned long) - align, len); + crc = crc_clmul_update_partial(crc, p, align, consts); + p += align; + len -= align; + } + + if (len >= 4 * sizeof(unsigned long)) { + unsigned long m0, m1; + + m0 = crc_clmul_prep(crc, crc_load_long(p)); + m1 = crc_load_long(p + sizeof(unsigned long)); + p += 2 * sizeof(unsigned long); + len -= 2 * sizeof(unsigned long); + /* + * Main loop. Each iteration starts with a message polynomial + * (x^BITS_PER_LONG)*m0 + m1, then logically extends it by two + * more longs of data to form x^(3*BITS_PER_LONG)*m0 + + * x^(2*BITS_PER_LONG)*m1 + x^BITS_PER_LONG*m2 + m3, then + * "folds" that back into a congruent (modulo G) value that uses + * just m0 and m1 again. This is done by multiplying m0 by the + * precomputed constant (x^(3*BITS_PER_LONG) mod G) and m1 by + * the precomputed constant (x^(2*BITS_PER_LONG) mod G), then + * adding the results to m2 and m3 as appropriate. Each such + * multiplication produces a result twice the length of a long, + * which in RISC-V is two instructions clmul and clmulh. + * + * This could be changed to fold across more than 2 longs at a + * time if there is a CPU that can take advantage of it. + */ + do { + unsigned long p0, p1, p2, p3; + + p0 = clmulh(m0, consts->fold_across_2_longs_const_hi); + p1 = clmul(m0, consts->fold_across_2_longs_const_hi); + p2 = clmulh(m1, consts->fold_across_2_longs_const_lo); + p3 = clmul(m1, consts->fold_across_2_longs_const_lo); + m0 = (LSB_CRC ? p1 ^ p3 : p0 ^ p2) ^ crc_load_long(p); + m1 = (LSB_CRC ? p0 ^ p2 : p1 ^ p3) ^ + crc_load_long(p + sizeof(unsigned long)); + + p += 2 * sizeof(unsigned long); + len -= 2 * sizeof(unsigned long); + } while (len >= 2 * sizeof(unsigned long)); + + crc = crc_clmul_long(m0, consts); + crc = crc_clmul_update_long(crc, m1, consts); + } + + while (len >= sizeof(unsigned long)) { + crc = crc_clmul_update_long(crc, crc_load_long(p), consts); + p += sizeof(unsigned long); + len -= sizeof(unsigned long); + } + + if (len) + crc = crc_clmul_update_partial(crc, p, len, consts); + + return crc; +} diff --git a/scripts/gen-crc-consts.py b/scripts/gen-crc-consts.py index aa678a50897d..f9b44fc3a03f 100755 --- a/scripts/gen-crc-consts.py +++ b/scripts/gen-crc-consts.py @@ -103,10 +103,61 @@ def gen_slicebyN_tables(variants, n): s += (' ' if s else '') + next_entry if s: print(f'\t{s}') print('};') +def print_riscv_const(v, bits_per_long, name, val, desc): + print(f'\t.{name} = {fmt_poly(v, val, bits_per_long)}, /* {desc} */') + +def do_gen_riscv_clmul_consts(v, bits_per_long): + (G, n, lsb) = (v.G, v.bits, v.lsb) + + pow_of_x = 3 * bits_per_long - (1 if lsb else 0) + print_riscv_const(v, bits_per_long, 'fold_across_2_longs_const_hi', + reduce(1 << pow_of_x, G), f'x^{pow_of_x} mod G') + pow_of_x = 2 * bits_per_long - (1 if lsb else 0) + print_riscv_const(v, bits_per_long, 'fold_across_2_longs_const_lo', + reduce(1 << pow_of_x, G), f'x^{pow_of_x} mod G') + + pow_of_x = bits_per_long - 1 + n + print_riscv_const(v, bits_per_long, 'barrett_reduction_const_1', + div(1 << pow_of_x, G), f'floor(x^{pow_of_x} / G)') + + val = G - (1 << n) + desc = f'G - x^{n}' + if lsb: + val <<= bits_per_long - n + desc = f'({desc}) * x^{bits_per_long - n}' + print_riscv_const(v, bits_per_long, 'barrett_reduction_const_2', val, desc) + +def gen_riscv_clmul_consts(variants): + print('') + print('struct crc_clmul_consts {'); + print('\tunsigned long fold_across_2_longs_const_hi;'); + print('\tunsigned long fold_across_2_longs_const_lo;'); + print('\tunsigned long barrett_reduction_const_1;'); + print('\tunsigned long barrett_reduction_const_2;'); + print('};'); + for v in variants: + print(''); + if v.bits > 32: + print_header(v, 'Constants') + print('#ifdef CONFIG_64BIT') + print(f'static const struct crc_clmul_consts {v.name}_consts __maybe_unused = {{') + do_gen_riscv_clmul_consts(v, 64) + print('};') + print('#endif') + else: + print_header(v, 'Constants') + print(f'static const struct crc_clmul_consts {v.name}_consts __maybe_unused = {{') + print('#ifdef CONFIG_64BIT') + do_gen_riscv_clmul_consts(v, 64) + print('#else') + do_gen_riscv_clmul_consts(v, 32) + print('#endif') + print('};') + # Generate constants for carryless multiplication based CRC computation. def gen_x86_pclmul_consts(variants): # These are the distances, in bits, to generate folding constants for. FOLD_DISTANCES = [2048, 1024, 512, 256, 128] @@ -211,11 +262,11 @@ def parse_crc_variants(vars_string): variants.append(CrcVariant(bits, generator_poly, bit_order)) return variants if len(sys.argv) != 3: sys.stderr.write(f'Usage: {sys.argv[0]} CONSTS_TYPE[,CONSTS_TYPE]... CRC_VARIANT[,CRC_VARIANT]...\n') - sys.stderr.write(' CONSTS_TYPE can be sliceby[1-8] or x86_pclmul\n') + sys.stderr.write(' CONSTS_TYPE can be sliceby[1-8], riscv_clmul, or x86_pclmul\n') sys.stderr.write(' CRC_VARIANT is crc${num_bits}_${bit_order}_${generator_poly_as_hex}\n') sys.stderr.write(' E.g. crc16_msb_0x8bb7 or crc32_lsb_0xedb88320\n') sys.stderr.write(' Polynomial must use the given bit_order and exclude x^{num_bits}\n') sys.exit(1) @@ -230,9 +281,11 @@ print(' */') consts_types = sys.argv[1].split(',') variants = parse_crc_variants(sys.argv[2]) for consts_type in consts_types: if consts_type.startswith('sliceby'): gen_slicebyN_tables(variants, int(consts_type.removeprefix('sliceby'))) + elif consts_type == 'riscv_clmul': + gen_riscv_clmul_consts(variants) elif consts_type == 'x86_pclmul': gen_x86_pclmul_consts(variants) else: raise ValueError(f'Unknown consts_type: {consts_type}') From patchwork Sun Feb 16 22:55:28 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Biggers X-Patchwork-Id: 13976778 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F3367C021A4 for ; Sun, 16 Feb 2025 23:00:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=3KQYJXcShrt2k1mnScm4AVchZF1lngkfWSr9eYqCF2g=; b=jugwO/uU3p8/xL laMVuixNU3WZlXXz23/Rj1elGQG6CE2l2x5sxFGodRXy/4k53I1Y2D1elyiUvUdej+m+bknhO3xZ4 BQhTHNviKUb3nrW0p492pvrljJqfq89F8AVJZu7Y5bE9SjP1R2QdCZQ+gKUyX1PSbPP61Vt0kuAK6 Z+3A+ADjfbGgOQCqwu8r7GhvoyQaDzYkorNjWXBTfGcQi9gNVjBo4yVDBbPlqDEkmQP890qvblJ4L BkLrfVjVX9aZV7/sx7KeBEpIV5h3YI3IxnvnCsD/h7bpH7LxpK7Y8+8xWRQtxSYAhP1C51mxRRMGq PLMZuwpAn23b5wt0GbzQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tjnct-00000002kMY-2tCM; Sun, 16 Feb 2025 23:00:39 +0000 Received: from nyc.source.kernel.org ([2604:1380:45d1:ec00::3]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tjncn-00000002kJ0-2eth for linux-riscv@lists.infradead.org; Sun, 16 Feb 2025 23:00:36 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id B0A14A40FA6; Sun, 16 Feb 2025 22:58:47 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2D746C4CEE9; Sun, 16 Feb 2025 23:00:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1739746832; bh=n3QS9UDn4sQ33onbfipQ0mGDpU0ZBGjKLwA9GWJRNjQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ZZbcywjJEu0BvDA/Su6dNowHx7Um/HOK/QFFJqhYB5bCOiijH+Lr9fF5CC6/L3X2I Ogdt+ZCLJUT2pb0KT8Raq0J36QWKw6pppOpVf7r+MLgfxZjWh301issd1AZDPGugsR LkV0I4ZN9r3qMyHIsth/6tFAY6YXgufEo74UhOMzGmJToJAR5uBpdoSsRk3FTpa51w BH+WB8Nom1S0mps8iMT1ZnoWcGZJdxy7F1oA8tTZapBHNPfKdpIdEu2Kmtvo64nvDv yzAk0npbw2up0r4UpTvw1fT6857r/5WdwDa6oxeitFH7L6lpBUOcuSGarzXc1WjpGF Pktx1g9TGS/zA== From: Eric Biggers To: linux-kernel@vger.kernel.org Cc: linux-crypto@vger.kernel.org, linux-riscv@lists.infradead.org, Zhihang Shao , Ard Biesheuvel , Xiao Wang , Charlie Jenkins Subject: [PATCH 2/4] riscv/crc32: reimplement the CRC32 functions using new template Date: Sun, 16 Feb 2025 14:55:28 -0800 Message-ID: <20250216225530.306980-3-ebiggers@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250216225530.306980-1-ebiggers@kernel.org> References: <20250216225530.306980-1-ebiggers@kernel.org> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250216_150033_813195_EE8A0E14 X-CRM114-Status: GOOD ( 18.80 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Eric Biggers Delete the previous Zbc optimized CRC32 code, and re-implement it using the new template. The new implementation is more optimized and shares more code among CRC variants. Signed-off-by: Eric Biggers --- arch/riscv/lib/Makefile | 1 + arch/riscv/lib/crc-clmul-consts.h | 72 +++++++ arch/riscv/lib/crc-clmul.h | 15 ++ arch/riscv/lib/crc32-riscv.c | 310 ------------------------------ arch/riscv/lib/crc32.c | 53 +++++ arch/riscv/lib/crc32_lsb.c | 18 ++ arch/riscv/lib/crc32_msb.c | 18 ++ 7 files changed, 177 insertions(+), 310 deletions(-) create mode 100644 arch/riscv/lib/crc-clmul-consts.h create mode 100644 arch/riscv/lib/crc-clmul.h delete mode 100644 arch/riscv/lib/crc32-riscv.c create mode 100644 arch/riscv/lib/crc32.c create mode 100644 arch/riscv/lib/crc32_lsb.c create mode 100644 arch/riscv/lib/crc32_msb.c diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile index 79368a895fee..7b32d3e88337 100644 --- a/arch/riscv/lib/Makefile +++ b/arch/riscv/lib/Makefile @@ -14,8 +14,9 @@ lib-$(CONFIG_RISCV_ISA_V) += uaccess_vector.o endif lib-$(CONFIG_MMU) += uaccess.o lib-$(CONFIG_64BIT) += tishift.o lib-$(CONFIG_RISCV_ISA_ZICBOZ) += clear_page.o obj-$(CONFIG_CRC32_ARCH) += crc32-riscv.o +crc32-riscv-y := crc32.o crc32_msb.o crc32_lsb.o obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o lib-$(CONFIG_RISCV_ISA_V) += xor.o lib-$(CONFIG_RISCV_ISA_V) += riscv_v_helpers.o diff --git a/arch/riscv/lib/crc-clmul-consts.h b/arch/riscv/lib/crc-clmul-consts.h new file mode 100644 index 000000000000..6fdf10648a20 --- /dev/null +++ b/arch/riscv/lib/crc-clmul-consts.h @@ -0,0 +1,72 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * CRC constants generated by: + * + * ./scripts/gen-crc-consts.py riscv_clmul crc32_msb_0x04c11db7,crc32_lsb_0xedb88320,crc32_lsb_0x82f63b78 + * + * Do not edit manually. + */ + +struct crc_clmul_consts { + unsigned long fold_across_2_longs_const_hi; + unsigned long fold_across_2_longs_const_lo; + unsigned long barrett_reduction_const_1; + unsigned long barrett_reduction_const_2; +}; + +/* + * Constants generated for most-significant-bit-first CRC-32 using + * G(x) = x^32 + x^26 + x^23 + x^22 + x^16 + x^12 + x^11 + x^10 + x^8 + x^7 + + * x^5 + x^4 + x^2 + x^1 + x^0 + */ +static const struct crc_clmul_consts crc32_msb_0x04c11db7_consts __maybe_unused = { +#ifdef CONFIG_64BIT + .fold_across_2_longs_const_hi = 0x00000000c5b9cd4c, /* x^192 mod G */ + .fold_across_2_longs_const_lo = 0x00000000e8a45605, /* x^128 mod G */ + .barrett_reduction_const_1 = 0x826880efa40da72d, /* floor(x^95 / G) */ + .barrett_reduction_const_2 = 0x0000000004c11db7, /* G - x^32 */ +#else + .fold_across_2_longs_const_hi = 0xf200aa66, /* x^96 mod G */ + .fold_across_2_longs_const_lo = 0x490d678d, /* x^64 mod G */ + .barrett_reduction_const_1 = 0x826880ef, /* floor(x^63 / G) */ + .barrett_reduction_const_2 = 0x04c11db7, /* G - x^32 */ +#endif +}; + +/* + * Constants generated for least-significant-bit-first CRC-32 using + * G(x) = x^32 + x^26 + x^23 + x^22 + x^16 + x^12 + x^11 + x^10 + x^8 + x^7 + + * x^5 + x^4 + x^2 + x^1 + x^0 + */ +static const struct crc_clmul_consts crc32_lsb_0xedb88320_consts __maybe_unused = { +#ifdef CONFIG_64BIT + .fold_across_2_longs_const_hi = 0x65673b4600000000, /* x^191 mod G */ + .fold_across_2_longs_const_lo = 0x9ba54c6f00000000, /* x^127 mod G */ + .barrett_reduction_const_1 = 0xb4e5b025f7011641, /* floor(x^95 / G) */ + .barrett_reduction_const_2 = 0x00000000edb88320, /* (G - x^32) * x^32 */ +#else + .fold_across_2_longs_const_hi = 0xccaa009e, /* x^95 mod G */ + .fold_across_2_longs_const_lo = 0xb8bc6765, /* x^63 mod G */ + .barrett_reduction_const_1 = 0xf7011641, /* floor(x^63 / G) */ + .barrett_reduction_const_2 = 0xedb88320, /* (G - x^32) * x^0 */ +#endif +}; + +/* + * Constants generated for least-significant-bit-first CRC-32 using + * G(x) = x^32 + x^28 + x^27 + x^26 + x^25 + x^23 + x^22 + x^20 + x^19 + x^18 + + * x^14 + x^13 + x^11 + x^10 + x^9 + x^8 + x^6 + x^0 + */ +static const struct crc_clmul_consts crc32_lsb_0x82f63b78_consts __maybe_unused = { +#ifdef CONFIG_64BIT + .fold_across_2_longs_const_hi = 0x3743f7bd00000000, /* x^191 mod G */ + .fold_across_2_longs_const_lo = 0x3171d43000000000, /* x^127 mod G */ + .barrett_reduction_const_1 = 0x4869ec38dea713f1, /* floor(x^95 / G) */ + .barrett_reduction_const_2 = 0x0000000082f63b78, /* (G - x^32) * x^32 */ +#else + .fold_across_2_longs_const_hi = 0x493c7d27, /* x^95 mod G */ + .fold_across_2_longs_const_lo = 0xdd45aab8, /* x^63 mod G */ + .barrett_reduction_const_1 = 0xdea713f1, /* floor(x^63 / G) */ + .barrett_reduction_const_2 = 0x82f63b78, /* (G - x^32) * x^0 */ +#endif +}; diff --git a/arch/riscv/lib/crc-clmul.h b/arch/riscv/lib/crc-clmul.h new file mode 100644 index 000000000000..62bd41080785 --- /dev/null +++ b/arch/riscv/lib/crc-clmul.h @@ -0,0 +1,15 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* Copyright 2025 Google LLC */ + +#ifndef _RISCV_CRC_CLMUL_H +#define _RISCV_CRC_CLMUL_H + +#include +#include "crc-clmul-consts.h" + +u32 crc32_msb_clmul(u32 crc, const void *p, size_t len, + const struct crc_clmul_consts *consts); +u32 crc32_lsb_clmul(u32 crc, const void *p, size_t len, + const struct crc_clmul_consts *consts); + +#endif /* _RISCV_CRC_CLMUL_H */ diff --git a/arch/riscv/lib/crc32-riscv.c b/arch/riscv/lib/crc32-riscv.c deleted file mode 100644 index b5cb752847c4..000000000000 --- a/arch/riscv/lib/crc32-riscv.c +++ /dev/null @@ -1,310 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-only -/* - * Accelerated CRC32 implementation with Zbc extension. - * - * Copyright (C) 2024 Intel Corporation - */ - -#include -#include -#include - -#include -#include -#include -#include -#include -#include - -/* - * Refer to https://www.corsix.org/content/barrett-reduction-polynomials for - * better understanding of how this math works. - * - * let "+" denotes polynomial add (XOR) - * let "-" denotes polynomial sub (XOR) - * let "*" denotes polynomial multiplication - * let "/" denotes polynomial floor division - * let "S" denotes source data, XLEN bit wide - * let "P" denotes CRC32 polynomial - * let "T" denotes 2^(XLEN+32) - * let "QT" denotes quotient of T/P, with the bit for 2^XLEN being implicit - * - * crc32(S, P) - * => S * (2^32) - S * (2^32) / P * P - * => lowest 32 bits of: S * (2^32) / P * P - * => lowest 32 bits of: S * (2^32) * (T / P) / T * P - * => lowest 32 bits of: S * (2^32) * quotient / T * P - * => lowest 32 bits of: S * quotient / 2^XLEN * P - * => lowest 32 bits of: (clmul_high_part(S, QT) + S) * P - * => clmul_low_part(clmul_high_part(S, QT) + S, P) - * - * In terms of below implementations, the BE case is more intuitive, since the - * higher order bit sits at more significant position. - */ - -#if __riscv_xlen == 64 -/* Slide by XLEN bits per iteration */ -# define STEP_ORDER 3 - -/* Each below polynomial quotient has an implicit bit for 2^XLEN */ - -/* Polynomial quotient of (2^(XLEN+32))/CRC32_POLY, in LE format */ -# define CRC32_POLY_QT_LE 0x5a72d812fb808b20 - -/* Polynomial quotient of (2^(XLEN+32))/CRC32C_POLY, in LE format */ -# define CRC32C_POLY_QT_LE 0xa434f61c6f5389f8 - -/* Polynomial quotient of (2^(XLEN+32))/CRC32_POLY, in BE format, it should be - * the same as the bit-reversed version of CRC32_POLY_QT_LE - */ -# define CRC32_POLY_QT_BE 0x04d101df481b4e5a - -static inline u64 crc32_le_prep(u32 crc, unsigned long const *ptr) -{ - return (u64)crc ^ (__force u64)__cpu_to_le64(*ptr); -} - -static inline u32 crc32_le_zbc(unsigned long s, u32 poly, unsigned long poly_qt) -{ - u32 crc; - - /* We don't have a "clmulrh" insn, so use clmul + slli instead. */ - asm volatile (".option push\n" - ".option arch,+zbc\n" - "clmul %0, %1, %2\n" - "slli %0, %0, 1\n" - "xor %0, %0, %1\n" - "clmulr %0, %0, %3\n" - "srli %0, %0, 32\n" - ".option pop\n" - : "=&r" (crc) - : "r" (s), - "r" (poly_qt), - "r" ((u64)poly << 32) - :); - return crc; -} - -static inline u64 crc32_be_prep(u32 crc, unsigned long const *ptr) -{ - return ((u64)crc << 32) ^ (__force u64)__cpu_to_be64(*ptr); -} - -#elif __riscv_xlen == 32 -# define STEP_ORDER 2 -/* Each quotient should match the upper half of its analog in RV64 */ -# define CRC32_POLY_QT_LE 0xfb808b20 -# define CRC32C_POLY_QT_LE 0x6f5389f8 -# define CRC32_POLY_QT_BE 0x04d101df - -static inline u32 crc32_le_prep(u32 crc, unsigned long const *ptr) -{ - return crc ^ (__force u32)__cpu_to_le32(*ptr); -} - -static inline u32 crc32_le_zbc(unsigned long s, u32 poly, unsigned long poly_qt) -{ - u32 crc; - - /* We don't have a "clmulrh" insn, so use clmul + slli instead. */ - asm volatile (".option push\n" - ".option arch,+zbc\n" - "clmul %0, %1, %2\n" - "slli %0, %0, 1\n" - "xor %0, %0, %1\n" - "clmulr %0, %0, %3\n" - ".option pop\n" - : "=&r" (crc) - : "r" (s), - "r" (poly_qt), - "r" (poly) - :); - return crc; -} - -static inline u32 crc32_be_prep(u32 crc, unsigned long const *ptr) -{ - return crc ^ (__force u32)__cpu_to_be32(*ptr); -} - -#else -# error "Unexpected __riscv_xlen" -#endif - -static inline u32 crc32_be_zbc(unsigned long s) -{ - u32 crc; - - asm volatile (".option push\n" - ".option arch,+zbc\n" - "clmulh %0, %1, %2\n" - "xor %0, %0, %1\n" - "clmul %0, %0, %3\n" - ".option pop\n" - : "=&r" (crc) - : "r" (s), - "r" (CRC32_POLY_QT_BE), - "r" (CRC32_POLY_BE) - :); - return crc; -} - -#define STEP (1 << STEP_ORDER) -#define OFFSET_MASK (STEP - 1) - -typedef u32 (*fallback)(u32 crc, unsigned char const *p, size_t len); - -static inline u32 crc32_le_unaligned(u32 crc, unsigned char const *p, - size_t len, u32 poly, - unsigned long poly_qt) -{ - size_t bits = len * 8; - unsigned long s = 0; - u32 crc_low = 0; - - for (int i = 0; i < len; i++) - s = ((unsigned long)*p++ << (__riscv_xlen - 8)) | (s >> 8); - - s ^= (unsigned long)crc << (__riscv_xlen - bits); - if (__riscv_xlen == 32 || len < sizeof(u32)) - crc_low = crc >> bits; - - crc = crc32_le_zbc(s, poly, poly_qt); - crc ^= crc_low; - - return crc; -} - -static inline u32 crc32_le_generic(u32 crc, unsigned char const *p, size_t len, - u32 poly, unsigned long poly_qt, - fallback crc_fb) -{ - size_t offset, head_len, tail_len; - unsigned long const *p_ul; - unsigned long s; - - asm goto(ALTERNATIVE("j %l[legacy]", "nop", 0, - RISCV_ISA_EXT_ZBC, 1) - : : : : legacy); - - /* Handle the unaligned head. */ - offset = (unsigned long)p & OFFSET_MASK; - if (offset && len) { - head_len = min(STEP - offset, len); - crc = crc32_le_unaligned(crc, p, head_len, poly, poly_qt); - p += head_len; - len -= head_len; - } - - tail_len = len & OFFSET_MASK; - len = len >> STEP_ORDER; - p_ul = (unsigned long const *)p; - - for (int i = 0; i < len; i++) { - s = crc32_le_prep(crc, p_ul); - crc = crc32_le_zbc(s, poly, poly_qt); - p_ul++; - } - - /* Handle the tail bytes. */ - p = (unsigned char const *)p_ul; - if (tail_len) - crc = crc32_le_unaligned(crc, p, tail_len, poly, poly_qt); - - return crc; - -legacy: - return crc_fb(crc, p, len); -} - -u32 crc32_le_arch(u32 crc, const u8 *p, size_t len) -{ - return crc32_le_generic(crc, p, len, CRC32_POLY_LE, CRC32_POLY_QT_LE, - crc32_le_base); -} -EXPORT_SYMBOL(crc32_le_arch); - -u32 crc32c_arch(u32 crc, const u8 *p, size_t len) -{ - return crc32_le_generic(crc, p, len, CRC32C_POLY_LE, - CRC32C_POLY_QT_LE, crc32c_base); -} -EXPORT_SYMBOL(crc32c_arch); - -static inline u32 crc32_be_unaligned(u32 crc, unsigned char const *p, - size_t len) -{ - size_t bits = len * 8; - unsigned long s = 0; - u32 crc_low = 0; - - s = 0; - for (int i = 0; i < len; i++) - s = *p++ | (s << 8); - - if (__riscv_xlen == 32 || len < sizeof(u32)) { - s ^= crc >> (32 - bits); - crc_low = crc << bits; - } else { - s ^= (unsigned long)crc << (bits - 32); - } - - crc = crc32_be_zbc(s); - crc ^= crc_low; - - return crc; -} - -u32 crc32_be_arch(u32 crc, const u8 *p, size_t len) -{ - size_t offset, head_len, tail_len; - unsigned long const *p_ul; - unsigned long s; - - asm goto(ALTERNATIVE("j %l[legacy]", "nop", 0, - RISCV_ISA_EXT_ZBC, 1) - : : : : legacy); - - /* Handle the unaligned head. */ - offset = (unsigned long)p & OFFSET_MASK; - if (offset && len) { - head_len = min(STEP - offset, len); - crc = crc32_be_unaligned(crc, p, head_len); - p += head_len; - len -= head_len; - } - - tail_len = len & OFFSET_MASK; - len = len >> STEP_ORDER; - p_ul = (unsigned long const *)p; - - for (int i = 0; i < len; i++) { - s = crc32_be_prep(crc, p_ul); - crc = crc32_be_zbc(s); - p_ul++; - } - - /* Handle the tail bytes. */ - p = (unsigned char const *)p_ul; - if (tail_len) - crc = crc32_be_unaligned(crc, p, tail_len); - - return crc; - -legacy: - return crc32_be_base(crc, p, len); -} -EXPORT_SYMBOL(crc32_be_arch); - -u32 crc32_optimizations(void) -{ - if (riscv_has_extension_likely(RISCV_ISA_EXT_ZBC)) - return CRC32_LE_OPTIMIZATION | - CRC32_BE_OPTIMIZATION | - CRC32C_OPTIMIZATION; - return 0; -} -EXPORT_SYMBOL(crc32_optimizations); - -MODULE_LICENSE("GPL"); -MODULE_DESCRIPTION("Accelerated CRC32 implementation with Zbc extension"); diff --git a/arch/riscv/lib/crc32.c b/arch/riscv/lib/crc32.c new file mode 100644 index 000000000000..a3188b7d9c40 --- /dev/null +++ b/arch/riscv/lib/crc32.c @@ -0,0 +1,53 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * RISC-V optimized CRC32 functions + * + * Copyright 2025 Google LLC + */ + +#include +#include +#include +#include + +#include "crc-clmul.h" + +u32 crc32_le_arch(u32 crc, const u8 *p, size_t len) +{ + if (riscv_has_extension_likely(RISCV_ISA_EXT_ZBC)) + return crc32_lsb_clmul(crc, p, len, + &crc32_lsb_0xedb88320_consts); + return crc32_le_base(crc, p, len); +} +EXPORT_SYMBOL(crc32_le_arch); + +u32 crc32_be_arch(u32 crc, const u8 *p, size_t len) +{ + if (riscv_has_extension_likely(RISCV_ISA_EXT_ZBC)) + return crc32_msb_clmul(crc, p, len, + &crc32_msb_0x04c11db7_consts); + return crc32_be_base(crc, p, len); +} +EXPORT_SYMBOL(crc32_be_arch); + +u32 crc32c_arch(u32 crc, const u8 *p, size_t len) +{ + if (riscv_has_extension_likely(RISCV_ISA_EXT_ZBC)) + return crc32_lsb_clmul(crc, p, len, + &crc32_lsb_0x82f63b78_consts); + return crc32c_base(crc, p, len); +} +EXPORT_SYMBOL(crc32c_arch); + +u32 crc32_optimizations(void) +{ + if (riscv_has_extension_likely(RISCV_ISA_EXT_ZBC)) + return CRC32_LE_OPTIMIZATION | + CRC32_BE_OPTIMIZATION | + CRC32C_OPTIMIZATION; + return 0; +} +EXPORT_SYMBOL(crc32_optimizations); + +MODULE_DESCRIPTION("RISC-V optimized CRC32 functions"); +MODULE_LICENSE("GPL"); diff --git a/arch/riscv/lib/crc32_lsb.c b/arch/riscv/lib/crc32_lsb.c new file mode 100644 index 000000000000..72fd67e7470c --- /dev/null +++ b/arch/riscv/lib/crc32_lsb.c @@ -0,0 +1,18 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * RISC-V optimized least-significant-bit-first CRC32 + * + * Copyright 2025 Google LLC + */ + +#include "crc-clmul.h" + +typedef u32 crc_t; +#define LSB_CRC 1 +#include "crc-clmul-template.h" + +u32 crc32_lsb_clmul(u32 crc, const void *p, size_t len, + const struct crc_clmul_consts *consts) +{ + return crc_clmul(crc, p, len, consts); +} diff --git a/arch/riscv/lib/crc32_msb.c b/arch/riscv/lib/crc32_msb.c new file mode 100644 index 000000000000..fdbeaccc369f --- /dev/null +++ b/arch/riscv/lib/crc32_msb.c @@ -0,0 +1,18 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * RISC-V optimized most-significant-bit-first CRC32 + * + * Copyright 2025 Google LLC + */ + +#include "crc-clmul.h" + +typedef u32 crc_t; +#define LSB_CRC 0 +#include "crc-clmul-template.h" + +u32 crc32_msb_clmul(u32 crc, const void *p, size_t len, + const struct crc_clmul_consts *consts) +{ + return crc_clmul(crc, p, len, consts); +} From patchwork Sun Feb 16 22:55:29 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Biggers X-Patchwork-Id: 13976776 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A6E40C021A4 for ; Sun, 16 Feb 2025 23:00:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=jYWewwBEhiYijhB+k8ZtGkapwV5XNDAeC61WCIyLWnQ=; b=oxgXsoreISNzDp xiu0kGLKqvLWceReT7eeqZdKRywceNLS24iOSIQ8cq2BlIgvqaE4VqbU/BIXVRlmCRwYbYbUEzIjS 2X080yH83VRUDFqmqBL/23reqx1i5gJ5vcwAV5+FGYHAykeBq0i7PmhOMgUmusFNppy2q0ZCBp7J9 MBOYmluqzWHTF++E9ZBZPTHSXiAa2Fm1XaPg4Tl6Rh2brVw1UASypX5tqN+gQX/+RTIHlIu6CMYH1 kuEDS+ToG0NwrSSCCqQJq6qzpY5laOKiN4ocjU3BYM85yZGTy5GvAY57D2LJU23AhFZe8AxpOZY+a AeKZ2tIPcfQLLniY4UtQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tjncq-00000002kL0-1G2g; Sun, 16 Feb 2025 23:00:36 +0000 Received: from dfw.source.kernel.org ([139.178.84.217]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tjncn-00000002kJ1-2c5z for linux-riscv@lists.infradead.org; Sun, 16 Feb 2025 23:00:34 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id A2DC95C5824; Sun, 16 Feb 2025 22:59:53 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 863A9C4CEEC; Sun, 16 Feb 2025 23:00:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1739746832; bh=qkiWzedE1M/317VJo+WP89ap6BCHBj0FmuJve7sxChA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=pfWNwLRF7Bm77ieywUngO9aZbBSN/ZBAE7Y6TJcW8VrHHe6b07T/ZgrziLZfdA8Bo KS67qYm2LaPe1lgEghWRDcTS12V/wwwvHgiycLHDCYnvdJrs9OdP8dHSg3V4RgLJIP ov5/FXeltVlN3KBP1++nZ0AWthFBtjGxVb7nXJnaq6izLCq/GY5NTFYKy/Fvl/gW/W kOhWsY9J0ZE2p/gT79KVo9dLyF+66knl0YgAm3HzcBtnWwl8VxGJBz97RX5CEB6NTK M5XwltMtfEHeuttmEDD2afZgDe66Z0k4qN2SM6KytQYcC4gXLb10G3of7OF8pkLKDr f0hn6Xr8WQx0A== From: Eric Biggers To: linux-kernel@vger.kernel.org Cc: linux-crypto@vger.kernel.org, linux-riscv@lists.infradead.org, Zhihang Shao , Ard Biesheuvel , Xiao Wang , Charlie Jenkins Subject: [PATCH 3/4] riscv/crc-t10dif: add Zbc optimized CRC-T10DIF function Date: Sun, 16 Feb 2025 14:55:29 -0800 Message-ID: <20250216225530.306980-4-ebiggers@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250216225530.306980-1-ebiggers@kernel.org> References: <20250216225530.306980-1-ebiggers@kernel.org> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250216_150033_751787_A25276E8 X-CRM114-Status: GOOD ( 15.22 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Eric Biggers Wire up crc_t10dif_arch() for RISC-V using crc-clmul-template.h. This greatly improves CRC-T10DIF performance on Zbc-capable CPUs. Signed-off-by: Eric Biggers --- arch/riscv/Kconfig | 1 + arch/riscv/lib/Makefile | 2 ++ arch/riscv/lib/crc-clmul-consts.h | 20 +++++++++++++++++++- arch/riscv/lib/crc-clmul.h | 2 ++ arch/riscv/lib/crc-t10dif.c | 24 ++++++++++++++++++++++++ arch/riscv/lib/crc16_msb.c | 18 ++++++++++++++++++ 6 files changed, 66 insertions(+), 1 deletion(-) create mode 100644 arch/riscv/lib/crc-t10dif.c create mode 100644 arch/riscv/lib/crc16_msb.c diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 7612c52e9b1e..db1cf9666dfd 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -23,10 +23,11 @@ config RISCV select ARCH_ENABLE_MEMORY_HOTREMOVE if MEMORY_HOTPLUG select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2 select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE select ARCH_HAS_BINFMT_FLAT select ARCH_HAS_CRC32 if RISCV_ISA_ZBC + select ARCH_HAS_CRC_T10DIF if RISCV_ISA_ZBC select ARCH_HAS_CURRENT_STACK_POINTER select ARCH_HAS_DEBUG_VIRTUAL if MMU select ARCH_HAS_DEBUG_VM_PGTABLE select ARCH_HAS_DEBUG_WX select ARCH_HAS_FAST_MULTIPLIER diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile index 7b32d3e88337..06d9552b9c8b 100644 --- a/arch/riscv/lib/Makefile +++ b/arch/riscv/lib/Makefile @@ -15,8 +15,10 @@ endif lib-$(CONFIG_MMU) += uaccess.o lib-$(CONFIG_64BIT) += tishift.o lib-$(CONFIG_RISCV_ISA_ZICBOZ) += clear_page.o obj-$(CONFIG_CRC32_ARCH) += crc32-riscv.o crc32-riscv-y := crc32.o crc32_msb.o crc32_lsb.o +obj-$(CONFIG_CRC_T10DIF_ARCH) += crc-t10dif-riscv.o +crc-t10dif-riscv-y := crc-t10dif.o crc16_msb.o obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o lib-$(CONFIG_RISCV_ISA_V) += xor.o lib-$(CONFIG_RISCV_ISA_V) += riscv_v_helpers.o diff --git a/arch/riscv/lib/crc-clmul-consts.h b/arch/riscv/lib/crc-clmul-consts.h index 6fdf10648a20..b3a02b9096cd 100644 --- a/arch/riscv/lib/crc-clmul-consts.h +++ b/arch/riscv/lib/crc-clmul-consts.h @@ -1,10 +1,10 @@ /* SPDX-License-Identifier: GPL-2.0-or-later */ /* * CRC constants generated by: * - * ./scripts/gen-crc-consts.py riscv_clmul crc32_msb_0x04c11db7,crc32_lsb_0xedb88320,crc32_lsb_0x82f63b78 + * ./scripts/gen-crc-consts.py riscv_clmul crc16_msb_0x8bb7,crc32_msb_0x04c11db7,crc32_lsb_0xedb88320,crc32_lsb_0x82f63b78 * * Do not edit manually. */ struct crc_clmul_consts { @@ -12,10 +12,28 @@ struct crc_clmul_consts { unsigned long fold_across_2_longs_const_lo; unsigned long barrett_reduction_const_1; unsigned long barrett_reduction_const_2; }; +/* + * Constants generated for most-significant-bit-first CRC-16 using + * G(x) = x^16 + x^15 + x^11 + x^9 + x^8 + x^7 + x^5 + x^4 + x^2 + x^1 + x^0 + */ +static const struct crc_clmul_consts crc16_msb_0x8bb7_consts __maybe_unused = { +#ifdef CONFIG_64BIT + .fold_across_2_longs_const_hi = 0x0000000000001faa, /* x^192 mod G */ + .fold_across_2_longs_const_lo = 0x000000000000a010, /* x^128 mod G */ + .barrett_reduction_const_1 = 0xfb2d2bfc0e99d245, /* floor(x^79 / G) */ + .barrett_reduction_const_2 = 0x0000000000008bb7, /* G - x^16 */ +#else + .fold_across_2_longs_const_hi = 0x00005890, /* x^96 mod G */ + .fold_across_2_longs_const_lo = 0x0000f249, /* x^64 mod G */ + .barrett_reduction_const_1 = 0xfb2d2bfc, /* floor(x^47 / G) */ + .barrett_reduction_const_2 = 0x00008bb7, /* G - x^16 */ +#endif +}; + /* * Constants generated for most-significant-bit-first CRC-32 using * G(x) = x^32 + x^26 + x^23 + x^22 + x^16 + x^12 + x^11 + x^10 + x^8 + x^7 + * x^5 + x^4 + x^2 + x^1 + x^0 */ diff --git a/arch/riscv/lib/crc-clmul.h b/arch/riscv/lib/crc-clmul.h index 62bd41080785..162c1b12b219 100644 --- a/arch/riscv/lib/crc-clmul.h +++ b/arch/riscv/lib/crc-clmul.h @@ -5,10 +5,12 @@ #define _RISCV_CRC_CLMUL_H #include #include "crc-clmul-consts.h" +u16 crc16_msb_clmul(u16 crc, const void *p, size_t len, + const struct crc_clmul_consts *consts); u32 crc32_msb_clmul(u32 crc, const void *p, size_t len, const struct crc_clmul_consts *consts); u32 crc32_lsb_clmul(u32 crc, const void *p, size_t len, const struct crc_clmul_consts *consts); diff --git a/arch/riscv/lib/crc-t10dif.c b/arch/riscv/lib/crc-t10dif.c new file mode 100644 index 000000000000..e6b0051ccd86 --- /dev/null +++ b/arch/riscv/lib/crc-t10dif.c @@ -0,0 +1,24 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * RISC-V optimized CRC-T10DIF function + * + * Copyright 2025 Google LLC + */ + +#include +#include +#include +#include + +#include "crc-clmul.h" + +u16 crc_t10dif_arch(u16 crc, const u8 *p, size_t len) +{ + if (riscv_has_extension_likely(RISCV_ISA_EXT_ZBC)) + return crc16_msb_clmul(crc, p, len, &crc16_msb_0x8bb7_consts); + return crc_t10dif_generic(crc, p, len); +} +EXPORT_SYMBOL(crc_t10dif_arch); + +MODULE_DESCRIPTION("RISC-V optimized CRC-T10DIF function"); +MODULE_LICENSE("GPL"); diff --git a/arch/riscv/lib/crc16_msb.c b/arch/riscv/lib/crc16_msb.c new file mode 100644 index 000000000000..554d295e95f5 --- /dev/null +++ b/arch/riscv/lib/crc16_msb.c @@ -0,0 +1,18 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * RISC-V optimized most-significant-bit-first CRC16 + * + * Copyright 2025 Google LLC + */ + +#include "crc-clmul.h" + +typedef u16 crc_t; +#define LSB_CRC 0 +#include "crc-clmul-template.h" + +u16 crc16_msb_clmul(u16 crc, const void *p, size_t len, + const struct crc_clmul_consts *consts) +{ + return crc_clmul(crc, p, len, consts); +} From patchwork Sun Feb 16 22:55:30 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Biggers X-Patchwork-Id: 13976777 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 291F1C02198 for ; Sun, 16 Feb 2025 23:00:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=eZtAMegAKc4drCt0nn1S06U0npBQ7mbnl02pbX60pxQ=; b=qusJ8AaoRNlCGe qJonNMs0INh03S3FpcD89fCZh+KJsMIUO+f7Knw6bDOO3NbkxxseCKPC5V5TIFFtHBwZOW0qwNEqN LUX0ccPWI+rM1cMjj75k4tbadH0uiAEf/XRsgcahaAANlbK9Q9qEgZXcbP4dKphhH8h3QxGgnL1tE cxmGcgNLmW4LTvT3IfcGTpsTlBJ1kHTqL1JZkfQwZ/0qdzceuGN7SpUPep6q6Mya67DEX56XOStAl vq7VTAumktBuO8e9M+Llq6oT412QKrc4dsEUUiRNsZq15/4L39J/F94G28dNHbgz2hkuNNDRozZeL 8DlROOs4y/j+z3J/CSkA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tjncs-00000002kLY-1rix; Sun, 16 Feb 2025 23:00:38 +0000 Received: from nyc.source.kernel.org ([2604:1380:45d1:ec00::3]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tjnco-00000002kJ4-0elb for linux-riscv@lists.infradead.org; Sun, 16 Feb 2025 23:00:35 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 7045AA41056; Sun, 16 Feb 2025 22:58:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E1F5EC4CEF1; Sun, 16 Feb 2025 23:00:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1739746833; bh=lEjXKuNQrqFQrs67kbVmsLM6h39EaeFvwXc4cVIAC3M=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=tmKYvSzpXWbDg2X4OFjHLNu15ZWyE6kFgy4zF95Rw6C6TWaZxtwhqFIk1xNFvPJCv cnBJxxCe6qjr2rLUqqOFmt3CEvsqiTKmrGC+oTjHXKecCYo/K5XNo/uR2XlJ9CVfyW NQj0MSu1yiNDq43Vqm9vCG8VBwUsMXX6YtltduTUF3QOSwvm9N9vhwitoXe7JIn18h BjZaxDeiDniSbKy8MtN9zTdNO5hcjdWIxE9ipMMxfHThoMHcIxuDPXmLr6H9NjZ3V/ rJp9yot0IgjLENEQ/EIoiu/FDFPEFWkVsJJVxA2himVPLUd2z60rbKnNJ7rXmXcHNK ih0OuGXW2Cwsg== From: Eric Biggers To: linux-kernel@vger.kernel.org Cc: linux-crypto@vger.kernel.org, linux-riscv@lists.infradead.org, Zhihang Shao , Ard Biesheuvel , Xiao Wang , Charlie Jenkins Subject: [PATCH 4/4] riscv/crc64: add Zbc optimized CRC64 functions Date: Sun, 16 Feb 2025 14:55:30 -0800 Message-ID: <20250216225530.306980-5-ebiggers@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250216225530.306980-1-ebiggers@kernel.org> References: <20250216225530.306980-1-ebiggers@kernel.org> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250216_150034_322421_39C3546E X-CRM114-Status: GOOD ( 16.79 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Eric Biggers Wire up crc64_be_arch() and crc64_nvme_arch() for 64-bit RISC-V using crc-clmul-template.h. This greatly improves the performance of these CRCs on Zbc-capable CPUs in 64-bit kernels. These optimized CRC64 functions are not yet supported in 32-bit kernels, since crc-clmul-template.h assumes that the CRC fits in an unsigned long. That implementation limitation could be addressed, but it would add a fair bit of complexity, so it has been omitted for now. Signed-off-by: Eric Biggers --- arch/riscv/Kconfig | 1 + arch/riscv/lib/Makefile | 2 ++ arch/riscv/lib/crc-clmul-consts.h | 34 ++++++++++++++++++++++++++++++- arch/riscv/lib/crc-clmul.h | 6 ++++++ arch/riscv/lib/crc64.c | 34 +++++++++++++++++++++++++++++++ arch/riscv/lib/crc64_lsb.c | 18 ++++++++++++++++ arch/riscv/lib/crc64_msb.c | 18 ++++++++++++++++ 7 files changed, 112 insertions(+), 1 deletion(-) create mode 100644 arch/riscv/lib/crc64.c create mode 100644 arch/riscv/lib/crc64_lsb.c create mode 100644 arch/riscv/lib/crc64_msb.c diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index db1cf9666dfd..e10dda2d0bfe 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -23,10 +23,11 @@ config RISCV select ARCH_ENABLE_MEMORY_HOTREMOVE if MEMORY_HOTPLUG select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2 select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE select ARCH_HAS_BINFMT_FLAT select ARCH_HAS_CRC32 if RISCV_ISA_ZBC + select ARCH_HAS_CRC64 if 64BIT && RISCV_ISA_ZBC select ARCH_HAS_CRC_T10DIF if RISCV_ISA_ZBC select ARCH_HAS_CURRENT_STACK_POINTER select ARCH_HAS_DEBUG_VIRTUAL if MMU select ARCH_HAS_DEBUG_VM_PGTABLE select ARCH_HAS_DEBUG_WX diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile index 06d9552b9c8b..b1c46153606a 100644 --- a/arch/riscv/lib/Makefile +++ b/arch/riscv/lib/Makefile @@ -15,10 +15,12 @@ endif lib-$(CONFIG_MMU) += uaccess.o lib-$(CONFIG_64BIT) += tishift.o lib-$(CONFIG_RISCV_ISA_ZICBOZ) += clear_page.o obj-$(CONFIG_CRC32_ARCH) += crc32-riscv.o crc32-riscv-y := crc32.o crc32_msb.o crc32_lsb.o +obj-$(CONFIG_CRC64_ARCH) += crc64-riscv.o +crc64-riscv-y := crc64.o crc64_msb.o crc64_lsb.o obj-$(CONFIG_CRC_T10DIF_ARCH) += crc-t10dif-riscv.o crc-t10dif-riscv-y := crc-t10dif.o crc16_msb.o obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o lib-$(CONFIG_RISCV_ISA_V) += xor.o lib-$(CONFIG_RISCV_ISA_V) += riscv_v_helpers.o diff --git a/arch/riscv/lib/crc-clmul-consts.h b/arch/riscv/lib/crc-clmul-consts.h index b3a02b9096cd..8d73449235ef 100644 --- a/arch/riscv/lib/crc-clmul-consts.h +++ b/arch/riscv/lib/crc-clmul-consts.h @@ -1,10 +1,10 @@ /* SPDX-License-Identifier: GPL-2.0-or-later */ /* * CRC constants generated by: * - * ./scripts/gen-crc-consts.py riscv_clmul crc16_msb_0x8bb7,crc32_msb_0x04c11db7,crc32_lsb_0xedb88320,crc32_lsb_0x82f63b78 + * ./scripts/gen-crc-consts.py riscv_clmul crc16_msb_0x8bb7,crc32_msb_0x04c11db7,crc32_lsb_0xedb88320,crc32_lsb_0x82f63b78,crc64_msb_0x42f0e1eba9ea3693,crc64_lsb_0x9a6c9329ac4bc9b5 * * Do not edit manually. */ struct crc_clmul_consts { @@ -86,5 +86,37 @@ static const struct crc_clmul_consts crc32_lsb_0x82f63b78_consts __maybe_unused .fold_across_2_longs_const_lo = 0xdd45aab8, /* x^63 mod G */ .barrett_reduction_const_1 = 0xdea713f1, /* floor(x^63 / G) */ .barrett_reduction_const_2 = 0x82f63b78, /* (G - x^32) * x^0 */ #endif }; + +/* + * Constants generated for most-significant-bit-first CRC-64 using + * G(x) = x^64 + x^62 + x^57 + x^55 + x^54 + x^53 + x^52 + x^47 + x^46 + x^45 + + * x^40 + x^39 + x^38 + x^37 + x^35 + x^33 + x^32 + x^31 + x^29 + x^27 + + * x^24 + x^23 + x^22 + x^21 + x^19 + x^17 + x^13 + x^12 + x^10 + x^9 + + * x^7 + x^4 + x^1 + x^0 + */ +#ifdef CONFIG_64BIT +static const struct crc_clmul_consts crc64_msb_0x42f0e1eba9ea3693_consts __maybe_unused = { + .fold_across_2_longs_const_hi = 0x4eb938a7d257740e, /* x^192 mod G */ + .fold_across_2_longs_const_lo = 0x05f5c3c7eb52fab6, /* x^128 mod G */ + .barrett_reduction_const_1 = 0xabc694e836627c39, /* floor(x^127 / G) */ + .barrett_reduction_const_2 = 0x42f0e1eba9ea3693, /* G - x^64 */ +}; +#endif + +/* + * Constants generated for least-significant-bit-first CRC-64 using + * G(x) = x^64 + x^63 + x^61 + x^59 + x^58 + x^56 + x^55 + x^52 + x^49 + x^48 + + * x^47 + x^46 + x^44 + x^41 + x^37 + x^36 + x^34 + x^32 + x^31 + x^28 + + * x^26 + x^23 + x^22 + x^19 + x^16 + x^13 + x^12 + x^10 + x^9 + x^6 + + * x^4 + x^3 + x^0 + */ +#ifdef CONFIG_64BIT +static const struct crc_clmul_consts crc64_lsb_0x9a6c9329ac4bc9b5_consts __maybe_unused = { + .fold_across_2_longs_const_hi = 0xeadc41fd2ba3d420, /* x^191 mod G */ + .fold_across_2_longs_const_lo = 0x21e9761e252621ac, /* x^127 mod G */ + .barrett_reduction_const_1 = 0x27ecfa329aef9f77, /* floor(x^127 / G) */ + .barrett_reduction_const_2 = 0x9a6c9329ac4bc9b5, /* (G - x^64) * x^0 */ +}; +#endif diff --git a/arch/riscv/lib/crc-clmul.h b/arch/riscv/lib/crc-clmul.h index 162c1b12b219..dd1736245815 100644 --- a/arch/riscv/lib/crc-clmul.h +++ b/arch/riscv/lib/crc-clmul.h @@ -11,7 +11,13 @@ u16 crc16_msb_clmul(u16 crc, const void *p, size_t len, const struct crc_clmul_consts *consts); u32 crc32_msb_clmul(u32 crc, const void *p, size_t len, const struct crc_clmul_consts *consts); u32 crc32_lsb_clmul(u32 crc, const void *p, size_t len, const struct crc_clmul_consts *consts); +#ifdef CONFIG_64BIT +u64 crc64_msb_clmul(u64 crc, const void *p, size_t len, + const struct crc_clmul_consts *consts); +u64 crc64_lsb_clmul(u64 crc, const void *p, size_t len, + const struct crc_clmul_consts *consts); +#endif #endif /* _RISCV_CRC_CLMUL_H */ diff --git a/arch/riscv/lib/crc64.c b/arch/riscv/lib/crc64.c new file mode 100644 index 000000000000..f0015a27836a --- /dev/null +++ b/arch/riscv/lib/crc64.c @@ -0,0 +1,34 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * RISC-V optimized CRC64 functions + * + * Copyright 2025 Google LLC + */ + +#include +#include +#include +#include + +#include "crc-clmul.h" + +u64 crc64_be_arch(u64 crc, const u8 *p, size_t len) +{ + if (riscv_has_extension_likely(RISCV_ISA_EXT_ZBC)) + return crc64_msb_clmul(crc, p, len, + &crc64_msb_0x42f0e1eba9ea3693_consts); + return crc64_be_generic(crc, p, len); +} +EXPORT_SYMBOL(crc64_be_arch); + +u64 crc64_nvme_arch(u64 crc, const u8 *p, size_t len) +{ + if (riscv_has_extension_likely(RISCV_ISA_EXT_ZBC)) + return crc64_lsb_clmul(crc, p, len, + &crc64_lsb_0x9a6c9329ac4bc9b5_consts); + return crc64_nvme_generic(crc, p, len); +} +EXPORT_SYMBOL(crc64_nvme_arch); + +MODULE_DESCRIPTION("RISC-V optimized CRC64 functions"); +MODULE_LICENSE("GPL"); diff --git a/arch/riscv/lib/crc64_lsb.c b/arch/riscv/lib/crc64_lsb.c new file mode 100644 index 000000000000..c5371bb85d90 --- /dev/null +++ b/arch/riscv/lib/crc64_lsb.c @@ -0,0 +1,18 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * RISC-V optimized least-significant-bit-first CRC64 + * + * Copyright 2025 Google LLC + */ + +#include "crc-clmul.h" + +typedef u64 crc_t; +#define LSB_CRC 1 +#include "crc-clmul-template.h" + +u64 crc64_lsb_clmul(u64 crc, const void *p, size_t len, + const struct crc_clmul_consts *consts) +{ + return crc_clmul(crc, p, len, consts); +} diff --git a/arch/riscv/lib/crc64_msb.c b/arch/riscv/lib/crc64_msb.c new file mode 100644 index 000000000000..1925d1dbe225 --- /dev/null +++ b/arch/riscv/lib/crc64_msb.c @@ -0,0 +1,18 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * RISC-V optimized most-significant-bit-first CRC64 + * + * Copyright 2025 Google LLC + */ + +#include "crc-clmul.h" + +typedef u64 crc_t; +#define LSB_CRC 0 +#include "crc-clmul-template.h" + +u64 crc64_msb_clmul(u64 crc, const void *p, size_t len, + const struct crc_clmul_consts *consts) +{ + return crc_clmul(crc, p, len, consts); +}