From patchwork Sun Aug 27 01:26:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Charlie Jenkins X-Patchwork-Id: 13366809 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A09B8C83F01 for ; Sun, 27 Aug 2023 01:27:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:In-Reply-To:References:Message-Id :MIME-Version:Subject:Date:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=OcKbNpxdiEkp/EiXLwtGyn9XuZPQnTNglt7+223TYT0=; b=XbMa85pyujopNP jwa9uHVXkoGC6Bru5WcV/iHHkbcGXtcy4OI7Aa+4gcAj3baA9MB5d17cFYEpskU7E/RyW0LYF8VXD jkU3tEwj2dr99PM5i0qr2bfyxMlr8CaxZD+T7t0/FJ9//yYbjIwh/7ScEAYNOvvLHil9LidM0C4mb DAtRvTZFRY4DoL6usgCDJ5uq+acPbGPo03TDqmZe4ibaSQJdw24PbZabXQ3p6UifEi5W/6e4j46B/ u0+YXE6lwmM+XY92Chq/2QIdam9BiuUNVuNiFYXgypYiVvYQxkUgr7cEuS6RQ0AWjIf2j+9i0M2Vf ia/4Uag9YtQbVvZ8aJQg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qa4YK-007MbL-2X; Sun, 27 Aug 2023 01:26:56 +0000 Received: from mail-oo1-xc32.google.com ([2607:f8b0:4864:20::c32]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qa4YF-007MZM-0G for linux-riscv@lists.infradead.org; Sun, 27 Aug 2023 01:26:52 +0000 Received: by mail-oo1-xc32.google.com with SMTP id 006d021491bc7-5733d431209so1380198eaf.0 for ; Sat, 26 Aug 2023 18:26:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20221208.gappssmtp.com; s=20221208; t=1693099608; x=1693704408; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=xfTS8tqhuCXvlRMkmz2PkepwLOiLzUz1oIVWEI6AmSg=; b=2uClxYn/559ZVUpa1YquW1FFWcG7vzRXkQRMoM5j2FLcrHWGTjCAKRd75m/5gdiqPh yM+9URcFsOcw0EQelDYFTuuZjZCoosRnNu79MJR79CKVohEkJwd6gRsL/eeobALPxFFg cQP6lkg8MltZCwzgfnmU0HspEgv5zpW/R1cgqxW3zzvjxYxixve7rX286VuKEZXZjWfa iHo1icydIYbrJ+qDmejDKV3LjvPKKECqFybgQlr4t5WJg70btYWsiKcwyHJjXEriRDiE ZBrdt7bfW73bkbWVNSu8uViudvT87B7nA3+sVIklrRrX7hUeKCGoS/qzG7wnSeXbMPtk mn/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693099608; x=1693704408; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xfTS8tqhuCXvlRMkmz2PkepwLOiLzUz1oIVWEI6AmSg=; b=iTer80JRysyui0TMZPYDIWEtFPqW2a2qdMeD5RECGvjeScx93MrzgSNxyZ1sx4SyA9 51DskWoWoLLJfegvEyYnjbw4fFV3aWZCiIjwtiHCwCU4GD8Z1SvqctwJ/AJzq2jc6qnK 7unWm+yUmo9fpO5RnjZ7pqFFAY5N1hCSQ5tqK41yU0kUaQ3i3/se3lO8NkCThEDtfwdA XJ+4XCca/ax8PULQYrs+8F92i9czqG5hcit9hLFgSleHBzdadasNOUWnF4Q66RBYZxty dXM1vydE0RgmRO/VLfgUGwov0rfDuVj159jfdxuULl3UDNPb2cZDbMD+MoHfpfuI2b6R 9CMQ== X-Gm-Message-State: AOJu0YzYA1BrLmT3kqxBiEaknzVe50yKkfjdzhz8uheYOJKsqXP7g0Sd 07WPRddattDtoBn9ARsW6MDugw== X-Google-Smtp-Source: AGHT+IFLXaIwbb2H3mZ5Q6cj8cOqVl+jK5FQkosINRTZl886s4rzQbb1kPqE57ixOM1lSRn0MVvJpA== X-Received: by 2002:a05:6358:880a:b0:12b:e45b:3fac with SMTP id hv10-20020a056358880a00b0012be45b3facmr17843078rwb.32.1693099607875; Sat, 26 Aug 2023 18:26:47 -0700 (PDT) Received: from charlie.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id jf6-20020a170903268600b001b869410ed2sm4357404plb.72.2023.08.26.18.26.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Aug 2023 18:26:46 -0700 (PDT) From: Charlie Jenkins Date: Sat, 26 Aug 2023 18:26:06 -0700 Subject: [PATCH 1/5] riscv: Checksum header MIME-Version: 1.0 Message-Id: <20230826-optimize_checksum-v1-1-937501b4522a@rivosinc.com> References: <20230826-optimize_checksum-v1-0-937501b4522a@rivosinc.com> In-Reply-To: <20230826-optimize_checksum-v1-0-937501b4522a@rivosinc.com> To: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Paul Walmsley , Palmer Dabbelt , Albert Ou , Charlie Jenkins X-Mailer: b4 0.12.3 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230826_182651_123020_B99B8379 X-CRM114-Status: GOOD ( 16.97 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Provide checksum algorithms that have been designed to leverage riscv instructions such as rotate. In 64-bit, can take advantage of the larger register to avoid some overflow checking. Add configuration for Zba extension and add march for Zba and Zbb. Signed-off-by: Charlie Jenkins --- arch/riscv/Kconfig | 23 +++++++++++ arch/riscv/Makefile | 2 + arch/riscv/include/asm/checksum.h | 86 +++++++++++++++++++++++++++++++++++++++ 3 files changed, 111 insertions(+) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 4c07b9189c86..8d7e475ca28d 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -507,6 +507,29 @@ config RISCV_ISA_V_DEFAULT_ENABLE If you don't know what to do here, say Y. +config TOOLCHAIN_HAS_ZBA + bool + default y + depends on !64BIT || $(cc-option,-mabi=lp64 -march=rv64ima_zba) + depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zba) + depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900 + depends on AS_HAS_OPTION_ARCH + +config RISCV_ISA_ZBA + bool "Zba extension support for bit manipulation instructions" + depends on TOOLCHAIN_HAS_ZBA + depends on MMU + depends on RISCV_ALTERNATIVE + default y + help + Adds support to dynamically detect the presence of the ZBA + extension (basic bit manipulation) and enable its usage. + + The Zba extension provides instructions to accelerate a number + of bit-specific address creation operations. + + If you don't know what to do here, say Y. + config TOOLCHAIN_HAS_ZBB bool default y diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile index 6ec6d52a4180..51fa3f67fc9a 100644 --- a/arch/riscv/Makefile +++ b/arch/riscv/Makefile @@ -61,6 +61,8 @@ riscv-march-$(CONFIG_ARCH_RV64I) := rv64ima riscv-march-$(CONFIG_FPU) := $(riscv-march-y)fd riscv-march-$(CONFIG_RISCV_ISA_C) := $(riscv-march-y)c riscv-march-$(CONFIG_RISCV_ISA_V) := $(riscv-march-y)v +riscv-march-$(CONFIG_RISCV_ISA_ZBA) := $(riscv-march-y)_zba +riscv-march-$(CONFIG_RISCV_ISA_ZBB) := $(riscv-march-y)_zbb ifdef CONFIG_TOOLCHAIN_NEEDS_OLD_ISA_SPEC KBUILD_CFLAGS += -Wa,-misa-spec=2.2 diff --git a/arch/riscv/include/asm/checksum.h b/arch/riscv/include/asm/checksum.h new file mode 100644 index 000000000000..cd98f8cde888 --- /dev/null +++ b/arch/riscv/include/asm/checksum.h @@ -0,0 +1,86 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * IP checksum routines + * + * Copyright (C) 2023 Rivos Inc. + */ +#ifndef __ASM_RISCV_CHECKSUM_H +#define __ASM_RISCV_CHECKSUM_H + +#include +#include + +/* Default version is sufficient for 32 bit */ +#ifdef CONFIG_64BIT +#define _HAVE_ARCH_IPV6_CSUM +__sum16 csum_ipv6_magic(const struct in6_addr *saddr, + const struct in6_addr *daddr, + __u32 len, __u8 proto, __wsum sum); +#endif + +/* + * Fold a partial checksum without adding pseudo headers + */ +static inline __sum16 csum_fold(__wsum sum) +{ + sum += (sum >> 16) | (sum << 16); + return (__force __sum16)(~(sum >> 16)); +} + +#define csum_fold csum_fold + +/* + * This is a version of ip_compute_csum() optimized for IP headers, + * which always checksum on 4 octet boundaries. + * Optimized for 32 and 64 bit platforms, with and without vector, with and + * without the bitmanip extensions zba/zbb. + */ +#ifdef CONFIG_32BIT +static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl) +{ + __wsum csum = 0; + int pos = 0; + + do { + csum += ((const __wsum *)iph)[pos]; + csum += csum < ((const __wsum *)iph)[pos]; + } while (++pos < ihl); + return csum_fold(csum); +} +#else + +/* + * Quickly compute an IP checksum with the assumption that IPv4 headers will + * always be in multiples of 32-bits, and have an ihl of at least 5. + * @ihl is the number of 32 bit segments and must be greater than or equal to 5. + * @iph is also assumed to be word aligned. + */ +static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl) +{ + unsigned long beginning; + unsigned long csum = 0; + + beginning = ((const unsigned long *)iph)[0]; + beginning += ((const unsigned long *)iph)[1]; + beginning += beginning < ((const unsigned long *)iph)[1]; + int pos = 4; + + do { + csum += ((const unsigned int *)iph)[pos]; + } while (++pos < ihl); + csum += beginning; + csum += csum < beginning; + csum += (csum >> 32) | (csum << 32); // Calculate overflow + return csum_fold((__force __wsum)(csum >> 32)); +} +#endif +#define ip_fast_csum ip_fast_csum + +#ifdef CONFIG_64BIT +extern unsigned int do_csum(const unsigned char *buff, int len); +#define do_csum do_csum +#endif + +#include + +#endif From patchwork Sun Aug 27 01:26:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Charlie Jenkins X-Patchwork-Id: 13366810 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 720D4C71153 for ; Sun, 27 Aug 2023 01:27:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:In-Reply-To:References:Message-Id :MIME-Version:Subject:Date:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=sTvPaCuJjGUNn942proSwmQy1tc3ey9uKsGfzdPQO+k=; b=aVyJCHuif30fCH 4pPePwwdi51joX9YnIyopLGkcNJ7L57lJ31LHjZm6yQrhBbe0j5pK2qSnHiHeVKl2leG6POc4/3UI qyVg+JFFdY0Ef5DPlLdw8kLcbJaTYD6IBF4erUvgnw7/Rdx9dyTxjuriqZyC3tpDXJVL242Ml85yY jGv6pJ6mpfaWOEu2nn/z5hrn8p58lMQmb9n48G47EjOcBiHqf+lnNypQcWPuBT9+Yp6CJQDjbVAEC bkxobVQhHTtqZ9wZKOSce+7NDAmowy9vFjRJJLZ51Nd9/183EC0MYbrqYbxJDXYqykdAzlagwVq/8 zpqQzUmwbUJrD/xCk9Mg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qa4YL-007MbV-2l; Sun, 27 Aug 2023 01:26:57 +0000 Received: from mail-pj1-x102a.google.com ([2607:f8b0:4864:20::102a]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qa4YF-007MZa-2A for linux-riscv@lists.infradead.org; Sun, 27 Aug 2023 01:26:54 +0000 Received: by mail-pj1-x102a.google.com with SMTP id 98e67ed59e1d1-26b44247123so1334687a91.2 for ; Sat, 26 Aug 2023 18:26:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20221208.gappssmtp.com; s=20221208; t=1693099609; x=1693704409; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=VcyEIUqviA7khUw+arj1F8KZbIH4pa6BCJUUF1erYKQ=; b=KBssN7pSowk/dJIIEgHKav1tbbyrmro3jYrP4WaAKlOMelorWlbKl6B5G9TRP9R9vm Cn/y4MlGtkiL+mEqtt7syBiBa6vzehktUGC4vJl5Bek9x1lbOAyLfEIuYLQBFfnr06QO vuULcGbun73JMaKzaej0JFrEU4DD5K4v6xHe2t941gT/NqP2zUPVypqwXGoSdrHg+zI0 FqfR4wVVNL3DpAl5aijkwqrmOMvaqeKZHUegDc3KJhhZIM4cay+jKTpVhBvOiV8hR6QN /b73p5FCVDSRnKsC5SBWWr+P1MzCDKNAEUuk2YRI6nqctwO0uqMuMim2uq+zxNnS+Z3V ceZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693099609; x=1693704409; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VcyEIUqviA7khUw+arj1F8KZbIH4pa6BCJUUF1erYKQ=; b=Qw+ys5qRx52nve2xZztgudStLgqXYO15jwGQ5s40TxoYerdvTbk9LY7bVs+vwlDeMc rGm76udSGOiu6al0h+pBj725M9DIsu2QtYOMXHKtJ4ZZTmTfGlYenCpe+y4vuKiilPMY sfZnY86rUl4jHPrrsOfopzSQBT0n7sDBytXG+GgTMnTij7Ort5hcs/m3zxyLenEDEYa1 PZfmnfZB8X/gRaf42QVJUL/wfluAFLvZxBrLbfnaaOKoM32oXa5RAnH0xTBxeXZsvKbM VXKX71xLcZEtCUQ6bRQA5laCqjjz1BC8e/1VoVI78Y+3jAryW/xj5tgZVXsmtoQ26WCa m9Aw== X-Gm-Message-State: AOJu0YwgSWviPqjb/Ng1KqOdzCjrvXqUrt82H5/fLQLkeXS2jpectE/c dGvZCnQWB+xsAJgnCKxx/61IAQ== X-Google-Smtp-Source: AGHT+IHd5wO3p992s4Jo593vlDR2y4R+RaUPenkdSZfrxwMjPtp06m42dnM4COmq5sTNRn0RSemKjw== X-Received: by 2002:a17:90a:4ec2:b0:268:f987:305f with SMTP id v2-20020a17090a4ec200b00268f987305fmr21012361pjl.46.1693099608933; Sat, 26 Aug 2023 18:26:48 -0700 (PDT) Received: from charlie.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id jf6-20020a170903268600b001b869410ed2sm4357404plb.72.2023.08.26.18.26.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Aug 2023 18:26:48 -0700 (PDT) From: Charlie Jenkins Date: Sat, 26 Aug 2023 18:26:07 -0700 Subject: [PATCH 2/5] riscv: Add checksum library MIME-Version: 1.0 Message-Id: <20230826-optimize_checksum-v1-2-937501b4522a@rivosinc.com> References: <20230826-optimize_checksum-v1-0-937501b4522a@rivosinc.com> In-Reply-To: <20230826-optimize_checksum-v1-0-937501b4522a@rivosinc.com> To: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Paul Walmsley , Palmer Dabbelt , Albert Ou , Charlie Jenkins X-Mailer: b4 0.12.3 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230826_182651_710576_0F9C89EF X-CRM114-Status: GOOD ( 20.61 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Provide a 32 and 64 bit version of do_csum. When compiled for 32-bit will load from the buffer in groups of 32 bits, and when compiled for 64-bit will load in groups of 64 bits. Signed-off-by: Charlie Jenkins --- arch/riscv/include/asm/checksum.h | 2 - arch/riscv/lib/Makefile | 1 + arch/riscv/lib/csum.c | 118 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 119 insertions(+), 2 deletions(-) diff --git a/arch/riscv/include/asm/checksum.h b/arch/riscv/include/asm/checksum.h index cd98f8cde888..af49b3409576 100644 --- a/arch/riscv/include/asm/checksum.h +++ b/arch/riscv/include/asm/checksum.h @@ -76,10 +76,8 @@ static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl) #endif #define ip_fast_csum ip_fast_csum -#ifdef CONFIG_64BIT extern unsigned int do_csum(const unsigned char *buff, int len); #define do_csum do_csum -#endif #include diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile index 26cb2502ecf8..2aa1a4ad361f 100644 --- a/arch/riscv/lib/Makefile +++ b/arch/riscv/lib/Makefile @@ -6,6 +6,7 @@ lib-y += memmove.o lib-y += strcmp.o lib-y += strlen.o lib-y += strncmp.o +lib-y += csum.o lib-$(CONFIG_MMU) += uaccess.o lib-$(CONFIG_64BIT) += tishift.o lib-$(CONFIG_RISCV_ISA_ZICBOZ) += clear_page.o diff --git a/arch/riscv/lib/csum.c b/arch/riscv/lib/csum.c new file mode 100644 index 000000000000..2037041ce8a0 --- /dev/null +++ b/arch/riscv/lib/csum.c @@ -0,0 +1,118 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * IP checksum library + * + * Influenced by arch/arm64/lib/csum.c + * Copyright (C) 2023 Rivos Inc. + */ +#include +#include +#include +#include + +#include + +/* Default version is sufficient for 32 bit */ +#ifdef CONFIG_64BIT +__sum16 csum_ipv6_magic(const struct in6_addr *saddr, + const struct in6_addr *daddr, + __u32 len, __u8 proto, __wsum csum) +{ + unsigned long sum, ulen, uproto; + + uproto = (unsigned long)htonl(proto); + ulen = (unsigned long)htonl(len); + sum = (unsigned long)csum; + + sum += *(const unsigned long *)saddr->s6_addr; + sum += sum < csum; + + sum += *((const unsigned long *)saddr->s6_addr + 1); + sum += sum < *((const unsigned long *)saddr->s6_addr + 1); + + sum += *(const unsigned long *)daddr->s6_addr; + sum += sum < *(const unsigned long *)daddr->s6_addr; + + sum += *((const unsigned long *)daddr->s6_addr + 1); + sum += sum < *((const unsigned long *)daddr->s6_addr + 1); + + sum += ulen; + sum += sum < ulen; + + sum += uproto; + sum += sum < uproto; + + sum += (sum >> 32) | (sum << 32); + sum >>= 32; + return csum_fold((__force __wsum)sum); +} +EXPORT_SYMBOL(csum_ipv6_magic); +#endif + +#ifdef CONFIG_32BIT +typedef unsigned int csum_t; +#define OFFSET_MASK 3 +#else +typedef unsigned long csum_t; +#define OFFSET_MASK 7 +#endif + +/* + * Perform a checksum on an arbitrary memory address. + * Algorithm accounts for buff being misaligned. + * If not aligned on an 8-byte boundary, will read the whole byte but not use + * the bytes that it shouldn't. The same thing will occur on the tail-end of the + * read. + */ +unsigned int __no_sanitize_address do_csum(const unsigned char *buff, int len) +{ + unsigned int offset, shift; + csum_t csum, data; + const csum_t *ptr; + + if (unlikely(len <= 0)) + return 0; + /* + * To align the address, grab the whole first byte in buff. + * Since it is inside of a same byte, it will never cross pages or cache + * lines. + * Directly call KASAN with the alignment we will be using. + */ + offset = (csum_t)buff & OFFSET_MASK; + kasan_check_read(buff, len); + ptr = (const csum_t *)(buff - offset); + len = len + offset - sizeof(csum_t); + + /* + * RISC-V is always little endian, so need to clear bits to the right. + */ + shift = offset * 8; + data = *ptr; + data = (data >> shift) << shift; + + while (len > 0) { + csum += data; + csum += csum < data; + len -= sizeof(csum_t); + ptr += 1; + data = *ptr; + } + + /* + * Perform alignment (and over-read) bytes on the tail if any bytes + * leftover. + */ + shift = len * -8; + data = (data << shift) >> shift; + csum += data; + csum += csum < data; + +#ifdef CONFIG_64BIT + csum += (csum >> 32) | (csum << 32); + csum >>= 32; +#endif + csum = (unsigned int)csum + (((unsigned int)csum >> 16) | ((unsigned int)csum << 16)); + if (offset & 1) + return (unsigned short)swab32(csum); + return csum >> 16; +} From patchwork Sun Aug 27 01:26:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Charlie Jenkins X-Patchwork-Id: 13366814 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 41130C71153 for ; Sun, 27 Aug 2023 01:27:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:In-Reply-To:References:Message-Id :MIME-Version:Subject:Date:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=nr/cEK7UoCqzF1VrhNK9NMHAmfl/rOZPX054NEFVnyA=; b=OGHpOdXJiqO4P6 wQ5L1K5jzXuUQ2Y0jZsrFP+t4f4lAVYjHOuS6A/pB2v0w9zaD22WheLPSocszhknmvIv6hwisw9Ql no06hm7qIFmbHPjlvXJthxp3QFCBHTABOYhW3Ah57PvKbPP8Yx1pSU/rCa9cHHl86/6ral9hBkkeB C/U75LLHuwihNa3VD63KOVmtrW9TNkjGfLSJsLFkdRvHoVPXUbNDpIYrzwUqFawyCM1ggoiE1/xfh PARX2pyPrI5N+ps6Zq666RsEqrNBqvHNAJWjjo6lJ5fYZP+vYLAC1jHsK3yvzU+j0DSzg8DBeGXrg qRbyRZDgNsKAAm1aVIEg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qa4YL-007MbR-17; Sun, 27 Aug 2023 01:26:57 +0000 Received: from mail-oo1-xc31.google.com ([2607:f8b0:4864:20::c31]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qa4YF-007MZb-1h for linux-riscv@lists.infradead.org; Sun, 27 Aug 2023 01:26:53 +0000 Received: by mail-oo1-xc31.google.com with SMTP id 006d021491bc7-5712b68dbc0so1445197eaf.1 for ; Sat, 26 Aug 2023 18:26:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20221208.gappssmtp.com; s=20221208; t=1693099610; x=1693704410; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=ExZs/Ozc5l4bbPGNHrfkcofcGWcAaQ7sfcQyuY4JEts=; b=mENLkbVNQcciD80loVn0MQJWYsOKvy3JwwwlPSEIvInRWvoIsBRPy7a/ERxr32CN41 dE5XWQgAbSdsVxrMaGxqmyHw5iKPyNf7QfCKohtIu3BMNW/3iH6Fd3s4o2sOGS5w+IrI 73aSoP63Wf02tG9CDg3Mj/0a6z3vsLsuqrXz9AgfxBL3gRNfRgfeBybbtlyXJO+NIYW5 Gf+GHn+aM6hDzQV0QNd+pflAy8kGBGDq3Vfu6SH+nRY58BZxl4xbFEGA3SRSVpm6yweq j4lbOvWCf1ReWiljEiMGnJ2stpC9FJYC9V60H5Fcc5vZiPvXSWYjRLZu1Qo49er9FCsP cS5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693099610; x=1693704410; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ExZs/Ozc5l4bbPGNHrfkcofcGWcAaQ7sfcQyuY4JEts=; b=VLDsQfpUN/G1vt85+69sSKoUo3qwaeLERS12/jcfFEZ66rDFcxkMb573zSy/cQ33lF R52CZucZKe19saNPbxp+lQFl3jmkGa6dIKrye5btDH+vRHW5xFnukSjp/5ceND6vm8CL Go0cJFnZdXn82HeVg9kh56+9BN9S04csdKLxTbVzh5B0M4oUmPyYnjlDk9FKELxns03A UAdlFNTQ4eqROeglRdTivFuDlAuwaUiiXcOHdKrWB8Wkno6IUH/LG/ebwYEbUB1R1d8u yiu9tCfH/ri1clUXVVPLbQ6R2+z7Lsza5DmHne/1nwlzyLO95AYn49IBHSPVvAxrkrTf agYg== X-Gm-Message-State: AOJu0Yxybd8zzx67u/VnnzunAcBkPZPP0FoAhLjl2cTBR4cm6eE13pAT y0+BFo21F/p8pBcAMXds3jvvRA== X-Google-Smtp-Source: AGHT+IEGHUp91K/R5AJzws7m5jTW/0V4MOrwdiKfnAZq2vc5gFCk+yTLxuLR2FbxrnzWKQCy61yFdQ== X-Received: by 2002:a05:6358:281d:b0:135:ae78:56c9 with SMTP id k29-20020a056358281d00b00135ae7856c9mr24257121rwb.6.1693099610119; Sat, 26 Aug 2023 18:26:50 -0700 (PDT) Received: from charlie.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id jf6-20020a170903268600b001b869410ed2sm4357404plb.72.2023.08.26.18.26.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Aug 2023 18:26:49 -0700 (PDT) From: Charlie Jenkins Date: Sat, 26 Aug 2023 18:26:08 -0700 Subject: [PATCH 3/5] riscv: Vector checksum header MIME-Version: 1.0 Message-Id: <20230826-optimize_checksum-v1-3-937501b4522a@rivosinc.com> References: <20230826-optimize_checksum-v1-0-937501b4522a@rivosinc.com> In-Reply-To: <20230826-optimize_checksum-v1-0-937501b4522a@rivosinc.com> To: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Paul Walmsley , Palmer Dabbelt , Albert Ou , Charlie Jenkins X-Mailer: b4 0.12.3 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230826_182651_560230_26E16F40 X-CRM114-Status: GOOD ( 12.52 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org This patch is not ready for merge as vector support in the kernel is limited. However, the code has been tested in QEMU so the algorithms do work. It is written in assembly rather than using the GCC vector instrinsics because they did not provide optimal code. Signed-off-by: Charlie Jenkins --- arch/riscv/include/asm/checksum.h | 81 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 81 insertions(+) diff --git a/arch/riscv/include/asm/checksum.h b/arch/riscv/include/asm/checksum.h index af49b3409576..7e31c0ad6346 100644 --- a/arch/riscv/include/asm/checksum.h +++ b/arch/riscv/include/asm/checksum.h @@ -10,6 +10,10 @@ #include #include +#ifdef CONFIG_RISCV_ISA_V +#include +#endif + /* Default version is sufficient for 32 bit */ #ifdef CONFIG_64BIT #define _HAVE_ARCH_IPV6_CSUM @@ -36,6 +40,46 @@ static inline __sum16 csum_fold(__wsum sum) * without the bitmanip extensions zba/zbb. */ #ifdef CONFIG_32BIT +#ifdef CONFIG_RISCV_ISA_V +static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl) +{ + vuint64m1_t prev_buffer; + vuint32m1_t curr_buffer; + unsigned int vl; + unsigned int high_result; + unsigned int low_result; + + asm("vsetivli x0, 1, e64, ta, ma \n\t\ + vmv.v.i %[prev_buffer], 0 \n\t\ + 1: \n\t\ + vsetvli %[vl], %[ihl], e32, m1, ta, ma \n\t\ + vle32.v %[curr_buffer], (%[iph]) \n\t\ + vwredsumu.vs %[prev_buffer], %[curr_buffer], %[prev_buffer] \n\t\ + sub %[ihl], %[ihl], %[vl] \n\t" +#ifdef CONFIG_RISCV_ISA_ZBA + "sh2add %[iph], %[vl], %[iph] \n\t" +#else + "slli %[vl], %[vl], 2 \n\ + add %[iph], %[vl], %[iph] \n\t" +#endif + "bnez %[ihl], 1b \n\ + vsetivli x0, 1, e64, m1, ta, ma \n\ + vmv.x.s %[low_result], %[prev_buffer] \n\ + addi %[vl], x0, 32 \n\ + vsrl.vx %[prev_buffer], %[prev_buffer], %[vl] \n\ + vmv.x.s %[high_result], %[prev_buffer]" + : [vl] "=&r" (vl), [prev_buffer] "=&vd" (prev_buffer), + [curr_buffer] "=&vd" (curr_buffer), + [high_result] "=&r" (high_result), + [low_result] "=&r" (low_result) + : [iph] "r" (iph), [ihl] "r" (ihl)); + + high_result += low_result; + high_result += high_result < low_result; + return csum_fold((__force __wsum)(high_result)); +} + +#else static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl) { __wsum csum = 0; @@ -47,8 +91,44 @@ static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl) } while (++pos < ihl); return csum_fold(csum); } +#endif +#else + +#ifdef CONFIG_RISCV_ISA_V +static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl) +{ + vuint64m1_t prev_buffer; + vuint32m1_t curr_buffer; + unsigned long vl; + unsigned long result; + + asm("vsetivli x0, 1, e64, ta, ma \n\ + vmv.v.i %[prev_buffer], 0 \n\ + 1: \n\ + # Setup 32-bit sum of iph \n\ + vsetvli %[vl], %[ihl], e32, m1, ta, ma \n\ + vle32.v %[curr_buffer], (%[iph]) \n\ + # Sum each 32-bit segment of iph that can fit into a vector reg \n\ + vwredsumu.vs %[prev_buffer], %[curr_buffer], %[prev_buffer] \n\ + subw %[ihl], %[ihl], %[vl] \n\t" +#ifdef CONFIG_RISCV_ISA_ZBA + "sh2add %[iph], %[vl], %[iph] \n\t" #else + "slli %[vl], %[vl], 2 \n\ + addw %[iph], %[vl], %[iph] \n\t" +#endif + "# If not all of iph could fit into vector reg, do another sum \n\ + bnez %[ihl], 1b \n\ + vsetvli x0, x0, e64, m1, ta, ma \n\ + vmv.x.s %[result], %[prev_buffer]" + : [vl] "=&r" (vl), [prev_buffer] "=&vd" (prev_buffer), + [curr_buffer] "=&vd" (curr_buffer), [result] "=&r" (result) + : [iph] "r" (iph), [ihl] "r" (ihl)); + result += (result >> 32) | (result << 32); + return csum_fold((__force __wsum)(result >> 32)); +} +#else /* * Quickly compute an IP checksum with the assumption that IPv4 headers will * always be in multiples of 32-bits, and have an ihl of at least 5. @@ -74,6 +154,7 @@ static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl) return csum_fold((__force __wsum)(csum >> 32)); } #endif +#endif #define ip_fast_csum ip_fast_csum extern unsigned int do_csum(const unsigned char *buff, int len); From patchwork Sun Aug 27 01:26:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Charlie Jenkins X-Patchwork-Id: 13366811 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CCD37C83F15 for ; Sun, 27 Aug 2023 01:27:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:In-Reply-To:References:Message-Id :MIME-Version:Subject:Date:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=HVmNyBeoWpFnJyj0lcqSqhh0Zc3bO06iKPCnPubxy8Y=; b=ilK1bv5OgfgyeG KfvewI5DrcEu7FGmKPoPhQX/ERSkycJds4z3BteUUtDdjwG9bK/JWxrsfhPhLDZgwnf0wL6qbvjl+ +4zrZ3N0VLedeNEX/KYMva+N628kK4FpME8dLHryyavX5zYZp5JZjAsd4DDnYv/asEl3Yd8tUuwDR 4ryOqJCpDmjxau/BLOeFFcttB43BdJFI62I4liXxxoARmxtJTGKc6QvwnvTufFkk54D7vtRtWOzG6 zyQBNbe8HAcYm4CFKKROg5Qf4sFVURAGEamhX/Okh0mBm6AUZsz9bZPfe0COWVatcwfRyuNcj/Q+a xpcroiKYfgRFFHab2HZw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qa4YM-007McC-1J; Sun, 27 Aug 2023 01:26:58 +0000 Received: from mail-oo1-xc2d.google.com ([2607:f8b0:4864:20::c2d]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qa4YG-007MaH-2T for linux-riscv@lists.infradead.org; Sun, 27 Aug 2023 01:26:54 +0000 Received: by mail-oo1-xc2d.google.com with SMTP id 006d021491bc7-56e16e4aba7so1336271eaf.2 for ; Sat, 26 Aug 2023 18:26:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20221208.gappssmtp.com; s=20221208; t=1693099611; x=1693704411; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=Ie5AvS87bg127Kdu5s03kcc9obiNPBZ0JNqYthYsEyc=; b=zUD1vsSrIt6NDaAmeB/rl4JKwtjqbf5Uby30fzjowl93TMiabkcelrdn0e50bqpdm0 1h4Gw18y3jK3WKRYljthTXqRU9bAfqMQryNerDQAh8mIa+QDvXoihH5oM9WWlfHtMlDo sgiR403FrWrNlom1heS9q9yEFb3GtxSznQIFRERg7lTwUx80bycVJjSHzai+JIQXZMao 8MiAePd6Cbl1EOPsVTPcCopM4Z23kLD1NS4gUaR0v2QMiPupe1AHOC7sEO8x7yxpqO9B gId8JcItLuzUQCzAcpnd0xH5/2WE0dmhbTHwkuoB9pjpuvZRoDJsfTOIFInXdnXv17l8 pOLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693099611; x=1693704411; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Ie5AvS87bg127Kdu5s03kcc9obiNPBZ0JNqYthYsEyc=; b=a0mZmYLePN8EbTs/ZXTZUqFo09pINHIP0RQWRCiGQe7bA3p8DmqK6+WN08D1dH61Tm 5I+3Ersml8IJ6gQDLxG2LqCWVQQZuWIcKsFA4tD3Ej/walKreTQZowPurl8VEOfdP87/ nQ5A9nMsEc1fQ3Y1GbO0fDiPOfD087/Ly+Z35fWKLUAUv1bFaG2UMkCgkbJPYMwtHoc1 IiQme/H9ozG9Yr2Z9nMcsTg8vk0GosDmcUrarX27YDR23aeyug5jgOSoMivH58xOW1zY KzaYlsuSG8WkUNnRP/Vgr004hTcJKLnssp0CClyjmw8QgUunTYgsGZ46XEdVdhkRhsoW mEnA== X-Gm-Message-State: AOJu0Yx4B2QFxuBJNkfkYjdeE4hNoY3qG7SkYGS6rplcCnEtg6QH7Npk Wou2bTLRrrl0zEVljDz9wTFmNg== X-Google-Smtp-Source: AGHT+IEoNs4BNtH12YP5V6L4J0N/KuUw1MFS8fSEwjeImsFFj1jnKWjJsURig57J3YVMHkIW5mJN+A== X-Received: by 2002:a05:6358:4319:b0:13a:4855:d885 with SMTP id r25-20020a056358431900b0013a4855d885mr24823282rwc.10.1693099611417; Sat, 26 Aug 2023 18:26:51 -0700 (PDT) Received: from charlie.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id jf6-20020a170903268600b001b869410ed2sm4357404plb.72.2023.08.26.18.26.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Aug 2023 18:26:50 -0700 (PDT) From: Charlie Jenkins Date: Sat, 26 Aug 2023 18:26:09 -0700 Subject: [PATCH 4/5] riscv: Vector checksum library MIME-Version: 1.0 Message-Id: <20230826-optimize_checksum-v1-4-937501b4522a@rivosinc.com> References: <20230826-optimize_checksum-v1-0-937501b4522a@rivosinc.com> In-Reply-To: <20230826-optimize_checksum-v1-0-937501b4522a@rivosinc.com> To: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Paul Walmsley , Palmer Dabbelt , Albert Ou , Charlie Jenkins X-Mailer: b4 0.12.3 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230826_182652_823798_C9EB43C7 X-CRM114-Status: GOOD ( 16.41 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org This patch is not ready for merge as vector support in the kernel is limited. However, the code has been tested in QEMU so the algorithms do work. When Vector support is more mature, I will do more thorough testing of this code. It is written in assembly rather than using the GCC vector instrinsics because they did not provide optimal code. Signed-off-by: Charlie Jenkins --- arch/riscv/lib/csum.c | 165 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 165 insertions(+) diff --git a/arch/riscv/lib/csum.c b/arch/riscv/lib/csum.c index 2037041ce8a0..049a10596008 100644 --- a/arch/riscv/lib/csum.c +++ b/arch/riscv/lib/csum.c @@ -12,6 +12,10 @@ #include +#ifdef CONFIG_RISCV_ISA_V +#include +#endif + /* Default version is sufficient for 32 bit */ #ifdef CONFIG_64BIT __sum16 csum_ipv6_magic(const struct in6_addr *saddr, @@ -64,6 +68,166 @@ typedef unsigned long csum_t; * the bytes that it shouldn't. The same thing will occur on the tail-end of the * read. */ +#ifdef CONFIG_RISCV_ISA_V +#ifdef CONFIG_32BIT +unsigned int __no_sanitize_address do_csum(const unsigned char *buff, int len) +{ + vuint64m1_t prev_buffer; + vuint32m1_t curr_buffer; + unsigned int shift; + unsigned int vl, high_result, low_result, csum, offset; + unsigned int tail_seg; + const unsigned int *ptr; + + if (len <= 0) + return 0; + + /* + * To align the address, grab the whole first byte in buff. + * Directly call KASAN with the alignment we will be using. + */ + offset = (unsigned int)buff & OFFSET_MASK; + kasan_check_read(buff, len); + ptr = (const unsigned int *)(buff - offset); + len += offset; + + // Read the tail segment + tail_seg = len % 4; + csum = 0; + if (tail_seg) { + shift = (4 - tail_seg) * 8; + csum = *(unsigned int *)((const unsigned char *)ptr + len - tail_seg); + csum = ((unsigned int)csum << shift) >> shift; + len -= tail_seg; + } + + unsigned long start_mask = (unsigned int)(~(~0U << offset)); + + asm("vsetvli %[vl], %[len], e8, m1, ta, ma \n\ + # clear out mask and vector registers since we switch up sizes \n\ + vmclr.m v0 \n\ + vmclr.m %[prev_buffer] \n\ + vmclr.m %[curr_buffer] \n\ + # Mask out the leading bits of a misaligned address \n\ + vsetivli x0, 1, e64, m1, ta, ma \n\ + vmv.s.x %[prev_buffer], %[csum] \n\ + vmv.s.x v0, %[start_mask] \n\ + vsetvli %[vl], %[len], e8, m1, ta, ma \n\ + vmnot.m v0, v0 \n\ + vle8.v %[curr_buffer], (%[buff]), v0.t \n\ + j 2f \n\ + # Iterate through the buff and sum all words \n\ + 1: \n\ + vsetvli %[vl], %[len], e8, m1, ta, ma \n\ + vle8.v %[curr_buffer], (%[buff]) \n\ + 2: \n\ + vsetvli x0, x0, e32, m1, ta, ma \n\ + vwredsumu.vs %[prev_buffer], %[curr_buffer], %[prev_buffer] \n\ + sub %[len], %[len], %[vl] \n\t" +#ifdef CONFIG_RISCV_ISA_ZBA + "sh2add %[iph], %[vl], %[iph] \n\t" +#else + "slli %[vl], %[vl], 2 \n\ + add %[iph], %[vl], %[iph] \n\t" +#endif + "bnez %[len], 1b \n\ + vsetvli x0, x0, e64, m1, ta, ma \n\ + vmv.x.s %[result], %[prev_buffer] \n\ + addi %[vl], x0, 32 \n\ + vsrl.vx %[prev_buffer], %[prev_buffer], %[vl] \n\ + vmv.x.s %[high_result], %[prev_buffer]" + : [vl] "=&r" (vl), [prev_buffer] "=&vd" (prev_buffer), + [curr_buffer] "=&vd" (curr_buffer), + [high_result] "=&r" (high_result), + [low_result] "=&r" (low_result) + : [buff] "r" (ptr), [len] "r" (len), [start_mask] "r" (start_mask), + [csum] "r" (csum)); + + high_result += low_result; + high_result += high_result < low_result; + result = (unsigned int)result + (((unsigned int)result >> 16) | ((unsigned int)result << 16)); + if (offset & 1) + return (unsigned short)swab32(result); + return result >> 16; +} +#else +unsigned int __no_sanitize_address do_csum(const unsigned char *buff, int len) +{ + vuint64m1_t prev_buffer; + vuint32m1_t curr_buffer; + unsigned int shift; + unsigned long vl, result, csum, offset; + unsigned int tail_seg; + const unsigned long *ptr; + + if (len <= 0) + return 0; + + /* + * To align the address, grab the whole first byte in buff. + * Directly call KASAN with the alignment we will be using. + */ + offset = (unsigned long)buff & 7; + kasan_check_read(buff, len); + ptr = (const unsigned long *)(buff - offset); + len += offset; + + // Read the tail segment + tail_seg = len % 4; + csum = 0; + if (tail_seg) { + shift = (4 - tail_seg) * 8; + csum = *(unsigned int *)((const unsigned char *)ptr + len - tail_seg); + csum = ((unsigned int)csum << shift) >> shift; + len -= tail_seg; + } + + unsigned long start_mask = (unsigned int)(~(~0U << offset)); + + asm("vsetvli %[vl], %[len], e8, m1, ta, ma \n\ + # clear out mask and vector registers since we switch up sizes \n\ + vmclr.m v0 \n\ + vmclr.m %[prev_buffer] \n\ + vmclr.m %[curr_buffer] \n\ + # Mask out the leading bits of a misaligned address \n\ + vsetivli x0, 1, e64, m1, ta, ma \n\ + vmv.s.x %[prev_buffer], %[csum] \n\ + vmv.s.x v0, %[start_mask] \n\ + vsetvli %[vl], %[len], e8, m1, ta, ma \n\ + vmnot.m v0, v0 \n\ + vle8.v %[curr_buffer], (%[buff]), v0.t \n\ + j 2f \n\ + # Iterate through the buff and sum all words \n\ + 1: \n\ + vsetvli %[vl], %[len], e8, m1, ta, ma \n\ + vle8.v %[curr_buffer], (%[buff]) \n\ + 2: \n\ + vsetvli x0, x0, e32, m1, ta, ma \n\ + vwredsumu.vs %[prev_buffer], %[curr_buffer], %[prev_buffer] \n\ + subw %[len], %[len], %[vl] \n\t" +#ifdef CONFIG_RISCV_ISA_ZBA + "sh2add %[iph], %[vl], %[iph] \n\t" +#else + "slli %[vl], %[vl], 2 \n\ + addw %[iph], %[vl], %[iph] \n\t" +#endif + "bnez %[len], 1b \n\ + vsetvli x0, x0, e64, m1, ta, ma \n\ + vmv.x.s %[result], %[prev_buffer]" + : [vl] "=&r" (vl), [prev_buffer] "=&vd" (prev_buffer), + [curr_buffer] "=&vd" (curr_buffer), [result] "=&r" (result) + : [buff] "r" (ptr), [len] "r" (len), [start_mask] "r" (start_mask), + [csum] "r" (csum)); + + result += (result >> 32) | (result << 32); + result >>= 32; + result = (unsigned int)result + (((unsigned int)result >> 16) | ((unsigned int)result << 16)); + if (offset & 1) + return (unsigned short)swab32(result); + return result >> 16; +} +#endif +#else unsigned int __no_sanitize_address do_csum(const unsigned char *buff, int len) { unsigned int offset, shift; @@ -116,3 +280,4 @@ unsigned int __no_sanitize_address do_csum(const unsigned char *buff, int len) return (unsigned short)swab32(csum); return csum >> 16; } +#endif From patchwork Sun Aug 27 01:26:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Charlie Jenkins X-Patchwork-Id: 13366812 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C8DE3C83F11 for ; Sun, 27 Aug 2023 01:27:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:In-Reply-To:References:Message-Id :MIME-Version:Subject:Date:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=iLJZPM1rbQBMJuZwlnVI7SxoA6Kzc2r3oc3NWxyDUQw=; b=lb4UK6J8mVA6jJ yUhNQIa4qowvk2JPeLOzvcMWOICfshFkLeB84CpHY2q+iiCv4DFtGPCrb32/mMAqqVWP2DRDSjoZY 9w2pKwHURWYkU8PWerAqVzHFpoOKG98NyhSH+670oUXk4lXd6ofQre6gSWUPZGOjJfjWVphxpjp9G jD8oF9lid5h+UDMUHQIl7vrhBZldQpkcM4oGnlV8m/0yYVSIXSWOLWODyrEjqQo1RnBQB81jimjQ0 DJQq+zLx0v37lsBeWmxvbNkwXnzLKylfVQviA9VxSMn5/UwZ/jroS/nPjuIrSTsMVrGbTPWvtQObE RZwvImSq5pvPcNgfLGtA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qa4YN-007McQ-01; Sun, 27 Aug 2023 01:26:59 +0000 Received: from mail-oo1-xc34.google.com ([2607:f8b0:4864:20::c34]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qa4YH-007Maj-2t for linux-riscv@lists.infradead.org; Sun, 27 Aug 2023 01:26:55 +0000 Received: by mail-oo1-xc34.google.com with SMTP id 006d021491bc7-573675e6b43so506633eaf.0 for ; Sat, 26 Aug 2023 18:26:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20221208.gappssmtp.com; s=20221208; t=1693099612; x=1693704412; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=51CzmLExobdNojQV/8xztk6TetLzPUmBlLjWC0Th8zU=; b=m7qZdRv407cpgoxyHhwtM859LwQH/Zj8rhFGzC87zGoRGDL45ix/rP1aZuL80cBWFm DUqvuwdRK0wTD5TJr3ZfPhOnWAh3BCiy5VGCdvcChj/wCUedKz5HdvRU27lBiuAIELJ0 ZMq6o52tP0c7qLP0g/Sm+nTo7Ckrbjfq2u9VICWZmLbhxAYLUdeVbcqtgw/75jNalkG9 dIIPK+aj+lhf0l00Dlc/Pe3RXZT+zrxBBvhrj8EVnS7Q0BUX16rDy5d2s+vW733lRQ6+ tK2PDtGdRL5EbT50ZkEcq4fqEaQDmeQdJ32G3r78HfqCB+JVWBwVsYN7iatWqV2bP+gm TfVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693099612; x=1693704412; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=51CzmLExobdNojQV/8xztk6TetLzPUmBlLjWC0Th8zU=; b=i5RhiaBIMkCgQs1VTaWI02OHbPzs751CrOOm5BUn3qCVmFoxK00Asy/GNspOjrHuI0 FM6mTA7gceyEapG2DmAB3VyRHX1E6Q1MjkZ4DsOOv10y/v6QFeMdnOob5k1grSTbTy14 Awx2C8t96ojkAwbEf5eacAIo2lRSevbGFFGloBX20iMIK3vuxXJrvs7EziLNQc6XhyFx q3bCjLijsmplFSe5tIIEiPNJzcO0TguMLG5PbtbCqBADgsO++3158BazkO0qi7rd1vW8 W9lzrt20Ta45U2tgor8EmKg8R7W+wZ7x5gjQ0jpXelWtdRfgF6sXplJNhcxvHNjDFZUH K79g== X-Gm-Message-State: AOJu0YzPalk09ZkG2GKoj9r3Gw+JcHCZeKJ35oLLjas6PhuPQ7xpziZm nTVwAMcJq/kA1U+JH2wP+7IpjQ== X-Google-Smtp-Source: AGHT+IFvk3wdXz8svKDCr5C9cRFRuEBUhu9SfN/ssh0HFACRuc4nx5PCahf2QTTUSsQ8BSkBngKdng== X-Received: by 2002:a05:6358:248b:b0:132:d42f:8e19 with SMTP id m11-20020a056358248b00b00132d42f8e19mr23623473rwc.31.1693099612527; Sat, 26 Aug 2023 18:26:52 -0700 (PDT) Received: from charlie.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id jf6-20020a170903268600b001b869410ed2sm4357404plb.72.2023.08.26.18.26.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Aug 2023 18:26:51 -0700 (PDT) From: Charlie Jenkins Date: Sat, 26 Aug 2023 18:26:10 -0700 Subject: [PATCH 5/5] riscv: Test checksum functions MIME-Version: 1.0 Message-Id: <20230826-optimize_checksum-v1-5-937501b4522a@rivosinc.com> References: <20230826-optimize_checksum-v1-0-937501b4522a@rivosinc.com> In-Reply-To: <20230826-optimize_checksum-v1-0-937501b4522a@rivosinc.com> To: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Paul Walmsley , Palmer Dabbelt , Albert Ou , Charlie Jenkins X-Mailer: b4 0.12.3 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230826_182653_932138_BAEAB46C X-CRM114-Status: GOOD ( 15.73 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Add Kconfig support for riscv specific testing modules. This was created to supplement lib/checksum_kunit.c, and add tests for ip_fast_csum and csum_ipv6_magic. Signed-off-by: Charlie Jenkins --- arch/riscv/Kconfig.debug | 1 + arch/riscv/lib/Kconfig.debug | 31 ++++++++++ arch/riscv/lib/Makefile | 2 + arch/riscv/lib/riscv_checksum_kunit.c | 111 ++++++++++++++++++++++++++++++++++ 4 files changed, 145 insertions(+) diff --git a/arch/riscv/Kconfig.debug b/arch/riscv/Kconfig.debug index e69de29bb2d1..53a84ec4f91f 100644 --- a/arch/riscv/Kconfig.debug +++ b/arch/riscv/Kconfig.debug @@ -0,0 +1 @@ +source "arch/riscv/lib/Kconfig.debug" diff --git a/arch/riscv/lib/Kconfig.debug b/arch/riscv/lib/Kconfig.debug new file mode 100644 index 000000000000..15fc83b68340 --- /dev/null +++ b/arch/riscv/lib/Kconfig.debug @@ -0,0 +1,31 @@ +# SPDX-License-Identifier: GPL-2.0-only +menu "riscv Testing and Coverage" + +menuconfig RUNTIME_TESTING_MENU + bool "Runtime Testing" + def_bool y + help + Enable riscv runtime testing. + +if RUNTIME_TESTING_MENU + +config RISCV_CHECKSUM_KUNIT + tristate "KUnit test riscv checksum functions at runtime" if !KUNIT_ALL_TESTS + depends on KUNIT + default KUNIT_ALL_TESTS + help + Enable this option to test the checksum functions at boot. + + KUnit tests run during boot and output the results to the debug log + in TAP format (http://testanything.org/). Only useful for kernel devs + running the KUnit test harness, and not intended for inclusion into a + production build. + + For more information on KUnit and unit tests in general please refer + to the KUnit documentation in Documentation/dev-tools/kunit/. + + If unsure, say N. + +endif # RUNTIME_TESTING_MENU + +endmenu # "riscv Testing and Coverage" diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile index 2aa1a4ad361f..1535a8c81430 100644 --- a/arch/riscv/lib/Makefile +++ b/arch/riscv/lib/Makefile @@ -12,3 +12,5 @@ lib-$(CONFIG_64BIT) += tishift.o lib-$(CONFIG_RISCV_ISA_ZICBOZ) += clear_page.o obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o + +obj-$(CONFIG_RISCV_CHECKSUM_KUNIT) += riscv_checksum_kunit.o diff --git a/arch/riscv/lib/riscv_checksum_kunit.c b/arch/riscv/lib/riscv_checksum_kunit.c new file mode 100644 index 000000000000..05b4710c907f --- /dev/null +++ b/arch/riscv/lib/riscv_checksum_kunit.c @@ -0,0 +1,111 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Test cases for checksum + */ + +#include + +#include +#include +#include + +#define CHECK_EQ(lhs, rhs) KUNIT_ASSERT_EQ(test, lhs, rhs) + +static void test_csum_fold(struct kunit *test) +{ + unsigned int one = 1226127848; + unsigned int two = 446627905; + unsigned int three = 3644783064; + unsigned int four = 361842745; + unsigned int five = 4281073503; + unsigned int max = -1; + + CHECK_EQ(0x7d02, csum_fold(one)); + CHECK_EQ(0xe51f, csum_fold(two)); + CHECK_EQ(0x2ce8, csum_fold(three)); + CHECK_EQ(0xa235, csum_fold(four)); + CHECK_EQ(0x174, csum_fold(five)); + CHECK_EQ(0x0, csum_fold(max)); +} + +static void test_ip_fast_csum(struct kunit *test) +{ + unsigned char *average = { 0x1c, 0x00, 0x00, 0x45, 0x00, 0x00, 0x68, + 0x74, 0x00, 0x00, 0x11, 0x80, 0x01, 0x64, + 0xa8, 0xc0, 0xe9, 0x9c, 0x46, 0xab }; + unsigned char *larger = { 0xa3, 0xde, 0x43, 0x41, 0x11, 0x19, + 0x2f, 0x73, 0x00, 0x00, 0xf1, 0xc5, + 0x31, 0xbb, 0xaa, 0xc1, 0x23, 0x5f, + 0x32, 0xde, 0x65, 0x39, 0xfe, 0xbc }; + unsigned char *overflow = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }; + unsigned char *max = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xfd, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xfd, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }; + + CHECK_EQ(0x598f, ip_fast_csum(average, 5)); + CHECK_EQ(0xdd4f, ip_fast_csum(larger, 6)); + CHECK_EQ(0xfffe, ip_fast_csum(overflow, 5)); + CHECK_EQ(0x400, ip_fast_csum(max, 14)); +} + +static void test_csum_ipv6_magic(struct kunit *test) +{ + struct in6_addr saddr = { + .s6_addr = { 0xf8, 0x43, 0x43, 0xf0, 0xdc, 0xa0, 0x39, 0x92, + 0x43, 0x67, 0x12, 0x03, 0xe3, 0x32, 0xfe, 0xed }}; + struct in6_addr daddr = { + .s6_addr = { 0xa8, 0x23, 0x46, 0xdc, 0xc8, 0x2d, 0xaa, 0xe3, + 0xdc, 0x66, 0x72, 0x43, 0xe2, 0x12, 0xee, 0xfd }}; + u32 len = 1 << 10; + u8 proto = 17; + __wsum csum = 53; + + CHECK_EQ(0x2fbb, csum_ipv6_magic(&saddr, &daddr, len, proto, csum)); +} + +static void test_do_csum(struct kunit *test) +{ + unsigned char *very_small = {0x32}; + unsigned char *small = {0xd3, 0x43, 0xad, 0x46}; + unsigned char *medium = { + 0xa0, 0x13, 0xaa, 0xa6, 0x53, 0xac, 0xa3, 0x43 + }; + unsigned char *misaligned = medium + 1; + unsigned char *large = { + 0xa0, 0x13, 0xaa, 0xa6, 0x53, 0xac, 0xa3, 0x43, + 0xa0, 0x13, 0xaa, 0xa6, 0x53, 0xac, 0xa3, 0x43, + 0xa0, 0x13, 0xaa, 0xa6, 0x53, 0xac, 0xa3, 0x43, + 0xa0, 0x13, 0xaa, 0xa6, 0x53, 0xac, 0xa3, 0x43 + }; + unsigned char *large_misaligned = large + 3; + + CHECK_EQ(0xffcd, ip_compute_csum(very_small, 1)); + CHECK_EQ(0x757f, ip_compute_csum(small, 4)); + CHECK_EQ(0x5e56, ip_compute_csum(misaligned, 7)); + CHECK_EQ(0x469d, ip_compute_csum(large, 29)); + CHECK_EQ(0x43ae, ip_compute_csum(large_misaligned, 28)); +} + +static struct kunit_case __refdata riscv_checksum_test_cases[] = { + KUNIT_CASE(test_csum_fold), + KUNIT_CASE(test_ip_fast_csum), + KUNIT_CASE(test_csum_ipv6_magic), + KUNIT_CASE(test_do_csum), + {} +}; + +static struct kunit_suite riscv_checksum_test_suite = { + .name = "riscv_checksum", + .test_cases = riscv_checksum_test_cases, +}; + +kunit_test_suites(&riscv_checksum_test_suite); + +MODULE_AUTHOR("Charlie Jenkins "); +MODULE_LICENSE("GPL");