From patchwork Fri Oct 18 07:53:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 13841339 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7D4CED2127B for ; Fri, 18 Oct 2024 07:58:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:Cc:To:From: Subject:Message-ID:Mime-Version:Date:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References:List-Owner; bh=x2mv8TUZpeEH80uut1qGXx+8u/loznjO7IbL6g6IhOA=; b=14dMshjfC6lF9tJ+sE2NfnYB9+ JofV7izvvW7rnwO10apTovoxsantlybWCBKiRpXvBhHmFJnYxEcft0mA0qykZMTbYsF19P0/Um3hr 9LZGr6Nt/4dtCaMXSUFNsc1Y+8UhTj/5vS4W0ogrw5Qoi+Kjm3gRuINse9pAkJet4RotimwJJshq5 lXr8ON9XFEKnzWOuOTC/jiRDxLm8dGVKQeres9crWA7dFeWv99mjf0EBNxy6YbPsReDH9up86VRvs zI2VKHxY4xj0YzE1AjB9l+/UTXIkA26qsM5OU7rkolD2lOT0FTV90U7hgUYk1U5x+/YqRyl6hMBMl 5gghnhnQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1t1hsk-0000000HQBt-1Ewk; Fri, 18 Oct 2024 07:58:46 +0000 Received: from mail-yb1-xb49.google.com ([2607:f8b0:4864:20::b49]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1t1ho4-0000000HPAi-1U1k for linux-arm-kernel@lists.infradead.org; Fri, 18 Oct 2024 07:53:57 +0000 Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-e293b3e014aso2974704276.3 for ; Fri, 18 Oct 2024 00:53:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729238034; x=1729842834; darn=lists.infradead.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=x2mv8TUZpeEH80uut1qGXx+8u/loznjO7IbL6g6IhOA=; b=YNqqRnrJDZWEMj2izmiQQ5HHBGhItQnNfXBfJ8uUzy3sDV2HjtwLSJpxSi/PPeuurU 0tbrlvcClKPijo2XVwaLAY0MQYJ6Co6R3DOqV91IWVNzUSH2hwGapGi78V3gIc7UntwO 3XHg0qiFzGam00RleaxhmDroNfosaYDhJfktWL3Ho8yEblYh54eL1ApKphya/Kjm3Ep0 mIdnHgpjQZCaHsqYI6d/kjhrVujyvZO3vCnJhYiWADEV9Wm221nK36mZof7U1MTNviDX IlUynKEv6w8DnSRgOqavXl5DUdlhhMsK+fmiRTTujtiHQobvNFhREDARbhhY6eG2SZ7+ EHOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729238034; x=1729842834; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=x2mv8TUZpeEH80uut1qGXx+8u/loznjO7IbL6g6IhOA=; b=eKrIYra/gLCwzF8bL1E7EyVZFZeGrc9Al79SNok7va06giJfJicS0FOoJ+F0S7boi2 SqQ5wWlrvipMpZuA0zRiSz5XgXwm+M3bqqUDYIQq5ESyIhj9S5G8kP0cMFmxpugejwXb 5QWCUu5dL3Ltew9wABcb9v0BLMMfPwnZmUK8mU1NRl0+aYhbKSKySSJ/V7Hx1lQFasjk gItwBjzGphr5GBsB7+TFhMfo+jaPT44oXijr6WF3GaMlLdl8wzZ/QQ0PhUzpfwu3Te6x QgqHg+XPusBi4tdZYz+1L9WIsZsVhd4CyuGKECDuqfttI3M9A3VuFC+WXi7UGbwwdbeK EDcw== X-Gm-Message-State: AOJu0Yx1RJHaAOKu+TKJqqYLdY3ufBCUneH7LMzMl2fy7ceP99cNXdGQ /rxieS9e1u2ScYM5IK9lYDTKol9GwQdd8KC5guXo1Bnzkh25CsMlBCcIsmHIdFOeednsur8v8Ir oqfu51mFuT67ohZlebvZkCuZSYRrcre5geEkQpSP4T3+HFLA8Yi3UyEQ8KnTLbU4oaBVpctGQpD J98EkVa/MyIYWiJxk8KIZSSrOORbm3ysZAqgefkjq5 X-Google-Smtp-Source: AGHT+IFwhKYzBpNi2YE4PRKeSOzLm+jjRSZJxfIsQP2aLt1c19XKh2lrrRVvRcdv+mI23HTHM9bSq9gI X-Received: from palermo.c.googlers.com ([fda3:e722:ac3:cc00:7b:198d:ac11:8138]) (user=ardb job=sendgmr) by 2002:a25:abcf:0:b0:e28:f19c:fd4 with SMTP id 3f1490d57ef6-e2bb16cd37dmr931276.11.1729238034003; Fri, 18 Oct 2024 00:53:54 -0700 (PDT) Date: Fri, 18 Oct 2024 09:53:48 +0200 Mime-Version: 1.0 X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=1847; i=ardb@kernel.org; h=from:subject; bh=FYOaF0aLEwhF6mjikhwenNhSc8+8y/Xk24OlbqUyWJA=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIV1IhEfo2T2/XVWmCy5NjfBtzD+meLuqQMG3YY24qu+iC 5N4Vwh3lLIwiHEwyIopsgjM/vtu5+mJUrXOs2Rh5rAygQxh4OIUgIlIvGP4px4zu8T/NmPiyYv3 OPXU6ma+Vb7m5TBNQVpQ5XDUZ90pJxkZvnarL4uyqp48Zdb6oHYtx96DIuolnM9OcuTdnrl2mXk xPwA= X-Mailer: git-send-email 2.47.0.rc1.288.g06298d1525-goog Message-ID: <20241018075347.2821102-5-ardb+git@google.com> Subject: [PATCH v4 0/3] arm64: Speed up CRC-32 using PMULL instructions From: Ard Biesheuvel To: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, will@kernel.org, catalin.marinas@arm.com, Ard Biesheuvel , Eric Biggers , Kees Cook X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20241018_005356_417264_52930629 X-CRM114-Status: GOOD ( 13.34 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: Ard Biesheuvel The CRC-32 code is library code, and is not part of the crypto subsystem. This means that callers may not generally be aware of the kind of implementation that backs it, and so we've refrained from using FP/SIMD code in the past, as it disables preemption, and this may incur scheduling latencies that the caller did not anticipate. This was solved a while ago, and on arm64, kernel mode FP/SIMD no longer disables preemption. This means we can happily use PMULL instructions in the CRC-32 library code, which permits an optimization to be implemented that results in a speedup of 2 - 2.8x for inputs >1k in size (on Apple M2) Patch #1 implements some prepwork to handle the scalar CRC-32 alternatives patching in C code. Changes since v3: - fix broken crc32be version - add patch to tidy up existing code for reuse - add 4-way code to existing .S file Changes since v2: - drop alternatives.h #include (#1) - drop unneeded branch (#2) - fix comment max -> min (#2) - add Eric's Rb Changes since v1: - rename crc32-pmull.S to crc32-4way.S and avoid pmull in the function names to avoid confusion about the nature of the implementation; - polish the asm a bit, and add some comments - don't return via the scalar code if len dropped to 0 after calling the 4-way code. Cc: Eric Biggers Cc: Kees Cook Ard Biesheuvel (3): arm64/lib: Handle CRC-32 alternative in C code arm64/crc32: Reorganize bit/byte ordering macros arm64/crc32: Implement 4-way interleave using PMULL arch/arm64/lib/Makefile | 2 +- arch/arm64/lib/crc32-glue.c | 82 +++++ arch/arm64/lib/crc32.S | 344 ++++++++++++++++---- 3 files changed, 356 insertions(+), 72 deletions(-) create mode 100644 arch/arm64/lib/crc32-glue.c