From patchwork Mon Sep 30 16:10:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristina Martsenko X-Patchwork-Id: 13816674 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6E9E6CE8360 for ; Mon, 30 Sep 2024 16:14:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-Type: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References:List-Owner; bh=9cZcNUTln4xfkYAWsIG+SjowE+iesd3r9Td8ibnLi3Q=; b=ouvuj8r2EFUBBcVsPxiZYfg8Gr pZmHgmodGBCKJrFKVTtuGJj/CZcS+6RRnhAZGZqEVeSot81ktYyeFPdajUOvLLKC24PCL4bFvBiFM cB0LYZ8YCDla54VU03XUfNCCR19MjM+3zRKzi1hbWP2Z3TBZApJog8tIPrxfzEXWjihLFo90dtO7h oUrggfVb4V6g5cwURkr0ztpf+e3zjsUZKcu7IZBpC+EfrENnR6l8JiKZUYCKA04oWlLSSX8HAzOjW O+zG4n/HTCQ2tLZcFJ05uaopch4fOGCq46vBQ6RgCraOZFjFo/tjC2e+lM9QhUVt6OMAZwUZNW6R/ j69dfXig==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1svJ2Q-00000000C7W-37cz; Mon, 30 Sep 2024 16:14:18 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1svJ1C-00000000Bo1-0c7E for linux-arm-kernel@lists.infradead.org; Mon, 30 Sep 2024 16:13:03 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7582A367; Mon, 30 Sep 2024 09:13:29 -0700 (PDT) Received: from e126864.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 53F943F58B; Mon, 30 Sep 2024 09:12:58 -0700 (PDT) From: Kristina Martsenko To: linux-arm-kernel@lists.infradead.org Cc: Catalin Marinas , Will Deacon , Mark Rutland , Robin Murphy , Marc Zyngier Subject: [PATCH 0/5] arm64: Use memory copy instructions in kernel routines Date: Mon, 30 Sep 2024 17:10:46 +0100 Message-Id: <20240930161051.3777828-1-kristina.martsenko@arm.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240930_091302_259988_6962A0F3 X-CRM114-Status: GOOD ( 14.21 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi, Here is a small series to make memcpy() and related functions use the memory copy/set instructions (Armv8.8 FEAT_MOPS). The kernel uses several library routines for copying or initializing memory, for example copy_to_user() and memset(). These routines have been optimized to make their load/store sequence perform well across a range of CPUs. However the chosen sequence can't be the fastest possible for every CPU microarchitecture nor for heterogeneous systems, and needs to be rewritten periodically as hardware changes. Future arm64 CPUs will have CPY* and SET* instructions that can copy (or set) a block of memory of arbitrary size and alignment. The kernel currently supports using these instructions in userspace applications [1] and KVM guests [2] but does not use them within the kernel. CPUs are expected to implement the CPY/SET instructions close to optimally for their microarchitecture (i.e. close to the performance of the best load/store sequence performing a generic copy/set). Using the instructions in the kernel's copy/set routines would therefore make the routines optimal and avoid the need to rewrite them. It could also lead to a performance improvement for some CPUs and systems. This series makes the memcpy(), memmove() and memset() routines use the CPY/SET instructions, as well as copy_page() and clear_page(). I'll send a follow-up series to update the usercopy routines (copy_to_user() etc) "soon", as it needs a bit more work. The patches were tested on an Arm FVP. Thanks, Kristina [1] https://lore.kernel.org/lkml/20230509142235.3284028-1-kristina.martsenko@arm.com/ [2] https://lore.kernel.org/linux-arm-kernel/20230922112508.1774352-1-kristina.martsenko@arm.com/ Kristina Martsenko (5): arm64: probes: Disable kprobes/uprobes on MOPS instructions arm64: mops: Handle MOPS exceptions from EL1 arm64: mops: Document booting requirement for HCR_EL2.MCE2 arm64: lib: Use MOPS for memcpy() routines arm64: lib: Use MOPS for copy_page() and clear_page() Documentation/arch/arm64/booting.rst | 3 +++ arch/arm64/Kconfig | 3 +++ arch/arm64/include/asm/debug-monitors.h | 1 + arch/arm64/include/asm/exception.h | 1 + arch/arm64/include/asm/insn.h | 1 + arch/arm64/kernel/debug-monitors.c | 5 +++++ arch/arm64/kernel/entry-common.c | 12 ++++++++++++ arch/arm64/kernel/probes/decode-insn.c | 7 +++++-- arch/arm64/kernel/traps.c | 7 +++++++ arch/arm64/lib/clear_page.S | 13 +++++++++++++ arch/arm64/lib/copy_page.S | 13 +++++++++++++ arch/arm64/lib/memcpy.S | 19 ++++++++++++++++++- arch/arm64/lib/memset.S | 20 +++++++++++++++++++- 13 files changed, 101 insertions(+), 4 deletions(-) base-commit: 9852d85ec9d492ebef56dc5f229416c925758edc