From patchwork Mon Oct 29 08:06:35 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Whitchurch X-Patchwork-Id: 10658823 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E87611734 for ; Mon, 29 Oct 2018 08:08:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D431829571 for ; Mon, 29 Oct 2018 08:08:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C8573296FF; Mon, 29 Oct 2018 08:08:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 033E129571 for ; Mon, 29 Oct 2018 08:08:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To: References:List-Owner; bh=4uAp7sDGVcDIn2V/B0zrHE+wWhj46/ZWDSmhCm7e9sY=; b=TAx xVK6fgTbjt6IYh4mAZpi5dMwdj56Ex6mvEC620rt11+FZxX8wu9y9egVnN6kE4XeHk9yD16rGYvjo wjKKa2JHCSlM9hueBT8LukMaeLTpY98ZODxPU7k6ky5Z4yH7XcVf0tAogUIX8pZqN8u6MR2sg5Emi si+ZbqDARO3appwiJsU7UliqT0lJKClJvaWYku3ENy3PyjV/LIlx12I3dL2AQizBAwzND8tmMadl7 TDnNRdhZojEo/bgd71rME+y8cuDbTvztw8YKTYvaf89F3IW3n+5jqnVyFl/aWzMhBPcaoiZ16RML9 ENuVvTB0Z64UeFUg1l1QxpyhLDQfHZQ==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1gH2ai-0003np-Ev; Mon, 29 Oct 2018 08:08:04 +0000 Received: from bastet.se.axis.com ([195.60.68.11]) by bombadil.infradead.org with esmtps (Exim 4.90_1 #2 (Red Hat Linux)) id 1gH2a4-0003VF-P5 for linux-arm-kernel@lists.infradead.org; Mon, 29 Oct 2018 08:07:27 +0000 Received: from localhost (localhost [127.0.0.1]) by bastet.se.axis.com (Postfix) with ESMTP id 8C2BD182EC; Mon, 29 Oct 2018 09:07:10 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at bastet.se.axis.com Received: from bastet.se.axis.com ([IPv6:::ffff:127.0.0.1]) by localhost (bastet.se.axis.com [::ffff:127.0.0.1]) (amavisd-new, port 10024) with LMTP id WTsfDHSdbaEX; Mon, 29 Oct 2018 09:07:09 +0100 (CET) Received: from boulder03.se.axis.com (boulder03.se.axis.com [10.0.8.17]) by bastet.se.axis.com (Postfix) with ESMTPS id 6B97E1832D; Mon, 29 Oct 2018 09:07:09 +0100 (CET) Received: from boulder03.se.axis.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1C53A1E06E; Mon, 29 Oct 2018 09:07:09 +0100 (CET) Received: from boulder03.se.axis.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 10D3A1E06A; Mon, 29 Oct 2018 09:07:09 +0100 (CET) Received: from seth.se.axis.com (unknown [10.0.2.172]) by boulder03.se.axis.com (Postfix) with ESMTP; Mon, 29 Oct 2018 09:07:09 +0100 (CET) Received: from lnxartpec.se.axis.com (lnxartpec.se.axis.com [10.88.4.9]) by seth.se.axis.com (Postfix) with ESMTP id 054F121F6; Mon, 29 Oct 2018 09:07:09 +0100 (CET) Received: by lnxartpec.se.axis.com (Postfix, from userid 10564) id 00600801EE; Mon, 29 Oct 2018 09:07:08 +0100 (CET) From: Vincent Whitchurch To: linux@arm.linux.org.uk Subject: [PATCH v4] ARM: Optimise copy_{from/to}_user for !CPU_USE_DOMAINS Date: Mon, 29 Oct 2018 09:06:35 +0100 Message-Id: <20181029080635.28973-1-vincent.whitchurch@axis.com> X-Mailer: git-send-email 2.11.0 X-TM-AS-GCONF: 00 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20181029_010725_139895_3AA9821E X-CRM114-Status: UNSURE ( 9.62 ) X-CRM114-Notice: Please train this message. X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Vincent Whitchurch , linux-arm-kernel@lists.infradead.org, nico@linaro.org MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP ARMv6+ processors do not use CONFIG_CPU_USE_DOMAINS and use privileged ldr/str instructions in copy_{from/to}_user. They are currently unnecessarily using single ldr/str instructions and can use ldm/stm instructions instead like memcpy does (but with appropriate fixup tables). This speeds up a "dd if=foo of=bar bs=32k" on a tmpfs filesystem by about 4% on my Cortex-A9. before:134217728 bytes (128.0MB) copied, 0.543848 seconds, 235.4MB/s before:134217728 bytes (128.0MB) copied, 0.538610 seconds, 237.6MB/s before:134217728 bytes (128.0MB) copied, 0.544356 seconds, 235.1MB/s before:134217728 bytes (128.0MB) copied, 0.544364 seconds, 235.1MB/s before:134217728 bytes (128.0MB) copied, 0.537130 seconds, 238.3MB/s before:134217728 bytes (128.0MB) copied, 0.533443 seconds, 240.0MB/s before:134217728 bytes (128.0MB) copied, 0.545691 seconds, 234.6MB/s before:134217728 bytes (128.0MB) copied, 0.534695 seconds, 239.4MB/s before:134217728 bytes (128.0MB) copied, 0.540561 seconds, 236.8MB/s before:134217728 bytes (128.0MB) copied, 0.541025 seconds, 236.6MB/s after:134217728 bytes (128.0MB) copied, 0.520445 seconds, 245.9MB/s after:134217728 bytes (128.0MB) copied, 0.527846 seconds, 242.5MB/s after:134217728 bytes (128.0MB) copied, 0.519510 seconds, 246.4MB/s after:134217728 bytes (128.0MB) copied, 0.527231 seconds, 242.8MB/s after:134217728 bytes (128.0MB) copied, 0.525030 seconds, 243.8MB/s after:134217728 bytes (128.0MB) copied, 0.524236 seconds, 244.2MB/s after:134217728 bytes (128.0MB) copied, 0.523659 seconds, 244.4MB/s after:134217728 bytes (128.0MB) copied, 0.525018 seconds, 243.8MB/s after:134217728 bytes (128.0MB) copied, 0.519249 seconds, 246.5MB/s after:134217728 bytes (128.0MB) copied, 0.518527 seconds, 246.9MB/s Signed-off-by: Vincent Whitchurch Reviewed-by: Nicolas Pitre --- v4: Do not modify str1b/ldr1b v3: Explicitly add IT instructions to fix fault handling on Thumb-2 v2: Group *_SHIFT #defines with respective .macro implementations arch/arm/include/asm/assembler.h | 6 ++++-- arch/arm/lib/copy_from_user.S | 23 ++++++++++++++++++++++- arch/arm/lib/copy_to_user.S | 27 ++++++++++++++++++++++----- 3 files changed, 48 insertions(+), 8 deletions(-) diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h index b17ee03d280b..da16d31c7ef9 100644 --- a/arch/arm/include/asm/assembler.h +++ b/arch/arm/include/asm/assembler.h @@ -243,13 +243,15 @@ .endm #endif -#define USER(x...) \ +#define USERL(l, x...) \ 9999: x; \ .pushsection __ex_table,"a"; \ .align 3; \ - .long 9999b,9001f; \ + .long 9999b,l; \ .popsection +#define USER(x...) USERL(9001f, x) + #ifdef CONFIG_SMP #define ALT_SMP(instr...) \ 9998: instr diff --git a/arch/arm/lib/copy_from_user.S b/arch/arm/lib/copy_from_user.S index a826df3d3814..3f79001830fe 100644 --- a/arch/arm/lib/copy_from_user.S +++ b/arch/arm/lib/copy_from_user.S @@ -34,12 +34,13 @@ * Number of bytes NOT copied. */ +#ifdef CONFIG_CPU_USE_DOMAINS + #ifndef CONFIG_THUMB2_KERNEL #define LDR1W_SHIFT 0 #else #define LDR1W_SHIFT 1 #endif -#define STR1W_SHIFT 0 .macro ldr1w ptr reg abort ldrusr \reg, \ptr, 4, abort=\abort @@ -57,10 +58,30 @@ ldr4w \ptr, \reg5, \reg6, \reg7, \reg8, \abort .endm +#else + +#define LDR1W_SHIFT 0 + + .macro ldr1w ptr reg abort + USERL(\abort, W(ldr) \reg, [\ptr], #4) + .endm + + .macro ldr4w ptr reg1 reg2 reg3 reg4 abort + USERL(\abort, ldmia \ptr!, {\reg1, \reg2, \reg3, \reg4}) + .endm + + .macro ldr8w ptr reg1 reg2 reg3 reg4 reg5 reg6 reg7 reg8 abort + USERL(\abort, ldmia \ptr!, {\reg1, \reg2, \reg3, \reg4, \reg5, \reg6, \reg7, \reg8}) + .endm + +#endif /* CONFIG_CPU_USE_DOMAINS */ + .macro ldr1b ptr reg cond=al abort ldrusr \reg, \ptr, 1, \cond, abort=\abort .endm +#define STR1W_SHIFT 0 + .macro str1w ptr reg abort W(str) \reg, [\ptr], #4 .endm diff --git a/arch/arm/lib/copy_to_user.S b/arch/arm/lib/copy_to_user.S index caf5019d8161..5cb28f6c7941 100644 --- a/arch/arm/lib/copy_to_user.S +++ b/arch/arm/lib/copy_to_user.S @@ -35,11 +35,6 @@ */ #define LDR1W_SHIFT 0 -#ifndef CONFIG_THUMB2_KERNEL -#define STR1W_SHIFT 0 -#else -#define STR1W_SHIFT 1 -#endif .macro ldr1w ptr reg abort W(ldr) \reg, [\ptr], #4 @@ -57,6 +52,14 @@ ldr\cond\()b \reg, [\ptr], #1 .endm +#ifdef CONFIG_CPU_USE_DOMAINS + +#ifndef CONFIG_THUMB2_KERNEL +#define STR1W_SHIFT 0 +#else +#define STR1W_SHIFT 1 +#endif + .macro str1w ptr reg abort strusr \reg, \ptr, 4, abort=\abort .endm @@ -72,6 +75,20 @@ str1w \ptr, \reg8, \abort .endm +#else + +#define STR1W_SHIFT 0 + + .macro str1w ptr reg abort + USERL(\abort, W(str) \reg, [\ptr], #4) + .endm + + .macro str8w ptr reg1 reg2 reg3 reg4 reg5 reg6 reg7 reg8 abort + USERL(\abort, stmia \ptr!, {\reg1, \reg2, \reg3, \reg4, \reg5, \reg6, \reg7, \reg8}) + .endm + +#endif /* CONFIG_CPU_USE_DOMAINS */ + .macro str1b ptr reg cond=al abort strusr \reg, \ptr, 1, \cond, abort=\abort .endm