From patchwork Wed Oct 9 18:50:32 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 3011251 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 36203BF924 for ; Wed, 9 Oct 2013 18:54:40 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 26858202B8 for ; Wed, 9 Oct 2013 18:54:39 +0000 (UTC) Received: from casper.infradead.org (casper.infradead.org [85.118.1.10]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E41172022A for ; Wed, 9 Oct 2013 18:54:37 +0000 (UTC) Received: from merlin.infradead.org ([2001:4978:20e::2]) by casper.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1VTysu-0006Nq-O5; Wed, 09 Oct 2013 18:53:25 +0000 Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1VTysb-0005N6-Vo; Wed, 09 Oct 2013 18:53:05 +0000 Received: from mail-we0-f181.google.com ([74.125.82.181]) by merlin.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1VTysI-0005K0-3N for linux-arm-kernel@lists.infradead.org; Wed, 09 Oct 2013 18:52:48 +0000 Received: by mail-we0-f181.google.com with SMTP id t60so1376261wes.12 for ; Wed, 09 Oct 2013 11:52:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=kW5nASwV31b+iyYsi02esokFIEJdl/ZlRAZYSw+B4tY=; b=KYuaYGqCqveuj19gMfFAZjXMgIOjqD7zLXK42BbOEdxeuvR5/JjagbMbBan4JlcP+P EHpqnTe4rdStRE9bLxXe2dP8Tc9DVgkE2ZOUiFSi+UA8JBCETIsg6heb977tM/PNO0w8 ieftKjKUIYsSOJQcX1Z7dirQbas8Y+QowXSe+OcT5uMPQE4qV8SbpzhM0fhRlxlMtitp +9dCHoTyXQxEoQr0n7LwPyGlNltiUZ0HJD7QtXGcfbf3aZ4XWe44qL3LV91dPZT/Tguh MtIjgy4RijL0kKjYrVao/VZ5g+cLh+enZ9X8I44hJgcVyb0crqbaUI/Hffd9YKuMs0wi h0DQ== X-Gm-Message-State: ALoCoQlWaUIIFp+JwNYaMLFNMflFiA06yc78W9yKfFu5FL0rZ2RHct0w6NgjJc8q4rTp+Fq152RT X-Received: by 10.194.241.228 with SMTP id wl4mr8305502wjc.2.1381344744466; Wed, 09 Oct 2013 11:52:24 -0700 (PDT) Received: from ards-mac-mini.local (cag06-7-83-153-85-71.fbx.proxad.net. [83.153.85.71]) by mx.google.com with ESMTPSA id l9sm17911688wif.10.1969.12.31.16.00.00 (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 09 Oct 2013 11:52:24 -0700 (PDT) From: Ard Biesheuvel To: linux-arm-kernel@lists.infradead.org Subject: [RFC v2 PATCH 2/4] ARM64: add support for kernel mode NEON in atomic context Date: Wed, 9 Oct 2013 20:50:32 +0200 Message-Id: <1381344634-14917-3-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 1.8.1.2 In-Reply-To: <1381344634-14917-1-git-send-email-ard.biesheuvel@linaro.org> References: <1381344634-14917-1-git-send-email-ard.biesheuvel@linaro.org> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20131009_145246_398250_75D9F0E2 X-CRM114-Status: GOOD ( 13.87 ) X-Spam-Score: -2.6 (--) Cc: Ard Biesheuvel , linux@arm.linux.org.uk, dave.martin@arm.com, nico@linaro.org X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patch adds kernel_neon_begin_atomic() and kernel_neon_end_atomic(), which may be called from any context. In !in_interrupt() case, they just call their non-_atomic counterparts. In atomic context, they stack resp. unstack the number of NEON registers declared when setting up the stack area using DEFINE_NEON_REG_STACK(). Signed-off-by: Ard Biesheuvel --- arch/arm64/include/asm/fpsimd.h | 16 +++++++++++++++ arch/arm64/include/asm/fpsimdmacros.h | 37 +++++++++++++++++++++++++++++++++++ arch/arm64/include/asm/neon.h | 31 +++++++++++++++++++++++++++++ arch/arm64/kernel/entry-fpsimd.S | 24 +++++++++++++++++++++++ arch/arm64/kernel/fpsimd.c | 3 +++ 5 files changed, 111 insertions(+) diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h index c43b4ac..3a741b0 100644 --- a/arch/arm64/include/asm/fpsimd.h +++ b/arch/arm64/include/asm/fpsimd.h @@ -39,6 +39,19 @@ struct fpsimd_state { }; }; +/* + * Variable sized struct for stacking the bottom n FP/SIMD registers. + * Mainly intended for kernel use of v8 Crypto Extensions which only + * needs a few registers and may need to execute in atomic context. + */ +struct fpsimd_partial_state { + const u32 num_regs; + u32 fpsr; + u32 fpcr; + __uint128_t vregs[] __aligned(16); +} __aligned(16); + + #if defined(__KERNEL__) && defined(CONFIG_COMPAT) /* Masks for extracting the FPSR and FPCR from the FPSCR */ #define VFP_FPSCR_STAT_MASK 0xf800009f @@ -55,6 +68,9 @@ struct task_struct; extern void fpsimd_save_state(struct fpsimd_state *state); extern void fpsimd_load_state(struct fpsimd_state *state); +extern void fpsimd_save_partial_state(struct fpsimd_partial_state *state); +extern void fpsimd_load_partial_state(struct fpsimd_partial_state *state); + extern void fpsimd_thread_switch(struct task_struct *next); extern void fpsimd_flush_thread(void); diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h index bbec599..1b47587 100644 --- a/arch/arm64/include/asm/fpsimdmacros.h +++ b/arch/arm64/include/asm/fpsimdmacros.h @@ -62,3 +62,40 @@ ldr w\tmpnr, [\state, #16 * 2 + 4] msr fpcr, x\tmpnr .endm + +.altmacro +.macro q2op, op, q1, q2, state + \op q\q1, q\q2, [\state, #-(16 * \q1) - 16] +.endm + +.macro fpsimd_save_partial state, tmpnr1, tmpnr2 + mrs x\tmpnr1, fpsr + mrs x\tmpnr2, fpcr + stp w\tmpnr1, w\tmpnr2, [\state, #4] + adr x\tmpnr1, 0f + ldr w\tmpnr2, [\state] + add \state, \state, x\tmpnr2, lsl #4 + sub x\tmpnr1, x\tmpnr1, x\tmpnr2, lsl #1 + br x\tmpnr1 + .irp qa, 30, 28, 26, 24, 22, 20, 18, 16, 14, 12, 10, 8, 6, 4, 2, 0 + qb = \qa + 1 + q2op stp, \qa, %qb, \state + .endr +0: +.endm + +.macro fpsimd_restore_partial state, tmpnr1, tmpnr2 + ldp w\tmpnr1, w\tmpnr2, [\state, #4] + msr fpsr, x\tmpnr1 + msr fpcr, x\tmpnr2 + adr x\tmpnr1, 0f + ldr w\tmpnr2, [\state] + add \state, \state, x\tmpnr2, lsl #4 + sub x\tmpnr1, x\tmpnr1, x\tmpnr2, lsl #1 + br x\tmpnr1 + .irp qa, 30, 28, 26, 24, 22, 20, 18, 16, 14, 12, 10, 8, 6, 4, 2, 0 + qb = \qa + 1 + q2op ldp, \qa, %qb, \state + .endr +0: +.endm diff --git a/arch/arm64/include/asm/neon.h b/arch/arm64/include/asm/neon.h index b0cc58a9..1c8600a 100644 --- a/arch/arm64/include/asm/neon.h +++ b/arch/arm64/include/asm/neon.h @@ -8,7 +8,38 @@ * published by the Free Software Foundation. */ +#include +#include +#include + #define cpu_has_neon() (1) +#define DEFINE_NEON_STACK_REGS(a, num) \ + struct { \ + struct fpsimd_partial_state regs; \ + __uint128_t vregs[(num) > 32 ? 32 : ((num) + 1) & ~1U]; \ + } a = { .regs.num_regs = sizeof(a.vregs) / sizeof(__uint128_t) } + +#define DEFINE_NEON_STACK_REGS_ALL(name) DEFINE_NEON_STACK_REGS(name, 32) + void kernel_neon_begin(void); void kernel_neon_end(void); + +static inline void __kernel_neon_begin_atomic(struct fpsimd_partial_state *regs) +{ + if (!in_interrupt()) + kernel_neon_begin(); + else + fpsimd_save_partial_state(regs); +} + +static inline void __kernel_neon_end_atomic(struct fpsimd_partial_state *regs) +{ + if (!in_interrupt()) + kernel_neon_end(); + else + fpsimd_load_partial_state(regs); +} + +#define kernel_neon_begin_atomic(a) __kernel_neon_begin_atomic(&(a).regs) +#define kernel_neon_end_atomic(a) __kernel_neon_end_atomic(&(a).regs) diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S index 6a27cd6..82cf648 100644 --- a/arch/arm64/kernel/entry-fpsimd.S +++ b/arch/arm64/kernel/entry-fpsimd.S @@ -41,3 +41,27 @@ ENTRY(fpsimd_load_state) fpsimd_restore x0, 8 ret ENDPROC(fpsimd_load_state) + +#ifdef CONFIG_KERNEL_MODE_NEON + +/* + * Save the bottom n FP registers. + * + * x0 - pointer to struct fpsimd_partial_state + */ +ENTRY(fpsimd_save_partial_state) + fpsimd_save_partial x0, 8, 9 + ret +ENDPROC(fpsimd_load_partial_state) + +/* + * Load the bottom n FP registers. + * + * x0 - pointer to struct fpsimd_partial_state + */ +ENTRY(fpsimd_load_partial_state) + fpsimd_restore_partial x0, 8, 9 + ret +ENDPROC(fpsimd_load_partial_state) + +#endif diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 1f2e4d5..69c7962 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -109,6 +109,9 @@ void kernel_neon_end(void) } EXPORT_SYMBOL(kernel_neon_end); +EXPORT_SYMBOL(fpsimd_load_partial_state); +EXPORT_SYMBOL(fpsimd_save_partial_state); + #endif /* CONFIG_KERNEL_MODE_NEON */ /*