From patchwork Thu May 8 11:23:04 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 4135361 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 123B09F1E1 for ; Thu, 8 May 2014 11:26:30 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 0A496201E7 for ; Thu, 8 May 2014 11:26:29 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E89EC201DC for ; Thu, 8 May 2014 11:26:27 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1WiMQY-0005Ib-A8; Thu, 08 May 2014 11:23:50 +0000 Received: from mail-ee0-f43.google.com ([74.125.83.43]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1WiMQL-0005Bj-3v for linux-arm-kernel@lists.infradead.org; Thu, 08 May 2014 11:23:38 +0000 Received: by mail-ee0-f43.google.com with SMTP id d17so1605372eek.2 for ; Thu, 08 May 2014 04:23:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=tcBZe9O0kcozKt5D1KnOax2m/rdMWLIspP3g/YrHwB8=; b=ECl3HeRsBXuHu50nVjHsoLnQZq6yPnLKf8nL6KZzO+QqXQD5BmKFIHNx9QHvoJz2eL FoCO4jreLIM98NEJWrAVuJut09APcYKFw2h4TjudS9HfmbDhnpMk2uJypHmO+7GPLGpZ L6/k/D+8HhKvggRmMTGMWJ9YbPLgyY9dFSKTbyd4bV0BeTpZKrvqJrb3w7gz+kzdwVSF ivSSxZK/XHEtJxQByPhn2nBLOpKaBaB9+hjSPtn4lW7rxi5zvyDW+sl8/I/8zyf0UkQD JYaTnAxF0qzKSINMPSUpymeSXA6xUMnHfzhiAcWFvCpqWJtmv5zoLVO1UH9c0Ohe1Df1 PiKg== X-Gm-Message-State: ALoCoQnGk3R6ErZGAv/tHm4AGFURhVrsVHjY8ErrcpAVQaX/tpeJID+Xq1XUjUC908N7FU8efG6u X-Received: by 10.15.34.197 with SMTP id e45mr4723379eev.112.1399548194642; Thu, 08 May 2014 04:23:14 -0700 (PDT) Received: from ards-macbook-pro.lan (ip51cdd08e.speed.planet.nl. [81.205.208.142]) by mx.google.com with ESMTPSA id x45sm3620801eeu.23.2014.05.08.04.23.13 for (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 08 May 2014 04:23:14 -0700 (PDT) From: Ard Biesheuvel To: catalin.marinas@arm.com Subject: [PATCH 3/3] arm64: add support for kernel mode NEON in interrupt context Date: Thu, 8 May 2014 13:23:04 +0200 Message-Id: <1399548184-9534-3-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 1.8.3.2 In-Reply-To: <1399548184-9534-1-git-send-email-ard.biesheuvel@linaro.org> References: <1399548184-9534-1-git-send-email-ard.biesheuvel@linaro.org> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20140508_042337_495730_7F7CB824 X-CRM114-Status: GOOD ( 18.69 ) X-Spam-Score: -0.7 (/) Cc: linux-arm-kernel@lists.infradead.org, Ard Biesheuvel X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patch modifies kernel_neon_begin() and kernel_neon_end(), so they may be called from any context. To address the case where only a couple of registers are needed, kernel_neon_begin_partial(u32) is introduced which takes as a parameter the number of bottom 'n' NEON q-registers required. To mark the end of such a partial section, the regular kernel_neon_end() should be used. Signed-off-by: Ard Biesheuvel --- Changes relative to previous version: - removed explicit __aligned(16) from struct fpsimd_partial_state - moved num_regs from offset #0 to offset #8 arch/arm64/include/asm/fpsimd.h | 15 ++++++++++++ arch/arm64/include/asm/fpsimdmacros.h | 35 ++++++++++++++++++++++++++++ arch/arm64/include/asm/neon.h | 6 ++++- arch/arm64/kernel/entry-fpsimd.S | 24 +++++++++++++++++++ arch/arm64/kernel/fpsimd.c | 44 ++++++++++++++++++++++++----------- 5 files changed, 109 insertions(+), 15 deletions(-) diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h index 7a900142dbc8..50f559f574fe 100644 --- a/arch/arm64/include/asm/fpsimd.h +++ b/arch/arm64/include/asm/fpsimd.h @@ -41,6 +41,17 @@ struct fpsimd_state { unsigned int cpu; }; +/* + * Struct for stacking the bottom 'n' FP/SIMD registers. + */ +struct fpsimd_partial_state { + u32 fpsr; + u32 fpcr; + u32 num_regs; + __uint128_t vregs[32]; +}; + + #if defined(__KERNEL__) && defined(CONFIG_COMPAT) /* Masks for extracting the FPSR and FPCR from the FPSCR */ #define VFP_FPSCR_STAT_MASK 0xf800009f @@ -66,6 +77,10 @@ extern void fpsimd_update_current_state(struct fpsimd_state *state); extern void fpsimd_flush_task_state(struct task_struct *target); +extern void fpsimd_save_partial_state(struct fpsimd_partial_state *state, + u32 num_regs); +extern void fpsimd_load_partial_state(struct fpsimd_partial_state *state); + #endif #endif diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h index bbec599c96bd..768414d55e64 100644 --- a/arch/arm64/include/asm/fpsimdmacros.h +++ b/arch/arm64/include/asm/fpsimdmacros.h @@ -62,3 +62,38 @@ ldr w\tmpnr, [\state, #16 * 2 + 4] msr fpcr, x\tmpnr .endm + +.altmacro +.macro fpsimd_save_partial state, numnr, tmpnr1, tmpnr2 + mrs x\tmpnr1, fpsr + str w\numnr, [\state, #8] + mrs x\tmpnr2, fpcr + stp w\tmpnr1, w\tmpnr2, [\state] + adr x\tmpnr1, 0f + add \state, \state, x\numnr, lsl #4 + sub x\tmpnr1, x\tmpnr1, x\numnr, lsl #1 + br x\tmpnr1 + .irp qa, 30, 28, 26, 24, 22, 20, 18, 16, 14, 12, 10, 8, 6, 4, 2, 0 + .irp qb, %(qa + 1) + stp q\qa, q\qb, [\state, # -16 * \qa - 16] + .endr + .endr +0: +.endm + +.macro fpsimd_restore_partial state, tmpnr1, tmpnr2 + ldp w\tmpnr1, w\tmpnr2, [\state] + msr fpsr, x\tmpnr1 + msr fpcr, x\tmpnr2 + adr x\tmpnr1, 0f + ldr w\tmpnr2, [\state, #8] + add \state, \state, x\tmpnr2, lsl #4 + sub x\tmpnr1, x\tmpnr1, x\tmpnr2, lsl #1 + br x\tmpnr1 + .irp qa, 30, 28, 26, 24, 22, 20, 18, 16, 14, 12, 10, 8, 6, 4, 2, 0 + .irp qb, %(qa + 1) + ldp q\qa, q\qb, [\state, # -16 * \qa - 16] + .endr + .endr +0: +.endm diff --git a/arch/arm64/include/asm/neon.h b/arch/arm64/include/asm/neon.h index b0cc58a97780..13ce4cc18e26 100644 --- a/arch/arm64/include/asm/neon.h +++ b/arch/arm64/include/asm/neon.h @@ -8,7 +8,11 @@ * published by the Free Software Foundation. */ +#include + #define cpu_has_neon() (1) -void kernel_neon_begin(void); +#define kernel_neon_begin() kernel_neon_begin_partial(32) + +void kernel_neon_begin_partial(u32 num_regs); void kernel_neon_end(void); diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S index 6a27cd6dbfa6..d358ccacfc00 100644 --- a/arch/arm64/kernel/entry-fpsimd.S +++ b/arch/arm64/kernel/entry-fpsimd.S @@ -41,3 +41,27 @@ ENTRY(fpsimd_load_state) fpsimd_restore x0, 8 ret ENDPROC(fpsimd_load_state) + +#ifdef CONFIG_KERNEL_MODE_NEON + +/* + * Save the bottom n FP registers. + * + * x0 - pointer to struct fpsimd_partial_state + */ +ENTRY(fpsimd_save_partial_state) + fpsimd_save_partial x0, 1, 8, 9 + ret +ENDPROC(fpsimd_load_partial_state) + +/* + * Load the bottom n FP registers. + * + * x0 - pointer to struct fpsimd_partial_state + */ +ENTRY(fpsimd_load_partial_state) + fpsimd_restore_partial x0, 8, 9 + ret +ENDPROC(fpsimd_load_partial_state) + +#endif diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 5ae89303c3ab..ad8aebb1cdef 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -218,29 +218,45 @@ void fpsimd_flush_task_state(struct task_struct *t) #ifdef CONFIG_KERNEL_MODE_NEON +static DEFINE_PER_CPU(struct fpsimd_partial_state, hardirq_fpsimdstate); +static DEFINE_PER_CPU(struct fpsimd_partial_state, softirq_fpsimdstate); + /* * Kernel-side NEON support functions */ -void kernel_neon_begin(void) +void kernel_neon_begin_partial(u32 num_regs) { - /* Avoid using the NEON in interrupt context */ - BUG_ON(in_interrupt()); - preempt_disable(); + if (in_interrupt()) { + struct fpsimd_partial_state *s = this_cpu_ptr( + in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate); - /* - * Save the userland FPSIMD state if we have one and if we haven't done - * so already. Clear fpsimd_last_state to indicate that there is no - * longer userland FPSIMD state in the registers. - */ - if (current->mm && !test_and_set_thread_flag(TIF_FOREIGN_FPSTATE)) - fpsimd_save_state(¤t->thread.fpsimd_state); - this_cpu_write(fpsimd_last_state, NULL); + BUG_ON(num_regs > 32); + fpsimd_save_partial_state(s, roundup(num_regs, 2)); + } else { + /* + * Save the userland FPSIMD state if we have one and if we + * haven't done so already. Clear fpsimd_last_state to indicate + * that there is no longer userland FPSIMD state in the + * registers. + */ + preempt_disable(); + if (current->mm && + !test_and_set_thread_flag(TIF_FOREIGN_FPSTATE)) + fpsimd_save_state(¤t->thread.fpsimd_state); + this_cpu_write(fpsimd_last_state, NULL); + } } -EXPORT_SYMBOL(kernel_neon_begin); +EXPORT_SYMBOL(kernel_neon_begin_partial); void kernel_neon_end(void) { - preempt_enable(); + if (in_interrupt()) { + struct fpsimd_partial_state *s = this_cpu_ptr( + in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate); + fpsimd_load_partial_state(s); + } else { + preempt_enable(); + } } EXPORT_SYMBOL(kernel_neon_end);