From patchwork Thu Apr 9 16:53:59 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 6189551 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id C35149F2E9 for ; Thu, 9 Apr 2015 16:57:08 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 8B57020376 for ; Thu, 9 Apr 2015 16:57:07 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 501A220364 for ; Thu, 9 Apr 2015 16:57:06 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1YgFiD-0001Iz-NU; Thu, 09 Apr 2015 16:53:53 +0000 Received: from inca-roads.misterjones.org ([213.251.177.50]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1YgFi8-0001EQ-Eu for linux-arm-kernel@lists.infradead.org; Thu, 09 Apr 2015 16:53:50 +0000 Received: from [93.89.81.133] (helo=why.lan) by cheepnis.misterjones.org with esmtpsa (TLSv1.2:AES128-SHA256:128) (Exim 4.80) (envelope-from ) id 1YgFhh-0008AP-E0; Thu, 09 Apr 2015 18:53:21 +0200 From: Marc Zyngier To: linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu Subject: [RFC PATCH] arm64: KVM: remove fpsimd save/restore from the world switch Date: Thu, 9 Apr 2015 17:53:59 +0100 Message-Id: <1428598439-5217-1-git-send-email-marc.zyngier@arm.com> X-Mailer: git-send-email 2.1.4 X-SA-Exim-Connect-IP: 93.89.81.133 X-SA-Exim-Rcpt-To: linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu, christoffer.dall@linaro.org X-SA-Exim-Mail-From: marc.zyngier@arm.com X-SA-Exim-Scanned: No (on cheepnis.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20150409_095348_677733_C0F45F6E X-CRM114-Status: GOOD ( 17.25 ) X-Spam-Score: 1.0 (+) Cc: Christoffer Dall X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The world switch spends quite some time dealing with the FP/SIMD registers, as the state is quite sizeable (32 128bit registers, plus some crumbs on the side). We save/restore them on each entry/exit, so that both the host and the guest always see the state they expect. But let's face it: the host kernel doesn't care. It is the host userspace that actually cares about FP. An obvious improvement is to remove the save/restore from the world switch, and only perform it when we're about to enter/exit the guest (by plugging it into vcpu_load/vcpu_put). The effect is pretty spectacular when running hackbench (which is the only benchmark worth looking at): Without this patch: Running with 50*40 (== 2000) tasks. Time: 36.756 Running with 50*40 (== 2000) tasks. Time: 36.679 Running with 50*40 (== 2000) tasks. Time: 36.699 With this patch: Running with 50*40 (== 2000) tasks. Time: 30.947 Running with 50*40 (== 2000) tasks. Time: 30.868 Running with 50*40 (== 2000) tasks. Time: 30.961 This is on a HiKey board (8*A53), with a 4 vcpu guest. Signed-off-by: Marc Zyngier --- arch/arm/include/asm/kvm_host.h | 3 +++ arch/arm/kvm/arm.c | 2 ++ arch/arm64/include/asm/kvm_asm.h | 4 ++++ arch/arm64/include/asm/kvm_host.h | 3 +++ arch/arm64/kvm/Makefile | 1 + arch/arm64/kvm/fpsimd.S | 39 ++++++++++++++++++++++++++++++++++++ arch/arm64/kvm/handle_fpsimd.c | 42 +++++++++++++++++++++++++++++++++++++++ arch/arm64/kvm/hyp.S | 27 ------------------------- 8 files changed, 94 insertions(+), 27 deletions(-) create mode 100644 arch/arm64/kvm/fpsimd.S create mode 100644 arch/arm64/kvm/handle_fpsimd.c diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h index d71607c..65cf1d1 100644 --- a/arch/arm/include/asm/kvm_host.h +++ b/arch/arm/include/asm/kvm_host.h @@ -226,6 +226,9 @@ static inline void vgic_arch_setup(const struct vgic_params *vgic) int kvm_perf_init(void); int kvm_perf_teardown(void); +static inline void kvm_fpsimd_flush_hwstate(struct kvm_vcpu *vcpu) {} +static inline void kvm_fpsimd_sync_hwstate(struct kvm_vcpu *vcpu) {} + void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot); struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr); diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index 6f53645..ff1213c 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -287,6 +287,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) vcpu->cpu = cpu; vcpu->arch.host_cpu_context = this_cpu_ptr(kvm_host_cpu_state); + kvm_fpsimd_flush_hwstate(vcpu); kvm_arm_set_running_vcpu(vcpu); } @@ -299,6 +300,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) */ vcpu->cpu = -1; + kvm_fpsimd_sync_hwstate(vcpu); kvm_arm_set_running_vcpu(NULL); } diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index 4f7310f..eafb0c3 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -137,6 +137,10 @@ extern char __restore_vgic_v2_state[]; extern char __save_vgic_v3_state[]; extern char __restore_vgic_v3_state[]; +struct kvm_cpu_context; +extern void __kvm_save_fpsimd(struct kvm_cpu_context *); +extern void __kvm_restore_fpsimd(struct kvm_cpu_context *); + #endif #endif /* __ARM_KVM_ASM_H__ */ diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index f0f58c9..2b968e5 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -201,6 +201,9 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run, int kvm_perf_init(void); int kvm_perf_teardown(void); +void kvm_fpsimd_flush_hwstate(struct kvm_vcpu *vcpu); +void kvm_fpsimd_sync_hwstate(struct kvm_vcpu *vcpu); + struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr); static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr, diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile index d5904f8..6d9c2b7 100644 --- a/arch/arm64/kvm/Makefile +++ b/arch/arm64/kvm/Makefile @@ -18,6 +18,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(ARM)/psci.o $(ARM)/perf.o kvm-$(CONFIG_KVM_ARM_HOST) += emulate.o inject_fault.o regmap.o kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o kvm-$(CONFIG_KVM_ARM_HOST) += guest.o reset.o sys_regs.o sys_regs_generic_v8.o +kvm-$(CONFIG_KVM_ARM_HOST) += fpsimd.o handle_fpsimd.o kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic.o kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic-v2.o diff --git a/arch/arm64/kvm/fpsimd.S b/arch/arm64/kvm/fpsimd.S new file mode 100644 index 0000000..458a1a7 --- /dev/null +++ b/arch/arm64/kvm/fpsimd.S @@ -0,0 +1,39 @@ +/* + * Copyright (C) 2012,2013 - ARM Ltd + * Author: Marc Zyngier + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see . + */ + +#include + +#include +#include + +#define CPU_GP_REG_OFFSET(x) (CPU_GP_REGS + x) + +ENTRY(__kvm_save_fpsimd) + // x0: cpu context address + // x1, x2: tmp regs + add x1, x0, #CPU_GP_REG_OFFSET(CPU_FP_REGS) + fpsimd_save x1, 2 + ret +END(__kvm_save_fpsimd) + +ENTRY(__kvm_restore_fpsimd) + // x0: cpu context address + // x1, x2: tmp regs + add x1, x0, #CPU_GP_REG_OFFSET(CPU_FP_REGS) + fpsimd_restore x1, 2 + ret +END(__kvm_restore_fpsimd) diff --git a/arch/arm64/kvm/handle_fpsimd.c b/arch/arm64/kvm/handle_fpsimd.c new file mode 100644 index 0000000..3d34cc9 --- /dev/null +++ b/arch/arm64/kvm/handle_fpsimd.c @@ -0,0 +1,42 @@ +/* + * Copyright (C) 2015 - ARM Ltd + * Author: Marc Zyngier + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see . + */ + +#include + +void kvm_fpsimd_flush_hwstate(struct kvm_vcpu *vcpu) +{ + unsigned long flags; + + local_irq_save(flags); + + __kvm_save_fpsimd(vcpu->arch.host_cpu_context); + __kvm_restore_fpsimd(&vcpu->arch.ctxt); + + local_irq_restore(flags); +} + +void kvm_fpsimd_sync_hwstate(struct kvm_vcpu *vcpu) +{ + unsigned long flags; + + local_irq_save(flags); + + __kvm_save_fpsimd(&vcpu->arch.ctxt); + __kvm_restore_fpsimd(vcpu->arch.host_cpu_context); + + local_irq_restore(flags); +} diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S index 5befd01..425c1ad 100644 --- a/arch/arm64/kvm/hyp.S +++ b/arch/arm64/kvm/hyp.S @@ -21,7 +21,6 @@ #include #include #include -#include #include #include #include @@ -102,20 +101,6 @@ restore_common_regs .endm -.macro save_fpsimd - // x2: cpu context address - // x3, x4: tmp regs - add x3, x2, #CPU_GP_REG_OFFSET(CPU_FP_REGS) - fpsimd_save x3, 4 -.endm - -.macro restore_fpsimd - // x2: cpu context address - // x3, x4: tmp regs - add x3, x2, #CPU_GP_REG_OFFSET(CPU_FP_REGS) - fpsimd_restore x3, 4 -.endm - .macro save_guest_regs // x0 is the vcpu address // x1 is the return code, do not corrupt! @@ -904,14 +889,6 @@ __restore_debug: restore_debug ret -__save_fpsimd: - save_fpsimd - ret - -__restore_fpsimd: - restore_fpsimd - ret - /* * u64 __kvm_vcpu_run(struct kvm_vcpu *vcpu); * @@ -932,7 +909,6 @@ ENTRY(__kvm_vcpu_run) kern_hyp_va x2 save_host_regs - bl __save_fpsimd bl __save_sysregs compute_debug_state 1f @@ -948,7 +924,6 @@ ENTRY(__kvm_vcpu_run) add x2, x0, #VCPU_CONTEXT bl __restore_sysregs - bl __restore_fpsimd skip_debug_state x3, 1f bl __restore_debug @@ -967,7 +942,6 @@ __kvm_vcpu_return: add x2, x0, #VCPU_CONTEXT save_guest_regs - bl __save_fpsimd bl __save_sysregs skip_debug_state x3, 1f @@ -986,7 +960,6 @@ __kvm_vcpu_return: kern_hyp_va x2 bl __restore_sysregs - bl __restore_fpsimd skip_debug_state x3, 1f // Clear the dirty flag for the next run, as all the state has