From patchwork Mon Feb 20 18:36:04 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Waiman Long X-Patchwork-Id: 9583473 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 89D77604A0 for ; Mon, 20 Feb 2017 18:38:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 87404284D8 for ; Mon, 20 Feb 2017 18:38:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7B9BB28807; Mon, 20 Feb 2017 18:38:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id CFD95284D8 for ; Mon, 20 Feb 2017 18:38:27 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cfsow-0006lW-La; Mon, 20 Feb 2017 18:36:22 +0000 Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cfsov-0006kq-JQ for xen-devel@lists.xenproject.org; Mon, 20 Feb 2017 18:36:21 +0000 Received: from [85.158.137.68] by server-2.bemta-3.messagelabs.com id 08/8A-16699-4273BA85; Mon, 20 Feb 2017 18:36:20 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrMLMWRWlGSWpSXmKPExsVysWW7jK6K+eo Ig80/RS2+b5nM5MDocfjDFZYAxijWzLyk/IoE1ox3m/8wFbTIVyz8HtXAOFuqi5GLQ0hgDpPE 9AMPWCCcZYwS3z93MXUxcnKwCahJ/LnVyQqSEBFYxyQx4cIfRhCHWWAns8SpC5OYQaqEBWIkX rV+BrNZBFQlHnTPZQGxeQUcJdp3doPZnAJOEps/vwGyOYBWOEosm2YKEpYQ0JY4dv4pG4Tdxy ixd6rKBEaeBYwMqxg1ilOLylKLdI0N9JKKMtMzSnITM3N0DQ2M9XJTi4sT01NzEpOK9ZLzczc xAj1fz8DAuIOx84TfIUZJDiYlUd47S1ZFCPEl5adUZiQWZ8QXleakFh9ilOHgUJLg5TRbHSEk WJSanlqRlpkDDEGYtAQHj5IIbwdImre4IDG3ODMdInWKUVFKnJcPJCEAksgozYNrg4X9JUZZK WFeRgYGBiGegtSi3MwSVPlXjOIcjErCvEEgU3gy80rgpr8CWswEtPimx0qQxSWJCCmpBkbh7Z fEUl+s8AhnS54k1R6lq3C8J213993/IkV6e68xuZrPOzgnQWPyY9N8lY38vt8Pzd+hcGVy6uL pWx4+mDff/PVr3r0cnv/69fvVDh3fWCLU+fdaycQJ4eeL5V5VfVOeNelYGqtCoIJq9zV7psT5 5+cl9d8MecBQdJ0nQPX6K4H6oIiwr/+UWIozEg21mIuKEwHQdVgTdgIAAA== X-Env-Sender: longman@redhat.com X-Msg-Ref: server-13.tower-31.messagelabs.com!1487615778!85910905!1 X-Originating-IP: [209.132.183.28] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogMjA5LjEzMi4xODMuMjggPT4gNTQwNjQ=\n X-StarScan-Received: X-StarScan-Version: 9.2.3; banners=-,-,- X-VirusChecked: Checked Received: (qmail 19017 invoked from network); 20 Feb 2017 18:36:20 -0000 Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by server-13.tower-31.messagelabs.com with DHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 20 Feb 2017 18:36:20 -0000 Received: from smtp.corp.redhat.com (int-mx16.intmail.prod.int.phx2.redhat.com [10.5.11.28]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3E8203A7680; Mon, 20 Feb 2017 18:36:18 +0000 (UTC) Received: from llong.com (dhcp-17-80.bos.redhat.com [10.18.17.80]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0014616999; Mon, 20 Feb 2017 18:36:14 +0000 (UTC) From: Waiman Long To: Jeremy Fitzhardinge , Chris Wright , Alok Kataria , Rusty Russell , Peter Zijlstra , Ingo Molnar , Thomas Gleixner , "H. Peter Anvin" Date: Mon, 20 Feb 2017 13:36:04 -0500 Message-Id: <1487615764-1343-3-git-send-email-longman@redhat.com> In-Reply-To: <1487615764-1343-1-git-send-email-longman@redhat.com> References: <1487615764-1343-1-git-send-email-longman@redhat.com> X-Scanned-By: MIMEDefang 2.74 on 10.5.11.28 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Mon, 20 Feb 2017 18:36:19 +0000 (UTC) Cc: linux-arch@vger.kernel.org, Juergen Gross , kvm@vger.kernel.org, =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= , Pan Xinhui , x86@kernel.org, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, Waiman Long , Paolo Bonzini , xen-devel@lists.xenproject.org, Boris Ostrovsky Subject: [Xen-devel] [PATCH v5 2/2] x86/kvm: Provide optimized version of vcpu_is_preempted() for x86-64 X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP It was found when running fio sequential write test with a XFS ramdisk on a KVM guest running on a 2-socket x86-64 system, the %CPU times as reported by perf were as follows: 69.75% 0.59% fio [k] down_write 69.15% 0.01% fio [k] call_rwsem_down_write_failed 67.12% 1.12% fio [k] rwsem_down_write_failed 63.48% 52.77% fio [k] osq_lock 9.46% 7.88% fio [k] __raw_callee_save___kvm_vcpu_is_preempt 3.93% 3.93% fio [k] __kvm_vcpu_is_preempted Making vcpu_is_preempted() a callee-save function has a relatively high cost on x86-64 primarily due to at least one more cacheline of data access from the saving and restoring of registers (8 of them) to and from stack as well as one more level of function call. To reduce this performance overhead, an optimized assembly version of the the __raw_callee_save___kvm_vcpu_is_preempt() function is provided for x86-64. With this patch applied on a KVM guest on a 2-socekt 16-core 32-thread system with 16 parallel jobs (8 on each socket), the aggregrate bandwidth of the fio test on an XFS ramdisk were as follows: I/O Type w/o patch with patch -------- --------- ---------- random read 8141.2 MB/s 8497.1 MB/s seq read 8229.4 MB/s 8304.2 MB/s random write 1675.5 MB/s 1701.5 MB/s seq write 1681.3 MB/s 1699.9 MB/s There are some increases in the aggregated bandwidth because of the patch. The perf data now became: 70.78% 0.58% fio [k] down_write 70.20% 0.01% fio [k] call_rwsem_down_write_failed 69.70% 1.17% fio [k] rwsem_down_write_failed 59.91% 55.42% fio [k] osq_lock 10.14% 10.14% fio [k] __kvm_vcpu_is_preempted The assembly code was verified by using a test kernel module to compare the output of C __kvm_vcpu_is_preempted() and that of assembly __raw_callee_save___kvm_vcpu_is_preempt() to verify that they matched. Suggested-by: Peter Zijlstra Signed-off-by: Waiman Long --- arch/x86/kernel/asm-offsets_64.c | 9 +++++++++ arch/x86/kernel/kvm.c | 24 ++++++++++++++++++++++++ 2 files changed, 33 insertions(+) diff --git a/arch/x86/kernel/asm-offsets_64.c b/arch/x86/kernel/asm-offsets_64.c index 210927e..99332f5 100644 --- a/arch/x86/kernel/asm-offsets_64.c +++ b/arch/x86/kernel/asm-offsets_64.c @@ -13,6 +13,10 @@ #include }; +#if defined(CONFIG_KVM_GUEST) && defined(CONFIG_PARAVIRT_SPINLOCKS) +#include +#endif + int main(void) { #ifdef CONFIG_PARAVIRT @@ -22,6 +26,11 @@ int main(void) BLANK(); #endif +#if defined(CONFIG_KVM_GUEST) && defined(CONFIG_PARAVIRT_SPINLOCKS) + OFFSET(KVM_STEAL_TIME_preempted, kvm_steal_time, preempted); + BLANK(); +#endif + #define ENTRY(entry) OFFSET(pt_regs_ ## entry, pt_regs, entry) ENTRY(bx); ENTRY(cx); diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 85ed343..14f65a5 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -589,6 +589,7 @@ static void kvm_wait(u8 *ptr, u8 val) local_irq_restore(flags); } +#ifdef CONFIG_X86_32 __visible bool __kvm_vcpu_is_preempted(long cpu) { struct kvm_steal_time *src = &per_cpu(steal_time, cpu); @@ -597,6 +598,29 @@ __visible bool __kvm_vcpu_is_preempted(long cpu) } PV_CALLEE_SAVE_REGS_THUNK(__kvm_vcpu_is_preempted); +#else + +#include + +extern bool __raw_callee_save___kvm_vcpu_is_preempted(long); + +/* + * Hand-optimize version for x86-64 to avoid 8 64-bit register saving and + * restoring to/from the stack. + */ +asm( +".pushsection .text;" +".global __raw_callee_save___kvm_vcpu_is_preempted;" +".type __raw_callee_save___kvm_vcpu_is_preempted, @function;" +"__raw_callee_save___kvm_vcpu_is_preempted:" +"movq __per_cpu_offset(,%rdi,8), %rax;" +"cmpb $0, " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax);" +"setne %al;" +"ret;" +".popsection"); + +#endif + /* * Setup pv_lock_ops to exploit KVM_FEATURE_PV_UNHALT if present. */