[1/4] KVM: x86: Introduce .pcpu_is_idle() stub infrastructure

This patch series aims to fix performance issue caused by current
para-virtualized scheduling design.

The current para-virtualized scheduling design uses 'preempted' field of
kvm_steal_time to avoid scheduling task on the preempted vCPU.
However, when the pCPU where the preempted vCPU most recently run is idle,
it will result in low cpu utilization, and consequently poor performance.

The new field: 'is_idle' of kvm_steal_time can precisely reveal
the status of pCPU where preempted vCPU most recently run, and
then improve cpu utilization.

pcpu_is_idle() is used to get the value of 'is_idle' of kvm_steal_time.

Experiments on a VM with 16 vCPUs show that the patch can reduce around
50% to 80% execution time for most PARSEC benchmarks. 
This also holds true for a VM with 112 vCPUs.

Experiments on 2 VMs with 112 vCPUs show that the patch can reduce around
20% to 80% execution time for most PARSEC benchmarks. 

Test environment:
-- PowerEdge R740
-- 56C-112T CPU Intel(R) Xeon(R) Gold 6238R CPU
-- Host 190G DRAM
-- QEMU 5.0.0
-- PARSEC 3.0 Native Inputs
-- Host is idle during the test
-- Host and Guest kernel are both kernel-5.14.0

Results:
1. 1 VM, 16 VCPU, 16 THREAD.
   Host Topology: sockets=2 cores=28 threads=2
   VM Topology:   sockets=1 cores=16 threads=1
   Command: <path to parsec>/bin/parsecmgmt -a run -p <benchmark> -i native -n 16
   Statistics below are the real time of running each benchmark.(lower is better)

			before patch    after patch	improvements
bodytrack		52.866s		22.619s		57.21%
fluidanimate		84.009s		38.148s		54.59%
streamcluster		270.17s		42.726s		84.19%
splash2x.ocean_cp	31.932s		9.539s		70.13%
splash2x.ocean_ncp	36.063s		14.189s		60.65%
splash2x.volrend	134.587s	21.79s		83.81%

2. 1VM, 112 VCPU. Some benchmarks require the number of threads to be the power of 2,
so we run them with 64 threads and 128 threads.
   Host Topology: sockets=2 cores=28 threads=2
   VM Topology:   sockets=1 cores=112 threads=1
   Command: <path to parsec>/bin/parsecmgmt -a run -p <benchmark> -i native -n <64,112,128>
   Statistics below are the real time of running each benchmark.(lower is better)

                        		before patch    after patch     improvements
fluidanimate(64 thread)			124.235s	27.924s		77.52%
fluidanimate(128 thread)		169.127s	64.541s		61.84%
streamcluster(112 thread)		861.879s	496.66s		42.37%
splash2x.ocean_cp(64 thread)		46.415s		18.527s		60.08%
splash2x.ocean_cp(128 thread)		53.647s		28.929s		46.08%
splash2x.ocean_ncp(64 thread)		47.613s		19.576s		58.89%
splash2x.ocean_ncp(128 thread)		54.94s		29.199s		46.85%
splash2x.volrend(112 thread)		801.384s	144.824s	81.93%

3. 2VM, each VM: 112 VCPU. Some benchmarks require the number of threads to
be the power of 2, so we run them with 64 threads and 128 threads.
   Host Topology: sockets=2 cores=28 threads=2
   VM Topology:   sockets=1 cores=112 threads=1
   Command: <path to parsec>/bin/parsecmgmt -a run -p <benchmark> -i native -n <64,112,128>
   Statistics below are the average real time of running each benchmark in 2 VMs.(lower is better)

                                        before patch    after patch	improvements
fluidanimate(64 thread)			135.2125s	49.827s		63.15%
fluidanimate(128 thread)		178.309s	86.964s		51.23%
splash2x.ocean_cp(64 thread)		47.4505s	20.314s		57.19%
splash2x.ocean_cp(128 thread)		55.5645s	30.6515s	44.84%
splash2x.ocean_ncp(64 thread)		49.9775s	23.489s		53.00%
splash2x.ocean_ncp(128 thread)		56.847s		28.545s		49.79%
splash2x.volrend(112 thread)		838.939s	239.632s	71.44%

For space limit, we list representative statistics here.

--
Authors: Tianqiang Xu, Dingji Li, Zeyu Mi
	 Shanghai Jiao Tong University

Signed-off-by: Tianqiang Xu <skyele@sjtu.edu.cn>
---
 arch/x86/hyperv/hv_spinlock.c         |  7 +++++++
 arch/x86/include/asm/cpufeatures.h    |  1 +
 arch/x86/include/asm/kvm_host.h       |  1 +
 arch/x86/include/asm/paravirt.h       |  8 ++++++++
 arch/x86/include/asm/paravirt_types.h |  1 +
 arch/x86/include/asm/qspinlock.h      |  7 +++++++
 arch/x86/include/uapi/asm/kvm_para.h  |  4 +++-
 arch/x86/kernel/asm-offsets_64.c      |  1 +
 arch/x86/kernel/kvm.c                 | 21 +++++++++++++++++++++
 arch/x86/kernel/paravirt-spinlocks.c  | 15 +++++++++++++++
 arch/x86/kernel/paravirt.c            |  2 ++
 11 files changed, 67 insertions(+), 1 deletion(-)

Message ID	20210831015919.13006-1-skyele@sjtu.edu.cn (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-sgx-owner@kernel.org> X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8C7A6C432BE for <linux-sgx@archiver.kernel.org>; Tue, 31 Aug 2021 02:00:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 71ECD6101C for <linux-sgx@archiver.kernel.org>; Tue, 31 Aug 2021 02:00:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239336AbhHaCAy (ORCPT <rfc822;linux-sgx@archiver.kernel.org>); Mon, 30 Aug 2021 22:00:54 -0400 Received: from smtp181.sjtu.edu.cn ([202.120.2.181]:41020 "EHLO smtp181.sjtu.edu.cn" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235217AbhHaCAx (ORCPT <rfc822;linux-sgx@vger.kernel.org>); Mon, 30 Aug 2021 22:00:53 -0400 Received: from proxy02.sjtu.edu.cn (smtp188.sjtu.edu.cn [202.120.2.188]) by smtp181.sjtu.edu.cn (Postfix) with ESMTPS id D47751008CBCD; Tue, 31 Aug 2021 09:59:55 +0800 (CST) Received: from localhost (localhost.localdomain [127.0.0.1]) by proxy02.sjtu.edu.cn (Postfix) with ESMTP id C5B93228C9244; Tue, 31 Aug 2021 09:59:55 +0800 (CST) X-Virus-Scanned: amavisd-new at Received: from proxy02.sjtu.edu.cn ([127.0.0.1]) by localhost (proxy02.sjtu.edu.cn [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id SBVDgW0gRBe3; Tue, 31 Aug 2021 09:59:55 +0800 (CST) Received: from sky.ipads-lab.se.sjtu.edu.cn (unknown [202.120.40.82]) (Authenticated sender: skyele@sjtu.edu.cn) by proxy02.sjtu.edu.cn (Postfix) with ESMTPSA id 40DB2228C9235; Tue, 31 Aug 2021 09:59:34 +0800 (CST) From: Tianqiang Xu <skyele@sjtu.edu.cn> To: x86@kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, vkuznets@redhat.com, wanpengli@tencent.com, jmattson@google.com, joro@8bytes.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, kvm@vger.kernel.org, hpa@zytor.com, jarkko@kernel.org, dave.hansen@linux.intel.com, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, Tianqiang Xu <skyele@sjtu.edu.cn> Subject: [PATCH 1/4] KVM: x86: Introduce .pcpu_is_idle() stub infrastructure Date: Tue, 31 Aug 2021 09:59:16 +0800 Message-Id: <20210831015919.13006-1-skyele@sjtu.edu.cn> X-Mailer: git-send-email 2.17.1 Precedence: bulk List-ID: <linux-sgx.vger.kernel.org> X-Mailing-List: linux-sgx@vger.kernel.org
Series	[1/4] KVM: x86: Introduce .pcpu_is_idle() stub infrastructure \| expand [1/4] KVM: x86: Introduce .pcpu_is_idle() stub infrastructure [2/4] Scheduler changes [3/4] KVM host implementation [4/4] KVM guest implementation

[1/4] KVM: x86: Introduce .pcpu_is_idle() stub infrastructure

Commit Message

Comments

Patch