From patchwork Sun Sep 16 12:46:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Liran Alon X-Patchwork-Id: 10601695 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0502C112B for ; Sun, 16 Sep 2018 12:48:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F1F2829DB1 for ; Sun, 16 Sep 2018 12:48:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E564A29EA1; Sun, 16 Sep 2018 12:48:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI,T_DKIM_INVALID,UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 4D2C529DB1 for ; Sun, 16 Sep 2018 12:48:23 +0000 (UTC) Received: from localhost ([::1]:58976 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1g1WTO-00073N-Ht for patchwork-qemu-devel@patchwork.kernel.org; Sun, 16 Sep 2018 08:48:22 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54557) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1g1WS9-0006BN-53 for qemu-devel@nongnu.org; Sun, 16 Sep 2018 08:47:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1g1WS5-0006Ro-Vx for qemu-devel@nongnu.org; Sun, 16 Sep 2018 08:47:05 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:43294) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1g1WS5-0006RX-Lk for qemu-devel@nongnu.org; Sun, 16 Sep 2018 08:47:01 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w8GCcsgV014126; Sun, 16 Sep 2018 12:46:57 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=GVeeiHteNkWQHqVTYI4pTr1aFKp8ca7sFTp9jaiNe3o=; b=CzAgC5AjTa+Suwb7QY3ddu6iEjjuzl64/cqp+ER5d8x+0upZS6LuCbaTUvzCOwnz4zId CFQmBqlVsppEsXWsG8aK7xvGottwjjnBfOY9zxMk1wS0BH31UNkvEd4CtWH9l0CimNap wqa/4EJMUsi1V40sTXfHmkntV7XAY/YI9Ay4o7zDWN0diqp2M3XzZ29iir9u5TOVOvQt J3HlLe7gvINnoi+JBEg76MYoW9ohWErDPy4xuXEwPyYC8QCwGy7qauRFlW4MQt4C5i93 XISymu0ne3Fc+JLUQ9EulDixzpnqpZgXl8Vi+GsD3s9un16lP+w82+/PJJmnu9nuhvvY rQ== Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by userp2130.oracle.com with ESMTP id 2mgsgtap7u-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 16 Sep 2018 12:46:57 +0000 Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w8GCkvSG011729 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 16 Sep 2018 12:46:57 GMT Received: from abhmp0013.oracle.com (abhmp0013.oracle.com [141.146.116.19]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w8GCkuCU001318; Sun, 16 Sep 2018 12:46:56 GMT Received: from spark.ravello.local (/213.57.127.2) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Sun, 16 Sep 2018 05:46:56 -0700 From: Liran Alon To: qemu-devel@nongnu.org Date: Sun, 16 Sep 2018 15:46:31 +0300 Message-Id: <20180916124631.39016-3-liran.alon@oracle.com> X-Mailer: git-send-email 2.16.1 In-Reply-To: <20180916124631.39016-1-liran.alon@oracle.com> References: <20180916124631.39016-1-liran.alon@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9017 signatures=668708 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=1 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1809160140 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 156.151.31.86 Subject: [Qemu-devel] [QEMU PATCH v2 2/2] KVM: i386: Add support for save and restore nested state X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ehabkost@redhat.com, kvm@vger.kernel.org, mtosatti@redhat.com, Liran Alon , pbonzini@redhat.com, rth@twiddle.net, jmattson@google.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP Kernel commit 8fcc4b5923af ("kvm: nVMX: Introduce KVM_CAP_NESTED_STATE") introduced new IOCTLs to extract and restore KVM internal state used to run a VM that is in VMX operation. Utilize these IOCTLs to add support of migration of VMs which are running nested hypervisors. Reviewed-by: Nikita Leshchenko Reviewed-by: Patrick Colp Reviewed-by: Mihai Carabas Signed-off-by: Liran Alon --- accel/kvm/kvm-all.c | 15 +++++++++++ include/sysemu/kvm.h | 1 + target/i386/cpu.h | 2 ++ target/i386/kvm.c | 58 ++++++++++++++++++++++++++++++++++++++++ target/i386/machine.c | 73 +++++++++++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 149 insertions(+) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index de12f78eb8e4..fe6377ce9bcc 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -87,6 +87,7 @@ struct KVMState #ifdef KVM_CAP_SET_GUEST_DEBUG struct kvm_sw_breakpoint_head kvm_sw_breakpoints; #endif + uint32_t max_nested_state_len; int many_ioeventfds; int intx_set_mask; bool sync_mmu; @@ -1628,6 +1629,15 @@ static int kvm_init(MachineState *ms) s->debugregs = kvm_check_extension(s, KVM_CAP_DEBUGREGS); #endif + ret = kvm_check_extension(s, KVM_CAP_NESTED_STATE); + if (ret < 0) { + fprintf(stderr, + "kvm failed to get max size of nested state (%d)", + ret); + goto err; + } + s->max_nested_state_len = (uint32_t)ret; + #ifdef KVM_CAP_IRQ_ROUTING kvm_direct_msi_allowed = (kvm_check_extension(s, KVM_CAP_SIGNAL_MSI) > 0); #endif @@ -2187,6 +2197,11 @@ int kvm_has_debugregs(void) return kvm_state->debugregs; } +uint32_t kvm_max_nested_state_length(void) +{ + return kvm_state->max_nested_state_len; +} + int kvm_has_many_ioeventfds(void) { if (!kvm_enabled()) { diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index 0b64b8e06786..352c7fd4e3d2 100644 --- a/include/sysemu/kvm.h +++ b/include/sysemu/kvm.h @@ -210,6 +210,7 @@ bool kvm_has_sync_mmu(void); int kvm_has_vcpu_events(void); int kvm_has_robust_singlestep(void); int kvm_has_debugregs(void); +uint32_t kvm_max_nested_state_length(void); int kvm_has_pit_state2(void); int kvm_has_many_ioeventfds(void); int kvm_has_gsi_routing(void); diff --git a/target/i386/cpu.h b/target/i386/cpu.h index 6e4c2b02f947..3b97b5b280f0 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -1330,6 +1330,8 @@ typedef struct CPUX86State { #if defined(CONFIG_KVM) || defined(CONFIG_HVF) void *xsave_buf; #endif + struct kvm_nested_state *nested_state; + uint32_t nested_state_len; /* needed for migration */ #if defined(CONFIG_HVF) HVFX86EmulatorState *hvf_emul; #endif diff --git a/target/i386/kvm.c b/target/i386/kvm.c index c1cd8c461fe4..aeb55b5ed6f5 100644 --- a/target/i386/kvm.c +++ b/target/i386/kvm.c @@ -1191,6 +1191,22 @@ int kvm_arch_init_vcpu(CPUState *cs) if (has_xsave) { env->xsave_buf = qemu_memalign(4096, sizeof(struct kvm_xsave)); } + + env->nested_state_len = kvm_max_nested_state_length(); + if (env->nested_state_len > 0) { + uint32_t min_nested_state_len = + offsetof(struct kvm_nested_state, size) + sizeof(uint32_t); + + /* + * Verify nested state length cover at least the size + * field of struct kvm_nested_state + */ + assert(env->nested_state_len >= min_nested_state_len); + + env->nested_state = g_malloc0(env->nested_state_len); + env->nested_state->size = env->nested_state_len; + } + cpu->kvm_msr_buf = g_malloc0(MSR_BUF_SIZE); if (!(env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_RDTSCP)) { @@ -2867,6 +2883,39 @@ static int kvm_get_debugregs(X86CPU *cpu) return 0; } +static int kvm_put_nested_state(X86CPU *cpu) +{ + CPUX86State *env = &cpu->env; + + if (kvm_max_nested_state_length() == 0) { + return 0; + } + + assert(env->nested_state->size <= env->nested_state_len); + return kvm_vcpu_ioctl(CPU(cpu), KVM_SET_NESTED_STATE, env->nested_state); +} + +static int kvm_get_nested_state(X86CPU *cpu) +{ + CPUX86State *env = &cpu->env; + + if (kvm_max_nested_state_length() == 0) { + return 0; + } + + + /* + * It is possible that migration restored a smaller size into + * nested_state->size than what our kernel support. + * We preserve migration origin nested_state->size for + * call to KVM_SET_NESTED_STATE but wish that our next call + * to KVM_GET_NESTED_STATE will use max size our kernel support. + */ + env->nested_state->size = env->nested_state_len; + + return kvm_vcpu_ioctl(CPU(cpu), KVM_GET_NESTED_STATE, env->nested_state); +} + int kvm_arch_put_registers(CPUState *cpu, int level) { X86CPU *x86_cpu = X86_CPU(cpu); @@ -2874,6 +2923,11 @@ int kvm_arch_put_registers(CPUState *cpu, int level) assert(cpu_is_stopped(cpu) || qemu_cpu_is_self(cpu)); + ret = kvm_put_nested_state(x86_cpu); + if (ret < 0) { + return ret; + } + if (level >= KVM_PUT_RESET_STATE) { ret = kvm_put_msr_feature_control(x86_cpu); if (ret < 0) { @@ -2989,6 +3043,10 @@ int kvm_arch_get_registers(CPUState *cs) if (ret < 0) { goto out; } + ret = kvm_get_nested_state(cpu); + if (ret < 0) { + goto out; + } ret = 0; out: cpu_sync_bndcs_hflags(&cpu->env); diff --git a/target/i386/machine.c b/target/i386/machine.c index 084c2c73a8f7..781de40dfcbe 100644 --- a/target/i386/machine.c +++ b/target/i386/machine.c @@ -842,6 +842,78 @@ static const VMStateDescription vmstate_tsc_khz = { } }; +static int nested_state_post_load(void *opaque, int version_id) +{ + X86CPU *cpu = opaque; + CPUX86State *env = &cpu->env; + uint32_t min_nested_state_len = + offsetof(struct kvm_nested_state, size) + sizeof(uint32_t); + uint32_t max_nested_state_len = kvm_max_nested_state_length(); + + /* + * If our kernel don't support setting nested state + * and we have received nested state from migration stream, + * we need to fail migration + */ + if (max_nested_state_len == 0) { + error_report("Received nested state when " + "kernel cannot restore it"); + return -EINVAL; + } + + /* + * Verify that the size of received buffer covers the + * struct size field and that the size specified + * in given struct is set to no more than the size + * that our kernel support + */ + if (env->nested_state_len < min_nested_state_len) { + error_report("Received nested state size less than min: " + "len=%d, min=%d", + env->nested_state_len, min_nested_state_len); + return -EINVAL; + } + if (env->nested_state->size > max_nested_state_len) { + error_report("Recieved unsupported nested state size: " + "nested_state->size=%d, max=%d", + env->nested_state->size, max_nested_state_len); + return -EINVAL; + } + + /* + * Reallocate nested_state buffer to always remain + * in max size which our kernel can support + */ + env->nested_state_len = max_nested_state_len; + env->nested_state = g_realloc(env->nested_state, + env->nested_state_len); + assert(env->nested_state); + + return 0; +} + +static bool nested_state_needed(void *opaque) +{ + X86CPU *cpu = opaque; + CPUX86State *env = &cpu->env; + return (env->nested_state_len > 0); +} + +static const VMStateDescription vmstate_nested_state = { + .name = "cpu/nested_state", + .version_id = 1, + .minimum_version_id = 1, + .post_load = nested_state_post_load, + .needed = nested_state_needed, + .fields = (VMStateField[]) { + VMSTATE_UINT32(env.nested_state_len, X86CPU), + VMSTATE_VBUFFER_ALLOC_UINT32(env.nested_state, X86CPU, + 0, NULL, + env.nested_state_len), + VMSTATE_END_OF_LIST() + } +}; + static bool mcg_ext_ctl_needed(void *opaque) { X86CPU *cpu = opaque; @@ -1080,6 +1152,7 @@ VMStateDescription vmstate_x86_cpu = { &vmstate_msr_intel_pt, &vmstate_msr_virt_ssbd, &vmstate_svm_npt, + &vmstate_nested_state, NULL } };