From patchwork Sat Feb 19 18:49:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12752393 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB6DDC433EF for ; Sat, 19 Feb 2022 18:50:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243059AbiBSSu4 (ORCPT ); Sat, 19 Feb 2022 13:50:56 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:45458 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243002AbiBSSuS (ORCPT ); Sat, 19 Feb 2022 13:50:18 -0500 Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1E79E6D4F3; Sat, 19 Feb 2022 10:49:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1645296599; x=1676832599; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=PpzT2l4IRKyPLoKbunBwpB5ZpT0TW/2JfAd52uG3wrU=; b=YfOD/Jbob0I/UjtbsmZMInW9Sm2R4mTYu125JBjBSpTYPsh2qHdE3qUH 4AfWh4j34LoCQvgxOkT9BsjaRaiJOFT+RvXp+h/Isdz5xoKseF7DHuti7 lm00elFWonk8S7cfDxCzfMMIWdQVOGh76JvN0CCDfGcy1MLTD3VkWaEeW 6+NU2l4KcLQTa/94ZRB7CpE7ti5k5oLrnJMfRi+ok05Ee+HdrCzHRgsZJ F0q9kbTJWfMOjFBSq0l9f6n2X8bD+7SXK6yT9TEFuc/hFFIXDtlFOzOwu Zy5fKCii7R25+Bug5dGmv4vneABBVftXW6ZZ51vQbrq+5CYHQuppLjqu3 Q==; X-IronPort-AV: E=McAfee;i="6200,9189,10263"; a="312059003" X-IronPort-AV: E=Sophos;i="5.88,381,1635231600"; d="scan'208";a="312059003" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Feb 2022 10:49:57 -0800 X-IronPort-AV: E=Sophos;i="5.88,381,1635231600"; d="scan'208";a="507137045" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Feb 2022 10:49:57 -0800 From: isaku.yamahata@intel.com To: Thomas Gleixner , Borislav Petkov , Paolo Bonzini , Jim Mattson , erdemaktas@google.com, Connor Kuehl , Sean Christopherson , linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com Subject: [PATCH v4 7/8] KVM: x86: Introduce vm_type to differentiate default VMs from confidential VMs Date: Sat, 19 Feb 2022 10:49:52 -0800 Message-Id: <24bfec5cbb1ecb8b9626aea7379f7f1bd62ea954.1645266955.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Sean Christopherson Unlike default VMs, confidential VMs (Intel TDX and AMD SEV-ES) don't allow some operations (e.g., memory read/write, register state access, etc). Introduce vm_type to track the type of the VM to x86 KVM. Other arch KVMs already use vm_type, KVM_INIT_VM accepts vm_type, and x86 KVM callback vm_init accepts vm_type. So follow them. Further, a different policy can be made based on vm_type. Define KVM_X86_DEFAULT_VM for default VM as default and define KVM_X86_TDX_VM for Intel TDX VM. The wrapper function will be defined as "bool is_td(kvm) { return vm_type == VM_TYPE_TDX; }" Add a capability KVM_CAP_VM_TYPES to effectively allow device model, e.g. qemu, to query what VM types are supported by KVM. This (introduce a new capability and add vm_type) is chosen to align with other arch KVMs that have VM types already. Other arch KVMs uses different name to query supported vm types and there is no common name for it, so new name was chosen. Co-developed-by: Xiaoyao Li Signed-off-by: Xiaoyao Li Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- Documentation/virt/kvm/api.rst | 15 +++++++++++++++ arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/include/uapi/asm/kvm.h | 3 +++ arch/x86/kvm/svm/svm.c | 6 ++++++ arch/x86/kvm/vmx/main.c | 1 + arch/x86/kvm/vmx/vmx.c | 5 +++++ arch/x86/kvm/vmx/x86_ops.h | 1 + arch/x86/kvm/x86.c | 9 ++++++++- include/uapi/linux/kvm.h | 1 + tools/arch/x86/include/uapi/asm/kvm.h | 3 +++ tools/include/uapi/linux/kvm.h | 1 + 12 files changed, 47 insertions(+), 1 deletion(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index a4267104db50..22076afcbc8e 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -147,15 +147,30 @@ described as 'basic' will be available. The new VM has no virtual cpus and no memory. You probably want to use 0 as machine type. +X86: +^^^^ + +Supported vm type can be queried from KVM_CAP_VM_TYPES, which returns the +bitmap of supported vm types. The 1-setting of bit @n means vm type with +value @n is supported. + +S390: +^^^^^ + In order to create user controlled virtual machines on S390, check KVM_CAP_S390_UCONTROL and use the flag KVM_VM_S390_UCONTROL as privileged user (CAP_SYS_ADMIN). +MIPS: +^^^^^ + To use hardware assisted virtualization on MIPS (VZ ASE) rather than the default trap & emulate implementation (which changes the virtual memory layout to fit in user mode), check KVM_CAP_MIPS_VZ and use the flag KVM_VM_MIPS_VZ. +ARM64: +^^^^^^ On arm64, the physical address size for a VM (IPA Size limit) is limited to 40bits by default. The limit can be configured if the host supports the diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index d39e0de06be2..8125d43d3566 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -18,6 +18,7 @@ KVM_X86_OP_NULL(hardware_unsetup) KVM_X86_OP_NULL(cpu_has_accelerated_tpr) KVM_X86_OP(has_emulated_msr) KVM_X86_OP(vcpu_after_set_cpuid) +KVM_X86_OP(is_vm_type_supported) KVM_X86_OP(vm_init) KVM_X86_OP_NULL(vm_destroy) KVM_X86_OP(vcpu_create) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 6dcccb304775..c99e8e014642 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1048,6 +1048,7 @@ struct kvm_x86_msr_filter { #define APICV_INHIBIT_REASON_ABSENT 7 struct kvm_arch { + unsigned long vm_type; unsigned long n_used_mmu_pages; unsigned long n_requested_mmu_pages; unsigned long n_max_mmu_pages; @@ -1322,6 +1323,7 @@ struct kvm_x86_ops { bool (*has_emulated_msr)(struct kvm *kvm, u32 index); void (*vcpu_after_set_cpuid)(struct kvm_vcpu *vcpu); + bool (*is_vm_type_supported)(unsigned long vm_type); unsigned int vm_size; int (*vm_init)(struct kvm *kvm); void (*vm_destroy)(struct kvm *kvm); diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index bf6e96011dfe..71a5851475e7 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -525,4 +525,7 @@ struct kvm_pmu_event_filter { #define KVM_VCPU_TSC_CTRL 0 /* control group for the timestamp counter (TSC) */ #define KVM_VCPU_TSC_OFFSET 0 /* attribute for the TSC offset */ +#define KVM_X86_DEFAULT_VM 0 +#define KVM_X86_TDX_VM 1 + #endif /* _ASM_X86_KVM_H */ diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index a290efb272ad..ba05532495fb 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4439,6 +4439,11 @@ static void svm_vm_destroy(struct kvm *kvm) sev_vm_destroy(kvm); } +static bool svm_is_vm_type_supported(unsigned long type) +{ + return type == KVM_X86_DEFAULT_VM; +} + static int svm_vm_init(struct kvm *kvm) { if (!pause_filter_count || !pause_filter_thresh) @@ -4466,6 +4471,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { .vcpu_free = svm_free_vcpu, .vcpu_reset = svm_vcpu_reset, + .is_vm_type_supported = svm_is_vm_type_supported, .vm_size = sizeof(struct kvm_svm), .vm_init = svm_vm_init, .vm_destroy = svm_vm_destroy, diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 28a7597d0782..77da926ee505 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -29,6 +29,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .cpu_has_accelerated_tpr = report_flexpriority, .has_emulated_msr = vmx_has_emulated_msr, + .is_vm_type_supported = vmx_is_vm_type_supported, .vm_size = sizeof(struct kvm_vmx), .vm_init = vmx_vm_init, diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 7ae1f259c103..ad9e3bae1a6c 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7079,6 +7079,11 @@ int vmx_vcpu_create(struct kvm_vcpu *vcpu) return err; } +bool vmx_is_vm_type_supported(unsigned long type) +{ + return type == KVM_X86_DEFAULT_VM; +} + #define L1TF_MSG_SMT "L1TF CPU bug present and SMT on, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for details.\n" #define L1TF_MSG_L1D "L1TF CPU bug present and virtualization mitigation disabled, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for details.\n" diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index 1bad27e592b5..f7327bc73be0 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -25,6 +25,7 @@ void vmx_hardware_unsetup(void); int vmx_hardware_enable(void); void vmx_hardware_disable(void); bool report_flexpriority(void); +bool vmx_is_vm_type_supported(unsigned long type); int vmx_vm_init(struct kvm *kvm); int vmx_vcpu_create(struct kvm_vcpu *vcpu); int vmx_vcpu_pre_run(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 7131d735b1ef..ccb192ab954a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4331,6 +4331,11 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) r = sizeof(struct kvm_xsave); break; } + case KVM_CAP_VM_TYPES: + r = BIT(KVM_X86_DEFAULT_VM); + if (static_call(kvm_x86_is_vm_type_supported)(KVM_X86_TDX_VM)) + r |= BIT(KVM_X86_TDX_VM); + break; default: break; } @@ -11561,9 +11566,11 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) int ret; unsigned long flags; - if (type) + if (!static_call(kvm_x86_is_vm_type_supported)(type)) return -EINVAL; + kvm->arch.vm_type = type; + ret = kvm_page_track_init(kvm); if (ret) return ret; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 5191b57e1562..e77edf37c4d1 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1134,6 +1134,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_VM_GPA_BITS 207 #define KVM_CAP_XSAVE2 208 #define KVM_CAP_SYS_ATTRIBUTES 209 +#define KVM_CAP_VM_TYPES 210 #ifdef KVM_CAP_IRQ_ROUTING diff --git a/tools/arch/x86/include/uapi/asm/kvm.h b/tools/arch/x86/include/uapi/asm/kvm.h index bf6e96011dfe..71a5851475e7 100644 --- a/tools/arch/x86/include/uapi/asm/kvm.h +++ b/tools/arch/x86/include/uapi/asm/kvm.h @@ -525,4 +525,7 @@ struct kvm_pmu_event_filter { #define KVM_VCPU_TSC_CTRL 0 /* control group for the timestamp counter (TSC) */ #define KVM_VCPU_TSC_OFFSET 0 /* attribute for the TSC offset */ +#define KVM_X86_DEFAULT_VM 0 +#define KVM_X86_TDX_VM 1 + #endif /* _ASM_X86_KVM_H */ diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h index 5191b57e1562..e77edf37c4d1 100644 --- a/tools/include/uapi/linux/kvm.h +++ b/tools/include/uapi/linux/kvm.h @@ -1134,6 +1134,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_VM_GPA_BITS 207 #define KVM_CAP_XSAVE2 208 #define KVM_CAP_SYS_ATTRIBUTES 209 +#define KVM_CAP_VM_TYPES 210 #ifdef KVM_CAP_IRQ_ROUTING