From patchwork Fri Apr 26 06:43:28 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Nakajima, Jun" X-Patchwork-Id: 2491101 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork2.kernel.org (Postfix) with ESMTP id CFEDDDF230 for ; Fri, 26 Apr 2013 06:44:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759475Ab3DZGn6 (ORCPT ); Fri, 26 Apr 2013 02:43:58 -0400 Received: from mail-da0-f46.google.com ([209.85.210.46]:62792 "EHLO mail-da0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754657Ab3DZGn5 (ORCPT ); Fri, 26 Apr 2013 02:43:57 -0400 Received: by mail-da0-f46.google.com with SMTP id x4so1244296daj.33 for ; Thu, 25 Apr 2013 23:43:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:from:to:subject:date:message-id:x-mailer:in-reply-to :references:x-gm-message-state; bh=5UxmL5UZtYrgns52r4xmOLAOHjpj9q58p858KT8ulXk=; b=jDaNRpcbGkcXRuW7FGBW5fDXPnTnW84VlmTec4YvS2xn5AhocFNqPGxxGkCV+ZHO4B cNGiBIBnlzPZr5wz3QST9w1zmEpdS7aWah0xYE1ybM6eNIY5yiPr1O8XX5CnzW7OQzql Dpd5b+e81XXOXxaicZ9s8ENxPn7LOAxZ5hJRriWxBVAFkdvVlbC+injLPPZCk/eGyfcT Ie5Gd7zty9m+MsXcto1KxdFAJPbau84/89in+OFoomdDGbMehzngWxO4h5EVSRaJ7Mze Jk7Dqwk/XFfDyqJNdIjWsPVKNwNHDQgOs06h8h1Lp6+kjEYWzbvLh2UYYfNkJCUqeZpc UiDw== X-Received: by 10.66.8.133 with SMTP id r5mr32865393paa.219.1366958636474; Thu, 25 Apr 2013 23:43:56 -0700 (PDT) Received: from localhost (c-98-207-34-191.hsd1.ca.comcast.net. [98.207.34.191]) by mx.google.com with ESMTPSA id c5sm10417124pbl.37.2013.04.25.23.43.54 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Thu, 25 Apr 2013 23:43:55 -0700 (PDT) From: Jun Nakajima To: kvm@vger.kernel.org Subject: [PATCH 08/11] nEPT: Nested INVEPT Date: Thu, 25 Apr 2013 23:43:28 -0700 Message-Id: <1366958611-6935-8-git-send-email-jun.nakajima@intel.com> X-Mailer: git-send-email 1.8.2.1.610.g562af5b In-Reply-To: <1366958611-6935-7-git-send-email-jun.nakajima@intel.com> References: <1366958611-6935-1-git-send-email-jun.nakajima@intel.com> <1366958611-6935-2-git-send-email-jun.nakajima@intel.com> <1366958611-6935-3-git-send-email-jun.nakajima@intel.com> <1366958611-6935-4-git-send-email-jun.nakajima@intel.com> <1366958611-6935-5-git-send-email-jun.nakajima@intel.com> <1366958611-6935-6-git-send-email-jun.nakajima@intel.com> <1366958611-6935-7-git-send-email-jun.nakajima@intel.com> X-Gm-Message-State: ALoCoQnt+x96f4FhTMd3mgjiWf59pjJqNKwt3BrT4UkpYLn1g6VaFUg9asdaD+cnOJ85NilLgRT5 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org If we let L1 use EPT, we should probably also support the INVEPT instruction. In our current nested EPT implementation, when L1 changes its EPT table for L2 (i.e., EPT12), L0 modifies the shadow EPT table (EPT02), and in the course of this modification already calls INVEPT. Therefore, when L1 calls INVEPT, we don't really need to do anything. In particular we *don't* need to call the real INVEPT again. All we do in our INVEPT is verify the validity of the call, and its parameters, and then do nothing. In KVM Forum 2010, Dong et al. presented "Nested Virtualization Friendly KVM" and classified our current nested EPT implementation as "shadow-like virtual EPT". He recommended instead a different approach, which he called "VTLB-like virtual EPT". If we had taken that alternative approach, INVEPT would have had a bigger role: L0 would only rebuild the shadow EPT table when L1 calls INVEPT. Signed-off-by: Nadav Har'El Signed-off-by: Jun Nakajima Signed-off-by: Xinhao Xu --- arch/x86/include/asm/vmx.h | 4 +- arch/x86/include/uapi/asm/vmx.h | 1 + arch/x86/kvm/vmx.c | 83 +++++++++++++++++++++++++++++++++++++++++ 3 files changed, 87 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index b6fbf86..0ce54f3 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -376,7 +376,9 @@ enum vmcs_field { #define VMX_EPTP_WB_BIT (1ull << 14) #define VMX_EPT_2MB_PAGE_BIT (1ull << 16) #define VMX_EPT_1GB_PAGE_BIT (1ull << 17) -#define VMX_EPT_AD_BIT (1ull << 21) +#define VMX_EPT_INVEPT_BIT (1ull << 20) +#define VMX_EPT_AD_BIT (1ull << 21) +#define VMX_EPT_EXTENT_INDIVIDUAL_BIT (1ull << 24) #define VMX_EPT_EXTENT_CONTEXT_BIT (1ull << 25) #define VMX_EPT_EXTENT_GLOBAL_BIT (1ull << 26) diff --git a/arch/x86/include/uapi/asm/vmx.h b/arch/x86/include/uapi/asm/vmx.h index 2871fcc..5662cef 100644 --- a/arch/x86/include/uapi/asm/vmx.h +++ b/arch/x86/include/uapi/asm/vmx.h @@ -65,6 +65,7 @@ #define EXIT_REASON_EOI_INDUCED 45 #define EXIT_REASON_EPT_VIOLATION 48 #define EXIT_REASON_EPT_MISCONFIG 49 +#define EXIT_REASON_INVEPT 50 #define EXIT_REASON_WBINVD 54 #define EXIT_REASON_XSETBV 55 #define EXIT_REASON_APIC_WRITE 56 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 76df3a8..c67eb06 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -5879,6 +5879,87 @@ static int handle_vmptrst(struct kvm_vcpu *vcpu) return 1; } +/* Emulate the INVEPT instruction */ +static int handle_invept(struct kvm_vcpu *vcpu) +{ + u32 vmx_instruction_info; + unsigned long type; + gva_t gva; + struct x86_exception e; + struct { + u64 eptp, gpa; + } operand; + + if (!(nested_vmx_secondary_ctls_high & SECONDARY_EXEC_ENABLE_EPT) || + !(nested_vmx_ept_caps & VMX_EPT_INVEPT_BIT)) { + kvm_queue_exception(vcpu, UD_VECTOR); + return 1; + } + + if (!nested_vmx_check_permission(vcpu)) + return 1; + + if (!kvm_read_cr0_bits(vcpu, X86_CR0_PE)) { + kvm_queue_exception(vcpu, UD_VECTOR); + return 1; + } + + /* According to the Intel VMX instruction reference, the memory + * operand is read even if it isn't needed (e.g., for type==global) + */ + vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO); + if (get_vmx_mem_address(vcpu, vmcs_readl(EXIT_QUALIFICATION), + vmx_instruction_info, &gva)) + return 1; + if (kvm_read_guest_virt(&vcpu->arch.emulate_ctxt, gva, &operand, + sizeof(operand), &e)) { + kvm_inject_page_fault(vcpu, &e); + return 1; + } + + type = kvm_register_read(vcpu, (vmx_instruction_info >> 28) & 0xf); + + switch (type) { + case VMX_EPT_EXTENT_GLOBAL: + if (!(nested_vmx_ept_caps & VMX_EPT_EXTENT_GLOBAL_BIT)) + nested_vmx_failValid(vcpu, + VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID); + else { + /* + * Do nothing: when L1 changes EPT12, we already + * update EPT02 (the shadow EPT table) and call INVEPT. + * So when L1 calls INVEPT, there's nothing left to do. + */ + nested_vmx_succeed(vcpu); + } + break; + case VMX_EPT_EXTENT_CONTEXT: + if (!(nested_vmx_ept_caps & VMX_EPT_EXTENT_CONTEXT_BIT)) + nested_vmx_failValid(vcpu, + VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID); + else { + /* Do nothing */ + nested_vmx_succeed(vcpu); + } + break; + case VMX_EPT_EXTENT_INDIVIDUAL_ADDR: + if (!(nested_vmx_ept_caps & VMX_EPT_EXTENT_INDIVIDUAL_BIT)) + nested_vmx_failValid(vcpu, + VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID); + else { + /* Do nothing */ + nested_vmx_succeed(vcpu); + } + break; + default: + nested_vmx_failValid(vcpu, + VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID); + } + + skip_emulated_instruction(vcpu); + return 1; +} + /* * The exit handlers return 1 if the exit was handled fully and guest execution * may resume. Otherwise they set the kvm_run parameter to indicate what needs @@ -5923,6 +6004,7 @@ static int (*const kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = { [EXIT_REASON_PAUSE_INSTRUCTION] = handle_pause, [EXIT_REASON_MWAIT_INSTRUCTION] = handle_invalid_op, [EXIT_REASON_MONITOR_INSTRUCTION] = handle_invalid_op, + [EXIT_REASON_INVEPT] = handle_invept, }; static const int kvm_vmx_max_exit_handlers = @@ -6107,6 +6189,7 @@ static bool nested_vmx_exit_handled(struct kvm_vcpu *vcpu) case EXIT_REASON_VMPTRST: case EXIT_REASON_VMREAD: case EXIT_REASON_VMRESUME: case EXIT_REASON_VMWRITE: case EXIT_REASON_VMOFF: case EXIT_REASON_VMON: + case EXIT_REASON_INVEPT: /* * VMX instructions trap unconditionally. This allows L1 to * emulate them for its L2 guest, i.e., allows 3-level nesting!