From patchwork Tue Aug 4 04:20:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aaron Lewis X-Patchwork-Id: 11699707 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 931F914E3 for ; Tue, 4 Aug 2020 04:21:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B3B362070A for ; Tue, 4 Aug 2020 04:21:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="IkaSQ7KA" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728516AbgHDEU4 (ORCPT ); Tue, 4 Aug 2020 00:20:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48276 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725807AbgHDEU4 (ORCPT ); Tue, 4 Aug 2020 00:20:56 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 55EB5C06174A for ; Mon, 3 Aug 2020 21:20:56 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id v65so3614348ybv.9 for ; Mon, 03 Aug 2020 21:20:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=Z431tMDqsp9z1m5bwGOkLYrKaoTeC+PaEXx+R0OqLl4=; b=IkaSQ7KAsLov+1Y9FM84mIgY3sQcUiqR7PcUtyaC3DwSknXaq0yWUX2jBgtuNCoph7 9/3cTjJvdcqTiJ96sas7a8Z8JJ6aQigGsgZMmkheia3ox39wd0fNlQ84cq2QrQZKoIKj xcmwhmm3Qjfxbb+OixQZ4/RapEWx1quGb3natZUJP9UjSt+3Tp3iU8Qo2yvBtu0OPAYU b9MxcGHxxSSGa0N/MaQUvhP9lt+ibp7Od3f0ogHLrKkEhbK+HWSmtnIlj0JLjESIlEYI AqrbgQDg3xW8+SV+B71GLRRS7u2n8bdTWeKJBKZ+w5eg8Z+0p5E7NoOcP/LDslcBEdJr Rjaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=Z431tMDqsp9z1m5bwGOkLYrKaoTeC+PaEXx+R0OqLl4=; b=LCcd25vmfEc8EoYpmXLEyMtm3Ubzc7SllDRV/Ico++1VbXlCSynA5732LKB60iU1U3 k2G6Kfrh29ywtqkcREJPwOnQxOAXh1eZTSIT878KkLS737SZDUpsCow/PBpTtF+8ieXA HI2K1btBaV8LUJfY8ukXb1QQA/ZnLTqW6j8/IjNACNakPrqiJDkf9ZexJ+2nfenUmUiB uHsre/8M5EI31QHy6A/+KbEgnucsEA3Xm6z1yOwxFJSrslYYmcRd2xH4mR18OMbOdwFL h/9cC6Z1psmsERTYsoa5GNQr3d+oT17J8w+/0RD2VIemHdVjvZ12TkH0JUlbfIYtSfef zClw== X-Gm-Message-State: AOAM533i+lHZngBMMX4QUmvzRivhya6n+ETSFKwHedPk4qvK3K266UDP xQJDCk6goQ843AvC27ryDEeZlcpU5zRttr/m X-Google-Smtp-Source: ABdhPJx0/0dFy/QKqK0nD6PiCmVNgbluFHwVOwSlm4VQGu5cqAF8U+GGTkX4YeLfHdR8pdH+4CsPd0/l2jFZs93G X-Received: by 2002:a25:9885:: with SMTP id l5mr33250745ybo.0.1596514855556; Mon, 03 Aug 2020 21:20:55 -0700 (PDT) Date: Mon, 3 Aug 2020 21:20:38 -0700 In-Reply-To: <20200804042043.3592620-1-aaronlewis@google.com> Message-Id: <20200804042043.3592620-2-aaronlewis@google.com> Mime-Version: 1.0 References: <20200804042043.3592620-1-aaronlewis@google.com> X-Mailer: git-send-email 2.28.0.163.g6104cc2f0b6-goog Subject: [PATCH 1/6] KVM: x86: Add ioctl for accepting a userspace provided MSR list From: Aaron Lewis To: jmattson@google.com, graf@amazon.com Cc: pshier@google.com, oupton@google.com, kvm@vger.kernel.org, Aaron Lewis Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add KVM_SET_EXIT_MSRS ioctl to allow userspace to pass in a list of MSRs that force an exit to userspace when rdmsr or wrmsr are used by the guest. KVM_SET_EXIT_MSRS will need to be called before any vCPUs are created to protect the 'user_exit_msrs' list from being mutated while vCPUs are running. Add KVM_CAP_SET_MSR_EXITS to identify the feature exists. Signed-off-by: Aaron Lewis Reviewed-by: Oliver Upton --- Documentation/virt/kvm/api.rst | 24 +++++++++++++++++++ arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/x86.c | 41 +++++++++++++++++++++++++++++++++ include/uapi/linux/kvm.h | 2 ++ 4 files changed, 69 insertions(+) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 320788f81a05..7d8167c165aa 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -1006,6 +1006,30 @@ such as migration. :Parameters: struct kvm_vcpu_event (out) :Returns: 0 on success, -1 on error +4.32 KVM_SET_EXIT_MSRS +------------------ + +:Capability: KVM_CAP_SET_MSR_EXITS +:Architectures: x86 +:Type: vm ioctl +:Parameters: struct kvm_msr_list (in) +:Returns: 0 on success, -1 on error + +Sets the userspace MSR list which is used to track which MSRs KVM should send +to userspace to be serviced when the guest executes rdmsr or wrmsr. + +This ioctl needs to be called before vCPUs are setup otherwise the list of MSRs +will not be accepted and an EINVAL error will be returned. Also, if a list of +MSRs has already been supplied, and this ioctl is called again an EEXIST error +will be returned. + +:: + + struct kvm_msr_list { + __u32 nmsrs; + __u32 indices[0]; +}; + X86: ^^^^ diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index be5363b21540..510055471dd0 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1004,6 +1004,8 @@ struct kvm_arch { struct kvm_pmu_event_filter *pmu_event_filter; struct task_struct *nx_lpage_recovery_thread; + + struct kvm_msr_list *user_exit_msrs; }; struct kvm_vm_stat { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 88c593f83b28..46a0fb9e0869 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3419,6 +3419,42 @@ static inline bool kvm_can_mwait_in_guest(void) boot_cpu_has(X86_FEATURE_ARAT); } +static int kvm_vm_ioctl_set_exit_msrs(struct kvm *kvm, + struct kvm_msr_list __user *user_msr_list) +{ + struct kvm_msr_list *msr_list, hdr; + size_t indices_size; + + if (kvm->arch.user_exit_msrs != NULL) + return -EEXIST; + + if (kvm->created_vcpus) + return -EINVAL; + + if (copy_from_user(&hdr, user_msr_list, sizeof(hdr))) + return -EFAULT; + + if (hdr.nmsrs >= MAX_IO_MSRS) + return -E2BIG; + + indices_size = sizeof(hdr.indices[0]) * hdr.nmsrs; + msr_list = kvzalloc(sizeof(struct kvm_msr_list) + indices_size, + GFP_KERNEL_ACCOUNT); + if (!msr_list) + return -ENOMEM; + msr_list->nmsrs = hdr.nmsrs; + + if (copy_from_user(msr_list->indices, user_msr_list->indices, + indices_size)) { + kvfree(msr_list); + return -EFAULT; + } + + kvm->arch.user_exit_msrs = msr_list; + + return 0; +} + int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) { int r = 0; @@ -3476,6 +3512,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_MSR_PLATFORM_INFO: case KVM_CAP_EXCEPTION_PAYLOAD: case KVM_CAP_SET_GUEST_DEBUG: + case KVM_CAP_SET_MSR_EXITS: r = 1; break; case KVM_CAP_SYNC_REGS: @@ -5261,6 +5298,10 @@ long kvm_arch_vm_ioctl(struct file *filp, r = 0; break; } + case KVM_SET_EXIT_MSRS: { + r = kvm_vm_ioctl_set_exit_msrs(kvm, argp); + break; + } case KVM_MEMORY_ENCRYPT_OP: { r = -ENOTTY; if (kvm_x86_ops.mem_enc_op) diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 4fdf30316582..de4638c1bd15 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1031,6 +1031,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_PPC_SECURE_GUEST 181 #define KVM_CAP_HALT_POLL 182 #define KVM_CAP_ASYNC_PF_INT 183 +#define KVM_CAP_SET_MSR_EXITS 185 #ifdef KVM_CAP_IRQ_ROUTING @@ -1371,6 +1372,7 @@ struct kvm_s390_ucas_mapping { /* Available with KVM_CAP_PMU_EVENT_FILTER */ #define KVM_SET_PMU_EVENT_FILTER _IOW(KVMIO, 0xb2, struct kvm_pmu_event_filter) #define KVM_PPC_SVM_OFF _IO(KVMIO, 0xb3) +#define KVM_SET_EXIT_MSRS _IOW(KVMIO, 0xb4, struct kvm_msr_list) /* ioctl for vm fd */ #define KVM_CREATE_DEVICE _IOWR(KVMIO, 0xe0, struct kvm_create_device) From patchwork Tue Aug 4 04:20:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aaron Lewis X-Patchwork-Id: 11699709 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B1D756C1 for ; Tue, 4 Aug 2020 04:21:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D392C2070A for ; Tue, 4 Aug 2020 04:21:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="MzLRCwgm" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728740AbgHDEU7 (ORCPT ); Tue, 4 Aug 2020 00:20:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48286 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725807AbgHDEU6 (ORCPT ); Tue, 4 Aug 2020 00:20:58 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B3A4DC06174A for ; Mon, 3 Aug 2020 21:20:58 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id a127so51306602ybb.14 for ; Mon, 03 Aug 2020 21:20:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=ekkRqwCzpqNMyGqJ+IRasMCYa9RWitCN8gbAOG6hXZk=; b=MzLRCwgmVe1d+5I0WOIS0U9CrCUYF4M3JbVIo6ANGyAhKNLKYsthIHEVFqISGtbDzw nQ36PyjlsnnXqKN3806i6lDPnCwTujCE5ioXaJtxgsXHWOmqlNLaQSxorvuxqktkbiW0 QWqnNt44fXxFwRSi/LhOY2TIJnUadY0EpcrplOUSMwlcJoZQqvBODsH3C3jU+3SCfHpm s/Z4p2KoF3osYWnI3+6YEG+bwJrJtSCmDaFWn8ar6yluvhlyZ+xMgtjYw8UjHJImQwwP DZGCAfekS4mV7QctoarlI+Ater27beXNzjt0m0N8HCpqjnrn5/idQeb56EayFP5exxtM tRaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=ekkRqwCzpqNMyGqJ+IRasMCYa9RWitCN8gbAOG6hXZk=; b=kNwCj08aB9FWfuPBlCn2VvA3oDNZ2Zt3NkZd1fCv0U8w+q5tEN9CspjpHdlEY5it86 BOT7u7+nHzfKYTYwgbL6CMCGicIULsriyMrcR9IUNspLsW+vhhrBGq4ZQXnD+Nf9NEVO R9pkJ6Q7PBZY7MF/HFSt1PfCwRsnaz5IxrlYbySa8rUKPEy/9xsJFZxH0TxJakbCWHj2 pxkxgg3k6IBPMG8gJ9B4hrnro+DBRtfF/Xyjn6T6S9An/DQm1bzcX8jMT6NnXcHIgnAw eslmpCw+kMgT78ayROoQ32/UTI3U6qam2GarHYMWxtoKctVNQx4RTbNZMIAg7bOWkl/E fZ8g== X-Gm-Message-State: AOAM533TCZftbbSv9sc5brUQnAIe9utg84jrdCA++z2fRimynSJJeOni RpHvR0MgX855JCEMLGgmxJbwvYvHR9Q+TU2x X-Google-Smtp-Source: ABdhPJw64jiau5VUkGApgQyQWba/ROMFe5/War77J64dUHeJghJ8JHrJeU1tF3lYsbeEGDShEW0kfgBoZg/B7/gK X-Received: by 2002:a25:ba42:: with SMTP id z2mr30928789ybj.336.1596514857952; Mon, 03 Aug 2020 21:20:57 -0700 (PDT) Date: Mon, 3 Aug 2020 21:20:39 -0700 In-Reply-To: <20200804042043.3592620-1-aaronlewis@google.com> Message-Id: <20200804042043.3592620-3-aaronlewis@google.com> Mime-Version: 1.0 References: <20200804042043.3592620-1-aaronlewis@google.com> X-Mailer: git-send-email 2.28.0.163.g6104cc2f0b6-goog Subject: [PATCH 2/6] KVM: x86: Add support for exiting to userspace on rdmsr or wrmsr From: Aaron Lewis To: jmattson@google.com, graf@amazon.com Cc: pshier@google.com, oupton@google.com, kvm@vger.kernel.org, Aaron Lewis Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add support for exiting to userspace on a rdmsr or wrmsr instruction if the MSR being read from or written to is in the user_exit_msrs list. Signed-off-by: Aaron Lewis --- Documentation/virt/kvm/api.rst | 29 +++++++++++- arch/x86/kvm/trace.h | 24 ++++++++++ arch/x86/kvm/x86.c | 83 ++++++++++++++++++++++++++++++++++ include/trace/events/kvm.h | 2 +- include/uapi/linux/kvm.h | 10 ++++ 5 files changed, 145 insertions(+), 3 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 7d8167c165aa..8b7078707e0a 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -4885,8 +4885,9 @@ to the byte array. .. note:: - For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_PAPR and - KVM_EXIT_EPR the corresponding + For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_PAPR, + KVM_EXIT_EPR, KVM_EXIT_X86_RDMSR, and KVM_EXIT_X86_WRMSR the + corresponding operations are complete (and guest state is consistent) only after userspace has re-entered the kernel with KVM_RUN. The kernel side will first finish @@ -5179,6 +5180,30 @@ Note that KVM does not skip the faulting instruction as it does for KVM_EXIT_MMIO, but userspace has to emulate any change to the processing state if it decides to decode and emulate the instruction. +:: + + /* KVM_EXIT_X86_RDMSR */ + /* KVM_EXIT_X86_WRMSR */ + struct { + __u8 inject_gp; /* out */ + __u8 pad[3]; + __u32 index; /* i.e. ecx; out */ + __u64 data; /* in (wrmsr) / out (rdmsr) */ + } msr; + +If the exit_reason is KVM_EXIT_X86_RDMSR then a rdmsr instruction in the guest +needs to be processed by userspace. If the exit_reason is KVM_EXIT_X86_WRMSR +then a wrmsr instruction in the guest needs to be processed by userspace. + +Userspace can tell KVM to inject a #GP into the guest by setting the +'inject_gp' flag. Setting the flag to 1 tells KVM to inject a GP into the +guest. Setting the flag to 0 tells KVM to not inject a GP into the guest. + +The MSR being processed is indicated by 'index'. If a read is being processed +the 'data' field is expected to be filled out by userspace (as an out +parameter). If a write is being processed the 'data' field will contain the +updated value of the MSR (as an in parameter). + :: /* Fix the size of the union. */ diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h index b66432b015d2..d03143ebd6f0 100644 --- a/arch/x86/kvm/trace.h +++ b/arch/x86/kvm/trace.h @@ -367,6 +367,30 @@ TRACE_EVENT(kvm_msr, #define trace_kvm_msr_read_ex(ecx) trace_kvm_msr(0, ecx, 0, true) #define trace_kvm_msr_write_ex(ecx, data) trace_kvm_msr(1, ecx, data, true) +TRACE_EVENT(kvm_userspace_msr, + TP_PROTO(bool is_write, u8 inject_gp, u32 index, u64 data), + TP_ARGS(is_write, inject_gp, index, data), + + TP_STRUCT__entry( + __field(bool, is_write) + __field(u8, inject_gp) + __field(u32, index) + __field(u64, data) + ), + + TP_fast_assign( + __entry->is_write = is_write; + __entry->inject_gp = inject_gp; + __entry->index = index; + __entry->data = data; + ), + + TP_printk("userspace %s %x = 0x%llx, %s", + __entry->is_write ? "wrmsr" : "rdmsr", + __entry->index, __entry->data, + __entry->inject_gp ? "inject_gp" : "no_gp") +); + /* * Tracepoint for guest CR access. */ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 46a0fb9e0869..47619b49818a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -108,6 +108,8 @@ static void __kvm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags); static void store_regs(struct kvm_vcpu *vcpu); static int sync_regs(struct kvm_vcpu *vcpu); +bool kvm_msr_user_exit(struct kvm *kvm, u32 index); + struct kvm_x86_ops kvm_x86_ops __read_mostly; EXPORT_SYMBOL_GPL(kvm_x86_ops); @@ -1549,11 +1551,61 @@ int kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data) } EXPORT_SYMBOL_GPL(kvm_set_msr); +/* + * On success, returns 1 so that __vcpu_run() will happen next. On + * error, returns 0. + */ +static int complete_userspace_msr(struct kvm_vcpu *vcpu, bool is_write) +{ + u32 ecx = vcpu->run->msr.index; + u64 data = vcpu->run->msr.data; + + trace_kvm_userspace_msr(is_write, + vcpu->run->msr.inject_gp, + vcpu->run->msr.index, + vcpu->run->msr.data); + + if (vcpu->run->msr.inject_gp) { + trace_kvm_msr(is_write, ecx, data, true); + kvm_inject_gp(vcpu, 0); + return 1; + } + + trace_kvm_msr(is_write, ecx, data, false); + if (!is_write) { + kvm_rax_write(vcpu, data & -1u); + kvm_rdx_write(vcpu, (data >> 32) & -1u); + } + + return kvm_skip_emulated_instruction(vcpu); +} + +static int complete_userspace_rdmsr(struct kvm_vcpu *vcpu) +{ + return complete_userspace_msr(vcpu, false); +} + +static int complete_userspace_wrmsr(struct kvm_vcpu *vcpu) +{ + return complete_userspace_msr(vcpu, true); +} + int kvm_emulate_rdmsr(struct kvm_vcpu *vcpu) { u32 ecx = kvm_rcx_read(vcpu); u64 data; + if (kvm_msr_user_exit(vcpu->kvm, ecx)) { + vcpu->run->exit_reason = KVM_EXIT_X86_RDMSR; + vcpu->run->msr.index = ecx; + vcpu->run->msr.data = 0; + vcpu->run->msr.inject_gp = 0; + memset(vcpu->run->msr.pad, 0, sizeof(vcpu->run->msr.pad)); + vcpu->arch.complete_userspace_io = + complete_userspace_rdmsr; + return 0; + } + if (kvm_get_msr(vcpu, ecx, &data)) { trace_kvm_msr_read_ex(ecx); kvm_inject_gp(vcpu, 0); @@ -1573,6 +1625,17 @@ int kvm_emulate_wrmsr(struct kvm_vcpu *vcpu) u32 ecx = kvm_rcx_read(vcpu); u64 data = kvm_read_edx_eax(vcpu); + if (kvm_msr_user_exit(vcpu->kvm, ecx)) { + vcpu->run->exit_reason = KVM_EXIT_X86_WRMSR; + vcpu->run->msr.index = ecx; + vcpu->run->msr.data = data; + vcpu->run->msr.inject_gp = 0; + memset(vcpu->run->msr.pad, 0, sizeof(vcpu->run->msr.pad)); + vcpu->arch.complete_userspace_io = + complete_userspace_wrmsr; + return 0; + } + if (kvm_set_msr(vcpu, ecx, data)) { trace_kvm_msr_write_ex(ecx, data); kvm_inject_gp(vcpu, 0); @@ -3455,6 +3518,25 @@ static int kvm_vm_ioctl_set_exit_msrs(struct kvm *kvm, return 0; } +bool kvm_msr_user_exit(struct kvm *kvm, u32 index) +{ + struct kvm_msr_list *exit_msrs; + int i; + + exit_msrs = kvm->arch.user_exit_msrs; + + if (!exit_msrs) + return false; + + for (i = 0; i < exit_msrs->nmsrs; ++i) { + if (exit_msrs->indices[i] == index) + return true; + } + + return false; +} +EXPORT_SYMBOL_GPL(kvm_msr_user_exit); + int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) { int r = 0; @@ -10762,3 +10844,4 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_avic_unaccelerated_access); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_avic_incomplete_ipi); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_avic_ga_log); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_apicv_update_request); +EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_userspace_msr); diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h index 2c735a3e6613..19f33a704174 100644 --- a/include/trace/events/kvm.h +++ b/include/trace/events/kvm.h @@ -17,7 +17,7 @@ ERSN(NMI), ERSN(INTERNAL_ERROR), ERSN(OSI), ERSN(PAPR_HCALL), \ ERSN(S390_UCONTROL), ERSN(WATCHDOG), ERSN(S390_TSCH), ERSN(EPR),\ ERSN(SYSTEM_EVENT), ERSN(S390_STSI), ERSN(IOAPIC_EOI), \ - ERSN(HYPERV) + ERSN(HYPERV), ERSN(X86_RDMSR), ERSN(X86_WRMSR) TRACE_EVENT(kvm_userspace_exit, TP_PROTO(__u32 reason, int errno), diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index de4638c1bd15..2b7d21e6338c 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -248,6 +248,8 @@ struct kvm_hyperv_exit { #define KVM_EXIT_IOAPIC_EOI 26 #define KVM_EXIT_HYPERV 27 #define KVM_EXIT_ARM_NISV 28 +#define KVM_EXIT_X86_RDMSR 29 +#define KVM_EXIT_X86_WRMSR 30 /* For KVM_EXIT_INTERNAL_ERROR */ /* Emulate instruction failed. */ @@ -412,6 +414,14 @@ struct kvm_run { __u64 esr_iss; __u64 fault_ipa; } arm_nisv; + /* KVM_EXIT_X86_RDMSR */ + /* KVM_EXIT_X86_RDMSR */ + struct { + __u8 inject_gp; + __u8 pad[3]; + __u32 index; + __u64 data; + } msr; /* Fix the size of the union. */ char padding[256]; }; From patchwork Tue Aug 4 04:20:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aaron Lewis X-Patchwork-Id: 11699711 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8B22C13B1 for ; Tue, 4 Aug 2020 04:21:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A3ABC20792 for ; Tue, 4 Aug 2020 04:21:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="t21BmMgp" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728749AbgHDEVC (ORCPT ); Tue, 4 Aug 2020 00:21:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48300 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728761AbgHDEVB (ORCPT ); Tue, 4 Aug 2020 00:21:01 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7CD37C06174A for ; Mon, 3 Aug 2020 21:21:01 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id i62so50849087ybc.15 for ; Mon, 03 Aug 2020 21:21:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=dzeTFxAYWtTD3RJGEX8iHhDjCrEuLXRqKxOLaG/0j/k=; b=t21BmMgpHXJH25M38sXOici6cTzr5HWstP5/5PcBQ66vymEjq9wkVxAEFI3OygeMmk mGTy03nV6R9giOKLYiUh6NtWD5LeHs9Bv2AMxA1h0eRcrTMIZltnTs3BmkgH6YDPoZcq uJQFOnMmwpFoRLrDs6VxscS9XaJpmnmJIwosAitszvEHHQ44wbTi3tmjWnaK9PFRsEmA CDiMH3hzaoMgjQcEGCdFAg5WOMhNKBduoMH+/hviO5KNHnq0SihllJnNExSbrlPrrkS2 EDKo5KTF1H8xX7+JNwO6dZ4Xlo18FT11g6waq8N97bUE4WmEvDAKbAGSZN6E4HaGYZMV PZAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=dzeTFxAYWtTD3RJGEX8iHhDjCrEuLXRqKxOLaG/0j/k=; b=cxa8O5SkCVUxt9A/IxFfFr/ree0L0un5Ka7WD0ijo4qm3KS9XfJJ8NQswEu+Yo9w0d NNE7gXC0eeKkTUqXbx9kNsbsYg0fsZKbh+Qbj28XZQLlmNjMkFPjk6KjZs06AgNuWqRJ 0+KTrsQhbyvhjqTYGXfR1ee79K4F90Rn79YgjOfGQ+ELiX0tyJRBF5X41bJXesAT1I6I NayyBP84GN/umBwzCblI0H7GDGYVbt194XeyuK1saMrVEve5DnFzeeZfqNQIhPXfJ9TA PycPNWAXKJ9fQoBFDFWfUwGoUu4jnIbrqIdHg1OikhV7mRen7tCfm9WKTCx6pSc6RuLK 8beA== X-Gm-Message-State: AOAM530WkALH4gZzYz7R5GmKHSNKjgRQiKZT/tMIdAxhOttUtUzAStQZ Vu5I7YOwF1KlyBAfY8VMIgXm9xbNvTh6qQ0h X-Google-Smtp-Source: ABdhPJy2AjSlxmxpNptbV0oi6KMz6nynpymByxZjdR09CKzQ+SWOK10f5benf+zvqZhdMDhsuFh/ov4aZ5GbdGDX X-Received: by 2002:a25:bcce:: with SMTP id l14mr18046186ybm.467.1596514860682; Mon, 03 Aug 2020 21:21:00 -0700 (PDT) Date: Mon, 3 Aug 2020 21:20:40 -0700 In-Reply-To: <20200804042043.3592620-1-aaronlewis@google.com> Message-Id: <20200804042043.3592620-4-aaronlewis@google.com> Mime-Version: 1.0 References: <20200804042043.3592620-1-aaronlewis@google.com> X-Mailer: git-send-email 2.28.0.163.g6104cc2f0b6-goog Subject: [PATCH 3/6] KVM: x86: Prepare MSR bitmaps for userspace tracked MSRs From: Aaron Lewis To: jmattson@google.com, graf@amazon.com Cc: pshier@google.com, oupton@google.com, kvm@vger.kernel.org, Aaron Lewis Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Prepare vmx and svm for a subsequent change that ensures the MSR bitmap allows for a vm_exit for all MSRs that userspace is tracking. Signed-off-by: Aaron Lewis Reviewed-by: Oliver Upton --- arch/x86/kvm/svm/svm.c | 48 +++++++++++----------- arch/x86/kvm/vmx/nested.c | 2 +- arch/x86/kvm/vmx/vmx.c | 83 +++++++++++++++++++-------------------- arch/x86/kvm/vmx/vmx.h | 2 +- 4 files changed, 67 insertions(+), 68 deletions(-) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index c0da4dd78ac5..eb673b59f7b7 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -564,7 +564,7 @@ static bool valid_msr_intercept(u32 index) return false; } -static bool msr_write_intercepted(struct kvm_vcpu *vcpu, unsigned msr) +static bool msr_write_intercepted(struct kvm_vcpu *vcpu, u32 msr) { u8 bit_write; unsigned long tmp; @@ -583,9 +583,11 @@ static bool msr_write_intercepted(struct kvm_vcpu *vcpu, unsigned msr) return !!test_bit(bit_write, &tmp); } -static void set_msr_interception(u32 *msrpm, unsigned msr, - int read, int write) +static void set_msr_interception(struct kvm_vcpu *vcpu, u32 msr, int read, + int write) { + struct vcpu_svm *svm = to_svm(vcpu); + u32 *msrpm = svm->msrpm; u8 bit_read, bit_write; unsigned long tmp; u32 offset; @@ -609,7 +611,7 @@ static void set_msr_interception(u32 *msrpm, unsigned msr, msrpm[offset] = tmp; } -static void svm_vcpu_init_msrpm(u32 *msrpm) +static void svm_vcpu_init_msrpm(struct kvm_vcpu *vcpu, u32 *msrpm) { int i; @@ -619,7 +621,7 @@ static void svm_vcpu_init_msrpm(u32 *msrpm) if (!direct_access_msrs[i].always) continue; - set_msr_interception(msrpm, direct_access_msrs[i].index, 1, 1); + set_msr_interception(vcpu, direct_access_msrs[i].index, 1, 1); } } @@ -666,26 +668,26 @@ static void init_msrpm_offsets(void) } } -static void svm_enable_lbrv(struct vcpu_svm *svm) +static void svm_enable_lbrv(struct kvm_vcpu *vcpu) { - u32 *msrpm = svm->msrpm; + struct vcpu_svm *svm = to_svm(vcpu); svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK; - set_msr_interception(msrpm, MSR_IA32_LASTBRANCHFROMIP, 1, 1); - set_msr_interception(msrpm, MSR_IA32_LASTBRANCHTOIP, 1, 1); - set_msr_interception(msrpm, MSR_IA32_LASTINTFROMIP, 1, 1); - set_msr_interception(msrpm, MSR_IA32_LASTINTTOIP, 1, 1); + set_msr_interception(vcpu, MSR_IA32_LASTBRANCHFROMIP, 1, 1); + set_msr_interception(vcpu, MSR_IA32_LASTBRANCHTOIP, 1, 1); + set_msr_interception(vcpu, MSR_IA32_LASTINTFROMIP, 1, 1); + set_msr_interception(vcpu, MSR_IA32_LASTINTTOIP, 1, 1); } -static void svm_disable_lbrv(struct vcpu_svm *svm) +static void svm_disable_lbrv(struct kvm_vcpu *vcpu) { - u32 *msrpm = svm->msrpm; + struct vcpu_svm *svm = to_svm(vcpu); svm->vmcb->control.virt_ext &= ~LBR_CTL_ENABLE_MASK; - set_msr_interception(msrpm, MSR_IA32_LASTBRANCHFROMIP, 0, 0); - set_msr_interception(msrpm, MSR_IA32_LASTBRANCHTOIP, 0, 0); - set_msr_interception(msrpm, MSR_IA32_LASTINTFROMIP, 0, 0); - set_msr_interception(msrpm, MSR_IA32_LASTINTTOIP, 0, 0); + set_msr_interception(vcpu, MSR_IA32_LASTBRANCHFROMIP, 0, 0); + set_msr_interception(vcpu, MSR_IA32_LASTBRANCHTOIP, 0, 0); + set_msr_interception(vcpu, MSR_IA32_LASTINTFROMIP, 0, 0); + set_msr_interception(vcpu, MSR_IA32_LASTINTTOIP, 0, 0); } void disable_nmi_singlestep(struct vcpu_svm *svm) @@ -1196,10 +1198,10 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu) clear_page(svm->nested.hsave); svm->msrpm = page_address(msrpm_pages); - svm_vcpu_init_msrpm(svm->msrpm); + svm_vcpu_init_msrpm(vcpu, svm->msrpm); svm->nested.msrpm = page_address(nested_msrpm_pages); - svm_vcpu_init_msrpm(svm->nested.msrpm); + svm_vcpu_init_msrpm(vcpu, svm->nested.msrpm); svm->vmcb = page_address(page); clear_page(svm->vmcb); @@ -2540,7 +2542,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) * We update the L1 MSR bit as well since it will end up * touching the MSR anyway now. */ - set_msr_interception(svm->msrpm, MSR_IA32_SPEC_CTRL, 1, 1); + set_msr_interception(vcpu, MSR_IA32_SPEC_CTRL, 1, 1); break; case MSR_IA32_PRED_CMD: if (!msr->host_initiated && @@ -2555,7 +2557,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) break; wrmsrl(MSR_IA32_PRED_CMD, PRED_CMD_IBPB); - set_msr_interception(svm->msrpm, MSR_IA32_PRED_CMD, 0, 1); + set_msr_interception(vcpu, MSR_IA32_PRED_CMD, 0, 1); break; case MSR_AMD64_VIRT_SPEC_CTRL: if (!msr->host_initiated && @@ -2619,9 +2621,9 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) svm->vmcb->save.dbgctl = data; mark_dirty(svm->vmcb, VMCB_LBR); if (data & (1ULL<<0)) - svm_enable_lbrv(svm); + svm_enable_lbrv(vcpu); else - svm_disable_lbrv(svm); + svm_disable_lbrv(vcpu); break; case MSR_VM_HSAVE_PA: svm->nested.hsave_msr = data; diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index d4a4cec034d0..9313814d9e91 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -4704,7 +4704,7 @@ static int enter_vmx_operation(struct kvm_vcpu *vcpu) if (vmx_pt_mode_is_host_guest()) { vmx->pt_desc.guest.ctl = 0; - pt_update_intercept_for_msr(vmx); + pt_update_intercept_for_msr(vcpu); } return 0; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 13745f2a5ecd..1313e47a5a1e 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -342,7 +342,7 @@ module_param_cb(vmentry_l1d_flush, &vmentry_l1d_flush_ops, NULL, 0644); static bool guest_state_valid(struct kvm_vcpu *vcpu); static u32 vmx_segment_access_rights(struct kvm_segment *var); -static __always_inline void vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, +static __always_inline void vmx_disable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type); void vmx_vmexit(void); @@ -2081,7 +2081,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) * in the merging. We update the vmcs01 here for L1 as well * since it will end up touching the MSR anyway now. */ - vmx_disable_intercept_for_msr(vmx->vmcs01.msr_bitmap, + vmx_disable_intercept_for_msr(vcpu, MSR_IA32_SPEC_CTRL, MSR_TYPE_RW); break; @@ -2117,8 +2117,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) * vmcs02.msr_bitmap here since it gets completely overwritten * in the merging. */ - vmx_disable_intercept_for_msr(vmx->vmcs01.msr_bitmap, MSR_IA32_PRED_CMD, - MSR_TYPE_W); + vmx_disable_intercept_for_msr(vcpu, MSR_IA32_PRED_CMD, MSR_TYPE_W); break; case MSR_IA32_CR_PAT: if (!kvm_pat_valid(data)) @@ -2168,7 +2167,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) return 1; vmcs_write64(GUEST_IA32_RTIT_CTL, data); vmx->pt_desc.guest.ctl = data; - pt_update_intercept_for_msr(vmx); + pt_update_intercept_for_msr(vcpu); break; case MSR_IA32_RTIT_STATUS: if (!pt_can_write_msr(vmx)) @@ -3691,9 +3690,11 @@ void free_vpid(int vpid) spin_unlock(&vmx_vpid_lock); } -static __always_inline void vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, +static __always_inline void vmx_disable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type) { + struct vcpu_vmx *vmx = to_vmx(vcpu); + unsigned long *msr_bitmap = vmx->vmcs01.msr_bitmap; int f = sizeof(unsigned long); if (!cpu_has_vmx_msr_bitmap()) @@ -3729,9 +3730,11 @@ static __always_inline void vmx_disable_intercept_for_msr(unsigned long *msr_bit } } -static __always_inline void vmx_enable_intercept_for_msr(unsigned long *msr_bitmap, +static __always_inline void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type) { + struct vcpu_vmx *vmx = to_vmx(vcpu); + unsigned long *msr_bitmap = vmx->vmcs01.msr_bitmap; int f = sizeof(unsigned long); if (!cpu_has_vmx_msr_bitmap()) @@ -3767,13 +3770,13 @@ static __always_inline void vmx_enable_intercept_for_msr(unsigned long *msr_bitm } } -static __always_inline void vmx_set_intercept_for_msr(unsigned long *msr_bitmap, - u32 msr, int type, bool value) +static __always_inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, + u32 msr, int type, bool value) { if (value) - vmx_enable_intercept_for_msr(msr_bitmap, msr, type); + vmx_enable_intercept_for_msr(vcpu, msr, type); else - vmx_disable_intercept_for_msr(msr_bitmap, msr, type); + vmx_disable_intercept_for_msr(vcpu, msr, type); } static u8 vmx_msr_bitmap_mode(struct kvm_vcpu *vcpu) @@ -3791,8 +3794,8 @@ static u8 vmx_msr_bitmap_mode(struct kvm_vcpu *vcpu) return mode; } -static void vmx_update_msr_bitmap_x2apic(unsigned long *msr_bitmap, - u8 mode) +static void vmx_update_msr_bitmap_x2apic(struct kvm_vcpu *vcpu, + unsigned long *msr_bitmap, u8 mode) { int msr; @@ -3807,11 +3810,11 @@ static void vmx_update_msr_bitmap_x2apic(unsigned long *msr_bitmap, * TPR reads and writes can be virtualized even if virtual interrupt * delivery is not in use. */ - vmx_disable_intercept_for_msr(msr_bitmap, X2APIC_MSR(APIC_TASKPRI), MSR_TYPE_RW); + vmx_disable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TASKPRI), MSR_TYPE_RW); if (mode & MSR_BITMAP_MODE_X2APIC_APICV) { - vmx_enable_intercept_for_msr(msr_bitmap, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_R); - vmx_disable_intercept_for_msr(msr_bitmap, X2APIC_MSR(APIC_EOI), MSR_TYPE_W); - vmx_disable_intercept_for_msr(msr_bitmap, X2APIC_MSR(APIC_SELF_IPI), MSR_TYPE_W); + vmx_enable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_RW); + vmx_disable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_EOI), MSR_TYPE_W); + vmx_disable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_SELF_IPI), MSR_TYPE_W); } } } @@ -3827,30 +3830,24 @@ void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu) return; if (changed & (MSR_BITMAP_MODE_X2APIC | MSR_BITMAP_MODE_X2APIC_APICV)) - vmx_update_msr_bitmap_x2apic(msr_bitmap, mode); + vmx_update_msr_bitmap_x2apic(vcpu, msr_bitmap, mode); vmx->msr_bitmap_mode = mode; } -void pt_update_intercept_for_msr(struct vcpu_vmx *vmx) +void pt_update_intercept_for_msr(struct kvm_vcpu *vcpu) { - unsigned long *msr_bitmap = vmx->vmcs01.msr_bitmap; + struct vcpu_vmx *vmx = to_vmx(vcpu); bool flag = !(vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN); u32 i; - vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_STATUS, - MSR_TYPE_RW, flag); - vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_OUTPUT_BASE, - MSR_TYPE_RW, flag); - vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_OUTPUT_MASK, - MSR_TYPE_RW, flag); - vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_CR3_MATCH, - MSR_TYPE_RW, flag); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_RTIT_STATUS, MSR_TYPE_RW, flag); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_RTIT_OUTPUT_BASE, MSR_TYPE_RW, flag); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_RTIT_OUTPUT_MASK, MSR_TYPE_RW, flag); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_RTIT_CR3_MATCH, MSR_TYPE_RW, flag); for (i = 0; i < vmx->pt_desc.addr_range; i++) { - vmx_set_intercept_for_msr(msr_bitmap, - MSR_IA32_RTIT_ADDR0_A + i * 2, MSR_TYPE_RW, flag); - vmx_set_intercept_for_msr(msr_bitmap, - MSR_IA32_RTIT_ADDR0_B + i * 2, MSR_TYPE_RW, flag); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_RTIT_ADDR0_A + i * 2, MSR_TYPE_RW, flag); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_RTIT_ADDR0_B + i * 2, MSR_TYPE_RW, flag); } } @@ -6905,18 +6902,18 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu) goto free_pml; msr_bitmap = vmx->vmcs01.msr_bitmap; - vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_TSC, MSR_TYPE_R); - vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW); - vmx_disable_intercept_for_msr(msr_bitmap, MSR_GS_BASE, MSR_TYPE_RW); - vmx_disable_intercept_for_msr(msr_bitmap, MSR_KERNEL_GS_BASE, MSR_TYPE_RW); - vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_CS, MSR_TYPE_RW); - vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_ESP, MSR_TYPE_RW); - vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_EIP, MSR_TYPE_RW); + vmx_disable_intercept_for_msr(vcpu, MSR_IA32_TSC, MSR_TYPE_R); + vmx_disable_intercept_for_msr(vcpu, MSR_FS_BASE, MSR_TYPE_RW); + vmx_disable_intercept_for_msr(vcpu, MSR_GS_BASE, MSR_TYPE_RW); + vmx_disable_intercept_for_msr(vcpu, MSR_KERNEL_GS_BASE, MSR_TYPE_RW); + vmx_disable_intercept_for_msr(vcpu, MSR_IA32_SYSENTER_CS, MSR_TYPE_RW); + vmx_disable_intercept_for_msr(vcpu, MSR_IA32_SYSENTER_ESP, MSR_TYPE_RW); + vmx_disable_intercept_for_msr(vcpu, MSR_IA32_SYSENTER_EIP, MSR_TYPE_RW); if (kvm_cstate_in_guest(vcpu->kvm)) { - vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C1_RES, MSR_TYPE_R); - vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C3_RESIDENCY, MSR_TYPE_R); - vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C6_RESIDENCY, MSR_TYPE_R); - vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C7_RESIDENCY, MSR_TYPE_R); + vmx_disable_intercept_for_msr(vcpu, MSR_CORE_C1_RES, MSR_TYPE_R); + vmx_disable_intercept_for_msr(vcpu, MSR_CORE_C3_RESIDENCY, MSR_TYPE_R); + vmx_disable_intercept_for_msr(vcpu, MSR_CORE_C6_RESIDENCY, MSR_TYPE_R); + vmx_disable_intercept_for_msr(vcpu, MSR_CORE_C7_RESIDENCY, MSR_TYPE_R); } vmx->msr_bitmap_mode = 0; diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 639798e4a6ca..b3c74f0fe8a1 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -350,7 +350,7 @@ bool vmx_get_nmi_mask(struct kvm_vcpu *vcpu); void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked); void vmx_set_virtual_apic_mode(struct kvm_vcpu *vcpu); struct shared_msr_entry *find_msr_entry(struct vcpu_vmx *vmx, u32 msr); -void pt_update_intercept_for_msr(struct vcpu_vmx *vmx); +void pt_update_intercept_for_msr(struct kvm_vcpu *vcpu); void vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp); int vmx_find_msr_index(struct vmx_msrs *m, u32 msr); int vmx_handle_memory_failure(struct kvm_vcpu *vcpu, int r, From patchwork Tue Aug 4 04:20:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aaron Lewis X-Patchwork-Id: 11699713 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1336E13B1 for ; Tue, 4 Aug 2020 04:21:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2BF852177B for ; Tue, 4 Aug 2020 04:21:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="owNHARPA" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728943AbgHDEVE (ORCPT ); Tue, 4 Aug 2020 00:21:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48308 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726000AbgHDEVE (ORCPT ); Tue, 4 Aug 2020 00:21:04 -0400 Received: from mail-pf1-x44a.google.com (mail-pf1-x44a.google.com [IPv6:2607:f8b0:4864:20::44a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 03C13C06174A for ; Mon, 3 Aug 2020 21:21:04 -0700 (PDT) Received: by mail-pf1-x44a.google.com with SMTP id t11so16112044pfq.21 for ; Mon, 03 Aug 2020 21:21:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=CAgmMnQwHlrdjfSqIwoZHX1Ll3mqpCJfgOQXx/NjxeQ=; b=owNHARPAa4aHvNLpWZa8sHYePvwqbiy5JNc1ETIZSoUNwLkzyAsaA4B4uPlBsOOnlc QIhlQd450vjbwRcLVvU8m8bt3isNn12mr5iwL44DJNZQ0g+2ko9FjXJYq07KU/xgCaFw JViBu76QUk7n53boGOImpxIhgxpL2VYdRr0VhjnnoL5vpKl0jAiSW/s8YyRwY6/Qo6Ry ihksff9RjXasAc9V/RCVowE9NLn8SM4fRlguY8ONwCSladjAWe2iiGTPggodOt+TdADX joWYRUxkn7odWDermiE6tcq09d0T3wrTHR9xnxg3Y5VR1K05/Em8fHi3UnFMWzoH4dZn j00w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=CAgmMnQwHlrdjfSqIwoZHX1Ll3mqpCJfgOQXx/NjxeQ=; b=FemeeC78R97nJzJLIaIKxgHub2A2bKrBjeyeLvBMDxdL2ebz1ofg3s8KuOwWcCtKwx lbWYi6YrLTz78ghI52n0m7mnj+9aSMB9BMB8h3/Sfpzlih7a5cmrk/92mhtsBqOZU2J/ XF3XqlJMnA3D+y28xQPGPHEJ/aGW1HoJx0f+UJbhSkCLTFxmylb1YIU3uueQgiZ/9wf1 1ykMaGIwzyQKrMh6wUxMfnnzU5+lwPrNOtf5JFWkChSqel9CL57edInGfFAcrY1KgZ2t SUl+sSb0CwGsTNHPS7UhqlgMzf3ydz/gA0IQW4eWyfbrwPHzzu0gFbD5qVDVQk15vx/Z T40A== X-Gm-Message-State: AOAM532FFYE4pif0QNH4METCHIMZ5buh4AQm7fwU8qAp0FKfH7s9BQfH hNhUyzg1WaD11gjmDW3CK5sncv4yfEIGLLdk X-Google-Smtp-Source: ABdhPJxRAdg50atCF+2pd4wif38GcvMqDRKlsxavnM+r2JyjaWkQSma1kQq1a7CEadmzTDxZs1COUiFRTFk20ZWN X-Received: by 2002:a63:fd03:: with SMTP id d3mr17619482pgh.76.1596514863450; Mon, 03 Aug 2020 21:21:03 -0700 (PDT) Date: Mon, 3 Aug 2020 21:20:41 -0700 In-Reply-To: <20200804042043.3592620-1-aaronlewis@google.com> Message-Id: <20200804042043.3592620-5-aaronlewis@google.com> Mime-Version: 1.0 References: <20200804042043.3592620-1-aaronlewis@google.com> X-Mailer: git-send-email 2.28.0.163.g6104cc2f0b6-goog Subject: [PATCH 4/6] KVM: x86: Ensure the MSR bitmap never clears userspace tracked MSRs From: Aaron Lewis To: jmattson@google.com, graf@amazon.com Cc: pshier@google.com, oupton@google.com, kvm@vger.kernel.org, Aaron Lewis Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org SDM volume 3: 24.6.9 "MSR-Bitmap Address" and APM volume 2: 15.11 "MS intercepts" describe MSR permission bitmaps. Permission bitmaps are used to control whether an execution of rdmsr or wrmsr will cause a vm exit. For userspace tracked MSRs it is required they cause a vm exit, so the host is able to forward the MSR to userspace. This change adds vmx/svm support to ensure the permission bitmap is properly set to cause a vm_exit to the host when rdmsr or wrmsr is used by one of the userspace tracked MSRs. Also, to avoid repeatedly setting them, kvm_make_request() is used to coalesce these into a single call. Signed-off-by: Aaron Lewis Reviewed-by: Oliver Upton --- arch/x86/include/asm/kvm_host.h | 3 ++ arch/x86/kvm/svm/svm.c | 49 ++++++++++++++++++++++++++------- arch/x86/kvm/vmx/vmx.c | 13 ++++++++- arch/x86/kvm/x86.c | 16 +++++++++++ 4 files changed, 70 insertions(+), 11 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 510055471dd0..07a85f5f0b8a 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -87,6 +87,7 @@ #define KVM_REQ_HV_TLB_FLUSH \ KVM_ARCH_REQ_FLAGS(27, KVM_REQUEST_NO_WAKEUP) #define KVM_REQ_APF_READY KVM_ARCH_REQ(28) +#define KVM_REQ_USER_MSR_UPDATE KVM_ARCH_REQ(29) #define CR0_RESERVED_BITS \ (~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \ @@ -1271,6 +1272,8 @@ struct kvm_x86_ops { int (*enable_direct_tlbflush)(struct kvm_vcpu *vcpu); void (*migrate_timers)(struct kvm_vcpu *vcpu); + + void (*set_user_msr_intercept)(struct kvm_vcpu *vcpu, u32 msr); }; struct kvm_x86_nested_ops { diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index eb673b59f7b7..c560d283b2af 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -583,13 +583,27 @@ static bool msr_write_intercepted(struct kvm_vcpu *vcpu, u32 msr) return !!test_bit(bit_write, &tmp); } +static void __set_msr_interception(u32 *msrpm, u32 msr, int read, int write, + u32 offset) +{ + u8 bit_read, bit_write; + unsigned long tmp; + + bit_read = 2 * (msr & 0x0f); + bit_write = 2 * (msr & 0x0f) + 1; + tmp = msrpm[offset]; + + read ? clear_bit(bit_read, &tmp) : set_bit(bit_read, &tmp); + write ? clear_bit(bit_write, &tmp) : set_bit(bit_write, &tmp); + + msrpm[offset] = tmp; +} + static void set_msr_interception(struct kvm_vcpu *vcpu, u32 msr, int read, int write) { struct vcpu_svm *svm = to_svm(vcpu); u32 *msrpm = svm->msrpm; - u8 bit_read, bit_write; - unsigned long tmp; u32 offset; /* @@ -598,17 +612,30 @@ static void set_msr_interception(struct kvm_vcpu *vcpu, u32 msr, int read, */ WARN_ON(!valid_msr_intercept(msr)); - offset = svm_msrpm_offset(msr); - bit_read = 2 * (msr & 0x0f); - bit_write = 2 * (msr & 0x0f) + 1; - tmp = msrpm[offset]; - + offset = svm_msrpm_offset(msr); BUG_ON(offset == MSR_INVALID); - read ? clear_bit(bit_read, &tmp) : set_bit(bit_read, &tmp); - write ? clear_bit(bit_write, &tmp) : set_bit(bit_write, &tmp); + __set_msr_interception(msrpm, msr, read, write, offset); - msrpm[offset] = tmp; + if (read || write) + kvm_make_request(KVM_REQ_USER_MSR_UPDATE, vcpu); +} + +static void set_user_msr_interception(struct kvm_vcpu *vcpu, u32 msr, int read, + int write) +{ + struct vcpu_svm *svm = to_svm(vcpu); + u32 *msrpm = svm->msrpm; + u32 offset; + + offset = svm_msrpm_offset(msr); + if (offset != MSR_INVALID) + __set_msr_interception(msrpm, msr, read, write, offset); +} + +void svm_set_user_msr_intercept(struct kvm_vcpu *vcpu, u32 msr) +{ + set_user_msr_interception(vcpu, msr, 0, 0); } static void svm_vcpu_init_msrpm(struct kvm_vcpu *vcpu, u32 *msrpm) @@ -4088,6 +4115,8 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { .need_emulation_on_page_fault = svm_need_emulation_on_page_fault, .apic_init_signal_blocked = svm_apic_init_signal_blocked, + + .set_user_msr_intercept = svm_set_user_msr_intercept, }; static struct kvm_x86_init_ops svm_init_ops __initdata = { diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 1313e47a5a1e..3d3d9eaeca64 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3728,6 +3728,10 @@ static __always_inline void vmx_disable_intercept_for_msr(struct kvm_vcpu *vcpu, __clear_bit(msr, msr_bitmap + 0xc00 / f); } + + if (type & MSR_TYPE_R || type & MSR_TYPE_W) { + kvm_make_request(KVM_REQ_USER_MSR_UPDATE, vcpu); + } } static __always_inline void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu, @@ -3795,7 +3799,7 @@ static u8 vmx_msr_bitmap_mode(struct kvm_vcpu *vcpu) } static void vmx_update_msr_bitmap_x2apic(struct kvm_vcpu *vcpu, - unsigned long *msr_bitmap, u8 mode) + unsigned long *msr_bitmap, u8 mode) { int msr; @@ -3819,6 +3823,11 @@ static void vmx_update_msr_bitmap_x2apic(struct kvm_vcpu *vcpu, } } +void vmx_set_user_msr_intercept(struct kvm_vcpu *vcpu, u32 msr) +{ + vmx_enable_intercept_for_msr(vcpu, msr, MSR_TYPE_RW); +} + void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -7965,6 +7974,8 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = { .need_emulation_on_page_fault = vmx_need_emulation_on_page_fault, .apic_init_signal_blocked = vmx_apic_init_signal_blocked, .migrate_timers = vmx_migrate_timers, + + .set_user_msr_intercept = vmx_set_user_msr_intercept, }; static __init int hardware_setup(void) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 47619b49818a..45bf59f94d34 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3537,6 +3537,19 @@ bool kvm_msr_user_exit(struct kvm *kvm, u32 index) } EXPORT_SYMBOL_GPL(kvm_msr_user_exit); +static void kvm_set_user_msr_intercepts(struct kvm_vcpu *vcpu) +{ + struct kvm_msr_list *msr_list = vcpu->kvm->arch.user_exit_msrs; + u32 i, msr; + + if (msr_list) { + for (i = 0; i < msr_list->nmsrs; i++) { + msr = msr_list->indices[i]; + kvm_x86_ops.set_user_msr_intercept(vcpu, msr); + } + } +} + int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) { int r = 0; @@ -8553,6 +8566,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) kvm_vcpu_update_apicv(vcpu); if (kvm_check_request(KVM_REQ_APF_READY, vcpu)) kvm_check_async_pf_completion(vcpu); + + if (kvm_check_request(KVM_REQ_USER_MSR_UPDATE, vcpu)) + kvm_set_user_msr_intercepts(vcpu); } if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) { From patchwork Tue Aug 4 04:20:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aaron Lewis X-Patchwork-Id: 11699715 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 04EB76C1 for ; Tue, 4 Aug 2020 04:21:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1D0C22177B for ; Tue, 4 Aug 2020 04:21:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="eKj0lvct" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729300AbgHDEVI (ORCPT ); Tue, 4 Aug 2020 00:21:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48320 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725807AbgHDEVG (ORCPT ); Tue, 4 Aug 2020 00:21:06 -0400 Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com [IPv6:2607:f8b0:4864:20::54a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D6712C06174A for ; Mon, 3 Aug 2020 21:21:06 -0700 (PDT) Received: by mail-pg1-x54a.google.com with SMTP id r3so3366795pgn.12 for ; Mon, 03 Aug 2020 21:21:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=y+cK3yK3on3iaDXKq4Us8jdSuqq9Ku70bGtopQ6r1/M=; b=eKj0lvct4BxJVjKtvPVEob3BDhRfV8xMTYTzuVZaVp2AjNsGHVVGqiAeJXoexpF8S2 NympPWYf/Xlrc8R+KwnMVKX7M8jacZhIHqBXFUjZt8I0/9fecqkS1N/v3rxRM+LmgBdh +MVWow6OEfNr3wQk2R3HLvERuB1Vs/UenkLIfjJbPNduWKiPqUTa04zLaWCX7soXGNUc o+GGrfQ/6I9YT9hyCaoc8y1fWNyCztI3e8oAgbdGB8tXDByycPAgf+Kj/j0xV8KA3BH5 JDwksX09UR42V7P/ymKUrV33Crs3pQhVbTW5bsF+8k8NAapfpVl/W074EaFH8kO5lty1 9CVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=y+cK3yK3on3iaDXKq4Us8jdSuqq9Ku70bGtopQ6r1/M=; b=FTwN9eBVYxsqIUuoS4MVEpjZ2wmWchiRKvzCLgvIgWTl+YPvd7Mapecg2/bqGXP0uW 8ya/Gu9PfYhZFN8jek5RKZV2LJZ1dDhCl//Gs/a0GrlKssVQPHuyH7M/P1r5l4DhIBlU YFDUzGjJXkt/iX4gYNtvPyinpxjVIVspa3QwqUyA4B4CPUI+BK/sp1hs9EAKom5qMsTi tX4DiwayCQ1sCLybTsAi+ONjkVGpDa90uurVvz2IacY9RzLHM09Yybcv6LCy3lsdA49j L88ho2E11Xb6xMxmWb3TxZ7jA5JzrBElNUO8rR+8CEIYdUeUvAIr5E3wxVKFR+0+bm1s 6GNg== X-Gm-Message-State: AOAM533ErQ5lS29hvM2YKl7EN6wCPpfr1ymTrF94ePXAhtoFGHislQ0W fPIPXV8Sgj4J3E4gFtqyko2sv3/q20Rq6XGl X-Google-Smtp-Source: ABdhPJz0nEpga0d7QY4mJoDD3tqGN0OiU9OHEGTNE+LAADeOSBTw47613gynb5KKLp1x8bhw2nij8JiLHx1SfgGY X-Received: by 2002:a17:902:368:: with SMTP id 95mr18050978pld.279.1596514866344; Mon, 03 Aug 2020 21:21:06 -0700 (PDT) Date: Mon, 3 Aug 2020 21:20:42 -0700 In-Reply-To: <20200804042043.3592620-1-aaronlewis@google.com> Message-Id: <20200804042043.3592620-6-aaronlewis@google.com> Mime-Version: 1.0 References: <20200804042043.3592620-1-aaronlewis@google.com> X-Mailer: git-send-email 2.28.0.163.g6104cc2f0b6-goog Subject: [PATCH 5/6] selftests: kvm: Fix the segment descriptor layout to match the actual layout From: Aaron Lewis To: jmattson@google.com, graf@amazon.com Cc: pshier@google.com, oupton@google.com, kvm@vger.kernel.org, Aaron Lewis Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Fix the layout of 'struct desc64' to match the layout described in the SDM Vol 3, 3.4.5 Segment Descriptors, Figure 3-8. The test added later in this series relies on this and crashes if this layout is not correct. Signed-off-by: Aaron Lewis --- tools/testing/selftests/kvm/include/x86_64/processor.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h index 82b7fe16a824..0a65e7bb5249 100644 --- a/tools/testing/selftests/kvm/include/x86_64/processor.h +++ b/tools/testing/selftests/kvm/include/x86_64/processor.h @@ -59,7 +59,7 @@ struct gpr64_regs { struct desc64 { uint16_t limit0; uint16_t base0; - unsigned base1:8, s:1, type:4, dpl:2, p:1; + unsigned base1:8, type:4, s:1, dpl:2, p:1; unsigned limit1:4, avl:1, l:1, db:1, g:1, base2:8; uint32_t base3; uint32_t zero1; From patchwork Tue Aug 4 04:20:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aaron Lewis X-Patchwork-Id: 11699717 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CA1176C1 for ; Tue, 4 Aug 2020 04:21:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CF73C20792 for ; Tue, 4 Aug 2020 04:21:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Pf1NMb9c" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729367AbgHDEVK (ORCPT ); Tue, 4 Aug 2020 00:21:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48330 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729209AbgHDEVJ (ORCPT ); Tue, 4 Aug 2020 00:21:09 -0400 Received: from mail-pf1-x449.google.com (mail-pf1-x449.google.com [IPv6:2607:f8b0:4864:20::449]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DB8C2C061756 for ; Mon, 3 Aug 2020 21:21:09 -0700 (PDT) Received: by mail-pf1-x449.google.com with SMTP id l195so17101288pfd.22 for ; Mon, 03 Aug 2020 21:21:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=hPa0uaMmIPgl1K+NirLg5sArLeHAvhOcV4CxBIacKGQ=; b=Pf1NMb9cjoQD+oWR+PQIW3WlBpAQpfZP1U6CemioIZJBu6Tt3Nx5qcM8qO16E0120Z 4oPyLGQPaV+yEjjGP5j+SgfjiIPTupr/AI/DjPBnsmyd+hLoQdG6g3xxkCtZb1a5e4TK ukIE5THug3rlUzlsYjZWpAPFNb7TDAqIKyh4wj3n2LnbA2IFH60zwsF1QLZT52sP8z2A TlZ3nr7J8lTs37VYGwQrkYxyGcOZ+qjwqi7ynJQ2D1m5W0wUwWFlC7G3lvFPXqY38iSp DMUZ41Lk0OuZy2Ao8PhAXwrdpwot/pVASCE/Y5x/buUqhKhqhf856oTWX2IcoR5cNFNh hpow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=hPa0uaMmIPgl1K+NirLg5sArLeHAvhOcV4CxBIacKGQ=; b=DXKmIjMkFcNj8sKnQHlAEi6rtb52jezlFksSJ3XGQhPV5uQLCIo174E7fUA6SOpbjY T9oZQyAGOTfukeKJ9jzemuB6eCFaDap/c5Gpm3k8Ccc33Z6JEEBmMuMzd8ou7MgTRkBN Fj98ql8VZX5BNtGw/cssdqIQfARB4S0yS20dGmB61H4JNEN6SAD03jfX6wJnM5ZQhpHO v5ed5wT0LiDCaNYN+AhM5dHPBQ7miZWCUdaFAHw8LEHcYmTH9mJ8iWZL1uWgZR6Td2Dm BFbxs2d6YCNcGHGLxS1cfLUPpzTtbQT3+eNmUUg0lJHHgfbhrtSSeqEIEo7NTngRXamL kVlw== X-Gm-Message-State: AOAM532SElSzK3MT+Y/6csf6A82zl3kcrbcLsd7FvIdmbY2Q1mo6uGMV 0rYiVYIh8wRogunB3Pubbt6glQ+VTs1sR5zx X-Google-Smtp-Source: ABdhPJwCfat+QTsEEOp7s/sgGQmsfJXjVKOZjfohUDUkAYrJMrjreh5g49NTVwRHx+ITrs7WMn1IoKxnBBZtBe71 X-Received: by 2002:a17:90a:f014:: with SMTP id bt20mr308345pjb.0.1596514868880; Mon, 03 Aug 2020 21:21:08 -0700 (PDT) Date: Mon, 3 Aug 2020 21:20:43 -0700 In-Reply-To: <20200804042043.3592620-1-aaronlewis@google.com> Message-Id: <20200804042043.3592620-7-aaronlewis@google.com> Mime-Version: 1.0 References: <20200804042043.3592620-1-aaronlewis@google.com> X-Mailer: git-send-email 2.28.0.163.g6104cc2f0b6-goog Subject: [PATCH 6/6] selftests: kvm: Add test to exercise userspace MSR list From: Aaron Lewis To: jmattson@google.com, graf@amazon.com Cc: pshier@google.com, oupton@google.com, kvm@vger.kernel.org, Aaron Lewis Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add a selftest to test that when ioctl KVM_SET_EXIT_MSRS is called with an MSR list the guest exits to the host and then to userspace when an MSR in that list is read from or written to. This test uses 3 MSRs to test these new features: 1. MSR_IA32_XSS, an MSR the kernel knows about. 2. MSR_IA32_FLUSH_CMD, an MSR the kernel does not know about. 3. MSR_NON_EXISTENT, an MSR invented in this test for the purposes of passing a fake MSR from the guest to userspace and having the guest be able to read from and write to it, with userspace handling it. KVM just acts as a pass through. Userspace is also able to inject a #GP. This is demonstrated when MSR_IA32_XSS and MSR_IA32_FLUSH_CMD are misused in the test. When this happens a #GP is initiated in userspace to be thrown in the guest. To be able to handle this, exception handling was added to selftests, and a #GP exception handler is registered in the test. This allows the test to inject a #GP and be able to handle it gracefully. Signed-off-by: Aaron Lewis --- tools/testing/selftests/kvm/.gitignore | 1 + tools/testing/selftests/kvm/Makefile | 20 +- .../selftests/kvm/include/x86_64/processor.h | 27 ++ tools/testing/selftests/kvm/lib/kvm_util.c | 17 ++ .../selftests/kvm/lib/kvm_util_internal.h | 2 + .../selftests/kvm/lib/x86_64/handlers.S | 83 ++++++ .../selftests/kvm/lib/x86_64/processor.c | 168 ++++++++++- .../testing/selftests/kvm/lib/x86_64/ucall.c | 3 + .../selftests/kvm/x86_64/userspace_msr_exit.c | 271 ++++++++++++++++++ 9 files changed, 582 insertions(+), 10 deletions(-) create mode 100644 tools/testing/selftests/kvm/lib/x86_64/handlers.S create mode 100644 tools/testing/selftests/kvm/x86_64/userspace_msr_exit.c diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore index 452787152748..33619f915857 100644 --- a/tools/testing/selftests/kvm/.gitignore +++ b/tools/testing/selftests/kvm/.gitignore @@ -14,6 +14,7 @@ /x86_64/vmx_preemption_timer_test /x86_64/svm_vmcall_test /x86_64/sync_regs_test +/x86_64/userspace_msr_exit /x86_64/vmx_close_while_nested_test /x86_64/vmx_dirty_log_test /x86_64/vmx_set_nested_state_test diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 4a166588d99f..66a6652ca305 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -34,7 +34,7 @@ ifeq ($(ARCH),s390) endif LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/sparsebit.c lib/test_util.c -LIBKVM_x86_64 = lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c +LIBKVM_x86_64 = lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c @@ -49,6 +49,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/state_test TEST_GEN_PROGS_x86_64 += x86_64/vmx_preemption_timer_test TEST_GEN_PROGS_x86_64 += x86_64/svm_vmcall_test TEST_GEN_PROGS_x86_64 += x86_64/sync_regs_test +TEST_GEN_PROGS_x86_64 += x86_64/userspace_msr_exit TEST_GEN_PROGS_x86_64 += x86_64/vmx_close_while_nested_test TEST_GEN_PROGS_x86_64 += x86_64/vmx_dirty_log_test TEST_GEN_PROGS_x86_64 += x86_64/vmx_set_nested_state_test @@ -108,14 +109,21 @@ LDFLAGS += -pthread $(no-pie-option) $(pgste-option) include ../lib.mk STATIC_LIBS := $(OUTPUT)/libkvm.a -LIBKVM_OBJ := $(patsubst %.c, $(OUTPUT)/%.o, $(LIBKVM)) -EXTRA_CLEAN += $(LIBKVM_OBJ) $(STATIC_LIBS) cscope.* +LIBKVM_C := $(filter %.c,$(LIBKVM)) +LIBKVM_S := $(filter %.S,$(LIBKVM)) +LIBKVM_C_OBJ := $(patsubst %.c, $(OUTPUT)/%.o, $(LIBKVM_C)) +LIBKVM_S_OBJ := $(patsubst %.S, $(OUTPUT)/%.o, $(LIBKVM_S)) +EXTRA_CLEAN += $(LIBKVM_C_OBJ) $(LIBKVM_S_OBJ) $(STATIC_LIBS) cscope.* + +x := $(shell mkdir -p $(sort $(dir $(LIBKVM_C_OBJ) $(LIBKVM_S_OBJ)))) +$(LIBKVM_C_OBJ): $(OUTPUT)/%.o: %.c + $(CC) $(CFLAGS) $(CPPFLAGS) $(TARGET_ARCH) -c $< -o $@ -x := $(shell mkdir -p $(sort $(dir $(LIBKVM_OBJ)))) -$(LIBKVM_OBJ): $(OUTPUT)/%.o: %.c +$(LIBKVM_S_OBJ): $(OUTPUT)/%.o: %.S $(CC) $(CFLAGS) $(CPPFLAGS) $(TARGET_ARCH) -c $< -o $@ -$(OUTPUT)/libkvm.a: $(LIBKVM_OBJ) +LIBKVM_OBJS = $(LIBKVM_C_OBJ) $(LIBKVM_S_OBJ) +$(OUTPUT)/libkvm.a: $(LIBKVM_OBJS) $(AR) crs $@ $^ x := $(shell mkdir -p $(sort $(dir $(TEST_GEN_PROGS)))) diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h index 0a65e7bb5249..a4de429eb408 100644 --- a/tools/testing/selftests/kvm/include/x86_64/processor.h +++ b/tools/testing/selftests/kvm/include/x86_64/processor.h @@ -36,6 +36,8 @@ #define X86_CR4_SMAP (1ul << 21) #define X86_CR4_PKE (1ul << 22) +#define UNEXPECTED_VECTOR_PORT 0xfff0u + /* General Registers in 64-Bit Mode */ struct gpr64_regs { u64 rax; @@ -239,6 +241,11 @@ static inline struct desc_ptr get_idt(void) return idt; } +static inline void outl(uint16_t port, uint32_t value) +{ + __asm__ __volatile__("outl %%eax, %%dx" : : "d"(port), "a"(value)); +} + #define SET_XMM(__var, __xmm) \ asm volatile("movq %0, %%"#__xmm : : "r"(__var) : #__xmm) @@ -334,10 +341,30 @@ int _vcpu_set_msr(struct kvm_vm *vm, uint32_t vcpuid, uint64_t msr_index, void vcpu_set_msr(struct kvm_vm *vm, uint32_t vcpuid, uint64_t msr_index, uint64_t msr_value); +void kvm_set_exit_msrs(struct kvm_vm *vm, uint32_t nmsrs, + uint32_t msr_indices[]); + uint32_t kvm_get_cpuid_max_basic(void); uint32_t kvm_get_cpuid_max_extended(void); void kvm_get_cpu_address_width(unsigned int *pa_bits, unsigned int *va_bits); +struct ex_regs { + uint64_t rax, rcx, rdx, rbx; + uint64_t dummy, rbp, rsi, rdi; + uint64_t r8, r9, r10, r11; + uint64_t r12, r13, r14, r15; + uint64_t vector; + uint64_t error_code; + uint64_t rip; + uint64_t cs; + uint64_t rflags; +}; + +void vm_init_descriptor_tables(struct kvm_vm *vm); +void vcpu_init_descriptor_tables(struct kvm_vm *vm, uint32_t vcpuid); +void vm_handle_exception(struct kvm_vm *vm, int vector, + void (*handler)(struct ex_regs *)); + /* * Basic CPU control in CR0 */ diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index 74776ee228f2..f8dde1cdbef0 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -1195,6 +1195,21 @@ int _vcpu_run(struct kvm_vm *vm, uint32_t vcpuid) do { rc = ioctl(vcpu->fd, KVM_RUN, NULL); } while (rc == -1 && errno == EINTR); + +#ifdef __x86_64__ + if (vcpu_state(vm, vcpuid)->exit_reason == KVM_EXIT_IO + && vcpu_state(vm, vcpuid)->io.port == UNEXPECTED_VECTOR_PORT + && vcpu_state(vm, vcpuid)->io.size == 4) { + /* Grab pointer to io data */ + uint32_t *data = (void *)vcpu_state(vm, vcpuid) + + vcpu_state(vm, vcpuid)->io.data_offset; + + TEST_ASSERT(false, + "Unexpected vectored event in guest (vector:0x%x)", + *data); + } +#endif + return rc; } @@ -1590,6 +1605,8 @@ static struct exit_reason { {KVM_EXIT_INTERNAL_ERROR, "INTERNAL_ERROR"}, {KVM_EXIT_OSI, "OSI"}, {KVM_EXIT_PAPR_HCALL, "PAPR_HCALL"}, + {KVM_EXIT_X86_RDMSR, "RDMSR"}, + {KVM_EXIT_X86_WRMSR, "WRMSR"}, #ifdef KVM_EXIT_MEMORY_NOT_PRESENT {KVM_EXIT_MEMORY_NOT_PRESENT, "MEMORY_NOT_PRESENT"}, #endif diff --git a/tools/testing/selftests/kvm/lib/kvm_util_internal.h b/tools/testing/selftests/kvm/lib/kvm_util_internal.h index 2ef446520748..f07d383d03a1 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util_internal.h +++ b/tools/testing/selftests/kvm/lib/kvm_util_internal.h @@ -50,6 +50,8 @@ struct kvm_vm { vm_paddr_t pgd; vm_vaddr_t gdt; vm_vaddr_t tss; + vm_vaddr_t idt; + vm_vaddr_t handlers; }; struct vcpu *vcpu_find(struct kvm_vm *vm, uint32_t vcpuid); diff --git a/tools/testing/selftests/kvm/lib/x86_64/handlers.S b/tools/testing/selftests/kvm/lib/x86_64/handlers.S new file mode 100644 index 000000000000..783d2c0fc741 --- /dev/null +++ b/tools/testing/selftests/kvm/lib/x86_64/handlers.S @@ -0,0 +1,83 @@ +handle_exception: + push %r15 + push %r14 + push %r13 + push %r12 + push %r11 + push %r10 + push %r9 + push %r8 + + push %rdi + push %rsi + push %rbp + sub $8, %rsp + push %rbx + push %rdx + push %rcx + push %rax + mov %rsp, %rdi + + call route_exception + + pop %rax + pop %rcx + pop %rdx + pop %rbx + add $8, %rsp + pop %rbp + pop %rsi + pop %rdi + pop %r8 + pop %r9 + pop %r10 + pop %r11 + pop %r12 + pop %r13 + pop %r14 + pop %r15 + + /* Discard vector and error code. */ + add $16, %rsp + iretq + +/* + * Build the handle_exception wrappers which push the vector/error code on the + * stack and an array of pointers to those wrappers. + */ +.pushsection .rodata +.globl idt_handlers +idt_handlers: +.popsection + +.macro HANDLERS has_error from to + vector = \from + .rept \to - \from + 1 + .align 8 + + /* Fetch current address and append it to idt_handlers. */ + current_handler = . +.pushsection .rodata +.quad current_handler +.popsection + + .if ! \has_error + pushq $0 + .endif + pushq $vector + jmp handle_exception + vector = vector + 1 + .endr +.endm + +.global idt_handler_code +idt_handler_code: + HANDLERS has_error=0 from=0 to=7 + HANDLERS has_error=1 from=8 to=8 + HANDLERS has_error=0 from=9 to=9 + HANDLERS has_error=1 from=10 to=14 + HANDLERS has_error=0 from=15 to=16 + HANDLERS has_error=1 from=17 to=17 + HANDLERS has_error=0 from=18 to=255 + +.section .note.GNU-stack, "", %progbits diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c index f6eb34eaa0d2..ff56753f205f 100644 --- a/tools/testing/selftests/kvm/lib/x86_64/processor.c +++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c @@ -12,6 +12,13 @@ #include "../kvm_util_internal.h" #include "processor.h" +#ifndef NUM_INTERRUPTS +#define NUM_INTERRUPTS 256 +#endif + +#define DEFAULT_CODE_SELECTOR 0x8 +#define DEFAULT_DATA_SELECTOR 0x10 + /* Minimum physical address used for virtual translation tables. */ #define KVM_GUEST_PAGE_TABLE_MIN_PADDR 0x180000 @@ -392,11 +399,12 @@ static void kvm_seg_fill_gdt_64bit(struct kvm_vm *vm, struct kvm_segment *segp) desc->limit0 = segp->limit & 0xFFFF; desc->base0 = segp->base & 0xFFFF; desc->base1 = segp->base >> 16; - desc->s = segp->s; desc->type = segp->type; + desc->s = segp->s; desc->dpl = segp->dpl; desc->p = segp->present; desc->limit1 = segp->limit >> 16; + desc->avl = segp->avl; desc->l = segp->l; desc->db = segp->db; desc->g = segp->g; @@ -556,9 +564,9 @@ static void vcpu_setup(struct kvm_vm *vm, int vcpuid, int pgd_memslot, int gdt_m sregs.efer |= (EFER_LME | EFER_LMA | EFER_NX); kvm_seg_set_unusable(&sregs.ldt); - kvm_seg_set_kernel_code_64bit(vm, 0x8, &sregs.cs); - kvm_seg_set_kernel_data_64bit(vm, 0x10, &sregs.ds); - kvm_seg_set_kernel_data_64bit(vm, 0x10, &sregs.es); + kvm_seg_set_kernel_code_64bit(vm, DEFAULT_CODE_SELECTOR, &sregs.cs); + kvm_seg_set_kernel_data_64bit(vm, DEFAULT_DATA_SELECTOR, &sregs.ds); + kvm_seg_set_kernel_data_64bit(vm, DEFAULT_DATA_SELECTOR, &sregs.es); kvm_setup_tss_64bit(vm, &sregs.tr, 0x18, gdt_memslot, pgd_memslot); break; @@ -843,6 +851,71 @@ void vcpu_set_msr(struct kvm_vm *vm, uint32_t vcpuid, uint64_t msr_index, " rc: %i errno: %i", r, errno); } +/* + * _KVM Set Exit MSR + * + * Input Args: + * vm - Virtual Machine + * nmsrs - Number of msrs in msr_indices + * msr_indices[] - List of msrs. + * + * Output Args: None + * + * Return: The result of KVM_SET_EXIT_MSRS. + * + * Sets a list of MSRs that will force an exit to userspace when + * any of them are read from or written to by the guest. + */ +int _kvm_set_exit_msrs(struct kvm_vm *vm, uint32_t nmsrs, + uint32_t msr_indices[]) +{ + const uint32_t max_nmsrs = 256; + struct kvm_msr_list *msr_list; + uint32_t i; + int r; + + TEST_ASSERT(nmsrs <= max_nmsrs, + "'nmsrs' is too large. Max is %u, currently %u.\n", + max_nmsrs, nmsrs); + uint32_t msr_list_byte_size = sizeof(struct kvm_msr_list) + + (sizeof(msr_list->indices[0]) * nmsrs); + msr_list = alloca(msr_list_byte_size); + memset(msr_list, 0, msr_list_byte_size); + + msr_list->nmsrs = nmsrs; + for (i = 0; i < nmsrs; i++) + msr_list->indices[i] = msr_indices[i]; + + r = ioctl(vm->fd, KVM_SET_EXIT_MSRS, msr_list); + + return r; +} + +/* + * KVM Set Exit MSR + * + * Input Args: + * vm - Virtual Machine + * nmsrs - Number of msrs in msr_indices + * msr_indices[] - List of msrs. + * + * Output Args: None + * + * Return: None + * + * Sets a list of MSRs that will force an exit to userspace when + * any of them are read from or written to by the guest. + */ +void kvm_set_exit_msrs(struct kvm_vm *vm, uint32_t nmsrs, + uint32_t msr_indices[]) +{ + int r; + + r = _kvm_set_exit_msrs(vm, nmsrs, msr_indices); + TEST_ASSERT(r == 0, "KVM_SET_EXIT_MSRS IOCTL failed,\n" + " rc: %i errno: %i", r, errno); +} + void vcpu_args_set(struct kvm_vm *vm, uint32_t vcpuid, unsigned int num, ...) { va_list ap; @@ -1118,3 +1191,90 @@ void kvm_get_cpu_address_width(unsigned int *pa_bits, unsigned int *va_bits) *va_bits = (entry->eax >> 8) & 0xff; } } + +struct idt_entry { + uint16_t offset0; + uint16_t selector; + uint16_t ist : 3; + uint16_t : 5; + uint16_t type : 4; + uint16_t : 1; + uint16_t dpl : 2; + uint16_t p : 1; + uint16_t offset1; + uint32_t offset2; uint32_t reserved; +}; + +static void set_idt_entry(struct kvm_vm *vm, int vector, unsigned long addr, + int dpl, unsigned short selector) +{ + struct idt_entry *base = + (struct idt_entry *)addr_gva2hva(vm, vm->idt); + struct idt_entry *e = &base[vector]; + + memset(e, 0, sizeof(*e)); + e->offset0 = addr; + e->selector = selector; + e->ist = 0; + e->type = 14; + e->dpl = dpl; + e->p = 1; + e->offset1 = addr >> 16; + e->offset2 = addr >> 32; +} + +void kvm_exit_unexpected_vector(uint32_t value) +{ + outl(UNEXPECTED_VECTOR_PORT, value); +} + +void route_exception(struct ex_regs *regs) +{ + typedef void(*handler)(struct ex_regs *); + handler *handlers; + + handlers = (handler *)rdmsr(MSR_GS_BASE); + + if (handlers[regs->vector]) { + handlers[regs->vector](regs); + return; + } + + kvm_exit_unexpected_vector(regs->vector); +} + +void vm_init_descriptor_tables(struct kvm_vm *vm) +{ + extern void *idt_handlers; + int i; + + vm->idt = vm_vaddr_alloc(vm, getpagesize(), 0x2000, 0, 0); + vm->handlers = vm_vaddr_alloc(vm, 256 * sizeof(void *), 0x2000, 0, 0); + /* Handlers have the same address in both address spaces.*/ + for (i = 0; i < NUM_INTERRUPTS; i++) + set_idt_entry(vm, i, (unsigned long)(&idt_handlers)[i], 0, + DEFAULT_CODE_SELECTOR); +} + +void vcpu_init_descriptor_tables(struct kvm_vm *vm, uint32_t vcpuid) +{ + struct kvm_sregs sregs; + + vcpu_sregs_get(vm, vcpuid, &sregs); + sregs.idt.base = vm->idt; + sregs.idt.limit = NUM_INTERRUPTS * sizeof(struct idt_entry) - 1; + sregs.gdt.base = vm->gdt; + sregs.gdt.limit = getpagesize() - 1; + /* Use GS Base to pass the pointer to the handlers to the guest.*/ + kvm_seg_set_kernel_data_64bit(NULL, DEFAULT_DATA_SELECTOR, &sregs.gs); + sregs.gs.base = (unsigned long) vm->handlers; + vcpu_sregs_set(vm, vcpuid, &sregs); +} + +void vm_handle_exception(struct kvm_vm *vm, int vector, + void (*handler)(struct ex_regs *)) +{ + vm_vaddr_t *handlers = (vm_vaddr_t *)addr_gva2hva(vm, vm->handlers); + + handlers[vector] = (vm_vaddr_t)handler; +} diff --git a/tools/testing/selftests/kvm/lib/x86_64/ucall.c b/tools/testing/selftests/kvm/lib/x86_64/ucall.c index da4d89ad5419..a3489973e290 100644 --- a/tools/testing/selftests/kvm/lib/x86_64/ucall.c +++ b/tools/testing/selftests/kvm/lib/x86_64/ucall.c @@ -40,6 +40,9 @@ uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc) struct kvm_run *run = vcpu_state(vm, vcpu_id); struct ucall ucall = {}; + if (uc) + memset(uc, 0, sizeof(*uc)); + if (run->exit_reason == KVM_EXIT_IO && run->io.port == UCALL_PIO_PORT) { struct kvm_regs regs; diff --git a/tools/testing/selftests/kvm/x86_64/userspace_msr_exit.c b/tools/testing/selftests/kvm/x86_64/userspace_msr_exit.c new file mode 100644 index 000000000000..6c7868dbce08 --- /dev/null +++ b/tools/testing/selftests/kvm/x86_64/userspace_msr_exit.c @@ -0,0 +1,271 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2020, Google LLC. + * + * Tests for exiting into userspace on registered MSRs + */ + +#define _GNU_SOURCE /* for program_invocation_short_name */ +#include + +#include "test_util.h" +#include "kvm_util.h" +#include "vmx.h" + +#define VCPU_ID 1 + +#define MSR_NON_EXISTENT 0x474f4f00 + +uint32_t msrs[] = { + /* Test an MSR the kernel knows about. */ + MSR_IA32_XSS, + /* Test an MSR the kernel doesn't know about. */ + MSR_IA32_FLUSH_CMD, + /* Test a fabricated MSR that no one knows about. */ + MSR_NON_EXISTENT, +}; +uint32_t nmsrs = ARRAY_SIZE(msrs); + +uint64_t msr_non_existent_data; +int guest_exception_count; + +/* + * Note: Force test_rdmsr() to not be inlined to prevent the labels, + * rdmsr_start and rdmsr_end, from being defined multiple times. + */ +static noinline uint64_t test_rdmsr(uint32_t msr) +{ + uint32_t a, d; + + guest_exception_count = 0; + + __asm__ __volatile__("rdmsr_start: rdmsr; rdmsr_end:" : + "=a"(a), "=d"(d) : "c"(msr) : "memory"); + + return a | ((uint64_t) d << 32); +} + +/* + * Note: Force test_wrmsr() to not be inlined to prevent the labels, + * wrmsr_start and wrmsr_end, from being defined multiple times. + */ +static noinline void test_wrmsr(uint32_t msr, uint64_t value) +{ + uint32_t a = value; + uint32_t d = value >> 32; + + guest_exception_count = 0; + + __asm__ __volatile__("wrmsr_start: wrmsr; wrmsr_end:" :: + "a"(a), "d"(d), "c"(msr) : "memory"); +} + +extern char rdmsr_start, rdmsr_end; +extern char wrmsr_start, wrmsr_end; + + +static void guest_code(void) +{ + uint64_t data; + + /* + * Test userspace intercepting rdmsr / wrmsr for MSR_IA32_XSS. + * + * A GP is thrown if anything other than 0 is written to + * MSR_IA32_XSS. + */ + data = test_rdmsr(MSR_IA32_XSS); + GUEST_ASSERT(data == 0); + GUEST_ASSERT(guest_exception_count == 0); + + test_wrmsr(MSR_IA32_XSS, 0); + GUEST_ASSERT(guest_exception_count == 0); + + test_wrmsr(MSR_IA32_XSS, 1); + GUEST_ASSERT(guest_exception_count == 1); + + /* + * Test userspace intercepting rdmsr / wrmsr for MSR_IA32_FLUSH_CMD. + * + * A GP is thrown if MSR_IA32_FLUSH_CMD is read + * from or if a value other than 1 is written to it. + */ + test_rdmsr(MSR_IA32_FLUSH_CMD); + GUEST_ASSERT(guest_exception_count == 1); + + test_wrmsr(MSR_IA32_FLUSH_CMD, 0); + GUEST_ASSERT(guest_exception_count == 1); + + test_wrmsr(MSR_IA32_FLUSH_CMD, 1); + GUEST_ASSERT(guest_exception_count == 0); + + /* + * Test userspace intercepting rdmsr / wrmsr for MSR_NON_EXISTENT. + * + * Test that a fabricated MSR can pass through the kernel + * and be handled in userspace. + * + */ + test_wrmsr(MSR_NON_EXISTENT, 2); + GUEST_ASSERT(guest_exception_count == 0); + + data = test_rdmsr(MSR_NON_EXISTENT); + GUEST_ASSERT(data == 2); + GUEST_ASSERT(guest_exception_count == 0); + + GUEST_DONE(); +} + +static void guest_gp_handler(struct ex_regs *regs) +{ + if (regs->rip == (uintptr_t)&rdmsr_start) { + regs->rip = (uintptr_t)&rdmsr_end; + regs->rax = 0; + regs->rdx = 0; + } else if (regs->rip == (uintptr_t)&wrmsr_start) { + regs->rip = (uintptr_t)&wrmsr_end; + } else { + GUEST_ASSERT(!"RIP is at an unknown location!"); + } + + ++guest_exception_count; +} + +static void run_guest(struct kvm_vm *vm) +{ + int rc; + + rc = _vcpu_run(vm, VCPU_ID); + TEST_ASSERT(rc == 0, "vcpu_run failed: %d\n", rc); +} + +static void check_for_guest_assert(struct kvm_vm *vm) +{ + struct kvm_run *run = vcpu_state(vm, VCPU_ID); + struct ucall uc; + + if (run->exit_reason == KVM_EXIT_IO && + get_ucall(vm, VCPU_ID, &uc) == UCALL_ABORT) { + TEST_FAIL("%s at %s:%ld", (const char *)uc.args[0], + __FILE__, uc.args[1]); + } +} + +static void process_rdmsr(struct kvm_vm *vm, uint32_t msr_index) +{ + struct kvm_run *run = vcpu_state(vm, VCPU_ID); + + check_for_guest_assert(vm); + + TEST_ASSERT(run->exit_reason == KVM_EXIT_X86_RDMSR, + "Unexpected exit reason: %u (%s),\n", + run->exit_reason, + exit_reason_str(run->exit_reason)); + TEST_ASSERT(run->msr.index == msr_index, + "Unexpected msr (0x%04x), expected 0x%04x", + run->msr.index, msr_index); + + switch (run->msr.index) { + case MSR_IA32_XSS: + run->msr.data = 0; + break; + case MSR_IA32_FLUSH_CMD: + run->msr.inject_gp = 1; + break; + case MSR_NON_EXISTENT: + run->msr.data = msr_non_existent_data; + break; + default: + TEST_ASSERT(false, "Unexpected MSR: 0x%04x", run->msr.index); + } +} + +static void process_wrmsr(struct kvm_vm *vm, uint32_t msr_index) +{ + struct kvm_run *run = vcpu_state(vm, VCPU_ID); + + check_for_guest_assert(vm); + + TEST_ASSERT(run->exit_reason == KVM_EXIT_X86_WRMSR, + "Unexpected exit reason: %u (%s),\n", + run->exit_reason, + exit_reason_str(run->exit_reason)); + TEST_ASSERT(run->msr.index == msr_index, + "Unexpected msr (0x%04x), expected 0x%04x", + run->msr.index, msr_index); + + switch (run->msr.index) { + case MSR_IA32_XSS: + if (run->msr.data != 0) + run->msr.inject_gp = 1; + break; + case MSR_IA32_FLUSH_CMD: + if (run->msr.data != 1) + run->msr.inject_gp = 1; + break; + case MSR_NON_EXISTENT: + msr_non_existent_data = run->msr.data; + break; + default: + TEST_ASSERT(false, "Unexpected MSR: 0x%04x", run->msr.index); + } +} + +static void process_ucall_done(struct kvm_vm *vm) +{ + struct kvm_run *run = vcpu_state(vm, VCPU_ID); + struct ucall uc; + + TEST_ASSERT(run->exit_reason == KVM_EXIT_IO, + "Unexpected exit reason: %u (%s)", + run->exit_reason, + exit_reason_str(run->exit_reason)); + + TEST_ASSERT(get_ucall(vm, VCPU_ID, &uc) == UCALL_DONE, + "Unexpected ucall command: %lu, expected UCALL_DONE (%d)", + uc.cmd, UCALL_DONE); +} + +int main(int argc, char *argv[]) +{ + struct kvm_vm *vm; + + vm = vm_create(VM_MODE_DEFAULT, DEFAULT_GUEST_PHY_PAGES, O_RDWR); + kvm_vm_elf_load(vm, program_invocation_name, 0, 0); + vm_create_irqchip(vm); + + kvm_set_exit_msrs(vm, nmsrs, msrs); + + vm_vcpu_add_default(vm, VCPU_ID, guest_code); + + vm_init_descriptor_tables(vm); + vcpu_init_descriptor_tables(vm, VCPU_ID); + + vm_handle_exception(vm, GP_VECTOR, guest_gp_handler); + + /* Process guest code userspace exits */ + run_guest(vm); + process_rdmsr(vm, MSR_IA32_XSS); + run_guest(vm); + process_wrmsr(vm, MSR_IA32_XSS); + run_guest(vm); + process_wrmsr(vm, MSR_IA32_XSS); + + run_guest(vm); + process_rdmsr(vm, MSR_IA32_FLUSH_CMD); + run_guest(vm); + process_wrmsr(vm, MSR_IA32_FLUSH_CMD); + run_guest(vm); + process_wrmsr(vm, MSR_IA32_FLUSH_CMD); + + run_guest(vm); + process_wrmsr(vm, MSR_NON_EXISTENT); + run_guest(vm); + process_rdmsr(vm, MSR_NON_EXISTENT); + + run_guest(vm); + process_ucall_done(vm); + + kvm_vm_free(vm); + return 0; +}