From patchwork Fri Apr 19 08:59:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13635961 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A362D74402 for ; Fri, 19 Apr 2024 08:59:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713517175; cv=none; b=CfFPbNILbIVPH0urv5o7BOO0F4ru09lez344vvLa4S4NbQSNzFGNaDm9AsxSETpmAOdj1tOkEGpPuS6uTSDdTggQY2cB7BG7g+AGHbohFEPY2OyYsHy8O7SkW9tK8DCXnBTKOmE21bcm04OFeREBd3KjMB9eFBDpO8LLGpkRkx0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713517175; c=relaxed/simple; bh=iRkpDnaskXPP0egnn6MrAKOO3BaX7qM3TsGiRSbbcL0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=fSgjM9HmvOJcW0fCMTvTU2tRezhNMWkOtUV9zlexC2nKcvokkWtT1pTQBSnply2dWYJfrDmPdI1vyjD4ndSUy/VlJaRFiIg++5tBLviWnmGXXyoVz6wcNKWANara8+DAJjow8hDPBw2Ck72xUE5QXmZtbpnYrDEgOgw7hA6sOBw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=g7j2DdOg; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="g7j2DdOg" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1713517172; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=H3n3T3dQQzJdP5pnfnrGlCZLigrTHAZ95pIVOwtMaJg=; b=g7j2DdOgD6widjqUXTWxUOSOjdlyMrQOz3225Y2iNJvPVKGriGS4qNgodmy17LAdH5cngg ghkbulw7FT2IyenK4pctJOqhaPj8jwSxDn8gKY44VaVcycQDrVf+kb8pDJyiSgXcUwjY8P JrcPxo9sse2nFCMLJD1ewt7o/bHrOWs= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-610-lTWG8KZGOyiE9Yfm2zaZqA-1; Fri, 19 Apr 2024 04:59:29 -0400 X-MC-Unique: lTWG8KZGOyiE9Yfm2zaZqA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B27823804073; Fri, 19 Apr 2024 08:59:28 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8024F2037D42; Fri, 19 Apr 2024 08:59:28 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: isaku.yamahata@intel.com, xiaoyao.li@intel.com, binbin.wu@linux.intel.com, seanjc@google.com, rick.p.edgecombe@intel.com Subject: [PATCH 1/6] KVM: Document KVM_PRE_FAULT_MEMORY ioctl Date: Fri, 19 Apr 2024 04:59:22 -0400 Message-ID: <20240419085927.3648704-2-pbonzini@redhat.com> In-Reply-To: <20240419085927.3648704-1-pbonzini@redhat.com> References: <20240419085927.3648704-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.4 From: Isaku Yamahata Adds documentation of KVM_PRE_FAULT_MEMORY ioctl. [1] It populates guest memory. It doesn't do extra operations on the underlying technology-specific initialization [2]. For example, CoCo-related operations won't be performed. Concretely for TDX, this API won't invoke TDH.MEM.PAGE.ADD() or TDH.MR.EXTEND(). Vendor-specific APIs are required for such operations. The key point is to adapt of vcpu ioctl instead of VM ioctl. First, populating guest memory requires vcpu. If it is VM ioctl, we need to pick one vcpu somehow. Secondly, vcpu ioctl allows each vcpu to invoke this ioctl in parallel. It helps to scale regarding guest memory size, e.g., hundreds of GB. [1] https://lore.kernel.org/kvm/Zbrj5WKVgMsUFDtb@google.com/ [2] https://lore.kernel.org/kvm/Ze-TJh0BBOWm9spT@google.com/ Suggested-by: Sean Christopherson Signed-off-by: Isaku Yamahata Message-ID: <9a060293c9ad9a78f1d8994cfe1311e818e99257.1712785629.git.isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini --- Documentation/virt/kvm/api.rst | 50 ++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index f0b76ff5030d..bbcaa5d2b54b 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -6352,6 +6352,56 @@ a single guest_memfd file, but the bound ranges must not overlap). See KVM_SET_USER_MEMORY_REGION2 for additional details. +4.143 KVM_PRE_FAULT_MEMORY +------------------------ + +:Capability: KVM_CAP_PRE_FAULT_MEMORY +:Architectures: none +:Type: vcpu ioctl +:Parameters: struct kvm_pre_fault_memory (in/out) +:Returns: 0 on success, < 0 on error + +Errors: + + ========== =============================================================== + EINVAL The specified `gpa` and `size` were invalid (e.g. not + page aligned). + ENOENT The specified `gpa` is outside defined memslots. + EINTR An unmasked signal is pending and no page was processed. + EFAULT The parameter address was invalid. + EOPNOTSUPP Mapping memory for a GPA is unsupported by the + hypervisor, and/or for the current vCPU state/mode. + ========== =============================================================== + +:: + + struct kvm_pre_fault_memory { + /* in/out */ + __u64 gpa; + __u64 size; + /* in */ + __u64 flags; + __u64 padding[5]; + }; + +KVM_PRE_FAULT_MEMORY populates KVM's stage-2 page tables used to map memory +for the current vCPU state. KVM maps memory as if the vCPU generated a +stage-2 read page fault, e.g. faults in memory as needed, but doesn't break +CoW. However, KVM does not mark any newly created stage-2 PTE as Accessed. + +In some cases, multiple vCPUs might share the page tables. In this +case, the ioctl can be called in parallel. + +Shadow page tables cannot support this ioctl because they +are indexed by virtual address or nested guest physical address. +Calling this ioctl when the guest is using shadow page tables (for +example because it is running a nested guest with nested page tables) +will fail with `EOPNOTSUPP` even if `KVM_CHECK_EXTENSION` reports +the capability to be present. + +`flags` must currently be zero. + + 5. The kvm_run structure ======================== From patchwork Fri Apr 19 08:59:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13635967 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3AB637EF1D for ; Fri, 19 Apr 2024 08:59:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713517179; cv=none; b=TORFiqesQlyRy4xLurvyXiVDv8iUBclgA2YeNGixw2qrAmqcILAI1wndqYG3yBiZb/TkdVShzdF5bO7xjrEBKhfO2lynWIFsfT46QsbjUKOeySMo7mwI8On7YtkFjNjrD9HMn0cc2vjPXYuGNyodCMVjX8KjRHuiyCNPc6CqXyU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713517179; c=relaxed/simple; bh=6QsTyro7o0g/mAgLz5I9OAAjavd5uhJCINBH9wpklts=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=uPHnt7mVc02Alsze5j2hScaqOgiXYf67EJMm7L7N88QbLLwdv6Fvp6IYddcLzC2W9LyN6o7GcLCokF2u8aaA+tLY1YGIxMNcVdVjekHijtcb6V3NJviMa33Mz4gaUHtDY8sffdsNGY1hOXMUy21h8qIpc+8fuTWfi3JqYnezC4g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=aWi/5Rol; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="aWi/5Rol" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1713517177; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AbYCbT6M2EaJK1Q6vJ2IcYCS7NUptHbLc4XrSU4QEHw=; b=aWi/5Rolb0Bt2KfsqTRDbkL/M7uygsJz0mkZ6BRycg69Zcx26cm41nqA+QRaLxsLq0Bwp1 +HOvczo9zzqMkmrkT4AB0/tYONrqMtC6WyOgN2+rCMZ4LX2SyMiZDZXFoBljQ161MCdFlu GU94kWUkfdKmNrWFJIGxUs/3T7gkcEQ= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-265-PvhzoT8IPsW6Z41sicIZIw-1; Fri, 19 Apr 2024 04:59:30 -0400 X-MC-Unique: PvhzoT8IPsW6Z41sicIZIw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id EE12C1C294CE; Fri, 19 Apr 2024 08:59:28 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id BC7992033A7D; Fri, 19 Apr 2024 08:59:28 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: isaku.yamahata@intel.com, xiaoyao.li@intel.com, binbin.wu@linux.intel.com, seanjc@google.com, rick.p.edgecombe@intel.com Subject: [PATCH 2/6] KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory Date: Fri, 19 Apr 2024 04:59:23 -0400 Message-ID: <20240419085927.3648704-3-pbonzini@redhat.com> In-Reply-To: <20240419085927.3648704-1-pbonzini@redhat.com> References: <20240419085927.3648704-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.4 From: Isaku Yamahata Add a new ioctl KVM_PRE_FAULT_MEMORY in the KVM common code. It iterates on the memory range and calls the arch-specific function. Add stub arch function as a weak symbol. Suggested-by: Sean Christopherson Signed-off-by: Isaku Yamahata Reviewed-by: Rick Edgecombe Message-ID: <819322b8f25971f2b9933bfa4506e618508ad782.1712785629.git.isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini --- include/linux/kvm_host.h | 5 ++++ include/uapi/linux/kvm.h | 10 +++++++ virt/kvm/Kconfig | 3 ++ virt/kvm/kvm_main.c | 63 ++++++++++++++++++++++++++++++++++++++++ 4 files changed, 81 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 8dea11701ab2..9e9943e5e37c 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2478,4 +2478,9 @@ long kvm_gmem_populate(struct kvm *kvm, gfn_t gfn, void __user *src, long npages void kvm_arch_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end); #endif +#ifdef CONFIG_KVM_GENERIC_PRE_FAULT_MEMORY +long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu, + struct kvm_pre_fault_memory *range); +#endif + #endif diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 2190adbe3002..917d2964947d 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -917,6 +917,7 @@ struct kvm_enable_cap { #define KVM_CAP_MEMORY_ATTRIBUTES 233 #define KVM_CAP_GUEST_MEMFD 234 #define KVM_CAP_VM_TYPES 235 +#define KVM_CAP_PRE_FAULT_MEMORY 236 struct kvm_irq_routing_irqchip { __u32 irqchip; @@ -1548,4 +1549,13 @@ struct kvm_create_guest_memfd { __u64 reserved[6]; }; +#define KVM_PRE_FAULT_MEMORY _IOWR(KVMIO, 0xd5, struct kvm_pre_fault_memory) + +struct kvm_pre_fault_memory { + __u64 gpa; + __u64 size; + __u64 flags; + __u64 padding[5]; +}; + #endif /* __LINUX_KVM_H */ diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index 754c6c923427..b14e14cdbfb9 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -67,6 +67,9 @@ config HAVE_KVM_INVALID_WAKEUPS config KVM_GENERIC_DIRTYLOG_READ_PROTECT bool +config KVM_GENERIC_PRE_FAULT_MEMORY + bool + config KVM_COMPAT def_bool y depends on KVM && COMPAT && !(S390 || ARM64 || RISCV) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 38b498669ef9..51d8dbe7e93b 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4379,6 +4379,55 @@ static int kvm_vcpu_ioctl_get_stats_fd(struct kvm_vcpu *vcpu) return fd; } +#ifdef CONFIG_KVM_GENERIC_PRE_FAULT_MEMORY +static int kvm_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu, + struct kvm_pre_fault_memory *range) +{ + int idx; + long r; + u64 full_size; + + if (range->flags) + return -EINVAL; + + if (!PAGE_ALIGNED(range->gpa) || + !PAGE_ALIGNED(range->size) || + range->gpa + range->size <= range->gpa) + return -EINVAL; + + if (!range->size) + return 0; + + vcpu_load(vcpu); + idx = srcu_read_lock(&vcpu->kvm->srcu); + + full_size = range->size; + do { + if (signal_pending(current)) { + r = -EINTR; + break; + } + + r = kvm_arch_vcpu_pre_fault_memory(vcpu, range); + if (r < 0) + break; + + if (WARN_ON_ONCE(r == 0)) + break; + + range->size -= r; + range->gpa += r; + cond_resched(); + } while (range->size); + + srcu_read_unlock(&vcpu->kvm->srcu, idx); + vcpu_put(vcpu); + + /* Return success if at least one page was mapped successfully. */ + return full_size == range->size ? r : 0; +} +#endif + static long kvm_vcpu_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) { @@ -4580,6 +4629,20 @@ static long kvm_vcpu_ioctl(struct file *filp, r = kvm_vcpu_ioctl_get_stats_fd(vcpu); break; } +#ifdef CONFIG_KVM_GENERIC_PRE_FAULT_MEMORY + case KVM_PRE_FAULT_MEMORY: { + struct kvm_pre_fault_memory range; + + r = -EFAULT; + if (copy_from_user(&range, argp, sizeof(range))) + break; + r = kvm_vcpu_pre_fault_memory(vcpu, &range); + /* Pass back leftover range. */ + if (copy_to_user(argp, &range, sizeof(range))) + r = -EFAULT; + break; + } +#endif default: r = kvm_arch_vcpu_ioctl(filp, ioctl, arg); } From patchwork Fri Apr 19 08:59:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13635965 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 24FA97BB01 for ; Fri, 19 Apr 2024 08:59:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713517177; cv=none; b=sDKVxpvCxAuWcssoHiYm8AK/jo5KswukLn4yoT6tD4DFtIUuaUCHP/19iYf64vzeYIBNo8twZFNNjeQTKHCfgVTm9TMIoEEsppoGa7gZ4HUGV/8D0ReUZeNuAk7HmzWdd4uMNZi1PB6/O7wCG2JqjMLo6Ni93dUWFcu3Q8OVMf4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713517177; c=relaxed/simple; bh=MlYJpZfiuuMe5cB6DpdDfoHLrBt/ox1lOXIQJOlkpek=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=IpMkU/Pp3pIVBcOw/D8V8+08meOXeNAjLdFPK1zOD/afxcCF6Nu6BEMM99uDACprB+F81plPJFE0ceweyNUp4YlqWdC1xnELQ2uWsSHPHQjD/DU7h6DHRfibBEnDrV9c1fh8et2ratbboU6WSBE1UdkiI04jK1LeZhnriJZM64w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=JDaEwGwY; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="JDaEwGwY" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1713517175; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6x7YBb66dnzoHcmKY1tLeMz57M5blqYb9nwNSC1fM+k=; b=JDaEwGwYAwOxsfFfswpvplIGjgIMOqcyE2zI3xDcwEe1v/HE2MQ3F0tORk5MiTp7y30pA8 Q5TvC8CWTbm3c+SvnSLPJhXI2oo/3IebzGjeis48G/zGImS2Dd8UUOkBGHQ4y7OC3MvQDJ LaoOHpcvdxxoGE91CT0MEtdsaS3c/9A= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-626-Qxu4fn7EOdCxpIzEZ7KaRQ-1; Fri, 19 Apr 2024 04:59:29 -0400 X-MC-Unique: Qxu4fn7EOdCxpIzEZ7KaRQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 32F5118A2BC7; Fri, 19 Apr 2024 08:59:29 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 01EFA20368A8; Fri, 19 Apr 2024 08:59:28 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: isaku.yamahata@intel.com, xiaoyao.li@intel.com, binbin.wu@linux.intel.com, seanjc@google.com, rick.p.edgecombe@intel.com Subject: [PATCH 3/6] KVM: x86/mmu: Extract __kvm_mmu_do_page_fault() Date: Fri, 19 Apr 2024 04:59:24 -0400 Message-ID: <20240419085927.3648704-4-pbonzini@redhat.com> In-Reply-To: <20240419085927.3648704-1-pbonzini@redhat.com> References: <20240419085927.3648704-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.4 From: Isaku Yamahata Extract out __kvm_mmu_do_page_fault() from kvm_mmu_do_page_fault(). The inner function is to initialize struct kvm_page_fault and to call the fault handler, and the outer function handles updating stats and converting return code. KVM_PRE_FAULT_MEMORY will call the KVM page fault handler. This patch makes the emulation_type always set irrelevant to the return code. kvm_mmu_page_fault() is the only caller of kvm_mmu_do_page_fault(), and references the value only when PF_RET_EMULATE is returned. Therefore, this adjustment doesn't affect functionality. No functional change intended. Suggested-by: Sean Christopherson Signed-off-by: Isaku Yamahata Message-ID: Signed-off-by: Paolo Bonzini Signed-off-by: Sean Christopherson --- arch/x86/kvm/mmu/mmu_internal.h | 38 +++++++++++++++++++++------------ 1 file changed, 24 insertions(+), 14 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index e68a60974cf4..9baae6c223ee 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -287,8 +287,8 @@ static inline void kvm_mmu_prepare_memory_fault_exit(struct kvm_vcpu *vcpu, fault->is_private); } -static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, - u64 err, bool prefetch, int *emulation_type) +static inline int __kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, + u64 err, bool prefetch, int *emulation_type) { struct kvm_page_fault fault = { .addr = cr2_or_gpa, @@ -318,6 +318,27 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, fault.slot = kvm_vcpu_gfn_to_memslot(vcpu, fault.gfn); } + if (IS_ENABLED(CONFIG_MITIGATION_RETPOLINE) && fault.is_tdp) + r = kvm_tdp_page_fault(vcpu, &fault); + else + r = vcpu->arch.mmu->page_fault(vcpu, &fault); + + if (r == RET_PF_EMULATE && fault.is_private) { + kvm_mmu_prepare_memory_fault_exit(vcpu, &fault); + r = -EFAULT; + } + + if (fault.write_fault_to_shadow_pgtable && emulation_type) + *emulation_type |= EMULTYPE_WRITE_PF_TO_SP; + + return r; +} + +static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, + u64 err, bool prefetch, int *emulation_type) +{ + int r; + /* * Async #PF "faults", a.k.a. prefetch faults, are not faults from the * guest perspective and have already been counted at the time of the @@ -326,18 +347,7 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, if (!prefetch) vcpu->stat.pf_taken++; - if (IS_ENABLED(CONFIG_MITIGATION_RETPOLINE) && fault.is_tdp) - r = kvm_tdp_page_fault(vcpu, &fault); - else - r = vcpu->arch.mmu->page_fault(vcpu, &fault); - - if (r == RET_PF_EMULATE && fault.is_private) { - kvm_mmu_prepare_memory_fault_exit(vcpu, &fault); - return -EFAULT; - } - - if (fault.write_fault_to_shadow_pgtable && emulation_type) - *emulation_type |= EMULTYPE_WRITE_PF_TO_SP; + r = __kvm_mmu_do_page_fault(vcpu, cr2_or_gpa, err, prefetch, emulation_type); /* * Similar to above, prefetch faults aren't truly spurious, and the From patchwork Fri Apr 19 08:59:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13635962 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 05D7A745D6 for ; Fri, 19 Apr 2024 08:59:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713517176; cv=none; b=UukBHy7LXSOusXOooeP0m7/8JYtzQUevH8/+fimvHor+S24oxkqL2BwOYgwon23kHypOEfdwYCXA6sSMN5N4q6baghkSWcuXa3G5TEg5hKo1CrsXwIQDVYxH3BgfC0YhqUsUoBuzL41MWgAY9+utGWJxYQCTG7yZkHRHlefyOSs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713517176; c=relaxed/simple; bh=81mARjUpw+KVyq4vQ1KQan74CjOK+s4dvQQ+ghYUVsU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=QwZ4k28wf6NP8JSAd1qM/6QnfBSfw0vWNZ33O1I9ZNWG1bBZn2fWHw7L64sa23DLXS2jhBInb1A8iyxAQ/YwMlX3+ZcwQVadQ45hT5Sh0lRZfjAzA67l8esJFHhFbnKwisw3HkHwkxm0yDHqDu6oyno95AJYOyTUNX5I22T+yio= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=QrhhNx24; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="QrhhNx24" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1713517173; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=otExSoiFgVMzLYgIYpvIwimsI5D4FF0c+uBWyLnFbGw=; b=QrhhNx24N0hyEewHwPG7V4v+UNcUa6IRO5irJRQsWzosUYfzJUqwuDEPwRnCmjL4+1SDjF lGesDi2jC0jdkKj9JYt7DUmBKS9kZH2O/j+KZ8lMx2pxEx3lQaqT4jOJUL2oEa7txqdeyj z5KZfRXHwScdsd0QTHVJrmLTqObwubQ= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-303-5ngrdknaO0GW3SWem-b2zA-1; Fri, 19 Apr 2024 04:59:29 -0400 X-MC-Unique: 5ngrdknaO0GW3SWem-b2zA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6B59D8ABE0C; Fri, 19 Apr 2024 08:59:29 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3AAF620368AC; Fri, 19 Apr 2024 08:59:29 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: isaku.yamahata@intel.com, xiaoyao.li@intel.com, binbin.wu@linux.intel.com, seanjc@google.com, rick.p.edgecombe@intel.com Subject: [PATCH 4/6] KVM: x86/mmu: Make __kvm_mmu_do_page_fault() return mapped level Date: Fri, 19 Apr 2024 04:59:25 -0400 Message-ID: <20240419085927.3648704-5-pbonzini@redhat.com> In-Reply-To: <20240419085927.3648704-1-pbonzini@redhat.com> References: <20240419085927.3648704-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.4 From: Isaku Yamahata The guest memory population logic will need to know what page size or level (4K, 2M, ...) is mapped. Signed-off-by: Isaku Yamahata Message-ID: Signed-off-by: Paolo Bonzini --- arch/x86/kvm/mmu/mmu_internal.h | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index 9baae6c223ee..b0a10f5a40dd 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -288,7 +288,8 @@ static inline void kvm_mmu_prepare_memory_fault_exit(struct kvm_vcpu *vcpu, } static inline int __kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, - u64 err, bool prefetch, int *emulation_type) + u64 err, bool prefetch, + int *emulation_type, u8 *level) { struct kvm_page_fault fault = { .addr = cr2_or_gpa, @@ -330,6 +331,8 @@ static inline int __kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gp if (fault.write_fault_to_shadow_pgtable && emulation_type) *emulation_type |= EMULTYPE_WRITE_PF_TO_SP; + if (level) + *level = fault.goal_level; return r; } @@ -347,7 +350,8 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, if (!prefetch) vcpu->stat.pf_taken++; - r = __kvm_mmu_do_page_fault(vcpu, cr2_or_gpa, err, prefetch, emulation_type); + r = __kvm_mmu_do_page_fault(vcpu, cr2_or_gpa, err, prefetch, + emulation_type, NULL); /* * Similar to above, prefetch faults aren't truly spurious, and the From patchwork Fri Apr 19 08:59:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13635963 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1DB5E76056 for ; Fri, 19 Apr 2024 08:59:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713517176; cv=none; b=WWjiyiw9BEnu9BuFJkPoKyyCKO3rv8zuZkKHorvK1igFFnzophgWeGdizO4zKXQrOgSl8E8yEki04Z2VaA4QT5Qrmk+eXZ4AxfeZvyzZtMr+JPxjxX8V6mB61W6rL3Xto5VdG8CbfoF4COZPeF5Jn8G/1rUDM9PCxtYi228QVPk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713517176; c=relaxed/simple; bh=p0m1HZy9UGhgenBFAgKeYWbF5X6HK5fP4nR7XaO2UKE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=X8loFUAzN7eWPuZ3mBNRoZoN2orZTi2nqo6pMHi/8uUVWqGVCiX/bs5LPcrehgYG/XgHOwiew5skScZc1FVVzvUl+ZUXZ/6lElkVvYlLvdA4ildBENsGmfm8sbqyhsGnkGv3cHdxh0FbSmCBOQFEbpndI5zSCKfmxm5/V9KyjmI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=X+/h6Z63; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="X+/h6Z63" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1713517174; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nBDAnD8rwFNbWbPY9L0LNNm2QshByyJ5ytYISZegYXU=; b=X+/h6Z637HO9JDWcI3Rt6g5IknT8b11Gr9hoUh2HrzoxLdvlAj4bNhoLBc5k9lUZerlpap gJV/Ftmd6injvR56tt1uXbz5yJbJamtAhD7nPkuw8MghsdY5hdD8dH7OmDwLRQ8fSQpNrO 0sLZKhYm3KoWwR9B+oovAdcFC8/P3CY= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-155-YpT6iWRLNxe--xzwEC86Ng-1; Fri, 19 Apr 2024 04:59:30 -0400 X-MC-Unique: YpT6iWRLNxe--xzwEC86Ng-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A4B702812FE9; Fri, 19 Apr 2024 08:59:29 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7389F20368A4; Fri, 19 Apr 2024 08:59:29 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: isaku.yamahata@intel.com, xiaoyao.li@intel.com, binbin.wu@linux.intel.com, seanjc@google.com, rick.p.edgecombe@intel.com Subject: [PATCH 5/6] KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory() Date: Fri, 19 Apr 2024 04:59:26 -0400 Message-ID: <20240419085927.3648704-6-pbonzini@redhat.com> In-Reply-To: <20240419085927.3648704-1-pbonzini@redhat.com> References: <20240419085927.3648704-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.4 From: Isaku Yamahata Wire KVM_PRE_FAULT_MEMORY ioctl to __kvm_mmu_do_page_fault() to populate guest memory. It can be called right after KVM_CREATE_VCPU creates a vCPU, since at that point kvm_mmu_create() and kvm_init_mmu() are called and the vCPU is ready to invoke the KVM page fault handler. The helper function kvm_mmu_map_tdp_page take care of the logic to process RET_PF_* return values and convert them to success or errno. Signed-off-by: Isaku Yamahata Message-ID: <9b866a0ae7147f96571c439e75429a03dcb659b6.1712785629.git.isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini --- arch/x86/kvm/Kconfig | 1 + arch/x86/kvm/mmu/mmu.c | 72 ++++++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/x86.c | 3 ++ 3 files changed, 76 insertions(+) diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index 7632fe6e4db9..54c155432793 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -44,6 +44,7 @@ config KVM select KVM_VFIO select HAVE_KVM_PM_NOTIFIER if PM select KVM_GENERIC_HARDWARE_ENABLING + select KVM_GENERIC_PRE_FAULT_MEMORY help Support hosting fully virtualized guest machines using hardware virtualization extensions. You will need a fairly recent diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 10e90788b263..a045b23964c0 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4647,6 +4647,78 @@ int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) return direct_page_fault(vcpu, fault); } +static int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code, + u8 *level) +{ + int r; + + /* Restrict to TDP page fault. */ + if (vcpu->arch.mmu->page_fault != kvm_tdp_page_fault) + return -EOPNOTSUPP; + +retry: + r = __kvm_mmu_do_page_fault(vcpu, gpa, error_code, true, NULL, level); + if (r < 0) + return r; + + switch (r) { + case RET_PF_RETRY: + if (signal_pending(current)) + return -EINTR; + cond_resched(); + goto retry; + + case RET_PF_FIXED: + case RET_PF_SPURIOUS: + break; + + case RET_PF_EMULATE: + return -ENOENT; + + case RET_PF_CONTINUE: + case RET_PF_INVALID: + default: + WARN_ON_ONCE(r); + return -EIO; + } + + return 0; +} + +long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu, + struct kvm_pre_fault_memory *range) +{ + u64 error_code = PFERR_GUEST_FINAL_MASK; + u8 level = PG_LEVEL_4K; + u64 end; + int r; + + /* + * reload is efficient when called repeatedly, so we can do it on + * every iteration. + */ + kvm_mmu_reload(vcpu); + + if (kvm_arch_has_private_mem(vcpu->kvm) && + kvm_mem_is_private(vcpu->kvm, gpa_to_gfn(range->gpa))) + error_code |= PFERR_PRIVATE_ACCESS; + + /* + * Shadow paging uses GVA for kvm page fault, so restrict to + * two-dimensional paging. + */ + r = kvm_tdp_map_page(vcpu, range->gpa, error_code, &level); + if (r < 0) + return r; + + /* + * If the mapping that covers range->gpa can use a huge page, it + * may start below it or end after range->gpa + range->size. + */ + end = (range->gpa & KVM_HPAGE_MASK(level)) + KVM_HPAGE_SIZE(level); + return min(range->size, end - range->gpa); +} + static void nonpaging_init_context(struct kvm_mmu *context) { context->page_fault = nonpaging_page_fault; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 83b8260443a3..619ad713254e 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4715,6 +4715,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_MEMORY_FAULT_INFO: r = 1; break; + case KVM_CAP_PRE_FAULT_MEMORY: + r = tdp_enabled; + break; case KVM_CAP_EXIT_HYPERCALL: r = KVM_EXIT_HYPERCALL_VALID_MASK; break; From patchwork Fri Apr 19 08:59:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13635966 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EF6667D08A for ; Fri, 19 Apr 2024 08:59:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713517178; cv=none; b=ZDo7waaQOlGorADBZxVTHCw4FXHVIO117KWiPMTobN935StVyGbFqUBcDDdgBRC6MP5RrraD7NwkPycf7pyiBE3pYAjPhgqCHLqBuwF9V5eZLp1LynxPaTPFtsUHKTvtEq3hdgHWMpix7VHxCEnOhAuIEWY8Eehma7UIVbriNPs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713517178; c=relaxed/simple; bh=2gm7deFzurYTAyrCh7iT1lNdC3GO1PyHSZi0isL2Ru8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Fqqr6CpbkaAiSBhQmP7mDPpmOfjp6PS+u6Ey2BU9C8IXfAQpoF9qCHdkqcV0YJ9rDwPkzwwf7owt+rvkYs1N3/8vOeISPZ9CSHLIJz75tJVSiJcAY+W89/avIdw931AJfoIMNvRny6a3GnC6Qg7c+XqvGlGqY2HdWjoolg9voFo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=bgnvDkPD; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="bgnvDkPD" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1713517175; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=52y2schAwBprTYBvvMf3LjB6dQ9sxUly4DKTEx6UkQs=; b=bgnvDkPD+CN0IUJ2HT1AdEWaj2ogVj5yOBCLb+CTNDJPcYwHDpRtk8+i6GWL8gqqstQaB0 pKaS7spL7A6TCIuAQ0Hx8fNLd/vRwnl3of+ean4SUIAQvARgKdGEmAo74Okf0dI8Ebntt1 mybolgEGye5f0N6VacwqpYHln2YPB2o= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-650-wGI1dtL0P9GJaIuPnMU1Vg-1; Fri, 19 Apr 2024 04:59:30 -0400 X-MC-Unique: wGI1dtL0P9GJaIuPnMU1Vg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id DDADD3811717; Fri, 19 Apr 2024 08:59:29 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id ACEB0203689B; Fri, 19 Apr 2024 08:59:29 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: isaku.yamahata@intel.com, xiaoyao.li@intel.com, binbin.wu@linux.intel.com, seanjc@google.com, rick.p.edgecombe@intel.com Subject: [PATCH 6/6] KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY Date: Fri, 19 Apr 2024 04:59:27 -0400 Message-ID: <20240419085927.3648704-7-pbonzini@redhat.com> In-Reply-To: <20240419085927.3648704-1-pbonzini@redhat.com> References: <20240419085927.3648704-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.4 From: Isaku Yamahata Add a test case to exercise KVM_PRE_FAULT_MEMORY and run the guest to access the pre-populated area. It tests KVM_PRE_FAULT_MEMORY ioctl for KVM_X86_DEFAULT_VM and KVM_X86_SW_PROTECTED_VM. Signed-off-by: Isaku Yamahata Message-ID: <32427791ef42e5efaafb05d2ac37fa4372715f47.1712785629.git.isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini --- tools/include/uapi/linux/kvm.h | 8 + tools/testing/selftests/kvm/Makefile | 1 + .../selftests/kvm/pre_fault_memory_test.c | 146 ++++++++++++++++++ 3 files changed, 155 insertions(+) create mode 100644 tools/testing/selftests/kvm/pre_fault_memory_test.c diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h index c3308536482b..4d66d8afdcd1 100644 --- a/tools/include/uapi/linux/kvm.h +++ b/tools/include/uapi/linux/kvm.h @@ -2227,4 +2227,12 @@ struct kvm_create_guest_memfd { __u64 reserved[6]; }; +#define KVM_PRE_FAULT_MEMORY _IOWR(KVMIO, 0xd5, struct kvm_pre_fault_memory) + +struct kvm_pre_fault_memory { + __u64 gpa; + __u64 size; + __u64 flags; +}; + #endif /* __LINUX_KVM_H */ diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 871e2de3eb05..61d581a4bab4 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -144,6 +144,7 @@ TEST_GEN_PROGS_x86_64 += set_memory_region_test TEST_GEN_PROGS_x86_64 += steal_time TEST_GEN_PROGS_x86_64 += kvm_binary_stats_test TEST_GEN_PROGS_x86_64 += system_counter_offset_test +TEST_GEN_PROGS_x86_64 += pre_fault_memory_test # Compiled outputs used by test targets TEST_GEN_PROGS_EXTENDED_x86_64 += x86_64/nx_huge_pages_test diff --git a/tools/testing/selftests/kvm/pre_fault_memory_test.c b/tools/testing/selftests/kvm/pre_fault_memory_test.c new file mode 100644 index 000000000000..e56eed2c1f05 --- /dev/null +++ b/tools/testing/selftests/kvm/pre_fault_memory_test.c @@ -0,0 +1,146 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2024, Intel, Inc + * + * Author: + * Isaku Yamahata + */ +#include + +#include +#include +#include + +/* Arbitrarily chosen values */ +#define TEST_SIZE (SZ_2M + PAGE_SIZE) +#define TEST_NPAGES (TEST_SIZE / PAGE_SIZE) +#define TEST_SLOT 10 + +static void guest_code(uint64_t base_gpa) +{ + volatile uint64_t val __used; + int i; + + for (i = 0; i < TEST_NPAGES; i++) { + uint64_t *src = (uint64_t *)(base_gpa + i * PAGE_SIZE); + + val = *src; + } + + GUEST_DONE(); +} + +static void pre_fault_memory(struct kvm_vcpu *vcpu, u64 gpa, u64 size, + u64 left) +{ + struct kvm_pre_fault_memory range = { + .gpa = gpa, + .size = size, + .flags = 0, + }; + u64 prev; + int ret, save_errno; + + do { + prev = range.size; + ret = __vcpu_ioctl(vcpu, KVM_PRE_FAULT_MEMORY, &range); + save_errno = errno; + TEST_ASSERT((range.size < prev) ^ (ret < 0), + "%sexpecting range.size to change on %s", + ret < 0 ? "not " : "", + ret < 0 ? "failure" : "success"); + } while (ret >= 0 ? range.size : save_errno == EINTR); + + TEST_ASSERT(range.size == left, + "Completed with %lld bytes left, expected %" PRId64, + range.size, left); + + if (left == 0) + __TEST_ASSERT_VM_VCPU_IOCTL(!ret, "KVM_PRE_FAULT_MEMORY", ret, vcpu->vm); + else + /* No memory slot causes RET_PF_EMULATE. it results in -ENOENT. */ + __TEST_ASSERT_VM_VCPU_IOCTL(ret && save_errno == ENOENT, + "KVM_PRE_FAULT_MEMORY", ret, vcpu->vm); +} + +static void __test_pre_fault_memory(unsigned long vm_type, bool private) +{ + const struct vm_shape shape = { + .mode = VM_MODE_DEFAULT, + .type = vm_type, + }; + struct kvm_vcpu *vcpu; + struct kvm_run *run; + struct kvm_vm *vm; + struct ucall uc; + + uint64_t guest_test_phys_mem; + uint64_t guest_test_virt_mem; + uint64_t alignment, guest_page_size; + + vm = vm_create_shape_with_one_vcpu(shape, &vcpu, guest_code); + + alignment = guest_page_size = vm_guest_mode_params[VM_MODE_DEFAULT].page_size; + guest_test_phys_mem = (vm->max_gfn - TEST_NPAGES) * guest_page_size; +#ifdef __s390x__ + alignment = max(0x100000UL, guest_page_size); +#else + alignment = SZ_2M; +#endif + guest_test_phys_mem = align_down(guest_test_phys_mem, alignment); + guest_test_virt_mem = guest_test_phys_mem; + + vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS, + guest_test_phys_mem, TEST_SLOT, TEST_NPAGES, + private ? KVM_MEM_GUEST_MEMFD : 0); + virt_map(vm, guest_test_virt_mem, guest_test_phys_mem, TEST_NPAGES); + + if (private) + vm_mem_set_private(vm, guest_test_phys_mem, TEST_SIZE); + pre_fault_memory(vcpu, guest_test_phys_mem, SZ_2M, 0); + pre_fault_memory(vcpu, guest_test_phys_mem + SZ_2M, PAGE_SIZE * 2, PAGE_SIZE); + pre_fault_memory(vcpu, guest_test_phys_mem + TEST_SIZE, PAGE_SIZE, PAGE_SIZE); + + vcpu_args_set(vcpu, 1, guest_test_virt_mem); + vcpu_run(vcpu); + + run = vcpu->run; + TEST_ASSERT(run->exit_reason == KVM_EXIT_IO, + "Wanted KVM_EXIT_IO, got exit reason: %u (%s)", + run->exit_reason, exit_reason_str(run->exit_reason)); + + switch (get_ucall(vcpu, &uc)) { + case UCALL_ABORT: + REPORT_GUEST_ASSERT(uc); + break; + case UCALL_DONE: + break; + default: + TEST_FAIL("Unknown ucall 0x%lx.", uc.cmd); + break; + } + + kvm_vm_free(vm); +} + +static void test_pre_fault_memory(unsigned long vm_type, bool private) +{ + if (vm_type && !(kvm_check_cap(KVM_CAP_VM_TYPES) & BIT(vm_type))) { + pr_info("Skipping tests for vm_type 0x%lx\n", vm_type); + return; + } + + __test_pre_fault_memory(vm_type, private); +} + +int main(int argc, char *argv[]) +{ + TEST_REQUIRE(kvm_check_cap(KVM_CAP_PRE_FAULT_MEMORY)); + + test_pre_fault_memory(0, false); +#ifdef __x86_64__ + test_pre_fault_memory(KVM_X86_SW_PROTECTED_VM, false); + test_pre_fault_memory(KVM_X86_SW_PROTECTED_VM, true); +#endif + return 0; +}