From patchwork Tue Jul 18 23:45:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 13317948 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00E14C001DE for ; Tue, 18 Jul 2023 23:52:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230182AbjGRXwT (ORCPT ); Tue, 18 Jul 2023 19:52:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48724 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230416AbjGRXva (ORCPT ); Tue, 18 Jul 2023 19:51:30 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 461522D5E for ; Tue, 18 Jul 2023 16:49:44 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-5771e0959f7so61316197b3.3 for ; Tue, 18 Jul 2023 16:49:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1689724150; x=1692316150; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=JFyfv7cUwdHBFE0lZxkORbobiGr29Mhkt+V0sNgVMXY=; b=abyqbSNyZycbaxPKt18pqhXSJrOGWpP5TwZf2YgbTCjUsbLuck513ndjv+yHxwR2sk YXcAtOTCPMJ3ZCgUIBa4jnm/lq2OcsNjwC1Fj6Y07OEgH7SysP/bGP2SB4tR/ecVGcu4 fEHJP2PX+ioBvlhIrpfr71tiQHQn7HGyc1Vhipg5f5eFk5waSaRXG9UhM9FPVNp/fgfh /x7VksYEZ48UwtM7+maBDxgjLxJDpzdBH28sB/wSc3YOlVD/5LGCB8VT/iGIiYNyhqoK ufXypH8AMhELepjGdX4TlzxygisepfpxvZFIbAVdyACzx065jg7I/3gO6mWtJGLjx91v PwQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689724150; x=1692316150; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=JFyfv7cUwdHBFE0lZxkORbobiGr29Mhkt+V0sNgVMXY=; b=UstTX3OT4ICljlWWglI4JbhHk62D9SS7tIRsPzumEz/KcOIPAnsgKSLwogZpaRAaSk FpEpc2sr7IduryrmuAjrcI53buSBYmONVsKXlA+9I+gg5GIC40lFPasvflf7vg0lrC8o 4kqxMRcm/HUbiRMSJ0l2g21F8DselJoDS4DY5ImsNBZjs7J95VRsDy7jN9qV0+BNW5o/ loAI5iDiYOm5eWmDO2PVoqW+AoracmpZbL+nnBvQdTPCDcGfCfzqeHlQwYl4O/iqmmsZ NKAZJJtXgN6UMlNJZ9tm6TotLJc9ciwnbaCrLc/3/l3FJH1RyKu8OerEf0ORFS2vfM81 NOEw== X-Gm-Message-State: ABy/qLZNfgo6BJ/OMwOjrcv+lJfjA0QtXaPpbTEKhQlOm82HfWKaXKvP itrlbccy7OvOKGEc4+3BAAiljT1ZqMM= X-Google-Smtp-Source: APBJJlHWr1DY6cVuvCqn9XA/EKnl3Vs3TUUs4H5jlXd8CgpkU85x3MFXTOonALS9aP+w4FK1lG3HeouM/K4= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6902:1709:b0:c8d:469a:a749 with SMTP id by9-20020a056902170900b00c8d469aa749mr13714ybb.3.1689724150322; Tue, 18 Jul 2023 16:49:10 -0700 (PDT) Reply-To: Sean Christopherson Date: Tue, 18 Jul 2023 16:45:03 -0700 In-Reply-To: <20230718234512.1690985-1-seanjc@google.com> Mime-Version: 1.0 References: <20230718234512.1690985-1-seanjc@google.com> X-Mailer: git-send-email 2.41.0.255.g8b1d071c50-goog Message-ID: <20230718234512.1690985-21-seanjc@google.com> Subject: [RFC PATCH v11 20/29] KVM: selftests: Add support for creating private memslots From: Sean Christopherson To: Paolo Bonzini , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , "Matthew Wilcox (Oracle)" , Andrew Morton , Paul Moore , James Morris , "Serge E. Hallyn" Cc: kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-security-module@vger.kernel.org, linux-kernel@vger.kernel.org, Chao Peng , Fuad Tabba , Jarkko Sakkinen , Yu Zhang , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , Vlastimil Babka , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Signed-off-by: Sean Christopherson --- .../selftests/kvm/include/kvm_util_base.h | 16 ++++ .../testing/selftests/kvm/include/test_util.h | 5 ++ tools/testing/selftests/kvm/lib/kvm_util.c | 85 ++++++++++++------- 3 files changed, 75 insertions(+), 31 deletions(-) diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h index d4a9925d6815..f1de6a279561 100644 --- a/tools/testing/selftests/kvm/include/kvm_util_base.h +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h @@ -407,6 +407,19 @@ static inline uint64_t vm_get_stat(struct kvm_vm *vm, const char *stat_name) } void vm_create_irqchip(struct kvm_vm *vm); +static inline int vm_create_guest_memfd(struct kvm_vm *vm, uint64_t size, + uint64_t flags) +{ + struct kvm_create_guest_memfd gmem = { + .size = size, + .flags = flags, + }; + + int fd = __vm_ioctl(vm, KVM_CREATE_GUEST_MEMFD, &gmem); + + TEST_ASSERT(fd >= 0, KVM_IOCTL_ERROR(KVM_CREATE_GUEST_MEMFD, fd)); + return fd; +} void vm_set_user_memory_region(struct kvm_vm *vm, uint32_t slot, uint32_t flags, uint64_t gpa, uint64_t size, void *hva); @@ -416,6 +429,9 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm, enum vm_mem_backing_src_type src_type, uint64_t guest_paddr, uint32_t slot, uint64_t npages, uint32_t flags); +void vm_mem_add(struct kvm_vm *vm, enum vm_mem_backing_src_type src_type, + uint64_t guest_paddr, uint32_t slot, uint64_t npages, + uint32_t flags, int gmem_fd, uint64_t gmem_offset); void vm_mem_region_set_flags(struct kvm_vm *vm, uint32_t slot, uint32_t flags); void vm_mem_region_move(struct kvm_vm *vm, uint32_t slot, uint64_t new_gpa); diff --git a/tools/testing/selftests/kvm/include/test_util.h b/tools/testing/selftests/kvm/include/test_util.h index a6e9f215ce70..f3088d27f3ce 100644 --- a/tools/testing/selftests/kvm/include/test_util.h +++ b/tools/testing/selftests/kvm/include/test_util.h @@ -143,6 +143,11 @@ static inline bool backing_src_is_shared(enum vm_mem_backing_src_type t) return vm_mem_backing_src_alias(t)->flag & MAP_SHARED; } +static inline bool backing_src_can_be_huge(enum vm_mem_backing_src_type t) +{ + return t != VM_MEM_SRC_ANONYMOUS && t != VM_MEM_SRC_SHMEM; +} + /* Aligns x up to the next multiple of size. Size must be a power of 2. */ static inline uint64_t align_up(uint64_t x, uint64_t size) { diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index c1e4de53d082..b93717e62325 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -664,6 +664,8 @@ static void __vm_mem_region_delete(struct kvm_vm *vm, TEST_ASSERT(!ret, __KVM_SYSCALL_ERROR("munmap()", ret)); close(region->fd); } + if (region->region.gmem_fd >= 0) + close(region->region.gmem_fd); free(region); } @@ -865,36 +867,15 @@ void vm_set_user_memory_region(struct kvm_vm *vm, uint32_t slot, uint32_t flags, errno, strerror(errno)); } -/* - * VM Userspace Memory Region Add - * - * Input Args: - * vm - Virtual Machine - * src_type - Storage source for this region. - * NULL to use anonymous memory. - * guest_paddr - Starting guest physical address - * slot - KVM region slot - * npages - Number of physical pages - * flags - KVM memory region flags (e.g. KVM_MEM_LOG_DIRTY_PAGES) - * - * Output Args: None - * - * Return: None - * - * Allocates a memory area of the number of pages specified by npages - * and maps it to the VM specified by vm, at a starting physical address - * given by guest_paddr. The region is created with a KVM region slot - * given by slot, which must be unique and < KVM_MEM_SLOTS_NUM. The - * region is created with the flags given by flags. - */ -void vm_userspace_mem_region_add(struct kvm_vm *vm, - enum vm_mem_backing_src_type src_type, - uint64_t guest_paddr, uint32_t slot, uint64_t npages, - uint32_t flags) +/* FIXME: This thing needs to be ripped apart and rewritten. */ +void vm_mem_add(struct kvm_vm *vm, enum vm_mem_backing_src_type src_type, + uint64_t guest_paddr, uint32_t slot, uint64_t npages, + uint32_t flags, int gmem_fd, uint64_t gmem_offset) { int ret; struct userspace_mem_region *region; size_t backing_src_pagesz = get_backing_src_pagesz(src_type); + size_t mem_size = npages * vm->page_size; size_t alignment; TEST_ASSERT(vm_adjust_num_guest_pages(vm->mode, npages) == npages, @@ -947,7 +928,7 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm, /* Allocate and initialize new mem region structure. */ region = calloc(1, sizeof(*region)); TEST_ASSERT(region != NULL, "Insufficient Memory"); - region->mmap_size = npages * vm->page_size; + region->mmap_size = mem_size; #ifdef __s390x__ /* On s390x, the host address must be aligned to 1M (due to PGSTEs) */ @@ -994,14 +975,47 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm, /* As needed perform madvise */ if ((src_type == VM_MEM_SRC_ANONYMOUS || src_type == VM_MEM_SRC_ANONYMOUS_THP) && thp_configured()) { - ret = madvise(region->host_mem, npages * vm->page_size, + ret = madvise(region->host_mem, mem_size, src_type == VM_MEM_SRC_ANONYMOUS ? MADV_NOHUGEPAGE : MADV_HUGEPAGE); TEST_ASSERT(ret == 0, "madvise failed, addr: %p length: 0x%lx src_type: %s", - region->host_mem, npages * vm->page_size, + region->host_mem, mem_size, vm_mem_backing_src_alias(src_type)->name); } region->backing_src_type = src_type; + + if (flags & KVM_MEM_PRIVATE) { + if (gmem_fd < 0) { + uint32_t gmem_flags = 0; + + /* + * Allow hugepages for the guest memfd backing if the + * "normal" backing is allowed/required to be huge. + */ + if (src_type != VM_MEM_SRC_ANONYMOUS && + src_type != VM_MEM_SRC_SHMEM) + gmem_flags |= KVM_GUEST_MEMFD_ALLOW_HUGEPAGE; + + TEST_ASSERT(!gmem_offset, + "Offset must be zero when creating new guest_memfd"); + gmem_fd = vm_create_guest_memfd(vm, mem_size, gmem_flags); + } else { + /* + * Install a unique fd for each memslot so that the fd + * can be closed when the region is deleted without + * needing to track if the fd is owned by the framework + * or by the caller. + */ + gmem_fd = dup(gmem_fd); + TEST_ASSERT(gmem_fd >= 0, __KVM_SYSCALL_ERROR("dup()", gmem_fd)); + } + + region->region.gmem_fd = gmem_fd; + region->region.gmem_offset = gmem_offset; + } else { + region->region.gmem_fd = -1; + } + region->unused_phy_pages = sparsebit_alloc(); sparsebit_set_num(region->unused_phy_pages, guest_paddr >> vm->page_shift, npages); @@ -1014,9 +1028,10 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm, TEST_ASSERT(ret == 0, "KVM_SET_USER_MEMORY_REGION2 IOCTL failed,\n" " rc: %i errno: %i\n" " slot: %u flags: 0x%x\n" - " guest_phys_addr: 0x%lx size: 0x%lx", + " guest_phys_addr: 0x%lx size: 0x%lx guest_memfd: %d\n", ret, errno, slot, flags, - guest_paddr, (uint64_t) region->region.memory_size); + guest_paddr, (uint64_t) region->region.memory_size, + region->region.gmem_fd); /* Add to quick lookup data structures */ vm_userspace_mem_region_gpa_insert(&vm->regions.gpa_tree, region); @@ -1037,6 +1052,14 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm, } } +void vm_userspace_mem_region_add(struct kvm_vm *vm, + enum vm_mem_backing_src_type src_type, + uint64_t guest_paddr, uint32_t slot, + uint64_t npages, uint32_t flags) +{ + vm_mem_add(vm, src_type, guest_paddr, slot, npages, flags, -1, 0); +} + /* * Memslot to region *