From patchwork Fri Oct 27 18:22:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 13438953 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01FC0C25B48 for ; Fri, 27 Oct 2023 18:23:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 361FB80022; Fri, 27 Oct 2023 14:23:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2E9CC80018; Fri, 27 Oct 2023 14:23:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 079C380022; Fri, 27 Oct 2023 14:23:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id E3D1780018 for ; Fri, 27 Oct 2023 14:23:19 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id B2664120988 for ; Fri, 27 Oct 2023 18:23:19 +0000 (UTC) X-FDA: 81392063718.10.F650385 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf07.hostedemail.com (Postfix) with ESMTP id D13BA4000C for ; Fri, 27 Oct 2023 18:23:17 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="Ts/Q51XK"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of 3FAA8ZQYKCDUjVReaTXffXcV.TfdcZelo-ddbmRTb.fiX@flex--seanjc.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3FAA8ZQYKCDUjVReaTXffXcV.TfdcZelo-ddbmRTb.fiX@flex--seanjc.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698430997; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=RLaDQqSQ3lzZ0t/3jaQVA6mY285XfPpfvSiTXdbQLio=; b=8Fcpgl6A7CWHJNTy2g+sBxd64y62yNiy3alMxldq/IfCeB3M8bW7o0rXTya1+iic1gvL4Z 1OWtFRG4RSNV5EOksW/m/b2PH/fgVR1xzIofcFQxY1ZYOrTd1u/7trNqo+tgkyC9iPvH6s qU/St5nSn4pVqIHxeEXXntnc1oQHvtM= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="Ts/Q51XK"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of 3FAA8ZQYKCDUjVReaTXffXcV.TfdcZelo-ddbmRTb.fiX@flex--seanjc.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3FAA8ZQYKCDUjVReaTXffXcV.TfdcZelo-ddbmRTb.fiX@flex--seanjc.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698430997; a=rsa-sha256; cv=none; b=OF6Ww0lVaVzO0A7RnWOqhOMfoirAP5gPft8SbW4sCyVdlReKPPlhDzpWQSSavCuWMxZOZA FP32M88zuruscWL8iFje6se1fqb6LuCqXnzEu+bFbmfcNPHvFWNaX781sJ3K+7sPD8ivG2 23bOlWCpw2xoLH7M7kFgcTKPDmQf6+U= Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-da0737dcb26so1813515276.3 for ; Fri, 27 Oct 2023 11:23:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698430997; x=1699035797; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=RLaDQqSQ3lzZ0t/3jaQVA6mY285XfPpfvSiTXdbQLio=; b=Ts/Q51XKnFdog0bgtAjBfypmfjkZGYwcSDSZ3iWm2ew/N7ojWlwcisgFrjYDLQdkgu Rxt7JiyXhbXFDg/3dFdd/mENdS1G7DSdmyPkj+bkYbbfSsZVGq029w+bz2PVPPy/I95Y 8vWSY3hUiMAlTsj8ZEJ2G6T0O7Oa9xgZlcODK1aNgJdZ438wxb5g3s3xGHwC9JhV++92 juSBIdG9juteyEkTD4soChflDvEayiKNaCOMtC2ngvDnquVSbr/ufOsWXnToE+b/JVep bqOx5V9QS2okrro1ZGyTyQNkNsI4pxUbMoWRsafYhaLne7VqbFkEcl5oFlWuH+iN+uWi ftKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698430997; x=1699035797; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=RLaDQqSQ3lzZ0t/3jaQVA6mY285XfPpfvSiTXdbQLio=; b=VSGjubjjsvfqlqrfsOJPZkjwDKm9nvCrrC90zOyjYU54fvU/+vrHCix0ZbSIwLX/Gv 0W+IbVVosVIL+8MRsUmXNlOSJLMdJRMBfTfOyalz+0BePboZc0nLUbiqcih/Qk7jOxlQ Jvj4oUA+OauuaKToUmmdVKRDKlAkIP54uoLHFvmCSc/6t9uWBI2KnLJPDRsW6cGZy6L8 vnlNeWLCFj/z7i5/r195p6NxNupCfkc31a4Bh7uWmfXdn4N23upWLWrXJY/0NMcNu6db sOqdpNmVnK0UvaQJGFnYyyV8F2Oiw7nIeod9vfaRYDSBYgxcCfFfEQDhGIRZoZP52yQu sTdg== X-Gm-Message-State: AOJu0YwYISZ4oO1irZqK6apMS6FL0gRWtaHzi/c7ZCL0Kf/NcVfwPWK1 B0aXgJjRAYZ1uTNLlq3YYGD4l/DJXLs= X-Google-Smtp-Source: AGHT+IFVwKsM7vhHHbindB36SAJ//dsM5jy6/DD1JTv8kubZidgUJZpq/2shomqVAZOPrnOCRBDGbJqIEWs= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6902:1782:b0:d81:582b:4661 with SMTP id ca2-20020a056902178200b00d81582b4661mr65203ybb.8.1698430996901; Fri, 27 Oct 2023 11:23:16 -0700 (PDT) Reply-To: Sean Christopherson Date: Fri, 27 Oct 2023 11:22:08 -0700 In-Reply-To: <20231027182217.3615211-1-seanjc@google.com> Mime-Version: 1.0 References: <20231027182217.3615211-1-seanjc@google.com> X-Mailer: git-send-email 2.42.0.820.g83a721a137-goog Message-ID: <20231027182217.3615211-27-seanjc@google.com> Subject: [PATCH v13 26/35] KVM: selftests: Add support for creating private memslots From: Sean Christopherson To: Paolo Bonzini , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Alexander Viro , Christian Brauner , "Matthew Wilcox (Oracle)" , Andrew Morton Cc: kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Xiaoyao Li , Xu Yilun , Chao Peng , Fuad Tabba , Jarkko Sakkinen , Anish Moorthy , David Matlack , Yu Zhang , Isaku Yamahata , " =?utf-8?q?Micka=C3=ABl_Sala?= =?utf-8?q?=C3=BCn?= " , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" X-Rspamd-Queue-Id: D13BA4000C X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: p7hx6ixwjze69mrzpgccwzjzx48zit46 X-HE-Tag: 1698430997-344453 X-HE-Meta: U2FsdGVkX18wcU0u6nUwpHmDQC1FTVc0F0OjQjRRNK6AuCb+dQ4SCyxNIkt/cMio9jfx8pZm29UvKfmeQjw6/3l1lN7OVR8XNk2FPi1/kFA8Do3MSGwEAwt0szwc30onS0xxUytfd6Goq/x7KnRyUBMrvevi6geT5WHhRECTWDdT1iVoIBzLpVsMsaP1w6Tkg8FAPcaQyI1mXVVZ6Ym/S+RZ2NZPJPcDJ2wmjGj/H75+J7Mt/VkHuZ4ZzVVW+DeMqwgu93lFPawcJ6+b7Iue0X8td/sgnRKXqM9cRSBBlLh3wwpb9SmX6ZS1c9ctlUUgBDAY1FkyP+rrMsiGhKgeJcSDRJNdUxE8QDS7KBz2QB1slePsnFBB6CwGYAyuGLEg6yUpPobKF7Xpre35hfLZWDeFA4xDlh9axq4ynaIu1BJfdCjytqsAyk4l/rp9QW49HCX5IbhygnCSPTbVcOIidXvqzmb5h/K0L6uyP9HH0H2q/lXYJpGYwLNjHLXpSi91wuRbjqs757E3dY1oY/A5LNe8opyVepmwpfg2JOuwN8nDrgtTnb5LYlcZ6YcRrS/9PUVl9GsIgOYeLsLiIffyaTcQmZqwKaNxZM0gHFpZK/DEryZabQWolyUDulqvr+sGN0/G82oirOYUhDCNu55cAb4a8D0FWK5YV6vhuAxPg3Dr4eLDlOAHxDYjVm5gIhF6PjxppfnSv8+2t8JH8fB5q0eAeoFcK5lQ1PsfqZ34iQfWQ4KKfXEb24n0vy4nkzChEAlTClGi4leWb7Zd+IgWMR8K8maUToai3aDsposjYdLCPFUqSw5z54cJ1ncoBHgzPQWVX0nj4XNuehXRgxe0UYmzU6SlZl5TMF5/A7DksaTZR8wpZfLxTTzIG+xNsxrii/F7zZIW9aIatv4rHtaXEGKbQrm8n7qIry16WYZ/kR3Td6WzsRJj4X6NoH05gn3hwioEbASFwiljiowzz23 7DWMWNBq NxlRJsDOqbMYVXGssiIXHQJq2c++TfiAyNO8mvdSQJy40cW4h3UZ+oqderTgJwTU6y+GkvwKwCGa7jwYGzmAzWi+8MCoCvVkvSLVYLU7hXJD4IbI+Vn4xM0PpS7ZAoBeIB5rwomo1WXXuXtHmTaltM17+OLgTetnyYTdRfEsrcwJSqlfrMCzYO1Mxnf4M9YbNNQtwV2v0nk08aw84Krd6MnLPCqP4Fpa31tQQLQ5DiCizbutSaOXdHpwb15n2vmsaYwUyEHXoZarPPO87PE2h1GrdIHupqjImeEXtCttopEYUlKwi7G6CCrpzLb7caef14Lxvt/7eqFYT8R0Z0gH84TeaYieXsoBpuNlkHdhbTn3PjYNpGcwOTgBbQk+iqGNxycrmRUFR3b8XV3O05uerl+vGF5+KlkCDyY+KnptlEb7IxK3049Oz1Y7iuf9FeQ+vD9Ll0ckSmFyniBsJ6dyCDd0oyCqRgO0MoC4M9fhzDBuCihZYQV13vnyUM90YnYTzSdsQ0zo4KA0GCTL5PYTZmzsf89vN1aOPXzzFZW9Picx1Gi0tvtH4hqjcFNmWIj0FyKY+Hy4v/cvFX1rY3UimZU79VFhkCqHbtI+xOh4R1XkfembWFdqgwrm0KybJL2pKA+Pk2r9Ac5/2hJeKHE5Kv3JrcnYAqlRyzjKmZeH8kWNAjYnptS03yRbZ88udWCIvbkxp8KJsbC3nRNrzMaer6AdLxLpZ1zeyc3c7sRb18yCgzlk+078/F9H32lFeFuP0ruwhihN7ra9PGi8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add support for creating "private" memslots via KVM_CREATE_GUEST_MEMFD and KVM_SET_USER_MEMORY_REGION2. Make vm_userspace_mem_region_add() a wrapper to its effective replacement, vm_mem_add(), so that private memslots are fully opt-in, i.e. don't require update all tests that add memory regions. Pivot on the KVM_MEM_PRIVATE flag instead of the validity of the "gmem" file descriptor so that simple tests can let vm_mem_add() do the heavy lifting of creating the guest memfd, but also allow the caller to pass in an explicit fd+offset so that fancier tests can do things like back multiple memslots with a single file. If the caller passes in a fd, dup() the fd so that (a) __vm_mem_region_delete() can close the fd associated with the memory region without needing yet another flag, and (b) so that the caller can safely close its copy of the fd without having to first destroy memslots. Co-developed-by: Ackerley Tng Signed-off-by: Ackerley Tng Signed-off-by: Sean Christopherson --- .../selftests/kvm/include/kvm_util_base.h | 23 +++++ .../testing/selftests/kvm/include/test_util.h | 5 ++ tools/testing/selftests/kvm/lib/kvm_util.c | 85 ++++++++++++------- 3 files changed, 82 insertions(+), 31 deletions(-) diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h index 9f144841c2ee..9f861182c02a 100644 --- a/tools/testing/selftests/kvm/include/kvm_util_base.h +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h @@ -431,6 +431,26 @@ static inline uint64_t vm_get_stat(struct kvm_vm *vm, const char *stat_name) void vm_create_irqchip(struct kvm_vm *vm); +static inline int __vm_create_guest_memfd(struct kvm_vm *vm, uint64_t size, + uint64_t flags) +{ + struct kvm_create_guest_memfd guest_memfd = { + .size = size, + .flags = flags, + }; + + return __vm_ioctl(vm, KVM_CREATE_GUEST_MEMFD, &guest_memfd); +} + +static inline int vm_create_guest_memfd(struct kvm_vm *vm, uint64_t size, + uint64_t flags) +{ + int fd = __vm_create_guest_memfd(vm, size, flags); + + TEST_ASSERT(fd >= 0, KVM_IOCTL_ERROR(KVM_CREATE_GUEST_MEMFD, fd)); + return fd; +} + void vm_set_user_memory_region(struct kvm_vm *vm, uint32_t slot, uint32_t flags, uint64_t gpa, uint64_t size, void *hva); int __vm_set_user_memory_region(struct kvm_vm *vm, uint32_t slot, uint32_t flags, @@ -439,6 +459,9 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm, enum vm_mem_backing_src_type src_type, uint64_t guest_paddr, uint32_t slot, uint64_t npages, uint32_t flags); +void vm_mem_add(struct kvm_vm *vm, enum vm_mem_backing_src_type src_type, + uint64_t guest_paddr, uint32_t slot, uint64_t npages, + uint32_t flags, int guest_memfd_fd, uint64_t guest_memfd_offset); void vm_mem_region_set_flags(struct kvm_vm *vm, uint32_t slot, uint32_t flags); void vm_mem_region_move(struct kvm_vm *vm, uint32_t slot, uint64_t new_gpa); diff --git a/tools/testing/selftests/kvm/include/test_util.h b/tools/testing/selftests/kvm/include/test_util.h index 7e614adc6cf4..7257f2243ab9 100644 --- a/tools/testing/selftests/kvm/include/test_util.h +++ b/tools/testing/selftests/kvm/include/test_util.h @@ -142,6 +142,11 @@ static inline bool backing_src_is_shared(enum vm_mem_backing_src_type t) return vm_mem_backing_src_alias(t)->flag & MAP_SHARED; } +static inline bool backing_src_can_be_huge(enum vm_mem_backing_src_type t) +{ + return t != VM_MEM_SRC_ANONYMOUS && t != VM_MEM_SRC_SHMEM; +} + /* Aligns x up to the next multiple of size. Size must be a power of 2. */ static inline uint64_t align_up(uint64_t x, uint64_t size) { diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index 3676b37bea38..45050f54701a 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -669,6 +669,8 @@ static void __vm_mem_region_delete(struct kvm_vm *vm, TEST_ASSERT(!ret, __KVM_SYSCALL_ERROR("munmap()", ret)); close(region->fd); } + if (region->region.guest_memfd >= 0) + close(region->region.guest_memfd); free(region); } @@ -870,36 +872,15 @@ void vm_set_user_memory_region(struct kvm_vm *vm, uint32_t slot, uint32_t flags, errno, strerror(errno)); } -/* - * VM Userspace Memory Region Add - * - * Input Args: - * vm - Virtual Machine - * src_type - Storage source for this region. - * NULL to use anonymous memory. - * guest_paddr - Starting guest physical address - * slot - KVM region slot - * npages - Number of physical pages - * flags - KVM memory region flags (e.g. KVM_MEM_LOG_DIRTY_PAGES) - * - * Output Args: None - * - * Return: None - * - * Allocates a memory area of the number of pages specified by npages - * and maps it to the VM specified by vm, at a starting physical address - * given by guest_paddr. The region is created with a KVM region slot - * given by slot, which must be unique and < KVM_MEM_SLOTS_NUM. The - * region is created with the flags given by flags. - */ -void vm_userspace_mem_region_add(struct kvm_vm *vm, - enum vm_mem_backing_src_type src_type, - uint64_t guest_paddr, uint32_t slot, uint64_t npages, - uint32_t flags) +/* FIXME: This thing needs to be ripped apart and rewritten. */ +void vm_mem_add(struct kvm_vm *vm, enum vm_mem_backing_src_type src_type, + uint64_t guest_paddr, uint32_t slot, uint64_t npages, + uint32_t flags, int guest_memfd, uint64_t guest_memfd_offset) { int ret; struct userspace_mem_region *region; size_t backing_src_pagesz = get_backing_src_pagesz(src_type); + size_t mem_size = npages * vm->page_size; size_t alignment; TEST_ASSERT(vm_adjust_num_guest_pages(vm->mode, npages) == npages, @@ -952,7 +933,7 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm, /* Allocate and initialize new mem region structure. */ region = calloc(1, sizeof(*region)); TEST_ASSERT(region != NULL, "Insufficient Memory"); - region->mmap_size = npages * vm->page_size; + region->mmap_size = mem_size; #ifdef __s390x__ /* On s390x, the host address must be aligned to 1M (due to PGSTEs) */ @@ -999,14 +980,47 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm, /* As needed perform madvise */ if ((src_type == VM_MEM_SRC_ANONYMOUS || src_type == VM_MEM_SRC_ANONYMOUS_THP) && thp_configured()) { - ret = madvise(region->host_mem, npages * vm->page_size, + ret = madvise(region->host_mem, mem_size, src_type == VM_MEM_SRC_ANONYMOUS ? MADV_NOHUGEPAGE : MADV_HUGEPAGE); TEST_ASSERT(ret == 0, "madvise failed, addr: %p length: 0x%lx src_type: %s", - region->host_mem, npages * vm->page_size, + region->host_mem, mem_size, vm_mem_backing_src_alias(src_type)->name); } region->backing_src_type = src_type; + + if (flags & KVM_MEM_PRIVATE) { + if (guest_memfd < 0) { + uint32_t guest_memfd_flags = 0; + + /* + * Allow hugepages for the guest memfd backing if the + * "normal" backing is allowed/required to be huge. + */ + if (src_type != VM_MEM_SRC_ANONYMOUS && + src_type != VM_MEM_SRC_SHMEM) + guest_memfd_flags |= KVM_GUEST_MEMFD_ALLOW_HUGEPAGE; + + TEST_ASSERT(!guest_memfd_offset, + "Offset must be zero when creating new guest_memfd"); + guest_memfd = vm_create_guest_memfd(vm, mem_size, guest_memfd_flags); + } else { + /* + * Install a unique fd for each memslot so that the fd + * can be closed when the region is deleted without + * needing to track if the fd is owned by the framework + * or by the caller. + */ + guest_memfd = dup(guest_memfd); + TEST_ASSERT(guest_memfd >= 0, __KVM_SYSCALL_ERROR("dup()", guest_memfd)); + } + + region->region.guest_memfd = guest_memfd; + region->region.guest_memfd_offset = guest_memfd_offset; + } else { + region->region.guest_memfd = -1; + } + region->unused_phy_pages = sparsebit_alloc(); sparsebit_set_num(region->unused_phy_pages, guest_paddr >> vm->page_shift, npages); @@ -1019,9 +1033,10 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm, TEST_ASSERT(ret == 0, "KVM_SET_USER_MEMORY_REGION2 IOCTL failed,\n" " rc: %i errno: %i\n" " slot: %u flags: 0x%x\n" - " guest_phys_addr: 0x%lx size: 0x%lx", + " guest_phys_addr: 0x%lx size: 0x%lx guest_memfd: %d\n", ret, errno, slot, flags, - guest_paddr, (uint64_t) region->region.memory_size); + guest_paddr, (uint64_t) region->region.memory_size, + region->region.guest_memfd); /* Add to quick lookup data structures */ vm_userspace_mem_region_gpa_insert(&vm->regions.gpa_tree, region); @@ -1042,6 +1057,14 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm, } } +void vm_userspace_mem_region_add(struct kvm_vm *vm, + enum vm_mem_backing_src_type src_type, + uint64_t guest_paddr, uint32_t slot, + uint64_t npages, uint32_t flags) +{ + vm_mem_add(vm, src_type, guest_paddr, slot, npages, flags, -1, 0); +} + /* * Memslot to region *