From patchwork Thu Oct 10 08:59:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13829781 Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5F74B1C3310 for ; Thu, 10 Oct 2024 08:59:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550779; cv=none; b=YvJm2YQ8vVbh7fjgkz4eGlzkJQ9q5obH8jSr7uXNXAGGykRoF6tzC2i/0TzWoG427+7waYbWV6JH4g5JK4iPGQFvHnSJwLAWaAI8eQ+jcxt0Ycl0kOdGwNwWVH3lg9oktb4mA2GEUtr9TgHpcLfl4WXi0WfjWnKMRQeXtISA/yM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550779; c=relaxed/simple; bh=Mga7mt3X9arSdqCQlXB1u30FTNDLBjy3at6GjGP0Jvo=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=WEtkRW4KWDxJVns+Mmi7GscRvX1zohH3uIe70moW4LLpqFAwikFZh3D1d00wIRfarnT9zygki1AdYaP2ZVWi7QN1LdtP2hxV1HjTmN5SHVmp6g2Ui3DV64Pjghgs1o6xFugkVDGsGsVedXDHWFdt9OE5jvOhz03qhq8GLwIF4zk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=2CHS9tF7; arc=none smtp.client-ip=209.85.221.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="2CHS9tF7" Received: by mail-wr1-f74.google.com with SMTP id ffacd0b85a97d-37d4922d8c7so201342f8f.1 for ; Thu, 10 Oct 2024 01:59:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728550776; x=1729155576; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Vyku6e6IRAvv3Cnewk6FjoCA5VWGrQWG+Ki8Kqf1epQ=; b=2CHS9tF7jEriErxosQt3e4Mw6EAy/zXh6yCm7fpWVTWdUa5SyGHEctI75KDz5yjjWT qpjLF9xRlTyYeDKXjZaGW7kcQ663TPyJ6gtBQXIJCh3az+oIYGQWlnY2TRKn79vbJEUk 68uARHp6UImIMe+ooa9pULVCEMBW5OEnMdU4PmBy8/EioZ1Lkzl1Nlt35oyr+LhDeYGL JpJwXG4/cDElszjAuGLTMGznUyoVmRe7gsub0qntc0Si64OQ5OxFMLf9QXmWM7YrbHyM 144c7/hncZIcyZcxYM4YvK4ASWNK0yaLhVv6LTPf9zT6QHfqazPKteGW3MMdRBmGA+jW WC2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728550776; x=1729155576; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Vyku6e6IRAvv3Cnewk6FjoCA5VWGrQWG+Ki8Kqf1epQ=; b=UWyIsK2YspL0IUYW9+FCDe31m/a6FKqIXByzHQloySAfKKY99rUVsVCUjd3K8jaIMT Q4Aq6i5cjHD92OyGntmG0FS+cN/cRtH7Xf6gh2TGBHDpJgjBxSxypgXCeSasx0yYirAa EK7UFIS+52x58fvqWUbrhGYkGb62nVXmUKYdt01S8qrA66ocWC1OY7mn+wyCU9oeSXNb SMzvBt5h3Acw9nyYNBsr8+7o+Pq7N8Ts3tVZarZ4XwQSBVPRMtwiINaz1eXXurUPgOo5 T4dFNrS4TAzPUguFPJQ06CVGi1h62sTR59mkIYNL/R7GLlgqECTmL11/cPIcZEyE1riU tgKA== X-Forwarded-Encrypted: i=1; AJvYcCVWM8jVCVCbSbPw8sCuJzvhNYEicK6oQxQo9qIneWHUNcWjKAOLIW6GwFiJXsQ3daIiLSwir0beY5qu7wBc@vger.kernel.org X-Gm-Message-State: AOJu0YyvgvsvhOetqTdCobFmfoJwUTq2aKD6PBDvHiWDJ0NIHDg9AzF8 JDM0JnQ3T3IjEskgmhtfWhFaQ9kfMoYyADIub46AjnuzFrtDoFwngVyI9bQ7HHhtSK+aQ43iMw= = X-Google-Smtp-Source: AGHT+IGSZpfTUz1T03Ghz7pMPW+WsgsV7vYF/u4tI/gDmrDsGEArgV4hEd1O0Gy7y58cVD5g/Q9cGyXW2w== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:adf:ed10:0:b0:37d:4332:e91a with SMTP id ffacd0b85a97d-37d4332ea66mr2783f8f.0.1728550775123; Thu, 10 Oct 2024 01:59:35 -0700 (PDT) Date: Thu, 10 Oct 2024 09:59:20 +0100 In-Reply-To: <20241010085930.1546800-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241010085930.1546800-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.rc0.187.ge670bccf7e-goog Message-ID: <20241010085930.1546800-2-tabba@google.com> Subject: [PATCH v3 01/11] KVM: guest_memfd: Make guest mem use guest mem inodes instead of anonymous inodes From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com From: Ackerley Tng Using guest mem inodes allows us to store metadata for the backing memory on the inode. Metadata will be added in a later patch to support HugeTLB pages. Metadata about backing memory should not be stored on the file, since the file represents a guest_memfd's binding with a struct kvm, and metadata about backing memory is not unique to a specific binding and struct kvm. Signed-off-by: Ackerley Tng Signed-off-by: Fuad Tabba --- include/uapi/linux/magic.h | 1 + virt/kvm/guest_memfd.c | 119 ++++++++++++++++++++++++++++++------- 2 files changed, 100 insertions(+), 20 deletions(-) diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h index bb575f3ab45e..169dba2a6920 100644 --- a/include/uapi/linux/magic.h +++ b/include/uapi/linux/magic.h @@ -103,5 +103,6 @@ #define DEVMEM_MAGIC 0x454d444d /* "DMEM" */ #define SECRETMEM_MAGIC 0x5345434d /* "SECM" */ #define PID_FS_MAGIC 0x50494446 /* "PIDF" */ +#define GUEST_MEMORY_MAGIC 0x474d454d /* "GMEM" */ #endif /* __LINUX_MAGIC_H__ */ diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 8f079a61a56d..5d7fd1f708a6 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -1,12 +1,17 @@ // SPDX-License-Identifier: GPL-2.0 +#include +#include #include #include #include +#include #include #include #include "kvm_mm.h" +static struct vfsmount *kvm_gmem_mnt; + struct kvm_gmem { struct kvm *kvm; struct xarray bindings; @@ -302,6 +307,38 @@ static inline struct file *kvm_gmem_get_file(struct kvm_memory_slot *slot) return get_file_active(&slot->gmem.file); } +static const struct super_operations kvm_gmem_super_operations = { + .statfs = simple_statfs, +}; + +static int kvm_gmem_init_fs_context(struct fs_context *fc) +{ + struct pseudo_fs_context *ctx; + + if (!init_pseudo(fc, GUEST_MEMORY_MAGIC)) + return -ENOMEM; + + ctx = fc->fs_private; + ctx->ops = &kvm_gmem_super_operations; + + return 0; +} + +static struct file_system_type kvm_gmem_fs = { + .name = "kvm_guest_memory", + .init_fs_context = kvm_gmem_init_fs_context, + .kill_sb = kill_anon_super, +}; + +static void kvm_gmem_init_mount(void) +{ + kvm_gmem_mnt = kern_mount(&kvm_gmem_fs); + BUG_ON(IS_ERR(kvm_gmem_mnt)); + + /* For giggles. Userspace can never map this anyways. */ + kvm_gmem_mnt->mnt_flags |= MNT_NOEXEC; +} + static struct file_operations kvm_gmem_fops = { .open = generic_file_open, .release = kvm_gmem_release, @@ -311,6 +348,8 @@ static struct file_operations kvm_gmem_fops = { void kvm_gmem_init(struct module *module) { kvm_gmem_fops.owner = module; + + kvm_gmem_init_mount(); } static int kvm_gmem_migrate_folio(struct address_space *mapping, @@ -392,11 +431,67 @@ static const struct inode_operations kvm_gmem_iops = { .setattr = kvm_gmem_setattr, }; +static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, + loff_t size, u64 flags) +{ + const struct qstr qname = QSTR_INIT(name, strlen(name)); + struct inode *inode; + int err; + + inode = alloc_anon_inode(kvm_gmem_mnt->mnt_sb); + if (IS_ERR(inode)) + return inode; + + err = security_inode_init_security_anon(inode, &qname, NULL); + if (err) { + iput(inode); + return ERR_PTR(err); + } + + inode->i_private = (void *)(unsigned long)flags; + inode->i_op = &kvm_gmem_iops; + inode->i_mapping->a_ops = &kvm_gmem_aops; + inode->i_mode |= S_IFREG; + inode->i_size = size; + mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER); + mapping_set_inaccessible(inode->i_mapping); + /* Unmovable mappings are supposed to be marked unevictable as well. */ + WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping)); + + return inode; +} + +static struct file *kvm_gmem_inode_create_getfile(void *priv, loff_t size, + u64 flags) +{ + static const char *name = "[kvm-gmem]"; + struct inode *inode; + struct file *file; + + if (kvm_gmem_fops.owner && !try_module_get(kvm_gmem_fops.owner)) + return ERR_PTR(-ENOENT); + + inode = kvm_gmem_inode_make_secure_inode(name, size, flags); + if (IS_ERR(inode)) + return ERR_CAST(inode); + + file = alloc_file_pseudo(inode, kvm_gmem_mnt, name, O_RDWR, + &kvm_gmem_fops); + if (IS_ERR(file)) { + iput(inode); + return file; + } + + file->f_mapping = inode->i_mapping; + file->f_flags |= O_LARGEFILE; + file->private_data = priv; + + return file; +} + static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) { - const char *anon_name = "[kvm-gmem]"; struct kvm_gmem *gmem; - struct inode *inode; struct file *file; int fd, err; @@ -410,32 +505,16 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) goto err_fd; } - file = anon_inode_create_getfile(anon_name, &kvm_gmem_fops, gmem, - O_RDWR, NULL); + file = kvm_gmem_inode_create_getfile(gmem, size, flags); if (IS_ERR(file)) { err = PTR_ERR(file); goto err_gmem; } - file->f_flags |= O_LARGEFILE; - - inode = file->f_inode; - WARN_ON(file->f_mapping != inode->i_mapping); - - inode->i_private = (void *)(unsigned long)flags; - inode->i_op = &kvm_gmem_iops; - inode->i_mapping->a_ops = &kvm_gmem_aops; - inode->i_mode |= S_IFREG; - inode->i_size = size; - mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER); - mapping_set_inaccessible(inode->i_mapping); - /* Unmovable mappings are supposed to be marked unevictable as well. */ - WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping)); - kvm_get_kvm(kvm); gmem->kvm = kvm; xa_init(&gmem->bindings); - list_add(&gmem->entry, &inode->i_mapping->i_private_list); + list_add(&gmem->entry, &file_inode(file)->i_mapping->i_private_list); fd_install(fd, file); return fd; From patchwork Thu Oct 10 08:59:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13829782 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 09C921B5ED6 for ; Thu, 10 Oct 2024 08:59:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550781; cv=none; b=gtDDL3LTvJtZF2ocMebpeUmyEZYcvzd7p814vKl6yOYP5jTJFXV+wDkbB2XA4VAvTM3aISNyeEB+y/BoR45+grEpNWJvsSnC3wlkeM5gTmmUG0oW5wi8B6m0YRipmngpZNjyfPgq9+0w0xtjqzVdKorzLspLNgKwZ1vxWVCd4xE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550781; c=relaxed/simple; bh=3zI2mMYBgDtaQ2CZ1B/ek40Jw4Rd0C6Kzftoo23lzPU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Q3a4yvhMCKg/1bl4WEeH8165HfwOqhT/GGmlZ5hRflotb/L+Na5Jii5D+i3tkSEFi14BoKz25uxqwXsFhuV9XNmsOhiShCLPFHR88923SMKSN6QNrdXHztNnOFhKLIMM/6IDNr0IFASQmdndtKH0pWJ/Y+egB4WH/j4z2kizRa8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Rvtb+ZoI; arc=none smtp.client-ip=209.85.128.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Rvtb+ZoI" Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-430581bd920so4088935e9.3 for ; Thu, 10 Oct 2024 01:59:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728550778; x=1729155578; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=4yV7g5zb1mLRm2mV+FxGCvZ9J58MmHKq+HH3QCFP3/A=; b=Rvtb+ZoIFD/YpmNoKTitqcpoic0AJAQGVKdbofn5bRAkvwgVKOlGwaDut2Q49Wo+N5 4uMkXc3IB97udPr634UBNpN6KAXVdvNJjyuDqa3FjVZlUxFHHT0VjoMj8WbqxnyoVJjN jRwA6GayiX3lnay6jn7458FhPkWNWUXz6/uPk6QrNY/4Z5AcxfwlnO8rhtb3muyBh2xj XKMutfz7Ggqnsrm9eRHJwjXdrhwqqHaq3FBkgfgIg+0fEim2YoPcL+xuL+BolGXxpGik YX2C4yuOS1iA6ysg3EOPLoXvvsmmWhhpcm+VJilbx7Vp9mTni20cF04dQGAbvNwdoRow NgsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728550778; x=1729155578; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=4yV7g5zb1mLRm2mV+FxGCvZ9J58MmHKq+HH3QCFP3/A=; b=dyPjBe9q16f0gVPzcVVBJlGm06dcWaEw+GuqLz9ExCK/zEfsty8Se+9CbOhhPumtDu 1m7XeCvNTMvbP3m3ErByFRjzhSL//4oV2X3KPv5FX0rtSF72qMopT8sPwiAR8JhYjJX9 5y+2J9fpAbFU4XzGpl1xNnG9C2HVspTG8/hTWqDIOqUtsESRxfMAAq7giE4FC8I/NIBk DN/a4jirTtF/nao6c0oYgJ2xYINSptlcn1nWTcgcE1/0v3eZFdgIGbqEOvmI88H37Ln0 LFK7+iGHvITCQGYNtvIF1ZI/g8nI4jWaa8ZTK6DBRcQBSLm43d8xpDU2+RKKcaDELgqo ZDKg== X-Forwarded-Encrypted: i=1; AJvYcCUu6hdaTghwsZ+TgT7pJx5IvCDQx8STpYG4NZW+pgHKl43OhWB/lYotWAG8/gg2lUj5bickfa2/dKDsJ3kD@vger.kernel.org X-Gm-Message-State: AOJu0YzoVmB5fiLvBB4uWzYhArqfnRLG9vucpdTSiWNWzPdH80PYS+rv Uq9vnKW4x9Qk6jY0IO/x1RdQIGj2s2/2REv/ejV5kfyVIGGj8A/lk/1dUN/WAJt3gBP/1QfKmw= = X-Google-Smtp-Source: AGHT+IEhajSTUGX0DrF4NwfEPAhnOGAT61tcgNDfAcBGm901JNES8kGjHixw5Cb+5EsBR7BggcxyXkjnrQ== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a05:600c:5357:b0:42c:b45c:5e95 with SMTP id 5b1f17b1804b1-430ccf091e0mr139615e9.1.1728550777855; Thu, 10 Oct 2024 01:59:37 -0700 (PDT) Date: Thu, 10 Oct 2024 09:59:21 +0100 In-Reply-To: <20241010085930.1546800-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241010085930.1546800-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.rc0.187.ge670bccf7e-goog Message-ID: <20241010085930.1546800-3-tabba@google.com> Subject: [PATCH v3 02/11] KVM: guest_memfd: Track mappability within a struct kvm_gmem_private From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com From: Ackerley Tng Track whether guest_memfd memory can be mapped within the inode, since it is property of the guest_memfd's memory contents. The guest_memfd PRIVATE memory attribute is not used for two reasons. First because it reflects the userspace expectation for that memory location, and therefore can be toggled by userspace. The second is, although each guest_memfd file has a 1:1 binding with a KVM instance, the plan is to allow multiple files per inode, e.g. to allow intra-host migration to a new KVM instance, without destroying guest_memfd. Signed-off-by: Ackerley Tng Co-developed-by: Vishal Annapurve Signed-off-by: Vishal Annapurve Co-developed-by: Fuad Tabba Signed-off-by: Fuad Tabba --- virt/kvm/guest_memfd.c | 56 ++++++++++++++++++++++++++++++++++++++---- 1 file changed, 51 insertions(+), 5 deletions(-) diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 5d7fd1f708a6..4d3ba346c415 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -18,6 +18,17 @@ struct kvm_gmem { struct list_head entry; }; +struct kvm_gmem_inode_private { +#ifdef CONFIG_KVM_GMEM_MAPPABLE + struct xarray mappable_offsets; +#endif +}; + +static struct kvm_gmem_inode_private *kvm_gmem_private(struct inode *inode) +{ + return inode->i_mapping->i_private_data; +} + /** * folio_file_pfn - like folio_file_page, but return a pfn. * @folio: The folio which contains this index. @@ -307,8 +318,28 @@ static inline struct file *kvm_gmem_get_file(struct kvm_memory_slot *slot) return get_file_active(&slot->gmem.file); } +static void kvm_gmem_evict_inode(struct inode *inode) +{ + struct kvm_gmem_inode_private *private = kvm_gmem_private(inode); + +#ifdef CONFIG_KVM_GMEM_MAPPABLE + /* + * .free_inode can be called before private data is set up if there are + * issues during inode creation. + */ + if (private) + xa_destroy(&private->mappable_offsets); +#endif + + truncate_inode_pages_final(inode->i_mapping); + + kfree(private); + clear_inode(inode); +} + static const struct super_operations kvm_gmem_super_operations = { - .statfs = simple_statfs, + .statfs = simple_statfs, + .evict_inode = kvm_gmem_evict_inode, }; static int kvm_gmem_init_fs_context(struct fs_context *fc) @@ -435,6 +466,7 @@ static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, loff_t size, u64 flags) { const struct qstr qname = QSTR_INIT(name, strlen(name)); + struct kvm_gmem_inode_private *private; struct inode *inode; int err; @@ -443,10 +475,19 @@ static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, return inode; err = security_inode_init_security_anon(inode, &qname, NULL); - if (err) { - iput(inode); - return ERR_PTR(err); - } + if (err) + goto out; + + err = -ENOMEM; + private = kzalloc(sizeof(*private), GFP_KERNEL); + if (!private) + goto out; + +#ifdef CONFIG_KVM_GMEM_MAPPABLE + xa_init(&private->mappable_offsets); +#endif + + inode->i_mapping->i_private_data = private; inode->i_private = (void *)(unsigned long)flags; inode->i_op = &kvm_gmem_iops; @@ -459,6 +500,11 @@ static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping)); return inode; + +out: + iput(inode); + + return ERR_PTR(err); } static struct file *kvm_gmem_inode_create_getfile(void *priv, loff_t size, From patchwork Thu Oct 10 08:59:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13829783 Received: from mail-ej1-f73.google.com (mail-ej1-f73.google.com [209.85.218.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C0FD41C3F18 for ; Thu, 10 Oct 2024 08:59:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550784; cv=none; b=kSlqk15a5fGJpnaYKmGAMo0QTx61OrEIPSsv/aUAmBNtihW4JBvDKt8n3xk4mQSuxdznSwj7QYinF7sFxzDkLd0QvVIG+BWmrfKu1mesinvfaRjHWH3JjI/P6C5rIKMqcUcmGFHD6/6I2Q+//DuH8osXoVMltC7d1Lm3cdGsceg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550784; c=relaxed/simple; bh=N/7Xc8SNlJ2wkKm4Jwyhcut9fezoJcEGmUEf+H2ES0k=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=A5fWA0yzWNRi215NfmnzUVAKq/Xrfj7xZeNuJDUEgsYb3VklGxSLZuaOonQe8LLC5ny1uJmx6hCa86yprQnJ4YhwOlnurLrJZgghGNWCWJkEpcZ3Kx2Rjl2cXOjIizea4To+BiFerC/Qs99LkqB/7HjXvVqHWyaowYkx3yGz++8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=l38jt1YA; arc=none smtp.client-ip=209.85.218.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="l38jt1YA" Received: by mail-ej1-f73.google.com with SMTP id a640c23a62f3a-a994d8d6004so58371166b.2 for ; Thu, 10 Oct 2024 01:59:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728550781; x=1729155581; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=zi/pjPr4ta3drldcSk6p6o77eOYFs/2AcZb+H+YLVPU=; b=l38jt1YAF+eJSiyqiHD7yf7A2HLOcBNC18aR8RrbUn7ugOkQ9Ur04RIfuYRv1Ily8X JJRkzEYX4GysvRe4tR7G0zS5V2mLbT2k8esC8N4R1BNKQEQe39KQXmPBKcYLEYsRS3Ak go5gHpHhxdrcZ70SUloIe9G9qwyPG4y7yloHV2AXZDby7aMCy/HzXdHktvTP+WRw87sQ sjyxua4X4SMWqH4/3GvUEvlQ4RvDExQC/dYWrkxIjOXZa0C31VOw/Tj8eFos2namoTeE aTPKBlHFyzFzk41tSQyVrESA3vSm2xsX1ieZ+LUoc/KCRx92IPKYBvZOUytR67iONmno tMiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728550781; x=1729155581; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=zi/pjPr4ta3drldcSk6p6o77eOYFs/2AcZb+H+YLVPU=; b=Mh5ttn5206KnQxdqw3SwdKNGFp7XhpaJHCRDyXmoS/rlxapOeyoBOOXZD1oxMPK8Az JeBNked2DRWnTtDgqpCjl4lPZ0Z8Om+LhcT2mB07MJ6mQevHQR/zTVDRVS35LyzykncC KBGJjR5kQ11V8WuLGY7ltYCU6UYZ/w52ilRNRnJ3n8UfUbkdAas3usU8J2Flf3BldNkY bK3AneOd8wlNzAYbzzVWjhvUOcQN5PcB1GZjxfjeoTqKwgR5a+ltraz2vg9s/ZYpXAQF rZ65iBjQT+maaGF5BkhsAnOuGcUg+TAeqBZYC97Zzt0wYP0SBjmD6CRFuz9q8bziPWnh bGwg== X-Forwarded-Encrypted: i=1; AJvYcCWzbeW3kA0vwBLOHJbHKeKr7vZp2dCUfbFsUJXLgdrnjwdS2Ti2LF2Jm+lGDy+eEfo4pPOy3PhfZ1wiBSP4@vger.kernel.org X-Gm-Message-State: AOJu0YzfEQ9GsH1WP1YqZPqvNr2r66QIoVQyUTj9PofbudjH3/ySgZtB KJCZ1BC3Rc/3KcoAocr9eAxHjbhTYNx4TkVMTex82SQVWKnJ2lY4YgfU3gYEHu2EUIeGUCFzMQ= = X-Google-Smtp-Source: AGHT+IFJwsaDfsra9si5hSwL5TQTTo4RhmMq+iKHIyBjxFdz3SDsa44kizohADHJBxPg0EWk39rrAlUrlw== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a17:906:e207:b0:a89:ee2b:e2dd with SMTP id a640c23a62f3a-a998d31c253mr128966b.12.1728550780668; Thu, 10 Oct 2024 01:59:40 -0700 (PDT) Date: Thu, 10 Oct 2024 09:59:22 +0100 In-Reply-To: <20241010085930.1546800-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241010085930.1546800-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.rc0.187.ge670bccf7e-goog Message-ID: <20241010085930.1546800-4-tabba@google.com> Subject: [PATCH v3 03/11] KVM: guest_memfd: Introduce kvm_gmem_get_pfn_locked(), which retains the folio lock From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Create a new variant of kvm_gmem_get_pfn(), which retains the folio lock if it returns successfully. This is needed in subsequent patches in order to protect against races when checking whether a folio can be mapped by the host. Signed-off-by: Fuad Tabba --- include/linux/kvm_host.h | 11 ++++++++++ virt/kvm/guest_memfd.c | 45 +++++++++++++++++++++++++++++++--------- 2 files changed, 46 insertions(+), 10 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index db567d26f7b9..acf85995b582 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2464,6 +2464,9 @@ static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn) #ifdef CONFIG_KVM_PRIVATE_MEM int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn, kvm_pfn_t *pfn, int *max_order); +int kvm_gmem_get_pfn_locked(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, int *max_order); + #else static inline int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn, @@ -2472,6 +2475,14 @@ static inline int kvm_gmem_get_pfn(struct kvm *kvm, KVM_BUG_ON(1, kvm); return -EIO; } +static inline int kvm_gmem_get_pfn_locked(struct kvm *kvm, + struct kvm_memory_slot *slot, + gfn_t gfn, + kvm_pfn_t *pfn, int *max_order) +{ + KVM_BUG_ON(1, kvm); + return -EIO; +} #endif /* CONFIG_KVM_PRIVATE_MEM */ #ifdef CONFIG_HAVE_KVM_ARCH_GMEM_PREPARE diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 4d3ba346c415..f414646c475b 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -714,34 +714,59 @@ __kvm_gmem_get_pfn(struct file *file, struct kvm_memory_slot *slot, return folio; } -int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, - gfn_t gfn, kvm_pfn_t *pfn, int *max_order) +static int +kvm_gmem_get_pfn_folio_locked(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, int *max_order, + struct folio **folio) { struct file *file = kvm_gmem_get_file(slot); - struct folio *folio; bool is_prepared = false; int r = 0; if (!file) return -EFAULT; - folio = __kvm_gmem_get_pfn(file, slot, gfn, pfn, &is_prepared, max_order); - if (IS_ERR(folio)) { - r = PTR_ERR(folio); + *folio = __kvm_gmem_get_pfn(file, slot, gfn, pfn, &is_prepared, max_order); + if (IS_ERR(*folio)) { + r = PTR_ERR(*folio); goto out; } if (!is_prepared) - r = kvm_gmem_prepare_folio(kvm, slot, gfn, folio); + r = kvm_gmem_prepare_folio(kvm, slot, gfn, *folio); - folio_unlock(folio); - if (r < 0) - folio_put(folio); + if (r) { + folio_unlock(*folio); + folio_put(*folio); + } out: fput(file); return r; } + +int kvm_gmem_get_pfn_locked(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, int *max_order) +{ + struct folio *folio; + + return kvm_gmem_get_pfn_folio_locked(kvm, slot, gfn, pfn, max_order, &folio); + +} +EXPORT_SYMBOL_GPL(kvm_gmem_get_pfn_locked); + +int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, int *max_order) +{ + struct folio *folio; + int r; + + r = kvm_gmem_get_pfn_folio_locked(kvm, slot, gfn, pfn, max_order, &folio); + if (!r) + folio_unlock(folio); + + return r; +} EXPORT_SYMBOL_GPL(kvm_gmem_get_pfn); #ifdef CONFIG_KVM_GENERIC_PRIVATE_MEM From patchwork Thu Oct 10 08:59:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13829784 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8B9D61BBBC1 for ; Thu, 10 Oct 2024 08:59:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550786; cv=none; b=Q1R+RJE0BOhevGamKPdABRreUngr+qNSyD7aEXPC1iNhpWTDz4N4zB1L58BkR0xk9wLKOJHWceGJXSwF2wa5IyJCn41wfykaRKniSRQBQsqCSsFNPX7XxkwXp5toMKrL0k3DUyZbZ3UXhYZ3n6C48BSkbmHncMUj5IYiaJjKm+E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550786; c=relaxed/simple; bh=pRdqtCot2Cke/Hf83HQPerqg05/NdJlp6U+Chdcdoyw=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=AVkbLHzXR2/QWAmk8ad1WBdHmNqKoNJPPR3WclDt/JWz3aKumRe9U23lde+CzSVDl6DaMiTmiurWUvRkBuqJcKE0fhR1ckGWYllO3eUdLIRuAYUQsaQ9+DTUKfbkt3YVbYBi4j7JzTfG3NRz/rG7+qIyzAZYSDu2ZVmDbKeMLdk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=dXGD497t; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="dXGD497t" Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-e29088a6942so1150891276.0 for ; Thu, 10 Oct 2024 01:59:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728550783; x=1729155583; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=RaMAnTMU97KOopoBfYjbwx+ePz+tIxWqRe78kfl7LPI=; b=dXGD497tz02CHMCi+V3KlbEmMyQjC/wHPiqfYs/WO88UgKI2Bp51VadQIVxPbpMS9Y QLtib1dJtDf3XIfskC6nbPoUgchGPYN+V8wxKDqFLgiAnCy2ZVT8Wju5KjO+fkQEIdmJ HJ0efQ20JrrT0yyocdWLTWOxQMLrkebbQenAlPYdXuaM+eGooWVM2mOnzg9s1HRRI9Ub 48cD0ixd6fUCFgBA1EbHMUEHf1PwkJLqBzXQpPTLSnGI+4KlW5Pk39WV4Sny+uB449RX 4DX2nWB4LZ1i0ZsRy081B1IiXX5Aw9sKOUK12h9Gh2FwBHC9Oq/FtfNMymACbf947JAj eblA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728550783; x=1729155583; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=RaMAnTMU97KOopoBfYjbwx+ePz+tIxWqRe78kfl7LPI=; b=qdyuqeXBczqvdQ9KWVTnB2q2fkRVPHk+SAeGjDNzLU2eTgNQTfZhQ0Z3/lGoyFGqgJ PMp1YHhAtVu6nStYWi7TRAUX3L/7iAin9xlVWyjoT3s1CTLyVhpNfy2+12S2PTsM/X5f SnWkgK+ZfbYI6aT9r+ZLukRgHVNWFQ1LsfIOkxrsBdK3dvgjmv2y7Q8yCbgD7oc7ASxR OdMj1W/qFBby9/XU/Lnp4EWz8RuoVC9JfdNuu03SXpDceCgn4DGh81N1JFljtM8B48D3 YHzy+ueEjFejW3LV+s0Vayo2wvfL9taS8P/0D+9XYdb/jldHqWZdcK/8Z9U1cC07snP8 aptA== X-Forwarded-Encrypted: i=1; AJvYcCVNzEARO9RTDbjOOhQJEGCAklUjTKq8jFKj5/P7L9S2CjrKi/h2aWtXdSQTvaE72hsB+wcJyyykaf/uGppD@vger.kernel.org X-Gm-Message-State: AOJu0YzLnmdbrk6ejWni3VOwTBxrEsvS+XNMulnrr6k/jkHYK2AXY5P8 Sp4ujgRYC7qPE/Kbj+UMByKp7HoXSCUWplmEx8awytKKTT+XUAYlEMVMPvMziHiSAl0HvZe2OA= = X-Google-Smtp-Source: AGHT+IFXrnj8DsQckmU8dck3/aR7Cyrtk6dA8VC+TkvI6bTNlQ8hcSmPDes1mOxFCnXb7jNq/oQlri1uFw== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a5b:308:0:b0:e25:d46a:a6b6 with SMTP id 3f1490d57ef6-e28fe41d292mr38495276.8.1728550783417; Thu, 10 Oct 2024 01:59:43 -0700 (PDT) Date: Thu, 10 Oct 2024 09:59:23 +0100 In-Reply-To: <20241010085930.1546800-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241010085930.1546800-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.rc0.187.ge670bccf7e-goog Message-ID: <20241010085930.1546800-5-tabba@google.com> Subject: [PATCH v3 04/11] KVM: guest_memfd: Allow host to mmap guest_memfd() pages when shared From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Add support for mmap() and fault() for guest_memfd in the host. The ability to fault in a guest page is contingent on that page being shared with the host. The guest_memfd PRIVATE memory attribute is not used for two reasons. First because it reflects the userspace expectation for that memory location, and therefore can be toggled by userspace. The second is, although each guest_memfd file has a 1:1 binding with a KVM instance, the plan is to allow multiple files per inode, e.g. to allow intra-host migration to a new KVM instance, without destroying guest_memfd. The mapping is restricted to only memory explicitly shared with the host. KVM checks that the host doesn't have any mappings for private memory via the folio's refcount. To avoid races between paths that check mappability and paths that check whether the host has any mappings (via the refcount), the folio lock is held in while either check is being performed. This new feature is gated with a new configuration option, CONFIG_KVM_GMEM_MAPPABLE. Co-developed-by: Ackerley Tng Signed-off-by: Ackerley Tng Co-developed-by: Elliot Berman Signed-off-by: Elliot Berman Signed-off-by: Fuad Tabba --- Note that the functions kvm_gmem_is_mapped(), kvm_gmem_set_mappable(), and int kvm_gmem_clear_mappable() are not used in this patch series. They are intended to be used in future patches [*], which check and toggle mapability when the guest shares/unshares pages with the host. [*] https://android-kvm.googlesource.com/linux/+/refs/heads/tabba/guestmem-6.12-v3-pkvm --- include/linux/kvm_host.h | 52 +++++++++++ virt/kvm/Kconfig | 4 + virt/kvm/guest_memfd.c | 185 +++++++++++++++++++++++++++++++++++++++ virt/kvm/kvm_main.c | 138 +++++++++++++++++++++++++++++ 4 files changed, 379 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index acf85995b582..bda7fda9945e 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2527,4 +2527,56 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu, struct kvm_pre_fault_memory *range); #endif +#ifdef CONFIG_KVM_GMEM_MAPPABLE +bool kvm_gmem_is_mappable(struct kvm *kvm, gfn_t gfn, gfn_t end); +bool kvm_gmem_is_mapped(struct kvm *kvm, gfn_t start, gfn_t end); +int kvm_gmem_set_mappable(struct kvm *kvm, gfn_t start, gfn_t end); +int kvm_gmem_clear_mappable(struct kvm *kvm, gfn_t start, gfn_t end); +int kvm_slot_gmem_set_mappable(struct kvm_memory_slot *slot, gfn_t start, + gfn_t end); +int kvm_slot_gmem_clear_mappable(struct kvm_memory_slot *slot, gfn_t start, + gfn_t end); +bool kvm_slot_gmem_is_mappable(struct kvm_memory_slot *slot, gfn_t gfn); +#else +static inline bool kvm_gmem_is_mappable(struct kvm *kvm, gfn_t gfn, gfn_t end) +{ + WARN_ON_ONCE(1); + return false; +} +static inline bool kvm_gmem_is_mapped(struct kvm *kvm, gfn_t start, gfn_t end) +{ + WARN_ON_ONCE(1); + return false; +} +static inline int kvm_gmem_set_mappable(struct kvm *kvm, gfn_t start, gfn_t end) +{ + WARN_ON_ONCE(1); + return -EINVAL; +} +static inline int kvm_gmem_clear_mappable(struct kvm *kvm, gfn_t start, + gfn_t end) +{ + WARN_ON_ONCE(1); + return -EINVAL; +} +static inline int kvm_slot_gmem_set_mappable(struct kvm_memory_slot *slot, + gfn_t start, gfn_t end) +{ + WARN_ON_ONCE(1); + return -EINVAL; +} +static inline int kvm_slot_gmem_clear_mappable(struct kvm_memory_slot *slot, + gfn_t start, gfn_t end) +{ + WARN_ON_ONCE(1); + return -EINVAL; +} +static inline bool kvm_slot_gmem_is_mappable(struct kvm_memory_slot *slot, + gfn_t gfn) +{ + WARN_ON_ONCE(1); + return false; +} +#endif /* CONFIG_KVM_GMEM_MAPPABLE */ + #endif diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index fd6a3010afa8..2cfcb0848e37 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -120,3 +120,7 @@ config HAVE_KVM_ARCH_GMEM_PREPARE config HAVE_KVM_ARCH_GMEM_INVALIDATE bool depends on KVM_PRIVATE_MEM + +config KVM_GMEM_MAPPABLE + select KVM_PRIVATE_MEM + bool diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index f414646c475b..df3a6f05a16e 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -370,7 +370,184 @@ static void kvm_gmem_init_mount(void) kvm_gmem_mnt->mnt_flags |= MNT_NOEXEC; } +#ifdef CONFIG_KVM_GMEM_MAPPABLE +static struct folio * +__kvm_gmem_get_pfn(struct file *file, struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, bool *is_prepared, + int *max_order); + +static int gmem_set_mappable(struct inode *inode, pgoff_t start, pgoff_t end) +{ + struct xarray *mappable_offsets = &kvm_gmem_private(inode)->mappable_offsets; + void *xval = xa_mk_value(true); + pgoff_t i; + bool r; + + filemap_invalidate_lock(inode->i_mapping); + for (i = start; i < end; i++) { + r = xa_err(xa_store(mappable_offsets, i, xval, GFP_KERNEL)); + if (r) + break; + } + filemap_invalidate_unlock(inode->i_mapping); + + return r; +} + +static int gmem_clear_mappable(struct inode *inode, pgoff_t start, pgoff_t end) +{ + struct xarray *mappable_offsets = &kvm_gmem_private(inode)->mappable_offsets; + pgoff_t i; + int r = 0; + + filemap_invalidate_lock(inode->i_mapping); + for (i = start; i < end; i++) { + struct folio *folio; + + /* + * Holds the folio lock until after checking its refcount, + * to avoid races with paths that fault in the folio. + */ + folio = kvm_gmem_get_folio(inode, i); + if (WARN_ON_ONCE(IS_ERR(folio))) + continue; + + /* + * Check that the host doesn't have any mappings on clearing + * the mappable flag, because clearing the flag implies that the + * memory will be unshared from the host. Therefore, to maintain + * the invariant that the host cannot access private memory, we + * need to check that it doesn't have any mappings to that + * memory before making it private. + * + * Two references are expected because of kvm_gmem_get_folio(). + */ + if (folio_ref_count(folio) > 2) + r = -EPERM; + else + xa_erase(mappable_offsets, i); + + folio_put(folio); + folio_unlock(folio); + + if (r) + break; + } + filemap_invalidate_unlock(inode->i_mapping); + + return r; +} + +static bool gmem_is_mappable(struct inode *inode, pgoff_t pgoff) +{ + struct xarray *mappable_offsets = &kvm_gmem_private(inode)->mappable_offsets; + bool r; + + filemap_invalidate_lock_shared(inode->i_mapping); + r = xa_find(mappable_offsets, &pgoff, pgoff, XA_PRESENT); + filemap_invalidate_unlock_shared(inode->i_mapping); + + return r; +} + +int kvm_slot_gmem_set_mappable(struct kvm_memory_slot *slot, gfn_t start, gfn_t end) +{ + struct inode *inode = file_inode(slot->gmem.file); + pgoff_t start_off = slot->gmem.pgoff + start - slot->base_gfn; + pgoff_t end_off = start_off + end - start; + + return gmem_set_mappable(inode, start_off, end_off); +} + +int kvm_slot_gmem_clear_mappable(struct kvm_memory_slot *slot, gfn_t start, gfn_t end) +{ + struct inode *inode = file_inode(slot->gmem.file); + pgoff_t start_off = slot->gmem.pgoff + start - slot->base_gfn; + pgoff_t end_off = start_off + end - start; + + return gmem_clear_mappable(inode, start_off, end_off); +} + +bool kvm_slot_gmem_is_mappable(struct kvm_memory_slot *slot, gfn_t gfn) +{ + struct inode *inode = file_inode(slot->gmem.file); + unsigned long pgoff = slot->gmem.pgoff + gfn - slot->base_gfn; + + return gmem_is_mappable(inode, pgoff); +} + +static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf) +{ + struct inode *inode = file_inode(vmf->vma->vm_file); + struct folio *folio; + vm_fault_t ret = VM_FAULT_LOCKED; + + /* + * Holds the folio lock until after checking whether it can be faulted + * in, to avoid races with paths that change a folio's mappability. + */ + folio = kvm_gmem_get_folio(inode, vmf->pgoff); + if (!folio) + return VM_FAULT_SIGBUS; + + if (folio_test_hwpoison(folio)) { + ret = VM_FAULT_HWPOISON; + goto out; + } + + if (!gmem_is_mappable(inode, vmf->pgoff)) { + ret = VM_FAULT_SIGBUS; + goto out; + } + + if (!folio_test_uptodate(folio)) { + unsigned long nr_pages = folio_nr_pages(folio); + unsigned long i; + + for (i = 0; i < nr_pages; i++) + clear_highpage(folio_page(folio, i)); + + folio_mark_uptodate(folio); + } + + vmf->page = folio_file_page(folio, vmf->pgoff); +out: + if (ret != VM_FAULT_LOCKED) { + folio_put(folio); + folio_unlock(folio); + } + + return ret; +} + +static const struct vm_operations_struct kvm_gmem_vm_ops = { + .fault = kvm_gmem_fault, +}; + +static int kvm_gmem_mmap(struct file *file, struct vm_area_struct *vma) +{ + if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) != + (VM_SHARED | VM_MAYSHARE)) { + return -EINVAL; + } + + file_accessed(file); + vm_flags_set(vma, VM_DONTDUMP); + vma->vm_ops = &kvm_gmem_vm_ops; + + return 0; +} +#else +static int gmem_set_mappable(struct inode *inode, pgoff_t start, pgoff_t end) +{ + WARN_ON_ONCE(1); + return -EINVAL; +} +#define kvm_gmem_mmap NULL +#endif /* CONFIG_KVM_GMEM_MAPPABLE */ + static struct file_operations kvm_gmem_fops = { + .mmap = kvm_gmem_mmap, .open = generic_file_open, .release = kvm_gmem_release, .fallocate = kvm_gmem_fallocate, @@ -557,6 +734,14 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) goto err_gmem; } + if (IS_ENABLED(CONFIG_KVM_GMEM_MAPPABLE)) { + err = gmem_set_mappable(file_inode(file), 0, size >> PAGE_SHIFT); + if (err) { + fput(file); + goto err_gmem; + } + } + kvm_get_kvm(kvm); gmem->kvm = kvm; xa_init(&gmem->bindings); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 05cbb2548d99..aed9cf2f1685 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3263,6 +3263,144 @@ static int next_segment(unsigned long len, int offset) return len; } +#ifdef CONFIG_KVM_GMEM_MAPPABLE +static bool __kvm_gmem_is_mappable(struct kvm *kvm, gfn_t start, gfn_t end) +{ + struct kvm_memslot_iter iter; + + lockdep_assert_held(&kvm->slots_lock); + + kvm_for_each_memslot_in_gfn_range(&iter, kvm_memslots(kvm), start, end) { + struct kvm_memory_slot *memslot = iter.slot; + gfn_t gfn_start, gfn_end, i; + + gfn_start = max(start, memslot->base_gfn); + gfn_end = min(end, memslot->base_gfn + memslot->npages); + if (WARN_ON_ONCE(gfn_start >= gfn_end)) + continue; + + for (i = gfn_start; i < gfn_end; i++) { + if (!kvm_slot_gmem_is_mappable(memslot, i)) + return false; + } + } + + return true; +} + +bool kvm_gmem_is_mappable(struct kvm *kvm, gfn_t start, gfn_t end) +{ + bool r; + + mutex_lock(&kvm->slots_lock); + r = __kvm_gmem_is_mappable(kvm, start, end); + mutex_unlock(&kvm->slots_lock); + + return r; +} + +static bool kvm_gmem_is_pfn_mapped(struct kvm *kvm, struct kvm_memory_slot *memslot, gfn_t gfn_idx) +{ + struct page *page; + bool is_mapped; + kvm_pfn_t pfn; + + /* + * Holds the folio lock until after checking its refcount, + * to avoid races with paths that fault in the folio. + */ + if (WARN_ON_ONCE(kvm_gmem_get_pfn_locked(kvm, memslot, gfn_idx, &pfn, NULL))) + return false; + + page = pfn_to_page(pfn); + + /* Two references are expected because of kvm_gmem_get_pfn_locked(). */ + is_mapped = page_ref_count(page) > 2; + + put_page(page); + unlock_page(page); + + return is_mapped; +} + +static bool __kvm_gmem_is_mapped(struct kvm *kvm, gfn_t start, gfn_t end) +{ + struct kvm_memslot_iter iter; + + lockdep_assert_held(&kvm->slots_lock); + + kvm_for_each_memslot_in_gfn_range(&iter, kvm_memslots(kvm), start, end) { + struct kvm_memory_slot *memslot = iter.slot; + gfn_t gfn_start, gfn_end, i; + + gfn_start = max(start, memslot->base_gfn); + gfn_end = min(end, memslot->base_gfn + memslot->npages); + if (WARN_ON_ONCE(gfn_start >= gfn_end)) + continue; + + for (i = gfn_start; i < gfn_end; i++) { + if (kvm_gmem_is_pfn_mapped(kvm, memslot, i)) + return true; + } + } + + return false; +} + +bool kvm_gmem_is_mapped(struct kvm *kvm, gfn_t start, gfn_t end) +{ + bool r; + + mutex_lock(&kvm->slots_lock); + r = __kvm_gmem_is_mapped(kvm, start, end); + mutex_unlock(&kvm->slots_lock); + + return r; +} + +static int kvm_gmem_toggle_mappable(struct kvm *kvm, gfn_t start, gfn_t end, + bool is_mappable) +{ + struct kvm_memslot_iter iter; + int r = 0; + + mutex_lock(&kvm->slots_lock); + + kvm_for_each_memslot_in_gfn_range(&iter, kvm_memslots(kvm), start, end) { + struct kvm_memory_slot *memslot = iter.slot; + gfn_t gfn_start, gfn_end; + + gfn_start = max(start, memslot->base_gfn); + gfn_end = min(end, memslot->base_gfn + memslot->npages); + if (WARN_ON_ONCE(start >= end)) + continue; + + if (is_mappable) + r = kvm_slot_gmem_set_mappable(memslot, gfn_start, gfn_end); + else + r = kvm_slot_gmem_clear_mappable(memslot, gfn_start, gfn_end); + + if (WARN_ON_ONCE(r)) + break; + } + + mutex_unlock(&kvm->slots_lock); + + return r; +} + +int kvm_gmem_set_mappable(struct kvm *kvm, gfn_t start, gfn_t end) +{ + return kvm_gmem_toggle_mappable(kvm, start, end, true); +} + +int kvm_gmem_clear_mappable(struct kvm *kvm, gfn_t start, gfn_t end) +{ + return kvm_gmem_toggle_mappable(kvm, start, end, false); +} + +#endif /* CONFIG_KVM_GMEM_MAPPABLE */ + /* Copy @len bytes from guest memory at '(@gfn * PAGE_SIZE) + @offset' to @data */ static int __kvm_read_guest_page(struct kvm_memory_slot *slot, gfn_t gfn, void *data, int offset, int len) From patchwork Thu Oct 10 08:59:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13829785 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B587718DF81 for ; Thu, 10 Oct 2024 08:59:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550788; cv=none; b=lqtdm4s11GkyUgqeJDAaWg+McM/X8NXgXKmigtf5nTNI7tW4ZHgn/z0dFL+Aj8Io0OuhzzozGuaPrKeXqFnu7W6KMazc9A6Axyt7DcIryHBB7/IeK3a7fI47gKp6xB0ZuTQ9MgAwMbvZSaxkEJcAsQL9UfN4TWGT/Wu/rqSyXmM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550788; c=relaxed/simple; bh=mYaVJvOvzfMQ+zKlTPpie9EoVyIYwwMO4jC1tA91HVg=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=bhPq8CX8r7UQHtZcWUfrwG5U0EUD8QYWEkogbK8sOHL0tyk+0wtI4Jyu24GC1lMgJQ08gB0dpoovxWR9V+z43EdOT9UtdT6BR0tKBuOlSUEDnC/vQ+xP5Gr+r+UMp+6JpmkbOldG8IUsgUulqC8FJ9wZpBVk36+C+5k9+oANsZI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=JUj2UhB/; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="JUj2UhB/" Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-6e21dcc7044so12980027b3.1 for ; Thu, 10 Oct 2024 01:59:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728550786; x=1729155586; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=8uqb2Hq7+/kapW0F00vPZqAJ1bU0uvb9gYwuhhY1Sds=; b=JUj2UhB/Ri2AHh5U8jLMtONhBl7M/XgydprvzIgAq7Q/AFtov5QdJ0jmkuZ5haiDhO TE+JvjQZsswskWPSsRWhOB7li1C18fLSTZVLTJspnKrDOTU80F5ZWy85wwUIcumVZgGf r3xf/zzpDDys86TXa268sojMPRKuEmTNLd/BBbMMxGV8Z6agHv8kQxDBKchIp2mNt6qA yQ2A5kKmWEDn1+YNMq3UmCYNrBQf1wfOeqCSKd84uJsikmrf+tXNpAYx8rigKgyp98mn W9L2A/QdV0nXjjWwTiOliiEn3FEF808a95GJK+Igbx6kISUjk+oePa5bST3Dstr5qS9z L6pQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728550786; x=1729155586; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=8uqb2Hq7+/kapW0F00vPZqAJ1bU0uvb9gYwuhhY1Sds=; b=KFFyFMGq1IXqkoXYcu3HKGOKoPUoLX5UnAYFny41r4x+dFo2TUjtgg1BXS/Fvwse3s CBUtTXs/hhMWr4gmvj68Mwm6gkaA3aCLHSoj6/4/J6raerGYoyTjgd8ppsdH2jCa78ry OsMfeudyO67x3r9hJjRTI7M1FG4kX21zApwqJl4RHbHuN22nDAmzKXdZ9nw1o5mNoRbb KUvyG0h4YMkSP/U7FkibPbj/9o3d+EZSwypYu5J/9kZhqdS2VcDevsFoj8gkpSPzGzDs C4lFxPFv9dbZJdHIIfl+xUzJXh26v5DJT6xLrCjMrrr0XbIA15NLzRSCCgNvRvQUXP9w z8Lw== X-Forwarded-Encrypted: i=1; AJvYcCVyO9bTOR7uVIwYEY0bPuoVIxYFGx/AW51DapT2WaUB6vtB4Y9xPMr4EALWybpbaT5zA/sMMCQzh+q5cBLU@vger.kernel.org X-Gm-Message-State: AOJu0YzeLstR8aTCF/OgKMqKufZUjGr/J+aoQCQE93Es84JfEeJCaBTK bULQY00Ii1hzFiBSxQM72UtIYYMxBmuNwr8zR/smt02XSeR0igTYJ43qS5qYn/MxQ7DSsRAcIg= = X-Google-Smtp-Source: AGHT+IG7SQRbRuMfYH/Rui+CBChrEOHfesWWRZDd/p/7xcfBTKh/y+cfP2FSghBgXkgmmW6X7QZVv1DCpQ== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a81:b545:0:b0:6e2:19f3:ff75 with SMTP id 00721157ae682-6e3221a89dbmr187767b3.6.1728550785757; Thu, 10 Oct 2024 01:59:45 -0700 (PDT) Date: Thu, 10 Oct 2024 09:59:24 +0100 In-Reply-To: <20241010085930.1546800-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241010085930.1546800-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.rc0.187.ge670bccf7e-goog Message-ID: <20241010085930.1546800-6-tabba@google.com> Subject: [PATCH v3 05/11] KVM: guest_memfd: Add guest_memfd support to kvm_(read|/write)_guest_page() From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Make kvm_(read|/write)_guest_page() capable of accessing guest memory for slots that don't have a userspace address, but only if the memory is mappable, which also indicates that it is accessible by the host. Signed-off-by: Fuad Tabba --- virt/kvm/kvm_main.c | 137 ++++++++++++++++++++++++++++++++++++++------ 1 file changed, 118 insertions(+), 19 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index aed9cf2f1685..77e6412034b9 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3399,23 +3399,114 @@ int kvm_gmem_clear_mappable(struct kvm *kvm, gfn_t start, gfn_t end) return kvm_gmem_toggle_mappable(kvm, start, end, false); } +static int __kvm_read_guest_memfd_page(struct kvm *kvm, + struct kvm_memory_slot *slot, + gfn_t gfn, void *data, int offset, + int len) +{ + struct page *page; + u64 pfn; + int r; + + /* + * Holds the folio lock until after checking whether it can be faulted + * in, to avoid races with paths that change a folio's mappability. + */ + r = kvm_gmem_get_pfn_locked(kvm, slot, gfn, &pfn, NULL); + if (r) + return r; + + page = pfn_to_page(pfn); + + if (!kvm_gmem_is_mappable(kvm, gfn, gfn + 1)) { + r = -EPERM; + goto unlock; + } + memcpy(data, page_address(page) + offset, len); +unlock: + if (r) + put_page(page); + else + kvm_release_pfn_clean(pfn); + unlock_page(page); + + return r; +} + +static int __kvm_write_guest_memfd_page(struct kvm *kvm, + struct kvm_memory_slot *slot, + gfn_t gfn, const void *data, + int offset, int len) +{ + struct page *page; + u64 pfn; + int r; + + /* + * Holds the folio lock until after checking whether it can be faulted + * in, to avoid races with paths that change a folio's mappability. + */ + r = kvm_gmem_get_pfn_locked(kvm, slot, gfn, &pfn, NULL); + if (r) + return r; + + page = pfn_to_page(pfn); + + if (!kvm_gmem_is_mappable(kvm, gfn, gfn + 1)) { + r = -EPERM; + goto unlock; + } + memcpy(page_address(page) + offset, data, len); +unlock: + if (r) + put_page(page); + else + kvm_release_pfn_dirty(pfn); + unlock_page(page); + + return r; +} +#else +static int __kvm_read_guest_memfd_page(struct kvm *kvm, + struct kvm_memory_slot *slot, + gfn_t gfn, void *data, int offset, + int len) +{ + WARN_ON_ONCE(1); + return -EIO; +} + +static int __kvm_write_guest_memfd_page(struct kvm *kvm, + struct kvm_memory_slot *slot, + gfn_t gfn, const void *data, + int offset, int len) +{ + WARN_ON_ONCE(1); + return -EIO; +} #endif /* CONFIG_KVM_GMEM_MAPPABLE */ /* Copy @len bytes from guest memory at '(@gfn * PAGE_SIZE) + @offset' to @data */ -static int __kvm_read_guest_page(struct kvm_memory_slot *slot, gfn_t gfn, - void *data, int offset, int len) + +static int __kvm_read_guest_page(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, void *data, int offset, int len) { - int r; unsigned long addr; if (WARN_ON_ONCE(offset + len > PAGE_SIZE)) return -EFAULT; + if (IS_ENABLED(CONFIG_KVM_GMEM_MAPPABLE) && + kvm_slot_can_be_private(slot) && + !slot->userspace_addr) { + return __kvm_read_guest_memfd_page(kvm, slot, gfn, data, + offset, len); + } + addr = gfn_to_hva_memslot_prot(slot, gfn, NULL); if (kvm_is_error_hva(addr)) return -EFAULT; - r = __copy_from_user(data, (void __user *)addr + offset, len); - if (r) + if (__copy_from_user(data, (void __user *)addr + offset, len)) return -EFAULT; return 0; } @@ -3425,7 +3516,7 @@ int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset, { struct kvm_memory_slot *slot = gfn_to_memslot(kvm, gfn); - return __kvm_read_guest_page(slot, gfn, data, offset, len); + return __kvm_read_guest_page(kvm, slot, gfn, data, offset, len); } EXPORT_SYMBOL_GPL(kvm_read_guest_page); @@ -3434,7 +3525,7 @@ int kvm_vcpu_read_guest_page(struct kvm_vcpu *vcpu, gfn_t gfn, void *data, { struct kvm_memory_slot *slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn); - return __kvm_read_guest_page(slot, gfn, data, offset, len); + return __kvm_read_guest_page(vcpu->kvm, slot, gfn, data, offset, len); } EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest_page); @@ -3511,22 +3602,30 @@ EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest_atomic); /* Copy @len bytes from @data into guest memory at '(@gfn * PAGE_SIZE) + @offset' */ static int __kvm_write_guest_page(struct kvm *kvm, - struct kvm_memory_slot *memslot, gfn_t gfn, - const void *data, int offset, int len) + struct kvm_memory_slot *slot, gfn_t gfn, + const void *data, int offset, int len) { - int r; - unsigned long addr; - if (WARN_ON_ONCE(offset + len > PAGE_SIZE)) return -EFAULT; - addr = gfn_to_hva_memslot(memslot, gfn); - if (kvm_is_error_hva(addr)) - return -EFAULT; - r = __copy_to_user((void __user *)addr + offset, data, len); - if (r) - return -EFAULT; - mark_page_dirty_in_slot(kvm, memslot, gfn); + if (IS_ENABLED(CONFIG_KVM_GMEM_MAPPABLE) && + kvm_slot_can_be_private(slot) && + !slot->userspace_addr) { + int r = __kvm_write_guest_memfd_page(kvm, slot, gfn, data, + offset, len); + + if (r) + return r; + } else { + unsigned long addr = gfn_to_hva_memslot(slot, gfn); + + if (kvm_is_error_hva(addr)) + return -EFAULT; + if (__copy_to_user((void __user *)addr + offset, data, len)) + return -EFAULT; + } + + mark_page_dirty_in_slot(kvm, slot, gfn); return 0; } From patchwork Thu Oct 10 08:59:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13829786 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D030B1C4604 for ; Thu, 10 Oct 2024 08:59:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550790; cv=none; b=leplvKiHkntoQmwOLvfSIopsXiZCmpjAWUNYo18wA0CN6uqCuudTMimBb4QHVk4H/qU/K8nrlw8GgU4aHHFQw4fwJPrwjspn1OSiWvh/ZEVm+K5amYllp/0kVXfz6rInCjwOT5F+xhtgbufhioqsMJ0WsPHpcATJ2Ri9vf/d9nc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550790; c=relaxed/simple; bh=AjE2RSAyv4Xb57vcuvQo1hOwImHvsIkCCGQdz654rUA=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=s/zflGhKiSoh60DTbMAutc5Aw386Ke5e2KcDGtS3gUBOI85QpeCq5HF4NQbzi4FXD4om66Zha7/k/sw1OLtbWDd0V7tg2gnAUW76C0mxlY4YGHd5kUBquZYbMho5WL2fXMYR5nD0nRl+Do6BpcA2bcvOvG546QQW0TfxYs00qMc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=hb69+X9G; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="hb69+X9G" Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-6e3231725c9so14057977b3.1 for ; Thu, 10 Oct 2024 01:59:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728550788; x=1729155588; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=McR6hFD8xu5MydKKfp3q4/vtlv2OyOY0xakrmAL+qLs=; b=hb69+X9G0qjIS5au7DBpFDfQhOl5J3v1gEEre9159cJyE1ma08xDVygYLgdqlOgOXo 2qVMZmiQDOrrexw6KE7vXe4yb5hRzMP/Uq9dYCY/LAOcNZroV7Mq6cwtJiYfSTw1bgJo +ncUbL2Sdbmt8lT/f/Mx0G8kwPzbzuMgEZct6hHc+BY+lPvPePTxLLz+BL24FSvkRxZ7 TmmO32OkIsdCV6eqZ72Vx86qWYJTKNaRuO2V7Ytpro37fSU1nXc+lyBqnUjJqNj+2mP6 zpelmGU9sxlc0OUejkjTKqCiVZmA6aFlqkvA3h9IfgvJf+rcbm0mPWsce2uOqN4PhWoA KGYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728550788; x=1729155588; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=McR6hFD8xu5MydKKfp3q4/vtlv2OyOY0xakrmAL+qLs=; b=L0UR0IrcxWQXfPp0WIxevnZ0mvzsdRKoVlyUXLQX++bpXKIPlFTtP+5E08pNRYWHYE 0i0xtuLj+HRhdiZKPFfNFUIGsoSvg2SPnk9q4zJMu5QxmWA5SfnWhGKhEHPlRRkpZarQ hkdbYSr+bwo946wJneP7U4iJzvd7KAClCjGgcrYeFYiFmTaXakj5eu7ftu+my1I0PjpJ N035AMGJA6q76zF7LiVc/RST8h7y7qAMfoKWzMg4dbx3/l5lFB/R0aGpcvZ9U2PJXrR3 Q3/6ksY9Tii63EQEFfRLI5pZnF5k66Zp8b+tQWUYdy3Bl9Ue6G4IzAuxtLXdvMCfRzCX HrGQ== X-Forwarded-Encrypted: i=1; AJvYcCXrUq6alK8BsV5zPUVWTJ4zd7uAJtha548hxuPI7h0sJIRbM4ev4QiNUntAainxVlsU5tXowa/yt1VYK9l3@vger.kernel.org X-Gm-Message-State: AOJu0YxvNM/5chvVgyX3a+ICS+ZMJtT7/UJaOfhrw/RQzjzmHqerXeIO MMRwH2v/n8XHMx+rOAKCdCHDYukxSlElU56adUNKVd14wYCPVv4axXpyYKDILR4+c6ailSefSA= = X-Google-Smtp-Source: AGHT+IFvDP0XuuJHxV5RnbWlC7X4GO/XpvIVrGemFE3icy6MnaUivOOGAIqQ7A9+OzK7/cvpWXHNkIHIjw== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a5b:e8e:0:b0:e25:e5b9:8114 with SMTP id 3f1490d57ef6-e28fe362769mr134823276.4.1728550787926; Thu, 10 Oct 2024 01:59:47 -0700 (PDT) Date: Thu, 10 Oct 2024 09:59:25 +0100 In-Reply-To: <20241010085930.1546800-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241010085930.1546800-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.rc0.187.ge670bccf7e-goog Message-ID: <20241010085930.1546800-7-tabba@google.com> Subject: [PATCH v3 06/11] KVM: guest_memfd: Add KVM capability to check if guest_memfd is host mappable From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Add the KVM capability KVM_CAP_GUEST_MEMFD_MAPPABLE, which is true if mapping guest memory is supported by the host. Signed-off-by: Fuad Tabba --- include/uapi/linux/kvm.h | 1 + virt/kvm/kvm_main.c | 4 ++++ 2 files changed, 5 insertions(+) diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 637efc055145..2c6057bab71c 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -933,6 +933,7 @@ struct kvm_enable_cap { #define KVM_CAP_PRE_FAULT_MEMORY 236 #define KVM_CAP_X86_APIC_BUS_CYCLES_NS 237 #define KVM_CAP_X86_GUEST_MODE 238 +#define KVM_CAP_GUEST_MEMFD_MAPPABLE 239 struct kvm_irq_routing_irqchip { __u32 irqchip; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 77e6412034b9..c2ff09197795 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -5176,6 +5176,10 @@ static int kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg) #ifdef CONFIG_KVM_PRIVATE_MEM case KVM_CAP_GUEST_MEMFD: return !kvm || kvm_arch_has_private_mem(kvm); +#endif +#ifdef CONFIG_KVM_GMEM_MAPPABLE + case KVM_CAP_GUEST_MEMFD_MAPPABLE: + return !kvm || kvm_arch_has_private_mem(kvm); #endif default: break; From patchwork Thu Oct 10 08:59:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13829787 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CF7EA18DF81 for ; Thu, 10 Oct 2024 08:59:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550793; cv=none; b=Ohysa2/HE9p2EfFv11nNBDUatDzOpvVJnA7/luMoA7uS2AlERqDhBOVMvhEU9oke8Y4uIZ4XVLWC3IuV1ARITNyTOsDFWMgzshHHLztv5kL/2JineBA2C/3BdCWJ0WXzjeeFN5hDR0nreNjR3jubTUGAslVNEXhrmE+sRbxokcM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550793; c=relaxed/simple; bh=BhjpzC9e+LOdQaEyIwWjPOfcNwacANucZUU5tZhNjr4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=OVkgym2b/Yg0iYI7mrCcSSieIZztXBc2nIdTJyuUjSkr5vuC+Y3qx+bCzq110IZVnTMEkCdlZdeFxqp3gWqHMLqHbRKRpypZkE2jlDaIGK7RNBDi95iw700Wx7pseHMxxtrqkxvvdNUyXZOGGmwFcxEU8MP69jxi/BinW5kCwKY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=NE+EOjow; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="NE+EOjow" Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-6e20e22243dso13326677b3.1 for ; Thu, 10 Oct 2024 01:59:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728550791; x=1729155591; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=zKaf8nivAag4/4uIL21ZmJPpCRWLdv9h6z/9SiSCBvk=; b=NE+EOjowsLpQuoh+ZjHURomptnThefOwBzO24Jr7z7JBcV8VeT+gnefkJOdb7+EDv/ YzDNXq2BJfusfUr6VLOhAsQtWHdbNJhBUyV6tVxiE9iSbJVb0XdRpJsiJnPpNvGRa1YZ eW5JOOWKpkdWkYabbPmPOylXkhXlPEOdlpbmWR0hl/r54YUQUTRAHrXPZK4KeTWODHxQ FZ1QkAZ56Atasf4AC6VrcaUhG9NweXWpdHz1KbzkWsOX4M/dRslnLA+cAcSylrJe3ITh hvJsXvSmGE7qRbzeuTPyY9QortdR3IHfP14XHHjzd9pGB9x/4Ifdy3x492+FBMdthXJN Wimw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728550791; x=1729155591; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=zKaf8nivAag4/4uIL21ZmJPpCRWLdv9h6z/9SiSCBvk=; b=m8w1NbDM/6BgZMvJLUhEJT8F6AQigFlzD9aSQ+q0tWP4nOWLcnrj0aiCtQJievOKEz ISm6E+c2wPYX0UAfjQHjLB+VATRh1PXkhutkXIfNJuU7QRa2ZVrmhHEsXVi1tRvMnoe3 P6PZ8ccD5+9ColFxx+iQYnMcECWSFsPWMONdBYIROXbxc7LGTEgpo6zkcTQkIps/70ul 502AyxN9UD8OUKdFbfT+na0FpCHOkxtGAp7frnQq8ipgUiPleMruAX5dk65YptBhwUet 8VcJTs3/IC6k+H+EOiyIGQmTAj8wPJMKm5QJ1snQcmtOJKLm6TlD+1qGWgqd30AZI6rz KLLA== X-Forwarded-Encrypted: i=1; AJvYcCVsg8M07Nid2ve0rbVGDs6APj1/dgFBr7/o7xFAxmFKppsKghdAAWlXqTilgO/L4JmRcpbrBji8Zzho31Dl@vger.kernel.org X-Gm-Message-State: AOJu0YyqKfele2omKoA7ifC+lTTNMRJHdBigHwErwDatW2dphniLFSAw MqIX8769lUJQjMqrdnqU4JVF9SCj5bltSsCNi9hXLNJYoadb1dO1iJpZQBKzu7ar1fsYG3uHIg= = X-Google-Smtp-Source: AGHT+IFyezTuDqo1giJ/p5guQQYFu/xIk5SZEqJH7PMzqSsZJQW4O7K2L6rTa8b012uIruIWq11czsKArw== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a25:c2c1:0:b0:e22:5bdf:39c1 with SMTP id 3f1490d57ef6-e28fe45b504mr3520276.10.1728550790172; Thu, 10 Oct 2024 01:59:50 -0700 (PDT) Date: Thu, 10 Oct 2024 09:59:26 +0100 In-Reply-To: <20241010085930.1546800-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241010085930.1546800-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.rc0.187.ge670bccf7e-goog Message-ID: <20241010085930.1546800-8-tabba@google.com> Subject: [PATCH v3 07/11] KVM: guest_memfd: Add a guest_memfd() flag to initialize it as mappable From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Not all use cases require guest_memfd() to be mappable by the host when first created. Add a new flag, GUEST_MEMFD_FLAG_INIT_MAPPABLE, which when set on KVM_CREATE_GUEST_MEMFD initializes the memory as mappable by the host. Otherwise, memory is private until shared by the guest with the host. Signed-off-by: Fuad Tabba --- Documentation/virt/kvm/api.rst | 4 ++++ include/uapi/linux/kvm.h | 1 + virt/kvm/guest_memfd.c | 6 +++++- 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index e32471977d0a..c503f9443335 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -6380,6 +6380,10 @@ most one mapping per page, i.e. binding multiple memory regions to a single guest_memfd range is not allowed (any number of memory regions can be bound to a single guest_memfd file, but the bound ranges must not overlap). +If the capability KVM_CAP_GUEST_MEMFD_MAPPABLE is supported, then the flags +field supports GUEST_MEMFD_FLAG_INIT_MAPPABLE, which initializes the memory +as mappable by the host. + See KVM_SET_USER_MEMORY_REGION2 for additional details. 4.143 KVM_PRE_FAULT_MEMORY diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 2c6057bab71c..751f167d0f33 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1558,6 +1558,7 @@ struct kvm_memory_attributes { #define KVM_MEMORY_ATTRIBUTE_PRIVATE (1ULL << 3) #define KVM_CREATE_GUEST_MEMFD _IOWR(KVMIO, 0xd4, struct kvm_create_guest_memfd) +#define GUEST_MEMFD_FLAG_INIT_MAPPABLE BIT(0) struct kvm_create_guest_memfd { __u64 size; diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index df3a6f05a16e..9080fa29cd8c 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -734,7 +734,8 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) goto err_gmem; } - if (IS_ENABLED(CONFIG_KVM_GMEM_MAPPABLE)) { + if (IS_ENABLED(CONFIG_KVM_GMEM_MAPPABLE) && + (flags & GUEST_MEMFD_FLAG_INIT_MAPPABLE)) { err = gmem_set_mappable(file_inode(file), 0, size >> PAGE_SHIFT); if (err) { fput(file); @@ -763,6 +764,9 @@ int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args) u64 flags = args->flags; u64 valid_flags = 0; + if (IS_ENABLED(CONFIG_KVM_GMEM_MAPPABLE)) + valid_flags |= GUEST_MEMFD_FLAG_INIT_MAPPABLE; + if (flags & ~valid_flags) return -EINVAL; From patchwork Thu Oct 10 08:59:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13829788 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 331EB1C6F4F for ; Thu, 10 Oct 2024 08:59:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550796; cv=none; b=V5e8b3gKMGye3DrrzfyHPwH5VDRjz8pzVd2eohKJ3A6EIgmLjJNCLGIcHl9c5B13PQFIO/ql26tRftcOGMgHGyOdm3SWinRyBALSqI+F77qMF2/952jLbL2xaK6nQxDUngQIbBHPkx2CChK+0naxLlMaQZjENDSSw7AjbTBOPmc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550796; c=relaxed/simple; bh=6O9oX/Eb/qfSLrwuOSLT7z4LMccm6m6RtLN57f3JkYg=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=DEwidOp7eo+xmYTqF1EKi8bVLVsQrq1vy692ykm1/FFF9B70FzpbS8NJ4DAsvZr+CvnOB5X7adSLwQ2vrLhgU/b/aoJzshPIMvaOIX4uNqq2dDGZ1b5nLUkXf9D3YVj3I0udt5CZWUTddwcORHNjStBi/wLkXq4SUqypQVPSu6k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=4UVtuTQ9; arc=none smtp.client-ip=209.85.221.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="4UVtuTQ9" Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-37d5116f0a6so16063f8f.0 for ; Thu, 10 Oct 2024 01:59:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728550793; x=1729155593; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=qzUThw8QDZNv9fDcMVRqMkWYYerRinZf/dn/Fj9e+Nk=; b=4UVtuTQ9dWKLsc1PmEjFOJhbTtlwY95wgrhIHEY3TKL9+dnqkY3BxXS01Ka+hjh5Vz h4pvjEJ5Boc05UuaFxy0E0/b8v3HobrAGqqPRjxZM2SxCDf2yjPmbU8FraWRMmFrt8hP bOibnzwpRWOoWT2bbVH6IEkVEkzdNHBwguvrR4o71Db+5ltqlVgyrbVayBss7IryCTc3 vflbfjTVhJ8cfK4R4x/WPggPHGRZ1DDEuIBf6+SwD0gHsg87YjeosRezc58mRwl5J6Kr WqNPsLpp46YTMgKvVMBD/DT1vXb+O2WIZxDMvN4n08nAzT+iCSxwrLbmQUq8ZqP6OwtM U7ow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728550793; x=1729155593; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=qzUThw8QDZNv9fDcMVRqMkWYYerRinZf/dn/Fj9e+Nk=; b=U/IdG8JYUqlmXXOt0LkMZZNN9Mbws+fUxzB8WUoCFzOyOtRhYBy148aVMW0e/3esVK +FHwz9dkEZ2hMQdUeJ6YvyjiFIrMhE7eehDyk6Szt0FvkoBYLJ2tk8OwIRcPba7MTEHe TK9344A+fZEZ0OjeG/eVPR1GpllI4BHcYC9BWMu68Jyf5OR3w9GWM2HNmP7nKMJs5clA 9aqYALNJBuHRHgxnTwi024oVY/GFsEauck9uWgCsYVmMdKCVVz9Go+pQoSe6iMi93jQ+ HoOfW0B1n8vCLkqh73kLpatkGsDhT7QqqDeWSWSN3gE/65Y4HUF0AtUwXFXx99bKfJLf t+ig== X-Forwarded-Encrypted: i=1; AJvYcCU+dfQDvBLidVvazaHbDHpICj0NQvJ/yX7qBXDp5TQGvDXoylEPeasjA7nKybOXMeJHrWRIV+pRBIf58VwD@vger.kernel.org X-Gm-Message-State: AOJu0Ywi/g/DKtSSCJs6d1r0y4MUsXinz+QGv6YyUFjVA8BXFx4J0QR2 oD+SjeWanka5qA/+nbxSrMwzM7ZZOxxzoMQ5GjvQE2W0kkpNS1YTNh71KmfsLN6VUd/gSAs9tw= = X-Google-Smtp-Source: AGHT+IGnEPVNJA5hNA6zgaReNf096hRtQziouC1E1cxYtRNCKe4dTpHEgBlCo1zAywzne6jSW5D++vxtyg== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a5d:420f:0:b0:37d:31a7:6610 with SMTP id ffacd0b85a97d-37d3a8ef27amr2808f8f.0.1728550793139; Thu, 10 Oct 2024 01:59:53 -0700 (PDT) Date: Thu, 10 Oct 2024 09:59:27 +0100 In-Reply-To: <20241010085930.1546800-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241010085930.1546800-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.rc0.187.ge670bccf7e-goog Message-ID: <20241010085930.1546800-9-tabba@google.com> Subject: [PATCH v3 08/11] KVM: guest_memfd: selftests: guest_memfd mmap() test when mapping is allowed From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Expand the guest_memfd selftests to include testing mapping guest memory if the capability is supported, and that still checks that memory is not mappable if the capability isn't supported. Also, build the guest_memfd selftest for aarch64. Signed-off-by: Fuad Tabba --- tools/testing/selftests/kvm/Makefile | 1 + .../testing/selftests/kvm/guest_memfd_test.c | 57 +++++++++++++++++-- 2 files changed, 53 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 960cf6a77198..c4937b6c2a97 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -172,6 +172,7 @@ TEST_GEN_PROGS_aarch64 += coalesced_io_test TEST_GEN_PROGS_aarch64 += demand_paging_test TEST_GEN_PROGS_aarch64 += dirty_log_test TEST_GEN_PROGS_aarch64 += dirty_log_perf_test +TEST_GEN_PROGS_aarch64 += guest_memfd_test TEST_GEN_PROGS_aarch64 += guest_print_test TEST_GEN_PROGS_aarch64 += get-reg-list TEST_GEN_PROGS_aarch64 += kvm_create_max_vcpus diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c index ba0c8e996035..ae64027d5bd8 100644 --- a/tools/testing/selftests/kvm/guest_memfd_test.c +++ b/tools/testing/selftests/kvm/guest_memfd_test.c @@ -34,12 +34,55 @@ static void test_file_read_write(int fd) "pwrite on a guest_mem fd should fail"); } -static void test_mmap(int fd, size_t page_size) +static void test_mmap_allowed(int fd, size_t total_size) { + size_t page_size = getpagesize(); + char *mem; + int ret; + int i; + + mem = mmap(NULL, total_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + TEST_ASSERT(mem != MAP_FAILED, "mmaping() guest memory should pass."); + + memset(mem, 0xaa, total_size); + for (i = 0; i < total_size; i++) + TEST_ASSERT_EQ(mem[i], 0xaa); + + ret = fallocate(fd, FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE, 0, + page_size); + TEST_ASSERT(!ret, "fallocate the first page should succeed"); + + for (i = 0; i < page_size; i++) + TEST_ASSERT_EQ(mem[i], 0x00); + for (; i < total_size; i++) + TEST_ASSERT_EQ(mem[i], 0xaa); + + memset(mem, 0xaa, total_size); + for (i = 0; i < total_size; i++) + TEST_ASSERT_EQ(mem[i], 0xaa); + + ret = munmap(mem, total_size); + TEST_ASSERT(!ret, "munmap should succeed"); +} + +static void test_mmap_denied(int fd, size_t total_size) +{ + size_t page_size = getpagesize(); char *mem; mem = mmap(NULL, page_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); TEST_ASSERT_EQ(mem, MAP_FAILED); + + mem = mmap(NULL, total_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + TEST_ASSERT_EQ(mem, MAP_FAILED); +} + +static void test_mmap(int fd, size_t total_size) +{ + if (kvm_has_cap(KVM_CAP_GUEST_MEMFD_MAPPABLE)) + test_mmap_allowed(fd, total_size); + else + test_mmap_denied(fd, total_size); } static void test_file_size(int fd, size_t page_size, size_t total_size) @@ -172,13 +215,17 @@ static void test_create_guest_memfd_multiple(struct kvm_vm *vm) int main(int argc, char *argv[]) { - size_t page_size; + uint64_t flags = 0; + struct kvm_vm *vm; size_t total_size; + size_t page_size; int fd; - struct kvm_vm *vm; TEST_REQUIRE(kvm_has_cap(KVM_CAP_GUEST_MEMFD)); + if (kvm_has_cap(KVM_CAP_GUEST_MEMFD_MAPPABLE)) + flags |= GUEST_MEMFD_FLAG_INIT_MAPPABLE; + page_size = getpagesize(); total_size = page_size * 4; @@ -187,10 +234,10 @@ int main(int argc, char *argv[]) test_create_guest_memfd_invalid(vm); test_create_guest_memfd_multiple(vm); - fd = vm_create_guest_memfd(vm, total_size, 0); + fd = vm_create_guest_memfd(vm, total_size, flags); test_file_read_write(fd); - test_mmap(fd, page_size); + test_mmap(fd, total_size); test_file_size(fd, page_size, total_size); test_fallocate(fd, page_size, total_size); test_invalid_punch_hole(fd, page_size, total_size); From patchwork Thu Oct 10 08:59:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13829789 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DAC841BCA04 for ; Thu, 10 Oct 2024 08:59:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550798; cv=none; b=O1gmftMzIX1Mi5UpQ8n6vcT+T/CgaP6E0HRgM0w+MnIPBS4oM1tpgAVq6gzd/vR+6+/nB2e44UtyZ2eIhR/M5PnJjyBHfmjA8DYRIAzIH99tmAiuTslseCrmddG4Tlkw3O/ekR0csV+Q6J00BXhWv1Z4Sj/ODLJ0xmNjS6oZzDg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550798; c=relaxed/simple; bh=7trnHcmDY5Yiyh+CNqA5Y+5bFBagOFIEqKUoC7XfWpA=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=LJnSCkIPuTYgul4UQ2NSY+qtmow6EmDV7Puaw09QiurpfGoPiqKRGPLW9/0QmprQyHvGJb7r7m9uQ7kb6ozxb+h3J8y0iy42hTj3cwUxCp4Cx4h9HsnemDzd2xLqKqMjh4hGARtpCeNkSTeJrfjGWcm4BnphohPR3C/tJv1/Bxo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=rOuGu7CT; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="rOuGu7CT" Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-e28ef71f0d8so1060379276.0 for ; Thu, 10 Oct 2024 01:59:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728550796; x=1729155596; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=HM7gsmExGCXNNGDXzdybypu9GrQXN1fbuf2tdqrKB/Q=; b=rOuGu7CTW1DJDmLKONxps23cthYlXBUiczRgJx0VzM+MlUj26JVk5NO4xnrUWEJydE M80g9yeUwFW2bzUkKtIqe0qdWDCWV+jpTYaY1ZtuKhVpZU9Z8cBwCnUlM1Xz+IFNPqRq PtRPGg6Y6+d4fQ/w5mEZ2XBngLSNHqo8R0NtzeYrj4quaRowyBJt/l+UTxG6qtE2Lzn6 20LrsrI9ADLI80h43kAOkMszlbeV2q8ji+E3pP1gMBLM9WUwJq+cdGuZkZ/OJfa0sogr jwQsKVavRWM3uxxp/0PbtnV5XXelZnD8OUulApUHApo4k0u6oca2uINwiUj92Zm3Cafo QUMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728550796; x=1729155596; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=HM7gsmExGCXNNGDXzdybypu9GrQXN1fbuf2tdqrKB/Q=; b=aR4AuAuNUlrFmCYFaCjFWlL8Vg+OeHcfyaDcYqc0h4Q1puemPJyay48FKoD3XwkrUg JQVMfuTILUqb2kK4v0RIKiHpgoC+2Bl4wE83WVUyIYuwVjBLe5IwvAyYGzo3C3DAcjTH o2MFIA9ihRgCSobknjPCOCaQcP/++wO9y1/d29PHOa5t8xSxR59iWxdUf68hW8300ovV dPy/I7Y08OCim80gXVDLCoHxjCzAHsHjIQk6l7mvpYVi618Y9jeVT3z5hxruiOHUsNI+ znm/B2G04HLphxP8YNLaUtMSz0fPiV2BxlONJw91dRKGjrWBbeBV5ZoALD6RsION23TK y5Fw== X-Forwarded-Encrypted: i=1; AJvYcCW2vA+YSUy9DQfAxF1KTzM35TU/yBA2qgi8WrClRWF/GVDiU8Q1hWmxOaXVK7vLsOmye4EcAHUmtXiW9z0X@vger.kernel.org X-Gm-Message-State: AOJu0YxLOtBMKGI+H2GwIkVCEzPoYaCOoy7xZUEiRJkUEogoJvxBSFtf BdPqGEuoePizSTTTI5h+9olCmABClgWYlsDoGMNWo9G5AUlHVY7wOncdDQ9jiCSgNc82tWxhWA= = X-Google-Smtp-Source: AGHT+IGFCUFcocuEb3ihc+miy+ZJ2j+wkfEPTs51cWpXQxawruWfXPnQR4cYRFEkkX27O5wF5IDqGcOMOg== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a25:b205:0:b0:e27:3e6a:345 with SMTP id 3f1490d57ef6-e28fe6935c6mr3791276.10.1728550795821; Thu, 10 Oct 2024 01:59:55 -0700 (PDT) Date: Thu, 10 Oct 2024 09:59:28 +0100 In-Reply-To: <20241010085930.1546800-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241010085930.1546800-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.rc0.187.ge670bccf7e-goog Message-ID: <20241010085930.1546800-10-tabba@google.com> Subject: [PATCH v3 09/11] KVM: arm64: Skip VMA checks for slots without userspace address From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Memory slots backed by guest memory might be created with no intention of being mapped by the host. These are recognized by not having a userspace address in the memory slot. VMA checks are neither possible nor necessary for this kind of slot, so skip them. Signed-off-by: Fuad Tabba --- arch/arm64/kvm/mmu.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index a509b63bd4dd..71ceea661701 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -987,6 +987,10 @@ static void stage2_unmap_memslot(struct kvm *kvm, phys_addr_t size = PAGE_SIZE * memslot->npages; hva_t reg_end = hva + size; + /* Host will not map this private memory without a userspace address. */ + if (kvm_slot_can_be_private(memslot) && !hva) + return; + /* * A memory region could potentially cover multiple VMAs, and any holes * between them, so iterate over all of them to find out if we should @@ -2126,6 +2130,10 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, hva = new->userspace_addr; reg_end = hva + (new->npages << PAGE_SHIFT); + /* Host will not map this private memory without a userspace address. */ + if ((kvm_slot_can_be_private(new)) && !hva) + return 0; + mmap_read_lock(current->mm); /* * A memory region could potentially cover multiple VMAs, and any holes From patchwork Thu Oct 10 08:59:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13829790 Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E70E61C7B86 for ; Thu, 10 Oct 2024 08:59:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550801; cv=none; b=gZcH0tUPdSRivNMuVFq9Q551TpAr+1/7wiqJ+MzOgDp/PAtfunBxezAjzGJbhHbA3Yd5H9NbqQCKqNMaKbtsk7jQflaadgV+CO1YuQZUbeU2S5nKdCdaDpoMwnrnPdn1Xlq7fCbuPofPwpJSXoWdstqPR8F5Hw53xDoKFbEFljw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550801; c=relaxed/simple; bh=1gESkYZeE8gsJM6vDvXoaGAzOqHt08IYuPYjQygM1g0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=n/4gJllXu4u2YMAe+XIl06Jd8m0vNi1gV2IR1/oir3qvHAXcmeqyIghvYd4ycWJvaZKr1Jk6/eek2cI9sxVjqn4JMIKiqtFp7FsqM+7+1YDS4hyY2XOAQJT1Co1qQs4BDxLEyMeP4pxQRB6w4qvMb0t+a5V1k0LzkdeFS6HTKlc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=kgPD3mGj; arc=none smtp.client-ip=209.85.221.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="kgPD3mGj" Received: by mail-wr1-f74.google.com with SMTP id ffacd0b85a97d-37d325e2b47so314763f8f.1 for ; Thu, 10 Oct 2024 01:59:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728550798; x=1729155598; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=exyGShkIuMmlgn/y2fH9jdue3XxSmqPMQZyWZJ9IIOc=; b=kgPD3mGjVy7Wqfk4v3Hj2vJH5+b7kXzkZ9yFIndNp9yhGyvOuyKlMkjBUkqq6h1QGd eND+2tbG+99eQcJWwyF8GRVLz3BzDnV517ImZRAzeN8U9+scVQwvMyuGpf+mW91X8NlF XK1Yn/YkYN4IxG/+YvN0lMBceI2PbfOR66h84ITOTkwclFBUeR/O3G/ssMGHkNV5b7vT 6OY8uTSh5EVTi3PKjErBtScGA//Td/g2tYMhyrYA0kSClSm10C92XZfd0pVy1xtJg0qC +RkXjDQGQaWiVI9DpJZ+Ur1Vd5tr134qO40okZLCfGcNkY0kzeVJTgcs4vYSwE/SiwzP lirA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728550798; x=1729155598; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=exyGShkIuMmlgn/y2fH9jdue3XxSmqPMQZyWZJ9IIOc=; b=j7WvnbjvBOkV/dCBeOqCKED9j4i7ow2+2qV654HHcWdKtfyaTrTytMNOv5VMeFYs+1 GPag2/1Jm/cCPL8r092UB1t2Q8Pz6Euho/uVhjtk0a2RvD3nwRc0DvAfF8RHHKKhgcuw mheDcx9+60Y3SvRXIJdm5PqRC9TS6Cj3FSQI4g0KObVAh87oNKiX77uw9q96m4+gMRLT kx0gDvDD/z1SvWDpYTL29d4aUCPF2dTEADdq+nA5b/IgQroabSdErofcD1O+U1DqUono i55HC6hlWCV14uX0DuPrFLSj1htE7yal5grJ25Fijmvu/nBlpVizDop7MwjULgq9HyjS Dd3w== X-Forwarded-Encrypted: i=1; AJvYcCXFLAvEZ6SqaFilndD4R22k4czJQi54wSLhiAqQyjK3aZ6ZaiuZI/7u6w+EUWoFPUxR23K9IQpziK++/xRd@vger.kernel.org X-Gm-Message-State: AOJu0YyzdN3Hm2eU9goVBlqF2IN0ZuRs7+C+scoiCgXcCBM5YKQc0p/3 8FGYV4e38DJwdop/HcEJb5ekDNqvfybu/9uXuuAQAOoQRUou8sVL0BLXB5yZF5lqSw3Toxqerg= = X-Google-Smtp-Source: AGHT+IE3WukKHOq7Bmp4xhzc8OQfh5cP/L7vmCsDLrA8EMFDPkbOy1WECbzZ6HHl8nL5SydVZHebiozHSw== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a05:6000:1091:b0:37d:4d36:e178 with SMTP id ffacd0b85a97d-37d4d36e297mr456f8f.7.1728550798081; Thu, 10 Oct 2024 01:59:58 -0700 (PDT) Date: Thu, 10 Oct 2024 09:59:29 +0100 In-Reply-To: <20241010085930.1546800-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241010085930.1546800-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.rc0.187.ge670bccf7e-goog Message-ID: <20241010085930.1546800-11-tabba@google.com> Subject: [PATCH v3 10/11] KVM: arm64: Handle guest_memfd()-backed guest page faults From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Add arm64 support for resolving guest page faults on guest_memfd() backed memslots. This support is not contingent on pKVM, or other confidential computing support, and works in both VHE and nVHE modes. Without confidential computing, this support is useful for testing and debugging. In the future, it might also be useful should a user want to use guest_memfd() for all code, whether it's for a protected guest or not. For now, the fault granule is restricted to PAGE_SIZE. Signed-off-by: Fuad Tabba --- arch/arm64/kvm/mmu.c | 112 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 110 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 71ceea661701..250c59f0ca5b 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1422,6 +1422,108 @@ static bool kvm_vma_mte_allowed(struct vm_area_struct *vma) return vma->vm_flags & VM_MTE_ALLOWED; } +static int guest_memfd_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, + struct kvm_memory_slot *memslot, bool fault_is_perm) +{ + struct kvm_mmu_memory_cache *memcache = &vcpu->arch.mmu_page_cache; + bool exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu); + bool logging_active = memslot_is_logging(memslot); + struct kvm_pgtable *pgt = vcpu->arch.hw_mmu->pgt; + enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; + bool write_fault = kvm_is_write_fault(vcpu); + struct mm_struct *mm = current->mm; + gfn_t gfn = gpa_to_gfn(fault_ipa); + struct kvm *kvm = vcpu->kvm; + struct page *page; + kvm_pfn_t pfn; + int ret; + + /* For now, guest_memfd() only supports PAGE_SIZE granules. */ + if (WARN_ON_ONCE(fault_is_perm && + kvm_vcpu_trap_get_perm_fault_granule(vcpu) != PAGE_SIZE)) { + return -EFAULT; + } + + VM_BUG_ON(write_fault && exec_fault); + + if (fault_is_perm && !write_fault && !exec_fault) { + kvm_err("Unexpected L2 read permission error\n"); + return -EFAULT; + } + + /* + * Permission faults just need to update the existing leaf entry, + * and so normally don't require allocations from the memcache. The + * only exception to this is when dirty logging is enabled at runtime + * and a write fault needs to collapse a block entry into a table. + */ + if (!fault_is_perm || (logging_active && write_fault)) { + ret = kvm_mmu_topup_memory_cache(memcache, + kvm_mmu_cache_min_pages(vcpu->arch.hw_mmu)); + if (ret) + return ret; + } + + /* + * Holds the folio lock until mapped in the guest and its refcount is + * stable, to avoid races with paths that check if the folio is mapped + * by the host. + */ + ret = kvm_gmem_get_pfn_locked(kvm, memslot, gfn, &pfn, NULL); + if (ret) + return ret; + + page = pfn_to_page(pfn); + + /* + * Once it's faulted in, a guest_memfd() page will stay in memory. + * Therefore, count it as locked. + */ + if (!fault_is_perm) { + ret = account_locked_vm(mm, 1, true); + if (ret) + goto unlock_page; + } + + read_lock(&kvm->mmu_lock); + if (write_fault) + prot |= KVM_PGTABLE_PROT_W; + + if (exec_fault) + prot |= KVM_PGTABLE_PROT_X; + + if (cpus_have_final_cap(ARM64_HAS_CACHE_DIC)) + prot |= KVM_PGTABLE_PROT_X; + + /* + * Under the premise of getting a FSC_PERM fault, we just need to relax + * permissions. + */ + if (fault_is_perm) + ret = kvm_pgtable_stage2_relax_perms(pgt, fault_ipa, prot); + else + ret = kvm_pgtable_stage2_map(pgt, fault_ipa, PAGE_SIZE, + __pfn_to_phys(pfn), prot, + memcache, + KVM_PGTABLE_WALK_HANDLE_FAULT | + KVM_PGTABLE_WALK_SHARED); + + /* Mark the page dirty only if the fault is handled successfully */ + if (write_fault && !ret) { + kvm_set_pfn_dirty(pfn); + mark_page_dirty_in_slot(kvm, memslot, gfn); + } + read_unlock(&kvm->mmu_lock); + + if (ret && !fault_is_perm) + account_locked_vm(mm, 1, false); +unlock_page: + put_page(page); + unlock_page(page); + + return ret != -EAGAIN ? ret : 0; +} + static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, struct kvm_s2_trans *nested, struct kvm_memory_slot *memslot, unsigned long hva, @@ -1893,8 +1995,14 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) goto out_unlock; } - ret = user_mem_abort(vcpu, fault_ipa, nested, memslot, hva, - esr_fsc_is_permission_fault(esr)); + if (kvm_slot_can_be_private(memslot)) { + ret = guest_memfd_abort(vcpu, fault_ipa, memslot, + esr_fsc_is_permission_fault(esr)); + } else { + ret = user_mem_abort(vcpu, fault_ipa, nested, memslot, hva, + esr_fsc_is_permission_fault(esr)); + } + if (ret == 0) ret = 1; out: From patchwork Thu Oct 10 08:59:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13829791 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E96911BC9E3 for ; Thu, 10 Oct 2024 09:00:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550803; cv=none; b=fk5GwI6WDg+TF8rq468mnrUDWrC2WKaYaUbI6bF5e1n6QmRH02zMeqwLoerlxRNoJOaJvlTIbegr49NtC+imAIUnvGBOp0eBt0BC6U+/CKvuIixMW24z7Cxgyhr63Zj0K7st1YoVd3ijbHR8yePLPUUMlEWvYKGeCHdXYvLqyyQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550803; c=relaxed/simple; bh=rWEgzpXZhYi9ydWJUmAbpEPiQxbIlQmYae02y98HGc0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=SCYPfNGpzvubjgx6Tbsl1nS6aklTDFQunlS/OZw0kKI5de2JZiIhOLP04bG9gvU0CMNpOr+bQBewB7+S9egJqbh/GB7IO9K2Sb/X/STdDQd0U3Bm23WZMFxTRI7lFQsGY8ga8ltg7jpFYruOoK9zm+AE1Ucy/XQ6Ej0ppgwZ1Lw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=IfJkRUq5; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="IfJkRUq5" Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-e28fc8902e6so955156276.0 for ; Thu, 10 Oct 2024 02:00:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728550801; x=1729155601; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=vRqRt9/OdUuC94QGlx1b3LLfUnzhlvtZvk9uJQuMY0s=; b=IfJkRUq5UegELD4QSn+0laPFPlceyHVYKugtHvM5KyebukizCnIGW7Keh+kkRNgoW9 wBfZDMhDGlf+qHBVMYEXpdhNePcloHRu7GuH7RtYS7jdalEt4CbBfrNcOse5arrn26kv FLy6cenSeas95fO6QeixEflc8aA4gC/cUWH+//TvmrAtjC8AG/A8/Tm/1lBMgpR/0KAI abjszzaHaKQVVplOtMSB5yuuK+MGHw4zxb4c7js0KRBEoJzvw+Vc7sYc00JTgf3xcGG8 ICAI4UDnKk9eGLJrFGqbTm5GDCPH71yxEivMZbD3FzdYouBkhgYrPG923wWf/la/7wEF hlIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728550801; x=1729155601; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=vRqRt9/OdUuC94QGlx1b3LLfUnzhlvtZvk9uJQuMY0s=; b=kL2woDkEukWp8ZaxOu7oC4iandU4Fp/xGfisH/9n7iPJDwVk20B2f+25uUtT6cTfps +pUAdtOZlgn3auarmvqxwgD+Yqy1sPX/mDgYNZJhxf9f0xqw8CFZzOZijRfvYL/umgq1 jMqzLC5hzmwRimbNlwCTCYW0XcbVa7akqukGWBh4c2sjOom597yumyc8pwbiaP8XJus8 v3yFpQHV26e69DuAYv164jJjoVvhYLKdVphqVBG+mEk6hnXzNbzeyXqpDVCXSGvtWvPJ Ib13mMLNMXPoVXKKG8a73lrimVbm083sIndgOpxMF7qfHeFDERD9SMrXUOFcNLo+dS6y h+wg== X-Forwarded-Encrypted: i=1; AJvYcCUdYE1CziRN9P6kTnCTbyZoKsjfAFIQspP/CzyxWEiSZNfAi7AT3bTwlJUMVu3J/wjxgVhYaSfXWmG/SPkb@vger.kernel.org X-Gm-Message-State: AOJu0Ywzagqs1UMRC1AjLqBAhCP4f+7Amlu+SjDEwgNSS+0sCuRzg47V cZHHTQyhGQmYnbo2cX6KnoCvjHAPwxKzPnp8x5coofrOh1GV6NfA/5jBjiYg3+wBySaaRT14EA= = X-Google-Smtp-Source: AGHT+IHKOC7WKTbH/ijVUTH38hxdU+Ou0RoKl5cARtWTF5vqoihO6BtV+q30wvio57rQ24Dl8KyceyokuA== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a05:6902:1812:b0:e24:9f58:dd17 with SMTP id 3f1490d57ef6-e28fe32f042mr55663276.1.1728550800731; Thu, 10 Oct 2024 02:00:00 -0700 (PDT) Date: Thu, 10 Oct 2024 09:59:30 +0100 In-Reply-To: <20241010085930.1546800-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241010085930.1546800-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.rc0.187.ge670bccf7e-goog Message-ID: <20241010085930.1546800-12-tabba@google.com> Subject: [PATCH v3 11/11] KVM: arm64: Enable guest_memfd private memory when pKVM is enabled From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Implement kvm_arch_has_private_mem() in arm64 when pKVM is enabled, and make it dependent on the configuration option. Also, now that the infrastructure is in place for arm64 to support guest private memory, enable it in the arm64 kernel configuration. Signed-off-by: Fuad Tabba --- arch/arm64/include/asm/kvm_host.h | 3 +++ arch/arm64/kvm/Kconfig | 1 + 2 files changed, 4 insertions(+) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 94cff508874b..eec32e537097 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -1496,4 +1496,7 @@ void kvm_set_vm_id_reg(struct kvm *kvm, u32 reg, u64 val); (system_supports_fpmr() && \ kvm_has_feat((k), ID_AA64PFR2_EL1, FPMR, IMP)) +#define kvm_arch_has_private_mem(kvm) \ + (IS_ENABLED(CONFIG_KVM_PRIVATE_MEM) && is_protected_kvm_enabled()) + #endif /* __ARM64_KVM_HOST_H__ */ diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig index ead632ad01b4..fe3451f244b5 100644 --- a/arch/arm64/kvm/Kconfig +++ b/arch/arm64/kvm/Kconfig @@ -38,6 +38,7 @@ menuconfig KVM select HAVE_KVM_VCPU_RUN_PID_CHANGE select SCHED_INFO select GUEST_PERF_EVENTS if PERF_EVENTS + select KVM_GMEM_MAPPABLE help Support hosting virtualized guest machines.