From patchwork Thu Aug 1 09:01:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13749991 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 410FBC3DA4A for ; Thu, 1 Aug 2024 09:01:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C80F26B0089; Thu, 1 Aug 2024 05:01:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C2F936B008A; Thu, 1 Aug 2024 05:01:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B1DC16B0092; Thu, 1 Aug 2024 05:01:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 939876B0089 for ; Thu, 1 Aug 2024 05:01:23 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 4A8E8A0955 for ; Thu, 1 Aug 2024 09:01:23 +0000 (UTC) X-FDA: 82403082846.03.FD256CC Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf15.hostedemail.com (Postfix) with ESMTP id 6E893A0027 for ; Thu, 1 Aug 2024 09:01:21 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=yo6BAT6S; spf=pass (imf15.hostedemail.com: domain of 34E6rZgUKCGcYFGGFLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--tabba.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=34E6rZgUKCGcYFGGFLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--tabba.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722502825; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=Xe9JulU9ubZjSforyL5UPrTOuPupt79Dp6Z8m7FP7ps=; b=MfoOy8UCJ8Ds9uyQvltamoHs2QUAlGhkZ59tJl/HHpwasg6VoQ6XHEvVP0iTITfh8RxlGL EGUPcGk7rQIHB3uMl7KAQOi62oyvnNUmdtRLmjxFRMYo0KMSvQtX/1O8ek7QjzdbKrktyD PdRGQyrDMX6dN6N11mIrljlLlllJ4hA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722502825; a=rsa-sha256; cv=none; b=wd+msiL3ySHiFded46FRV3+a5R6hZW3FNTQm3iPMZSDUxQ5+otnsR9Z1UFbRNBuoe98Ho9 L1+Yqj8UmBDhCtoIpTgdvGY0vX6B41C6Ry5kJMf9dua11AinrPNpvRdmqHSOfmcL8geqAj evOQjmoBfEkOwUruyeANK8bM+JJjuwQ= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=yo6BAT6S; spf=pass (imf15.hostedemail.com: domain of 34E6rZgUKCGcYFGGFLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--tabba.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=34E6rZgUKCGcYFGGFLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--tabba.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-664916e5b40so31354637b3.1 for ; Thu, 01 Aug 2024 02:01:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1722502880; x=1723107680; darn=kvack.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=Xe9JulU9ubZjSforyL5UPrTOuPupt79Dp6Z8m7FP7ps=; b=yo6BAT6SRXvfeCnTxgXifiZdfc3D0uOxKC6v1sN2L98Vzwp7yuaJ5kH5P7bT8TpUqS Lb374sTrVtvKI90cZYd031wdXbC0El3ZQmVk95jii4KETSKKFu+LsnQxwBLv7RrP7XBe stZoPIDbk+u6G1U/zDjj24Tn6S4GHIknAgTBH3ZxHsjy69mzBIJRmz3IMbbaUQY8DLX9 MZWwlqwF8ea3kz1zHnc+an0z1EuGpAFdBwL4SIUzyxsDtWT+qnAvdoZUEeg3ZbKq/xUT 749I9ifUssUWVQeREL371byzRWvAxsJZjV54GADVdfgPlp0OJ8FIXegQ05rP9sZpQcL7 TRCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722502880; x=1723107680; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=Xe9JulU9ubZjSforyL5UPrTOuPupt79Dp6Z8m7FP7ps=; b=NRgRabK6LVsg2tXnvGtnszDOy+UykoDJmBCWPTCSOvhsluaktLIAzAy/PBNyDl5Kwy x5Xo+8mo/Z/vHcCEPfpQTRIEb2NGmkLq720EBKjwIPg6x6YjuZ8u2ufmXpa5h2tFMgjb lhCd2cj/EvzwWBs6mlMmuXI2ssAtrAE2uC2q9EcjoDriwFXRlia1ADRoeok8uYeV9J5E 4weH6gAIvJu3W5bIfyzqBrNow8pJLcrPQ/7Zj2dWsXU17NJlhNzgrRByXiWmDtiIu+4V JmRfddYSl4zdSnPWWZKifTXkB3wRj5/MRyanKMUke3/HjcP0tL7ux97vouKhm1O5Q1gl 5MOA== X-Forwarded-Encrypted: i=1; AJvYcCUdWnibVqqr5Y+DxR3PANkkVlotN3G48+huzEcBX9tf2QOUMuzZdKEe7YD9gUaKIuY8q4z13EZWfWwxa/Q4Y2PH6lI= X-Gm-Message-State: AOJu0YwP6ZU7VypKqHCiUGjbUWCIIazuUjNZUiEPfjZTOVQSY+od8sPN xB+sEhWc5kMUlhe1VEPNNMQD4cVpeIV7S2TINEITpuOD4FYKHZH63MjzgJDr1hv9EGqALa6bVQ= = X-Google-Smtp-Source: AGHT+IEkbm7eKiyptnKtDvqF54Af0lY5WtNDdg1rza9U70uM2cyHwfK08+zwYTkSwgPVnTm5E0t39WV6jw== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a05:690c:fcf:b0:64b:5675:3ff5 with SMTP id 00721157ae682-68850f45332mr21307b3.2.1722502880221; Thu, 01 Aug 2024 02:01:20 -0700 (PDT) Date: Thu, 1 Aug 2024 10:01:07 +0100 Mime-Version: 1.0 X-Mailer: git-send-email 2.46.0.rc1.232.g9752f9e123-goog Message-ID: <20240801090117.3841080-1-tabba@google.com> Subject: [RFC PATCH v2 00/10] KVM: Restricted mapping of guest_memfd at the host and pKVM/arm64 support From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, tabba@google.com X-Rspamd-Queue-Id: 6E893A0027 X-Stat-Signature: y3fq97mqofkff3cg18zhza1unsr5qgu8 X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1722502881-27897 X-HE-Meta: U2FsdGVkX1/jofuHBjpiWxTFsaNVo0fFwYZ+g5vN+PBNH4+gaxNvxRP1XGbkj0D7AfIN4vU0DHo6r6wIKfUE+6x6/59e7YJHUNi7u7m6zPdUAUW/yxQvqLGrIVpcx1VriT3T6rcIE16jJLt/OiRIi3STyikJaBkSH9tunQ0MydkEsQuaihP5D1dTvoeof7igbzToiAFFI/2zaFaw08IrZVVNGFR80Y8dUZljz0eJ174KZVk/C6dAtTXPbcY+f33G4C52mH1Kllmn5NVvJdACZ8Q6SmmiALX6JvCmDZmUlC44pe6J3vCvxJbixhJ+Rxt3F3nmb2wONiIOVVNNYWwLFUCDhkwlTdIgjT93+xyOwLoaetacWSbaoZhcexuCLFyaUNMUND9ZpZ90/VQ8E916hSGSSdMxurA9/zY8FSjx7CV4EEENHCQCv9zACRbJD581fRDqOrRx6WaCCMY4X01L/oEhBIjWpQzAWdqDqSYm8+L5bLYQudgauc145I4WqZ51yegRlGvT9XYEgnZzdA2fyMtpoXrtfgOo+eZVZFRxV9cIACCu8coWxP7V8dvmirDVVUaZPUae0TaMK6vvN6AGtVUD82TsQxXNyEvPOceNaaccc0jwElLZEMcrHwu+B7eJdw/wT8Ngj6LRSvuTtZ4JwJZnOyf2XgYwi5FcgZIIN+SFwrU0B//4MEI7P4QMBqM9u/LZ1+FtNihQLJDRSOkeaJXWzi04EfkF6ACaonE+kd8IDu3zloFu8vNvEZllCJh64kXRTzrn1wVhCUx2CLDNHnMvLOrW8sMxuLjDCokzlxLw1jgqelhtJ55rnjJIa8jR0GsQBwaEV1U/O16CYTH5zaKWCFLRE7bLjcnmofEedPj05jxiVb19B2p/xxG9muOzkpIQXFcafnTw/OFQ+yf8kuy9THGn9jGUhts9M8Y1yHovwk3TebxbdCwroGL8tHhaXmsOdC7S6NOrPvVlCjS 7514JzJ8 X74zOwOJFE4t23zJfEXE9dIMeOTww6F8XKyQHTFGPQQdF8tDXR/M13JdyBeDv6uE/x4GYyl0OsVDNYScbXondQ0HyKxhYFYQBTRYthF6JOTrjbDV7DeyFt5UOya8sccHGHrNbjd8HtulHucx0gt8eQROe+szsAJvA/RlM2KKHAKwudh5xxu9BHOHCnqzrBKeh/8upYbrr2upwj1IgNdY5Id++Qsr6tGtyOwGnYyyHNH4bKSZGSz5WqShamXzbWc7oYVNWJ3i8gl/Ilx6bKMCba90NXN0gMrO4lBJJ1FKKuhIvNuNC3QLv2El4tO/URmOPhef4xelwV2YgJo5m2sYw5aHr0EJS31K7o1P8nI2j/yxPA1+PzC1hRirG1tJ7YliM+VPGSqjECxOKX2DUhlK3hD1fKcIan/PHfMsYma2a50Y72VFGtatSkOJBv1PyloFwyyDJjLEPig79SnDIudFPjzBbkDPKhAjHtRBkSExsJpNKoO4q8WUrU78e+NaQ4r0mV8ScdzpeJVd45cyYubMTMdlKBHGBH3I2dG0kr8wZhRoEadmzXxc/6ma7WYWKoUUs45Lpj+UTL7z5fekdqvKqvLUEamr859NQW96tWRQqYtC6olovG0ZldZ7STj0qcoDNC9MmtCzkXZnzCwoEWF3YhpLPZTZ5fhpE6pMeqIkTOYQZ8+4otqM+J0+9E8Uj81K6kgBbgAlB1CLpKOHSJ6aceyMfUYwI2dS1i6KJFEABlTR+x5CIUQd3fL5JWMFlwxqzQCHhsAUtNJ04vkY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000006, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This series adds restricted mmap() support to guest_memfd, as well as support for guest_memfd on pKVM/arm64. It is based on Linux 6.10. Main changes since V1 [1]: - Decoupled whether guest memory is mappable from KVM memory attributes (SeanC) Mappability is now tracked in the guest_mem object, orthogonal to whether userspace wants the memory to be private or shared. Moreover, the memory attributes capability (i.e., KVM_CAP_MEMORY_ATTRIBUTES) is not enabled for pKVM, since for software-based hypervisors such as pKVM and Gunyah, userspace is informed of the state of the memory via hypervisor exits if needed. Even if attributes are enabled, this patch series would still work (modulo bugs), without compromising guest memory nor crashing the system. - Use page_mapped() instead of page_mapcount() to check if page is mapped (DavidH) - Add a new capability, KVM_CAP_GUEST_MEMFD_MAPPABLE, to query whether guest private memory can be mapped (with aforementioned restrictions) - Add a selftest to check whether memory is mappable when the capability is enabled, and not mappable otherwise. Also, test the effect of punching holes in mapped memory. (DavidH) By design, guest_memfd cannot be mapped, read, or written by the host. In pKVM, memory shared between a protected guest and the host is shared in-place, unlike the other confidential computing solutions that guest_memfd was originally envisaged for (e.g, TDX). When initializing a guest, as well as when accessing memory shared by the guest with the host, it would be useful to support mapping that memory at the host to avoid copying its contents. One of the benefits of guest_memfd is that it prevents a misbehaving host process from crashing the system when attempting to access (deliberately or accidentally) protected guest memory, since this memory isn't mapped to begin with. Without guest_memfd, the hypervisor would still prevent such accesses, but in certain cases the host kernel wouldn't be able to recover, causing the system to crash. Support for mmap() in this patch series maintains the invariant that only memory shared with the host, either explicitly by the guest or implicitly before the guest has started running (in order to populate its memory) is allowed to have a valid mapping at the host. At no time should private (as viewed by the hypervisor) guest memory be mapped at the host. This patch series is divided into two parts: The first part is to the KVM core code. It adds opt-in support for mapping guest memory only as long as it is shared. For that, the host needs to know the mappability status of guest memory. Therefore, the series adds a structure to track whether memory is mappable. This new structure is associated with each guest_memfd object. The second part of the series adds guest_memfd support for pKVM/arm64. We don't enforce the invariant that only memory shared with the host can be mapped by the host userspace in file_operations::mmap(), but we enforce it in vm_operations_struct:fault(). On vm_operations_struct::fault(), we check whether the page is allowed to be mapped. If not, we deliver a SIGBUS to the current task, as discussed in the Linux MM Alignment Session on this topic [2]. Currently there's no support for huge pages, which is something we hope to add in the future, and seems to be a hot topic for the upcoming LPC 2024 [3]. Cheers, /fuad [1] https://lore.kernel.org/all/20240222161047.402609-1-tabba@google.com/ [2] https://lore.kernel.org/all/20240712232937.2861788-1-ackerleytng@google.com/ [3] https://lpc.events/event/18/sessions/183/#20240919 Fuad Tabba (10): KVM: Introduce kvm_gmem_get_pfn_locked(), which retains the folio lock KVM: Add restricted support for mapping guestmem by the host KVM: Implement kvm_(read|/write)_guest_page for private memory slots KVM: Add KVM capability to check if guest_memfd can be mapped by the host KVM: selftests: guest_memfd mmap() test when mapping is allowed KVM: arm64: Skip VMA checks for slots without userspace address KVM: arm64: Do not allow changes to private memory slots KVM: arm64: Handle guest_memfd()-backed guest page faults KVM: arm64: arm64 has private memory support when config is enabled KVM: arm64: Enable private memory kconfig for arm64 arch/arm64/include/asm/kvm_host.h | 3 + arch/arm64/kvm/Kconfig | 1 + arch/arm64/kvm/mmu.c | 139 +++++++++- include/linux/kvm_host.h | 72 +++++ include/uapi/linux/kvm.h | 3 +- tools/testing/selftests/kvm/Makefile | 1 + .../testing/selftests/kvm/guest_memfd_test.c | 47 +++- virt/kvm/Kconfig | 4 + virt/kvm/guest_memfd.c | 129 ++++++++- virt/kvm/kvm_main.c | 253 ++++++++++++++++-- 10 files changed, 628 insertions(+), 24 deletions(-) base-commit: 0c3836482481200ead7b416ca80c68a29cfdaabd