From patchwork Thu Oct 10 08:59:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13829771 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64B81CF07DA for ; Thu, 10 Oct 2024 08:59:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E3ED46B0085; Thu, 10 Oct 2024 04:59:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E15906B0088; Thu, 10 Oct 2024 04:59:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CDD8C6B0089; Thu, 10 Oct 2024 04:59:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id AFE7D6B0085 for ; Thu, 10 Oct 2024 04:59:35 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 7AB00A13AC for ; Thu, 10 Oct 2024 08:59:29 +0000 (UTC) X-FDA: 82657094310.03.99745E7 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf17.hostedemail.com (Postfix) with ESMTP id AA9B04000E for ; Thu, 10 Oct 2024 08:59:32 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=nA5EnR4s; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf17.hostedemail.com: domain of 3dJcHZwUKCAEubccbhpphmf.dpnmjovy-nnlwbdl.psh@flex--tabba.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3dJcHZwUKCAEubccbhpphmf.dpnmjovy-nnlwbdl.psh@flex--tabba.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728550622; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=44RrF/B9Ex8nQJPJovZzgohUiCEdgUa9GoiqQfAe37s=; b=u1sMpi6hQVwzfdeEaoduLT0eTLM+USo43p5umZR16BxkSqLr2pYOvBaIl+QcJasW7oXgu+ PA6pGhTlbbvsRes4/3qvM+GpFipudHBClVLt9nY2Z2W3246Yh2wM/H8dFlcrIbOoSMKkje rUJP0dJkEc6YAMOkJun/bdsuYq25l08= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728550622; a=rsa-sha256; cv=none; b=g/Wz86ZWh7SVacdsC87aX2HIkMjSpFUDOWKnj5oHwRydfRRdYWQYf0B2M3sYQh7zKpjWab xwCgnweEBNq48j1spkEBTIuuN/JmXRzKsy1OgDNUEAOw1fBopTolR5av7JN2jLbygXd39t b8/mm5T556q4r4pRlRvZusYiz5n06l8= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=nA5EnR4s; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf17.hostedemail.com: domain of 3dJcHZwUKCAEubccbhpphmf.dpnmjovy-nnlwbdl.psh@flex--tabba.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3dJcHZwUKCAEubccbhpphmf.dpnmjovy-nnlwbdl.psh@flex--tabba.bounces.google.com Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-6e21dcc7044so12978157b3.1 for ; Thu, 10 Oct 2024 01:59:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728550772; x=1729155572; darn=kvack.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=44RrF/B9Ex8nQJPJovZzgohUiCEdgUa9GoiqQfAe37s=; b=nA5EnR4s4YaWHkI++qSRYGUySafJAvRlSS9GRw1gaaWSNh0ViZ2pMlyAkMOalN3OZb 9zlRcIbK3IYTJ7oY9EOo9u24Ugoacc5zBE9baeMrKNkIP94u0+vUa9SKWJwq8DVqHXPe QFDlbJR3/ES7XGAXoBneEaHVedLIlOtgtgw7hIdLE4n2KgxzDzw/A2l3v/e+b12KH/3g TFLHBK4zOHWED4w25RVbrDUi8k7ZIW3D7bEmi0C9yp23nh22eg4QAllafWS8t6QunLKy 7B0a75RpzX/pJoe4u+N9rLHxwfOiAdRmMacIXmg/OE5J74UE9qZXAvbZ7gu5uLydSNXK OmeA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728550772; x=1729155572; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=44RrF/B9Ex8nQJPJovZzgohUiCEdgUa9GoiqQfAe37s=; b=bZxYqeMlFY97oHjJ38q23qZ53zp5QpjvRnYXHfW3/zT3Au96inlawjGbvviI+2S9mS jxKaYk4h6rvZrLrfljIrws7fxThE0xyRlayFxg84Waw8DeNcqLWvRNAqjJqyNrwOd3Rv VDhHHdmodpR3XAfv3qq/oxjPpbryPxJEFC47lgi/uRxlTnmdrYpI3jeeQdj8GfLVhxVZ 7gNj+E9NeW1QKbXM4X52uVeIU9Yb4EW2IDEHtyRPYt8fFfmiHpRrQvMXwwoMnCHKIcIY galQqwDgYHCHwK+VuRFnXT5Gdk9GMR+NytchdzBIY2s+UFmm3FTTzpnfawj2SShrW30L saWQ== X-Forwarded-Encrypted: i=1; AJvYcCVStN1skLdkTlOQlWmk8eqj/KK5HzpNsq3ywDmhfMhtWCtzmIJyfHw3BxIhi5aWLJ5ncePr8cMsSw==@kvack.org X-Gm-Message-State: AOJu0YyCDKXbv42rDSOgS3S4PesJqYnZKFwGuRmY40P6h9V+mygRCXNb /6tMX7jbvYj49184/8YridmL3Jo3LpdtwjLyyJ7QS6AXUKeZyZ1qwcbNneeKuXFS7MY4m4M5Xg= = X-Google-Smtp-Source: AGHT+IE48pBjjjaVu8xZXl4soeW7srzTlab264gBCoBpdD51ajco5eIPVFv17ipdIxXgjEAxnQnx2cZ6PA== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a25:ab51:0:b0:e28:fb8b:9155 with SMTP id 3f1490d57ef6-e28fe41c747mr48174276.9.1728550772449; Thu, 10 Oct 2024 01:59:32 -0700 (PDT) Date: Thu, 10 Oct 2024 09:59:19 +0100 Mime-Version: 1.0 X-Mailer: git-send-email 2.47.0.rc0.187.ge670bccf7e-goog Message-ID: <20241010085930.1546800-1-tabba@google.com> Subject: [PATCH v3 00/11] KVM: Restricted mapping of guest_memfd at the host and arm64 support From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: AA9B04000E X-Stat-Signature: a49iznu11o1f8ccu6ho9y4ugmz65zcnn X-Rspam-User: X-HE-Tag: 1728550772-581445 X-HE-Meta: U2FsdGVkX1+3r9OeKAghLgnAn21u/EY9dFLiRNoFeP1GNyx6ON6fWjiHZ2jicnpAzxKSJJffqhIab7hgr3OABK42ZVB7wP43nwx5CbhcvPp8dloYgB3P9UQey3MtHnZKyPZpjU3c1maMcjyy1zpcvkggEGj6pw/eCtXjJxu3dTRXlSR7S4i6VjRyryr25QC3EzF/mRiNh5SzYK/mYcJkZH5QIltluqUrnHKzl4NtxB6Td6UAdTFiZTojQNt2VcXIgGt9JXe7gP+jn4LD4wN0YUaDCmbCPcqKtf36hp5Lbsa0T1PUNe3xau1b9bLVIb7J9+hXMJvnbNOX1mnMUCgWOGNbz0+w+oBhZYyBI85aAM74ygMYKlaq/MjICWlsFulmUSrKIywvq3EDMZJsYZV76mYbVg2H9q9rn8f3IqSYh9Okk7DwV18VBurRU5/vNXmZMIsBEDltFivyAibZOKmdDIVrbSnvmES3LVGwllDGx70/GIqLqNntV2CIHpGc67SZvQ+If6nwE9NTQke7K02vFI1sA/qREZAdPcWodYXp2HUcN2qqgn0nVJ7/e1raAzt/NRZEASpySBdoUZ/sEaPlwaHdT78qZM2KGXEX0IqeRm5mIRGQ40esoIqLVpT3xmNYF5hMJDnorUre0UT4wgziGkjc2Z6qgQXMk4jH0srDuYltkt5W7z/cTMHzi8rBV/0HJEZKmPWT5QhpiCwmA0jkTGvOs00rTm5kJ+lDY0T44XsEd8vHuWPAOR9rOLTyoUHUhoP18qiy1UAtlT/Ii+Wm/p+c/Tb3NxhMPgXFYjfcEwx5m1E5i1QX99pfn8jB5RC+XZtnD/cvD0Qz8Svn4LjCi1pljy2qts51kEj2zGv/P/+FlxfT9j0AO3fzWUF2gfMM5qefpxEtBaE99MJdwapXhcI7qPv8RyAT1I+qh0WG3khNcdPXE39kxNvlVcMvITLHU7bOF+x6DSMRgmWNVJZ mIHDj9bh 0mGeYtzFTBE1dxUPfzzsr00sN5EhjDBx2pUDXVMtfXKiyzoRdr9XuHgn805dBVQslT8ZlylbZD52Q4f6AMsjTpeacMip0LLsFbYidlXb5giovCmIQX1P0SeAYfOXEPILFTxbNbQczRghkDuMM3zF2wDmQd4N+WGJpy6zagmxhm1jVIfOvgsxkOlAd3jDMuTYGPigV/DuRI/oEmiZb2lLKGt3noOuwesIbAcUvumL3DCRmurPeyg5lWFnvUibyX+EdBXv8pHJ+h7jLVrq+7JpC5dxYCpJ6+1ey2UzFki7ZD2k5rS9R6jpw2/BzsHCFME3aSMIbpvn30XwyLKcBnLIQjGiqigJidZ8bXidDLRkzYIu2W3uiQFLME4VxX/xR7QXElQu0MexbG9Yfmdx40miq/sthk/ELcpvoh3rAbpMTJmiyUWs7LAxQz65uPBNh6vyJKcPnxbdum1fgYSaxxOKYhtDS0PZ1EsoHOeguQABe4e/6Y+tdIHp+BSteZ14W+PvYTpiiVMnX5sfewDD+LgkOrm8Ekk4MGZKf7lgfJ+3yhDkNwj3/uRYnHt+sBb08ZMX6qzRpNq1SYOIer+uVsq0XI2ljrK5qzL2lQc6z2/6vNSaPjfmyU3mJ9qw8uDfFgNgBdjsL+PUK3gaLq3Md6MpJPwdI1iZ9+uvTbH2/JQRE1tEh2MYGMJivIvr7KS8uP+IGizb4P3+T/T7OEgDs7bIizBDkFUqCiP5B7LBUJOGt/fqxryDanQeJyAWOUtdbG8eFAstz X-Bogosity: Ham, tests=bogofilter, spamicity=0.000002, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This series adds restricted mmap() support to guest_memfd, as well as support for guest_memfd on arm64. It is based on Linux 6.12-rc2. Changes since V2 [1]: - Use refcount to determine whether a page/folio is mapped by the host rather than folio_mapcount()+folio_maybe_dma_pinned() (DavidH) - Track of mappability of guest memory at the host in the guest_memfd inode (Ackerly) - Refactoring and tidying up (Sean, Ackerly) By design, guest_memfd cannot be mapped, read, or written by the host. In pKVM, memory shared between a protected guest and the host is shared in-place, unlike other confidential computing solutions that guest_memfd was originally envisaged for (e.g, TDX). When initializing a guest, as well as when accessing memory shared by a protected guest with the host, it would be useful to support mapping guest memory at the host to avoid copying its contents. One of the benefits of guest_memfd is that it prevents a misbehaving host from crashing the system when attempting to access private guest memory (deliberately or accidentally), since this memory isn't mapped to begin with. Without guest_memfd, the hypervisor would still prevent such accesses, but in certain cases the host kernel wouldn't be able to recover, causing the system to crash. Support for mmap() in this patch series maintains the invariant that only memory shared with the host, either explicitly by the guest or implicitly before the guest has started running (in order to populate its memory) is allowed to have a valid mapping at the host. At no point should _private_ guest memory have any mappings at the host. This patch series is divided into two parts: The first part is to the KVM core code. It adds opt-in support for mapping guest memory only as long as it is shared, or optionally when it is first created. For that, the host needs to know the mappability status of guest memory. Therefore, the series adds a structure to track whether memory is mappable. This new structure is associated with each guest_memfd inode object. The second part of the series adds guest_memfd support for arm64. The patch series enforces the invariant that only memory shared with the host can be mapped by the host userspace in vm_operations_struct:fault(), instead of file_operations:mmap(). On a fault, we check whether the page is allowed to be mapped. If not, we deliver a SIGBUS to the current task, as discussed in the Linux MM Alignment Session and LPC 2024 on this topic [2,3 ]. Currently, there's no support for huge pages, which is something we hope to support in the near future [4]. Cheers, /fuad [1] https://lore.kernel.org/all/20240801090117.3841080-1-tabba@google.com/ [2] https://lore.kernel.org/all/20240712232937.2861788-1-ackerleytng@google.com/ [3] https://lpc.events/event/18/sessions/183/#20240919 [4] https://lore.kernel.org/all/cover.1726009989.git.ackerleytng@google.com/ Ackerley Tng (2): KVM: guest_memfd: Make guest mem use guest mem inodes instead of anonymous inodes KVM: guest_memfd: Track mappability within a struct kvm_gmem_private Fuad Tabba (9): KVM: guest_memfd: Introduce kvm_gmem_get_pfn_locked(), which retains the folio lock KVM: guest_memfd: Allow host to mmap guest_memfd() pages when shared KVM: guest_memfd: Add guest_memfd support to kvm_(read|/write)_guest_page() KVM: guest_memfd: Add KVM capability to check if guest_memfd is host mappable KVM: guest_memfd: Add a guest_memfd() flag to initialize it as mappable KVM: guest_memfd: selftests: guest_memfd mmap() test when mapping is allowed KVM: arm64: Skip VMA checks for slots without userspace address KVM: arm64: Handle guest_memfd()-backed guest page faults KVM: arm64: Enable guest_memfd private memory when pKVM is enabled Documentation/virt/kvm/api.rst | 4 + arch/arm64/include/asm/kvm_host.h | 3 + arch/arm64/kvm/Kconfig | 1 + arch/arm64/kvm/mmu.c | 120 +++++- include/linux/kvm_host.h | 63 +++ include/uapi/linux/kvm.h | 2 + include/uapi/linux/magic.h | 1 + tools/testing/selftests/kvm/Makefile | 1 + .../testing/selftests/kvm/guest_memfd_test.c | 57 ++- virt/kvm/Kconfig | 4 + virt/kvm/guest_memfd.c | 397 ++++++++++++++++-- virt/kvm/kvm_main.c | 279 +++++++++++- 12 files changed, 877 insertions(+), 55 deletions(-) base-commit: 8cf0b93919e13d1e8d4466eb4080a4c4d9d66d7b