From patchwork Wed Apr 2 16:07:16 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nikita Kalyazin X-Patchwork-Id: 14036199 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54DC4C28B20 for ; Wed, 2 Apr 2025 16:07:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8752B280003; Wed, 2 Apr 2025 12:07:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 825D5280001; Wed, 2 Apr 2025 12:07:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6EE58280003; Wed, 2 Apr 2025 12:07:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 4E366280001 for ; Wed, 2 Apr 2025 12:07:36 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 1CD83BBABC for ; Wed, 2 Apr 2025 16:07:38 +0000 (UTC) X-FDA: 83289584196.24.A394AF1 Received: from smtp-fw-80009.amazon.com (smtp-fw-80009.amazon.com [99.78.197.220]) by imf09.hostedemail.com (Postfix) with ESMTP id 3365614000B for ; Wed, 2 Apr 2025 16:07:35 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=qAUjjE1K; spf=pass (imf09.hostedemail.com: domain of "prvs=1800b2f01=kalyazin@amazon.co.uk" designates 99.78.197.220 as permitted sender) smtp.mailfrom="prvs=1800b2f01=kalyazin@amazon.co.uk"; dmarc=pass (policy=quarantine) header.from=amazon.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743610056; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=ykJjfCmlolWCrZ+wMpqsGGROl+iLthzyksVnMeJ/bsI=; b=jDNIJpu/yUzl69M5ln0J9vjsWCxlol5qPP3JJZladguTROqj2/Z36WRpDg1YYqA1GVNKAa 1HamaP/7YPzxLVg70Bakp4sUWWvY3JwhpAZlYIzZoxfxRXLb4aCzZbOZ7vDh6q1db9hOU+ YYu+PWxBvOqAMFISVkvT/vxdVAymvNE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743610056; a=rsa-sha256; cv=none; b=7sO9oXxP015pSeGewSCEjR18OKlkMrco1qBKde21PnndcRX6CoSDjpk2/hC+Vq/C89410T rAubSKbjhiuVQ971r+ygaKovZUtCdxUZrmr/SaNByqee9bIDOIGT4rlRmCDfPSmvKe3vYv oUAwVZbLaH5Y/6eEY7RgyTCmIT/lh88= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=qAUjjE1K; spf=pass (imf09.hostedemail.com: domain of "prvs=1800b2f01=kalyazin@amazon.co.uk" designates 99.78.197.220 as permitted sender) smtp.mailfrom="prvs=1800b2f01=kalyazin@amazon.co.uk"; dmarc=pass (policy=quarantine) header.from=amazon.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1743610056; x=1775146056; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=ykJjfCmlolWCrZ+wMpqsGGROl+iLthzyksVnMeJ/bsI=; b=qAUjjE1Kc2M3g60679R5unfjQ32BZkobNP/deqIjjHXCkfcNTyAjr3PM bTJFmQUuKHHkC9iLht6cDogRCtRK1BPGbC2W8Ip07IAmcECztr+fIu0/K 2mEyF/4iQmPtdAfhIG+FdWXEFEKQx0ls1fpbOgo4r668F1pTcuMpOwRDf M=; X-IronPort-AV: E=Sophos;i="6.15,182,1739836800"; d="scan'208";a="187481073" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.210]) by smtp-border-fw-80009.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Apr 2025 16:07:31 +0000 Received: from EX19MTAEUC001.ant.amazon.com [10.0.17.79:35474] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.18.64:2525] with esmtp (Farcaster) id c135015f-45aa-440b-8b20-346c1d1d6670; Wed, 2 Apr 2025 16:07:24 +0000 (UTC) X-Farcaster-Flow-ID: c135015f-45aa-440b-8b20-346c1d1d6670 Received: from EX19D014EUC003.ant.amazon.com (10.252.51.184) by EX19MTAEUC001.ant.amazon.com (10.252.51.193) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Wed, 2 Apr 2025 16:07:24 +0000 Received: from EX19MTAUEA001.ant.amazon.com (10.252.134.203) by EX19D014EUC003.ant.amazon.com (10.252.51.184) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Wed, 2 Apr 2025 16:07:23 +0000 Received: from email-imr-corp-prod-iad-all-1a-6ea42a62.us-east-1.amazon.com (10.43.8.2) by mail-relay.amazon.com (10.252.134.102) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14 via Frontend Transport; Wed, 2 Apr 2025 16:07:23 +0000 Received: from dev-dsk-kalyazin-1a-a12e27e2.eu-west-1.amazon.com (dev-dsk-kalyazin-1a-a12e27e2.eu-west-1.amazon.com [172.19.103.116]) by email-imr-corp-prod-iad-all-1a-6ea42a62.us-east-1.amazon.com (Postfix) with ESMTPS id 2C31B431A3; Wed, 2 Apr 2025 16:07:22 +0000 (UTC) From: Nikita Kalyazin To: , , CC: , , , , , , , , , , , , , , , , Subject: [PATCH v2 0/5] KVM: guest_memfd: support for uffd minor Date: Wed, 2 Apr 2025 16:07:16 +0000 Message-ID: <20250402160721.97596-1-kalyazin@amazon.com> X-Mailer: git-send-email 2.47.1 MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 3365614000B X-Stat-Signature: q8onj6pdmg69rpaxxzgf87by6rz7mbbg X-HE-Tag: 1743610055-400297 X-HE-Meta: U2FsdGVkX1+Yx6n1cbCjBa66GvQuxdAQR9Ms6UB95G9SAq4/OZPervpmo1btGnrmU29+noQjyhr4savSJAPGRi80quo8Om2O9iU4tc5EL7uNE42474pcUnTfBx5uEjPZOXReb7QIbqnbZ3ftMf4j8U4wLaTakhzM8jpJWauTQCsG5JajqRU8f7U1feZQDte6elZ/thodp9wdwazFj0dMMBTMRy/iBYoUIC/LojvkgFVltxNKDH7IT9wFQsrg78iPzzFrR79H6tX4d/uhWoSGpsvTNy42bgp9SezxuxeTUan+L1vghcN5o0n5mmAGvXUccPmWvHmY0LQjQEhkpbe4aUg5906OSsQ/lfr2SpzmK+4twd6nghjqVhN2+Llil59c7SoekVqAYrBL419xEbNg5Db6PH26WFicuBJVeKYhgYfYtAiiRl31tBDd4hhVvoD/IrkEOFxV/ymn9mw7Flmqp45luDzzMKg+7uQJXZSFVPQTrBg/jf6T7k+zF05mHOjCs4G1WBklrvIXvGI/WvIH3aknk+kPPiWto+gDRelsElkWCUqksASHb99bSXFMGF+RdnxtBE2Aynhom8UYLs3k0zuRZQSBNQ8EKaWgCtZ2DWMDPreyn7FjU2SWZKmdDFO6yOKwJ4UaEaEbuQ0wb1IT+6ZT9cSPpvrA9vK7yUn0aXpxYhA6q7BqVzHyHJ8aaZF4hDmpfDB6LF6dL4t5lDT5v90vklClSQX9uAjIpJ4mGAIOOGas/tTc15wY2YIyc8OGNGQWe7+dV85ubqo2Eek+jokv1A6U+0d94S3PEIoeHq8gFO8iIdBlbLrQ3NDUOSflWvQciSMP+0VJQmzLYl1UIBeJgE4Ej7jZA58xwpdlBb03cWtgWJ8cd9x7uOvgS0Wp3RLsMWp0N9zmbhCQyRNvW0fHFNeiEbizD4rDajkbIgtJGFbwmi6Or5jpHMzgLcG5XZOzBynNWFDh6qUJ82y WagqzrGW WY5ni7Df+pAOl13H89eXEL/7XFj6rNSSw9bbulcrFOkGC+SesXbyPr+PUvt8M66Z4Dt2uxVMShyYBPt7wkfkkBrsvkQGHVSDEAlBh259RO1Ja/F3iLJfdTADqvN9LGqB32Tv5U9VhAV7ziw8EblfXsIcnaJtXhmqffT5ePpCWmPB6bV6EJTh1u1W2Sli/D3eBgp2KgIPZ3bHFMgwDJngnveXUCPBm3OWxOLAjaGEHzS64Uj78gL2Y5KJ/m2dYYfTzX2BkxhLGIACb8pg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000026, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This series is built on top of Fuad's v7 "mapping guest_memfd backed memory at the host" [1]. With James's KVM userfault [2], it is possible to handle stage-2 faults in guest_memfd in userspace. However, KVM itself also triggers faults in guest_memfd in some cases, for example: PV interfaces like kvmclock, PV EOI and page table walking code when fetching the MMIO instruction on x86. It was agreed in the guest_memfd upstream call on 23 Jan 2025 [3] that KVM would be accessing those pages via userspace page tables. In order for such faults to be handled in userspace, guest_memfd needs to support userfaultfd. Changes since v1 [4]: - James, Peter: implement a full minor trap instead of a hybrid missing/minor trap - James, Peter: to avoid shmem- and guest_memfd-specific code in the UFFDIO_CONTINUE implementation make it generic by calling vm_ops->fault() While generalising UFFDIO_CONTINUE implementation helped avoid guest_memfd-specific code in mm/userfaulfd, userfaultfd still needs access to KVM code to be able to verify the VMA type when handling UFFDIO_REGISTER_MODE_MINOR, so I used a similar approach to what Fuad did for now [5]. In v1, Peter was mentioning a potential for eliminating taking a folio lock [6]. I did not implement that, but according to my testing, the performance of shmem minor fault handling stayed the same after the migration to calling vm_ops->fault() (tested on an x86). Before: ./demand_paging_test -u MINOR -s shmem Random seed: 0x6b8b4567 Testing guest mode: PA-bits:ANY, VA-bits:48, 4K pages guest physical test memory: [0x3fffbffff000, 0x3ffffffff000) Finished creating vCPUs and starting uffd threads Started all vCPUs All vCPU threads joined Total guest execution time: 10.979277020s Per-vcpu demand paging rate: 23876.253375 pgs/sec/vcpu Overall demand paging rate: 23876.253375 pgs/sec After: ./demand_paging_test -u MINOR -s shmem Random seed: 0x6b8b4567 Testing guest mode: PA-bits:ANY, VA-bits:48, 4K pages guest physical test memory: [0x3fffbffff000, 0x3ffffffff000) Finished creating vCPUs and starting uffd threads Started all vCPUs All vCPU threads joined Total guest execution time: 10.978893504s Per-vcpu demand paging rate: 23877.087423 pgs/sec/vcpu Overall demand paging rate: 23877.087423 pgs/sec Nikita [1] https://lore.kernel.org/kvm/20250318161823.4005529-1-tabba@google.com/T/ [2] https://lore.kernel.org/kvm/20250109204929.1106563-1-jthoughton@google.com/T/ [3] https://docs.google.com/document/d/1M6766BzdY1Lhk7LiR5IqVR8B8mG3cr-cxTxOrAosPOk/edit?tab=t.0#heading=h.w1126rgli5e3 [4] https://lore.kernel.org/kvm/20250303133011.44095-1-kalyazin@amazon.com/T/ [5] https://lore.kernel.org/kvm/20250318161823.4005529-1-tabba@google.com/T/#Z2e.:..:20250318161823.4005529-3-tabba::40google.com:1mm:swap.c [6] https://lore.kernel.org/kvm/20250303133011.44095-1-kalyazin@amazon.com/T/#m8695dc24d2cc633a6a486a8990e3f7d50d4efb79 Nikita Kalyazin (5): mm: userfaultfd: generic continue for non hugetlbfs KVM: guest_memfd: add kvm_gmem_vma_is_gmem mm: userfaultfd: allow to register continue for guest_memfd KVM: guest_memfd: add support for userfaultfd minor KVM: selftests: test userfaultfd minor for guest_memfd include/linux/mm_types.h | 3 + include/linux/userfaultfd_k.h | 13 ++- mm/hugetlb.c | 2 +- mm/shmem.c | 3 +- mm/userfaultfd.c | 25 +++-- .../testing/selftests/kvm/guest_memfd_test.c | 94 +++++++++++++++++++ virt/kvm/guest_memfd.c | 15 +++ virt/kvm/kvm_mm.h | 1 + 8 files changed, 146 insertions(+), 10 deletions(-) base-commit: 3cc51efc17a2c41a480eed36b31c1773936717e0