From patchwork Tue Sep 10 23:43:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799474 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1741EE01F2 for ; Tue, 10 Sep 2024 23:44:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5F46C8D00CC; Tue, 10 Sep 2024 19:44:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 57D598D0002; Tue, 10 Sep 2024 19:44:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 41DDA8D00CC; Tue, 10 Sep 2024 19:44:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 22DB48D0002 for ; Tue, 10 Sep 2024 19:44:38 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id D1FF3120DF2 for ; Tue, 10 Sep 2024 23:44:37 +0000 (UTC) X-FDA: 82550460594.14.E702139 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf10.hostedemail.com (Postfix) with ESMTP id 149BAC0002 for ; Tue, 10 Sep 2024 23:44:35 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=PNZEK1HN; spf=pass (imf10.hostedemail.com: domain of 349ngZgsKCFg02A4HB4OJD66EE6B4.2ECB8DKN-CCAL02A.EH6@flex--ackerleytng.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=349ngZgsKCFg02A4HB4OJD66EE6B4.2ECB8DKN-CCAL02A.EH6@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011772; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=sKXYIyjsovK+WV8DiI7P/1dY1v3IRDe3dp7/1EItjd0=; b=7Fm3elZAF9DdWP7rT8pY1wRMipWiuyy8xLTyvQMyNRC1ZCkNX1Mp2q5VwCZGZaCmFUuP/I KX5RGhv1d03heCsdi8t1FsKSLV1kmk3RR9gL2B6DqMfuj/ORi3f8lnGp0p+606CiDfq0rb iYNaxUPpnErNfXZoRol/gfYwjzbwuDE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011772; a=rsa-sha256; cv=none; b=RH+j1FFsTrJNsgT9T8hDZYBwWFiUXDFAwkGyMUaXtS2ocNNxFiz/jO24gWcZGtBoAS5p53 k5zOIsZ0P8JplVkPVmBr3bbmCNFpkWU4Fl21sDyqhEJkyBG5LDTMcHzntCC/mN5SJFnH1b lqZpHZxKskVglFMMXSIdXz8nMIUrxQo= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=PNZEK1HN; spf=pass (imf10.hostedemail.com: domain of 349ngZgsKCFg02A4HB4OJD66EE6B4.2ECB8DKN-CCAL02A.EH6@flex--ackerleytng.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=349ngZgsKCFg02A4HB4OJD66EE6B4.2ECB8DKN-CCAL02A.EH6@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-6d5fccc3548so6661567b3.1 for ; Tue, 10 Sep 2024 16:44:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011875; x=1726616675; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=sKXYIyjsovK+WV8DiI7P/1dY1v3IRDe3dp7/1EItjd0=; b=PNZEK1HNKHaMTyH9NgK5gqNmlpTi+xoWJkYHGMZq0PJxSXqgVlNzsjOrvO3r1Mp1dc Aa0kN1axRkfsDXfkzTPAg+wCa8H6RZhnomFOzVPSTka+aXAZQdFTTmzekDBGHbPgU0dX 1qj6fZEbQv70/Ni/N8T14rUo+/OomaQhXtChOIuRhUUle1trfHUSouh7z8oYnTrLj1Bh 7jX7KZaec3pLcBsVUDbx3LQqhwQonFushhFynRAd84Oy6kVeudx/H7dvusOsSjfkZLU6 CiBLgBMQ38UxIt3oDRyGaqMYNhn2b8WD/tiahf532lt+O0Swt8UnGrXyqBzh0Lnbvo91 7Xxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011875; x=1726616675; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=sKXYIyjsovK+WV8DiI7P/1dY1v3IRDe3dp7/1EItjd0=; b=RlfsJIw31oSULu/IYSMmCqYsHXOfvcHQBoxM3NsLBTsuD+6gjVScPP+wRVi2f5zT2X 57JmPi3mD0tlykRP81TAv+Axv+ZQb/0sDlvRrXPQbtF884RNrrrpiZdfszICfQC9cadB v3oCq/v9dUeroGDIxKCHF6wWSp9xHi2Tk3YwThxMpCR8VM3h7YFKMzImGyZgAQpgnA6q qe/Fai1alI0MrWGTWPVDpnVNGFibCFL3qQXeVd8DHxeSkZTwyV8/wQNOTxbza0PqXq9r 8XV66xi5D1IKQ3q1KM1HRjdr3xNfCp4Ah9IE//aWF9iWvUBYIaGngKUWX0CTGbhAUFIV rh4Q== X-Forwarded-Encrypted: i=1; AJvYcCUOM3HEayobXxzI2eTuA1PzsPItparXQaDL8Doi1EBoXFSAia5obSaXIeFli3+CrcXMoQ5gGuXplg==@kvack.org X-Gm-Message-State: AOJu0Yyh1HIGkOi9RqecjukD8K4yTAQP3CBa645sbx9p39wOVWOgYHZa 18XyS6VJt+OnP27Fdb6SdpvB+/wP3yFap7VQjgVEjag71adxPfgfu9m0B8AClIzWmv8O0XjjJYC S4wlLwvNZ5eiX542gUCWVJQ== X-Google-Smtp-Source: AGHT+IGUU7n7hwsUDrLro4Wgjoj/bFNd+sQ+f4VxxKGmgg0FjsnP+LmFWcl01bi6YDhQwKnjz58ssMXCFPvnGLCBGA== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a05:690c:20a0:b0:6db:7f4d:f79f with SMTP id 00721157ae682-6db951c4d86mr1153687b3.0.1726011875005; Tue, 10 Sep 2024 16:44:35 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:32 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: Subject: [RFC PATCH 01/39] mm: hugetlb: Simplify logic in dequeue_hugetlb_folio_vma() From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 149BAC0002 X-Stat-Signature: 6b3cdyfu4tmn9uyaq54ow5nw9xexk69u X-HE-Tag: 1726011875-956836 X-HE-Meta: U2FsdGVkX1/71UODBADkVHCOfeBSVZsLRasoNjboXvn3XH6Pi4cCal5bGYaNDGHIyukRmyUPRpzNw+yC2Bo7QtHDb3ylieNAReip2OLywgSYzFTDm7ninzYtCodjHNPzV2r9coACKU6AzIcG6euHo0szVRHzCVaj9g5OYrw2yERC6MF8musxEKyikDPJP7KYA9TagnUPvMdqCWJDGYeiDq0XmzP96UNXE+p5V9z8DIF+8f0T4K5oFMKMweZ99yd2hteBoG3e5OoEZ0teYYHGUJdibmoDB+4WGhmTY8orHfqiGpmxip5JuwSDqFTT4XXc1HQQRnc7YM5mOSQj3FuAcBGXXLjuNvhYJacOWXc44NlYc0F7EJ0dzWexGj/Iz1UXsXB4+7tpMmmrF5teXSBorworNC9xCOCS7Nyl1aILZhuc082qvTR+T8CeOmFgKC9BpUG0AIw9Cs+VkzaI7y0EhEIqzUvyQxx/+cXxC/u7dbjnmSTh26yqB0w+CIWVDYOrt5BwkaDzkL3z2eYP7XVnXowaL1T0iDUxPTr/qIFrwiRmLWr1JgRHWrIAQ5J6mZGJCf49/L1Pgs9Hsr4MW9iPbK2/n04ibw8Yup7/S50IdVMyqnb+OJadR3NHGsXSyVKDwOL27vgq+GWKyjk6sWaiuP1++2eMxHiGUu0cP1CQTiOTSalamjVXVLvF1aeAKR2vgEjAGSj1XeKUaKEDAkCUmJ97T2x2+IzyP8vw5Fg0kpSJN/oUO+FAZ2WXHJC6ApHcs+w3toPz6oz4BQreIBDjY2gSAEE8NOZj+3kt1jbAPspeeklB/moLV3ZSDaiXZJ5S3Zxp/s/G6OFaEndcavP8TFSUQxkv3N7GY6CMbOQBsNesP7DypcwVRG5jqGYj/NQcJoaEbsDwNz1xOQjHbgxqzUoA1FJrBUkIMCWcGOVRDnuZzDvISnn3nF757A1OWjyFMnnHT1XI3dMBHHYtdUM XM22uvRQ GE1gclt7HNpfP0cE8kEk6xPFX+0uAM69fo1B7Dl3N4289T9m9wj40pImD2FBbBRU4kLrrGMPmosRyx6qlwxAycjuTfkIjHBlTwNnFvZ8oP0YDXp2uEHjz9Lw4+oFPbKjgHTrLYYycSBG6dXhnxI5B2raCSki/xjoxJqgDLE8kUgYG0BKHjmQSCdijjFgDdukwFNFY14yYXjWMx4zWcNU/CBTgM2VAyXtm+8NGvTfoor/e429zsMNhcpto0k24BIe42/BPlOFcU+OvwXL7GHFKkiINqK1Gmg7jtMGMJzRiB5Tlojm87uRs1hOEyCDB/Il1ueaoddx1gF6doJrRVugaSDwitg3vNnC9CO2YZnJAfaEPxKkDexh5phwI+mEQJiIxtHB05Vz981VMlmtXyA6NuOsJL1vAXPWDQEqtJTSjymc98FSyEXYadnxWwkELLL6nCYb5/tdYbSHP+XDHX3gQgkZGohEwtqVzi+PSX7VBfigNmmrkTNH+9nDiEKj7RfEkYWNnIqkEzb9Ha0O9SW3Vg9uoEk4azjWwdLwJPiFLfzO+0/IDNRGqLClUiZ4wjTUWe0KgU2jcfsLG5sLEamhexDBHGJZtkysoKGADAG4h2wgQXHLhf4YncjVxF8bRg1yTpzim1UbvsY7fXicUNr3qNE7eWt8zr3gTIYfHgq+iTDyp5+fi4JSftTxVeTDo8DjB2Cv/FAqBXAyYfb8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000036, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Replace arguments avoid_reserve and chg in dequeue_hugetlb_folio_vma() so dequeue_hugetlb_folio_vma() is more understandable. The new argument, use_hstate_resv, indicates whether the folio to be dequeued should be taken from reservations in hstate. If use_hstate_resv is true, the folio to be dequeued should be taken from reservations in hstate and hence h->resv_huge_pages is decremented, and the folio is marked so that the reservation is restored. If use_hstate_resv is false, then a folio needs to be taken from the pool and hence there must exist available_huge_pages(h), failing which, goto err. The bool use_hstate_resv can be reused within dequeue_hugetlb_folio_vma()'s caller, alloc_hugetlb_folio(). No functional changes are intended. As proof, the original two if conditions !vma_has_reserves(vma, chg) && !available_huge_pages(h) and avoid_reserve && !available_huge_pages(h) can be combined into (avoid_reserve || !vma_has_reserves(vma, chg)) && !available_huge_pages(h). Applying de Morgan's theorem on avoid_reserve || !vma_has_reserves(vma, chg) yields !avoid_reserve && vma_has_reserves(vma, chg), hence the simplification is correct. Signed-off-by: Ackerley Tng --- mm/hugetlb.c | 33 +++++++++++---------------------- 1 file changed, 11 insertions(+), 22 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index aaf508be0a2b..af5c6bbc9ff0 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1280,8 +1280,9 @@ static bool vma_has_reserves(struct vm_area_struct *vma, long chg) } /* - * Only the process that called mmap() has reserves for - * private mappings. + * Only the process that called mmap() has reserves for private + * mappings. A child process with MAP_PRIVATE mappings created by their + * parent have no page reserves. */ if (is_vma_resv_set(vma, HPAGE_RESV_OWNER)) { /* @@ -1393,8 +1394,7 @@ static unsigned long available_huge_pages(struct hstate *h) static struct folio *dequeue_hugetlb_folio_vma(struct hstate *h, struct vm_area_struct *vma, - unsigned long address, int avoid_reserve, - long chg) + unsigned long address, bool use_hstate_resv) { struct folio *folio = NULL; struct mempolicy *mpol; @@ -1402,16 +1402,7 @@ static struct folio *dequeue_hugetlb_folio_vma(struct hstate *h, nodemask_t *nodemask; int nid; - /* - * A child process with MAP_PRIVATE mappings created by their parent - * have no page reserves. This check ensures that reservations are - * not "stolen". The child may still get SIGKILLed - */ - if (!vma_has_reserves(vma, chg) && !available_huge_pages(h)) - goto err; - - /* If reserves cannot be used, ensure enough pages are in the pool */ - if (avoid_reserve && !available_huge_pages(h)) + if (!use_hstate_resv && !available_huge_pages(h)) goto err; gfp_mask = htlb_alloc_mask(h); @@ -1429,7 +1420,7 @@ static struct folio *dequeue_hugetlb_folio_vma(struct hstate *h, folio = dequeue_hugetlb_folio_nodemask(h, gfp_mask, nid, nodemask); - if (folio && !avoid_reserve && vma_has_reserves(vma, chg)) { + if (folio && use_hstate_resv) { folio_set_hugetlb_restore_reserve(folio); h->resv_huge_pages--; } @@ -3130,6 +3121,7 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, struct mem_cgroup *memcg; bool deferred_reserve; gfp_t gfp = htlb_alloc_mask(h) | __GFP_RETRY_MAYFAIL; + bool use_hstate_resv; memcg = get_mem_cgroup_from_current(); memcg_charge_ret = mem_cgroup_hugetlb_try_charge(memcg, gfp, nr_pages); @@ -3190,20 +3182,17 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, if (ret) goto out_uncharge_cgroup_reservation; + use_hstate_resv = !avoid_reserve && vma_has_reserves(vma, gbl_chg); + spin_lock_irq(&hugetlb_lock); - /* - * glb_chg is passed to indicate whether or not a page must be taken - * from the global free pool (global change). gbl_chg == 0 indicates - * a reservation exists for the allocation. - */ - folio = dequeue_hugetlb_folio_vma(h, vma, addr, avoid_reserve, gbl_chg); + folio = dequeue_hugetlb_folio_vma(h, vma, addr, use_hstate_resv); if (!folio) { spin_unlock_irq(&hugetlb_lock); folio = alloc_buddy_hugetlb_folio_with_mpol(h, vma, addr); if (!folio) goto out_uncharge_cgroup; spin_lock_irq(&hugetlb_lock); - if (!avoid_reserve && vma_has_reserves(vma, gbl_chg)) { + if (use_hstate_resv) { folio_set_hugetlb_restore_reserve(folio); h->resv_huge_pages--; } From patchwork Tue Sep 10 23:43:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799475 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41D94EE01F4 for ; Tue, 10 Sep 2024 23:44:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5F38A8D00CD; Tue, 10 Sep 2024 19:44:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 57BEE8D0002; Tue, 10 Sep 2024 19:44:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3F59E8D00CD; Tue, 10 Sep 2024 19:44:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 196B88D0002 for ; Tue, 10 Sep 2024 19:44:40 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id CFE241C4D96 for ; Tue, 10 Sep 2024 23:44:39 +0000 (UTC) X-FDA: 82550460678.14.A005ABE Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf20.hostedemail.com (Postfix) with ESMTP id 106461C0002 for ; Tue, 10 Sep 2024 23:44:37 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Cgx3YFqI; spf=pass (imf20.hostedemail.com: domain of 35NngZgsKCFk13B5IC5PKE77FF7C5.3FDC9ELO-DDBM13B.FI7@flex--ackerleytng.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=35NngZgsKCFk13B5IC5PKE77FF7C5.3FDC9ELO-DDBM13B.FI7@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011774; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=J90/0IHODiFrv0z5R6WYA4Y44jSC3bFJWxuZTo3OsX8=; b=axf0Tm5NK/dAY0FICm6cw3QEMueuS58AuWTQwXvz1J9n17BUtSEfxr2A7LG8AEw67P3LYx jrLPVeuJLiekykk0Xbwiz+tp8i7gMu9KhdEFPuICJLaiRTOALiL1POXzLXFlVzHAHdrX9E XJqzLYiHkJFEUXWVDDlhrPwIgTJY3no= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011774; a=rsa-sha256; cv=none; b=IT4cGOfbrzYucKI2CptPpPKl8M4iKZrUKZ7xe884y/wDpXwG7CEzn2S3zV183q8tErv4Ne /Mg3xX3yt9Or+aWNzPBbsMN7jo1fcmHalqCs6z3dFkGyrXQw0RboH/r2JIVESkdvggfPjS 0O/W4XfhwKlyUliPdrYO7a/m2KavfQQ= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Cgx3YFqI; spf=pass (imf20.hostedemail.com: domain of 35NngZgsKCFk13B5IC5PKE77FF7C5.3FDC9ELO-DDBM13B.FI7@flex--ackerleytng.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=35NngZgsKCFk13B5IC5PKE77FF7C5.3FDC9ELO-DDBM13B.FI7@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2073498f269so28757755ad.3 for ; Tue, 10 Sep 2024 16:44:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011877; x=1726616677; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=J90/0IHODiFrv0z5R6WYA4Y44jSC3bFJWxuZTo3OsX8=; b=Cgx3YFqIgT6DOAh7VWYhKhr/VOeIgJh/p/Kl+ST9HmTE3S05gAVVktCcmGLb61tT5t ecrQ9KiEXTl/vyf7eso0hW/8/8gPGQhbpLXtZSxgPSttUUioatXbGOZBWicPgKHSHCQa C6EAZn5414424l+1V4pPzYrpDA8ZNONOxPLneLTqKInUU0drvgm6n6n8Ib1PVMvxjvIR NFjRhFZqy0BcudOgsxpQf3q/emyitaWfRVyhHMRMtwNqh+O24+g8JtxpRv1vYy39DUen r068wt1m/pnA+rTg/FcmQi+P61VH8y6gnyTMFRmHmi5edl0gmMu3/cBCuK532Yoab6od WX9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011877; x=1726616677; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=J90/0IHODiFrv0z5R6WYA4Y44jSC3bFJWxuZTo3OsX8=; b=wZG7+wTCd09cwtXM63aTSrH2zozA+gUF+uUugqyoa20sxVwtTwJBGm9avuvX4nZbbH zE0PhT6B9uKUv0O7wJPmp6Tp1638pdaWvwAai+afWe1sUa5lxLfcEapcGwf7OVWQ1/fG C5e31lo1fWyiHXmusdqJ+MshjsgTlXqKaDxnilb+v4ulZ9nqBYhQFtk9GHgcEId3LCFI LRmNcDPnjAq80+0yXwioxtHCIdePhHUScFSxTfPUle33Mj/BE3gXHEMKIZqzxMnoTYlL DJWyJoWwHK/Z8AKvpv8h2IpxfXNjshd2LBl42gEZckeSzgf5++9nam0z20ylrNpK4lLh pziA== X-Forwarded-Encrypted: i=1; AJvYcCVEfMLhXu4qBUe7Frtz564e/IaZKxAXkxuUDBH+dI++bY6XRlmwvcnCCcTejaxs5pNKTG5yaXhLXQ==@kvack.org X-Gm-Message-State: AOJu0YwGvgPV2D5iKfjwY2xDJ/3bveAXEWSD4BTcLmzMGXwY8WLYWPt7 gWTeGfoabxA/JyzfZ1kGAF7kGXWCatpiouw/sEy2n/X3YbvC9j0LewxRSG2a22ZbGqqsUIbuzDl FuUneBhOCeejsAw2DMwG/pw== X-Google-Smtp-Source: AGHT+IH+P75b7qdmkP6eoZLmeGyUgGCW8nOF1XpFIdx3c0B8s35iOw3gNUn1NvFFrnEnCdwtnZF761N+oNC8hfEQ5Q== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a17:902:c40f:b0:205:4d27:6164 with SMTP id d9443c01a7336-2074c5e71f2mr248205ad.5.1726011876473; Tue, 10 Sep 2024 16:44:36 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:33 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <416274da1bb0f07db37944578f9e7d96dac3873c.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 02/39] mm: hugetlb: Refactor vma_has_reserves() to should_use_hstate_resv() From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Stat-Signature: 3bkcp4aujfyoj7894uqqhwsxgrgfyykr X-Rspamd-Queue-Id: 106461C0002 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1726011877-446572 X-HE-Meta: U2FsdGVkX1+cCx+Jyh1pufsUD9QVvdqiF7TQTK0THW83m4bIobsWEl1LKoA+iD0LbIlGv3R0QnBCxp5FeCNVxOR1R0EZUO7fMRXD2e1DWjeijeXZ1T9MQY6YbiksubwpeUkHNNPnxNEsrwtOtvP34uqTeuyr10I5EYba7VtgL4SPnat6o5xhW56v7xDJ6IFgbjdL2Eui2lu53an3ma3vU0lT0IO0pvkmMmA6v6M4O8l+Vl4X7AkT3aZz8bVpbUjsh30VFEHWN2nZjE+fMUlvsawueypZyuh6hvV6AjaV/O5o2FpYInbpRWM/umy6eh6P9IkVjGDJ2rbqm0f21GEH/66NhWlmkuivVwfwkOACG0lADqaw2E/uZO9cz3F5oApBWRGIm9fLIKTRaq9IHlRm8PlqbXbS/4kpG4KvIJ4qSDxCIAmANsJspePHiOB20MZy0F6PMYOuKLIl0Zkwq2za2I/MBy4Sd/+mQBQFJztiaUMgJo8mOY0pPlI/Wjj09s1mU+13gQWYiLG/uLDpuGSGm2aSPOPb8d5gl5t6NWFrtaQLTv7UqYbI6E2oIjHUnlXSptBZJ5O/bLzG0KNiLfT1uUhclbMYddTKZXp2QGvmhUAbKxN82EZv90S+0zFz3VhN4I2bI1ELgUTqexhvgotbxBBmNyZV9gEBZ+xdWj4pW3WsZU326zsdVh5FI3JOMeYnXEHYyfeEdIaPcy9v4ATYVNf6xYWvocH02w9HqklXtnItDQGMO6ViRn/V3rywKPAwp1O11XHGSlvfM0vzqbeXY//iy+yANmtK/OwFOxKppnOVGQ2j4uZqM1SLAY/fZ3LAl713TOgPSAtLFHtEtNBUFwsYIf6wYmMEv9I2WqppEAnjo20DiL30wSLPbxMvV/rbpo9Pr8OoBfQbfKXNjM/vMib/PqUMIm9/bI0NZUkQZgPTN98+4/8YhmJBJ9M5c2K7NwWKbohq59erZm6bl+w GrjIYv6X o28U2nYvs+5zHnbLvv5gMHq+kgmgEdqqUMgyjAOdEpeVHMS1jEreAOXaqskWpKrdBcaFRMSvyxKInjTvdx6hiunRhwEY9I+/lw/QqwrdArt1CeTSpOp+Hn7r3DoPdohgkuxcWmQmbNMD/f9pQ7L+laZA7OIxBiHy3DoN0PyEz3vmpYkQFofDL3rKWeWm1/bdg+R0gECjxy5rvCpQ5yVWRmWwOGTXDryUBQ0htQO5B1xcyrr6aGZCXfE0impbOKuF+bXUBgccIyXm/yjoDPRP6tz+iOt9lEqkMHDMuSU57b5Nn1S0csGS/CpllfFNgzLdV/GKpCZE69muWFuM6f0ExrTa8AOoIBndzLi0tkic4tR1tJWLgQ+3a3NAyML4Vynlu1Z69o+8MilbVgKiYD+byqyp6q4hr0xJyhwNlVpWRW6VIvrFQ3RQkLjO/bfj5hRouP6Pvgtl16L68Nkz/+MzB8CQarZ6uJiYge4tpbEZWfCzRr4wc2OJGnOdTE7bq8NmV/kk+UgtSL97XQaoElnSpV4fNVIEcQ0GXQ0lQA69mJkEfbQJTBBRIuIR4uZdGc4vzjgi+wqQl99b0IrsM3IVMTvDwU80cwoC2G5Th3nBwLO2ePh4EDcqjyZ2PHKeqyWdIgWzko/K93cpIjHd881nXqNYzyDTK6KNqwcxpWnh46lkL4gQt/fNxIthRj83BNWNqtZQ6t5n6cC8xsfw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.003575, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: With the addition of the chg parameter, vma_has_reserves() no longer just determines whether the vma has reserves. The comment in the vma->vm_flags & VM_NORESERVE block indicates that this function actually computes whether or not the reserved count should be decremented. This refactoring also takes into account the allocation's request parameter avoid_reserve, which helps to further simplify the calling function alloc_hugetlb_folio(). Signed-off-by: Ackerley Tng --- mm/hugetlb.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index af5c6bbc9ff0..597102ed224b 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1245,9 +1245,19 @@ void clear_vma_resv_huge_pages(struct vm_area_struct *vma) hugetlb_dup_vma_private(vma); } -/* Returns true if the VMA has associated reserve pages */ -static bool vma_has_reserves(struct vm_area_struct *vma, long chg) +/* + * Returns true if this allocation should use (debit) hstate reservations, based on + * + * @vma: VMA config + * @chg: Whether the page requirement can be satisfied using subpool reservations + * @avoid_reserve: Whether allocation was requested to avoid using reservations + */ +static bool should_use_hstate_resv(struct vm_area_struct *vma, long chg, + bool avoid_reserve) { + if (avoid_reserve) + return false; + if (vma->vm_flags & VM_NORESERVE) { /* * This address is already reserved by other process(chg == 0), @@ -3182,7 +3192,7 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, if (ret) goto out_uncharge_cgroup_reservation; - use_hstate_resv = !avoid_reserve && vma_has_reserves(vma, gbl_chg); + use_hstate_resv = should_use_hstate_resv(vma, gbl_chg, avoid_reserve); spin_lock_irq(&hugetlb_lock); folio = dequeue_hugetlb_folio_vma(h, vma, addr, use_hstate_resv); From patchwork Tue Sep 10 23:43:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799476 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 223D7EE01F1 for ; Tue, 10 Sep 2024 23:44:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 112BB8D00CE; Tue, 10 Sep 2024 19:44:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 09C8F8D0002; Tue, 10 Sep 2024 19:44:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E30CE8D00CE; Tue, 10 Sep 2024 19:44:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id C42FD8D0002 for ; Tue, 10 Sep 2024 19:44:41 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 8771680897 for ; Tue, 10 Sep 2024 23:44:41 +0000 (UTC) X-FDA: 82550460762.09.5A02698 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf06.hostedemail.com (Postfix) with ESMTP id 9D948180009 for ; Tue, 10 Sep 2024 23:44:39 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=TZcK4nFV; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of 35tngZgsKCFs35D7KE7RMG99HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--ackerleytng.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=35tngZgsKCFs35D7KE7RMG99HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--ackerleytng.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011852; a=rsa-sha256; cv=none; b=JXLTi/i9ixnH8/VMX6HPTintg1mK50+GCukfTKJ6JKpUA7xZEaugbI9B2oJNO0ob6AJN/N e4DeTwgXAWQ58GrSts1iTr1QEcOE4UuyIrNH7fT06lptRVuLue+dc16Gc9KN+bEBwNgJ0G mF4h/xV4QhbnXlNKWE8ln1yWA1+vack= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=TZcK4nFV; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of 35tngZgsKCFs35D7KE7RMG99HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--ackerleytng.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=35tngZgsKCFs35D7KE7RMG99HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--ackerleytng.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011852; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=krcbfxFAJP32Bl8ts+LdeAkshb3wdoRz7FQc5et7+e0=; b=5Rxu75sdcdO8b5SGtvGDRCHlhWuo+0aI6/2M/iy/O0Hl0Q+9/gUAV+/6qcZzo60rzSxNLd AvUddEVgCPYq3RGVnGN+H+EedYp745E0FsdYWvhtKZosrTZ/ovBie4s/B4kuCBJzuUmykW 0Do7IomZSZQFEfxH93bXRKK5J1MkNNQ= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-205516d992eso15741195ad.3 for ; Tue, 10 Sep 2024 16:44:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011878; x=1726616678; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=krcbfxFAJP32Bl8ts+LdeAkshb3wdoRz7FQc5et7+e0=; b=TZcK4nFVJPD40vGVXqL7LL8chwjlE9QYeGUuWfEAj1PjfH+sY1hpeMG7wEsxhaIVsg CVq/r4p6m7mBK66SHd08Dewp0hXQIedRoAE4toYHiqqdDADaT0JDCO5G+esp/lW2dFSK U7Hv36a/1jMTMA82xLB9RW+WdhXwThFG87NPb9C3oGLZyIk2msbWWv6Y67oW/5Dp+H7g tz8wUoLvrUEmoc58zdC3Nr+E2tryP/wb4gnBHHPID9XdeMl+OQfImTpjTtB/+bm7nJEU GAYIwYEe91ynYudgrSaHPxd7epFdBqV2zmHj1vP9lm1RNMe+hlVqSV03omIkFjwfZxGZ XINQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011878; x=1726616678; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=krcbfxFAJP32Bl8ts+LdeAkshb3wdoRz7FQc5et7+e0=; b=LpCBlbgvNLSA5sdZHJUvWI513/16omGweKXKqYhmVpWBpTqHtgXSyowRYOXu8q/Ega 6gQxMA9iOs49arupUPIsl1Ocw1J86KnVkn38SCmyo8TRXz1qIgQT/Qv2qHs4EAQfX9SF 6n7t/6IxwmxJ2IGqgMd6BYZoGF3Dg7YVNMe9IxM/+/j1FQyHWr3PYY+ACTedg63S2NMK gqHYc7AQoVdTHeu6YltAGtjF2e979MC/l/MWWsAFa3AYS0FD9k5sJeEoKChhcrpQ1zNx hdhup2+8Et9BBwSIxr7hZ+0vs5SNa727HsQfE2W0cxV2rJLNYgTTKgPajTs1NpHsqpD7 fqWw== X-Forwarded-Encrypted: i=1; AJvYcCUWI3GQzFjBL8oVpbKJGLRWCMbmau0Kb8Ri8tf3jYPlXmxmgayakSbwvrJ+5LiLl/FAGEvLi/DQkQ==@kvack.org X-Gm-Message-State: AOJu0YyFtDtkHqyHvn4h8YVKnY1M0DDjuSuvq9RUxB+0iRcG4zTqtihB bFyqOD0dze6S5k8t+ag4wf++wjwJtVDn1PndvgyH7O3pIBRgZa2aIkremGDaVXhlFmcA4W94wUH W9ohc26GyTLSqcX6Yvi3DPQ== X-Google-Smtp-Source: AGHT+IGzA4nKfM70wloRZKktU8I5wwl9MfYLThvDDzHIgaA6loY/0Smr81sI2GQQVA1zTeWX0Ryp6QAH+KUadLwDFg== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a17:903:1c3:b0:205:71f1:853f with SMTP id d9443c01a7336-207521d6944mr178125ad.5.1726011878156; Tue, 10 Sep 2024 16:44:38 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:34 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <5a5e998e8f154c28a28dcdab73fb563f658f2f51.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 03/39] mm: hugetlb: Remove unnecessary check for avoid_reserve From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspam-User: X-Rspamd-Queue-Id: 9D948180009 X-Rspamd-Server: rspam01 X-Stat-Signature: 8pykrbi3sa8mqd1bumapmnimkjugaebw X-HE-Tag: 1726011879-234512 X-HE-Meta: U2FsdGVkX19i5xrSIJWJPGX34cf348T7wkvegcUOk/pcrju0mIK+rRjK4O5RyqZMnrjKEB017FH2Y6PHy6VN0CPiiX2e2QUYb6DPgHElhQ/LcypWRlSm7j/5iSPZP81wL5pZSJulgw4ahvb4EUH2+14aHu1He6impyvxscDvgQB+RqxCcCA3JHvuJdQUMoG7LHoxXs8soufIIbZ9VJ8GwK50UR6G7c96yNWhYQcWyNdWWY7B0RiYXHu7eiMUs3tG5xqfJovoIiWuCcpe8LECc/yVmr3OpFLr8bH+/QAvvCS5RK7oUTxcBUU5Vb1FHvd4yuysEsnxhniGKdD6fJBUKrIFKLMCO90cDqhTPmpO27GlZAP9Y1/qW4BqJx1JqcmoAHdQu54butjC+RtbggRsPDv9fTdff47zB4sDAhSL2ztm92PsDACAK5AHKngJIocygOEMg8/HLe9VgWUwn68T2ZiXgWetwWkEst+Jr2SJg4UHzOL8ZgE92z3DefbU8uz7wUZnHaOVm6gFi9mVi24H9yH0x7FU81vfK1IxkEag8qomJwwtKlb2+fu9yoRciSXJL9oZbpnIQfyKwH8Bd0n5cWaUbaNKHNYOiVgSrpgIBPWhpyLL/gJosC+/OHkp+GzB8F8jaTamfQAZ9vbW1CZtlXifaCf3eOtF0Ot3IeLKhD6lw7KkylodV9c4BjSASF3hdkIyjQoYFTqNAw/ZS+ghJCcy/DIrhA/YCxrKmIbjJiJHYGtyrvf+rKQrHR3uDLt7Fll7XE4j86x623xG6PGVrjJv02F+DuFXrw2f9FxG456RKvt/km6b/j6JZlSgcGHN9y8OodUkDwLcZBA4oI9aSd+MMfZO3YiHCL4E+yxmtNmgl2kMUrJQSskLED8AyTRn052n9NmwTfShg4ErduR6hk2A6zuaUcK/jbIDk0slpxbhLp/Jajkqg0+YXblX7AmtQViwqVI7FOhdz+jkw8Y yYeYBrUc XFTBntYzmAP28LW4f5GOC2jsj0325PQOaG7Q6dJyUJCoDkU6+RrYHjF2InFKvJw8qf7hP9ULuizh6k96AS9QVJKlAp9VXNJL6z2+3qx5rONAWJWZiPCjfiWG4iW83X/kWrhm7gPPgPU3j6KRLvMGgrXU7A7nypzpT1GdGwqqs9bj3cUB88x0nEKNq7xgy6XYC6lmpmi+bCQgcLkO3T5bATOp59I7B4xAVE720BtE5n2wNYGxrOImTevvXJ1u/ELxIO5vWnzheN19+2+zHxkC4eg1epHgieV3Q6V+Ovx9xq6Gjz0xb/tW/wtLWPq93zaOM2COeLZXBygu4B2j4t3qJmaTmcZtcg00QOUxTsM/8unfrMbFKHIeNSEKh7c7KqURwBHZcx54RNVo8NcV4RgFHTBRtl3KOIXnkZEIk9mSDaykG5Ue5KOJR4HpSL69kpOt/oDpLFCI2Zl7FpYovsqYkJUsGoRllnzeTEHkGI0267u2wxzQDjUbzHxIXW/xzZ5Nr8kA1T8xDMih7sURfvfZ4XqGftjj1f7o/C+RlnFMfI32G65GMI/S+6o7PHvOZN4GeFZmM6Lhcd8bUxRPUAswFHNunSu9ldsxuOhtLcRdjj117kTcQpAY+bY2N61urOD/vE1nYr0o5RhYX+vICkrr080eOaFbxIKJ1Y3EqTOdTjsx7Fg/NO4nZyri6OGruDcSHUhYU X-Bogosity: Ham, tests=bogofilter, spamicity=0.004488, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: If avoid_reserve is true, gbl_chg is not used anyway, so there is no point in setting gbl_chg. Signed-off-by: Ackerley Tng --- mm/hugetlb.c | 10 ---------- 1 file changed, 10 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 597102ed224b..5cf7fb117e9d 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3166,16 +3166,6 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, if (gbl_chg < 0) goto out_end_reservation; - /* - * Even though there was no reservation in the region/reserve - * map, there could be reservations associated with the - * subpool that can be used. This would be indicated if the - * return value of hugepage_subpool_get_pages() is zero. - * However, if avoid_reserve is specified we still avoid even - * the subpool reservations. - */ - if (avoid_reserve) - gbl_chg = 1; } /* If this allocation is not consuming a reservation, charge it now. From patchwork Tue Sep 10 23:43:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799477 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81326EE01F1 for ; Tue, 10 Sep 2024 23:44:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 30F2F8D00CF; Tue, 10 Sep 2024 19:44:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2BD938D0002; Tue, 10 Sep 2024 19:44:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0A50D8D00CF; Tue, 10 Sep 2024 19:44:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id D4C388D0002 for ; Tue, 10 Sep 2024 19:44:42 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 8DE88160E87 for ; Tue, 10 Sep 2024 23:44:42 +0000 (UTC) X-FDA: 82550460804.16.0FC03CE Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf26.hostedemail.com (Postfix) with ESMTP id CEE96140007 for ; Tue, 10 Sep 2024 23:44:40 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="d9WIn/aA"; spf=pass (imf26.hostedemail.com: domain of 359ngZgsKCFw46E8LF8SNHAAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--ackerleytng.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=359ngZgsKCFw46E8LF8SNHAAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011743; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jTd2rO5Zdb8WdV1vtO69lkT5XfB4R/SsMg9R2z/uiu4=; b=t0mh3PyTh4LQlUGXOOFS7ac6h9JUG/08JZBPc84dut30Wyf0vER1xEdrqV8W78rgDkmce7 8Rvj01EvPzCd+05ThJw893yhjWclujC8HpUibwWIOgXMNMikInzvESLAx6c4TPJRvIOpuI rZA+4uxlyT0GZp1Blo1FWBxzJiWzBoY= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="d9WIn/aA"; spf=pass (imf26.hostedemail.com: domain of 359ngZgsKCFw46E8LF8SNHAAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--ackerleytng.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=359ngZgsKCFw46E8LF8SNHAAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011743; a=rsa-sha256; cv=none; b=0VB3Bwo/dDm/acXzMwi8YWOufO8QZTQpQ3vMe73smaN7SV9BJr9gQF2s37SuDesDr0pVy+ RevjKJNpl68xBZAX+7OTHfBwwqEr7LaN6kYhY+bnxdJTfttUT6cFw7BmSuWktpCsZVGlff xlnWXnIKGXxMFkrDBlcPJmrl6vNP6Cs= Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-e02b5792baaso12484829276.2 for ; Tue, 10 Sep 2024 16:44:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011880; x=1726616680; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=jTd2rO5Zdb8WdV1vtO69lkT5XfB4R/SsMg9R2z/uiu4=; b=d9WIn/aAAnLBAG3aOEF9a8EPsRTMlXKdLonhjbui28RSMfSzCq7frxuXW9hXJqE8Sa fr8Vmi2FABzuFVVfSnJjnAkR8K21+PaqVgCm8pVWH52nfx/HTKvAzYDGLGg+Dncih/HG kK29NSqQNL2KkZ2V9B2ilS27UxX/X6d+KTzjkTCeWkQO6ursFEFd+zdkoAj+mZYQ4ilX ajbAUysdv+PrjUhngGiOKV3ax0EtDvzDLnQ3ZuVEPAnGVJecTj8wgbcDcOfWyP1YGH+0 qIDUnZkpisU62mCJ4ivIfskJ0pM/7osc8T5+nZRhfHAGzUkFDJmVWm0gkaXP5wK/HljP ibzw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011880; x=1726616680; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=jTd2rO5Zdb8WdV1vtO69lkT5XfB4R/SsMg9R2z/uiu4=; b=aakg/TcdsXrcyNFt+fprNFHXWEZQqoBweoqz9A7nS/CLuUf8n+KBUt0xNISWHBWYh3 f6bSZXPEtMATjphJjcPPdJTih5flhaQBxSK8XEK12kC75wU7As2WCCwRlzlrG5MrSJGD 2X8+Cl4v4DUNX5j2QE/wegurAnWueehofFTyXCOKyrrEZkMxpQopLhp5hFCp/+c5dvH5 0NUjDUSu7hlGWsOvHZAAn1l939JGN2lP/SN9mFEVlYwmo5BBfKB7Pcr4qUVz5plDOyhh O90NPzE8uoQRPMgUfAIxa4ck/8HM9cJcE811PMkn9lRNZfQL+InTodx7n/ngUaL6hFqv PqRA== X-Forwarded-Encrypted: i=1; AJvYcCXSQL4m226cclLpAfj0yhfFe6a0Wv5c7MgDAVyYpAgWqPYeFo8p2DNOrdMH0CFqGCaW2bWJ37nulw==@kvack.org X-Gm-Message-State: AOJu0Yx0hOvq8KN5tTv8hBFYw7xLu+GbEZDc6fBSb3AM4mBUxoKt4pV2 ZW4HJFoNKtGhYmA35mOgmDYejXkDvUQ9UCBlQnqSJMNZbkmAJTYx6g6qFd3pm84kM9uXl++IAwg Y7UuT48FQ59ibYAuQ0xYVFg== X-Google-Smtp-Source: AGHT+IELcWQwCF/9EVJh/HJrP9xEFU149NgacCTcvYjoLYUSuucXNFK9/9x50btN31jddKHhaQJO6vkBJPZwQK4+6w== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a25:ae9b:0:b0:e0b:f69b:da30 with SMTP id 3f1490d57ef6-e1d34a2f4b4mr88712276.9.1726011879738; Tue, 10 Sep 2024 16:44:39 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:35 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <9831cfcc77e325e48ec3674c3a518bda76e78df5.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 04/39] mm: mempolicy: Refactor out policy_node_nodemask() From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: CEE96140007 X-Stat-Signature: n8argzssdmnajmhgp5q7scugjocfqsah X-Rspam-User: X-HE-Tag: 1726011880-301726 X-HE-Meta: U2FsdGVkX1/pGCSkjTzMXYylHypmhLkzfhqxFHFX20sGjE24qOAIHqOUbVqGetgRF/aOpMH+264B0UoXvEMavk/GFyYdNt6yEC7q52KMKOrehTKaKLg2ZqSCx4JSPF/opwe+Ba5RLRAHRskJiy6mZYsL+iRZsKXn/h4I01XAOhrhRXrRhsiuKZun1HbMc2SvE5z14+Aqn0/YW3n2xL6kJjMJ1vKglp0GiC+Bk/i58nUTCQEx4NflPcKkHN26la6aDJKCQpSkpLHfSEWH0aQaKcpJvb1+rPfvs4YuZgj+ZpL0s/E+ZulFuF1dsr/xKv5HDNt5oBRJJDTEDtWvKMyNV9BY+VYv8VkGmB7Y9sJmJai7eppDQQaQ6z9eDLCSV3nmx6B5sWpKjCH+vmJX3L0Ep7aXwwoIApTMw20Mci1ZsHOzJ+MEJyRRZqRxPoVghf9x5+P1IhNRRj0QascUUbUa1u1XyrZwSRGGe0V7/tVs+sGgKujzlD+qkutE9Sq1QguF3npc/EOteTEO4gw4qNRQZP8qC0stf3WZDZt+4RThtSsfdowc0kbYZMG2fzKWRjRrrUoo29Hhrnpha6J6LxHMPEwlNv7BYh0pgVxH0YfRbdS/jXQKmtLPNEfyzDL1jdEvi05P6iyr0VWQEhJ/JdneGeIZDvSYVajOH3NQB87KGBniX9wBC0id5dxJ8QKuzpIOEH1dxgE9fF5WG/78P7siU2Q3NarUjvMvsjlzvqxN8BbhgFUM5KlfkfoGgTISGQvVMtpdb3NZK1PNyYw8vnvpEQVp/lrKWXWRyevxm8ogxX1UANL4/gnUlMtHW3QIJrHN2dVoiV6P5uqvKT1vXsWc8i/3PgntFubYW87NMr2riHb6mp3I1btIULywHHTpbbtKAGlncsmu+0aIgvJ5eKY7egJrgriS+hLaKC5NXcfU92h7bIdUmcKQVxPANnB2MWBXf5C8acZHGegWTyZ72yo NeuLbhfx iQs9TdwsOjgqOdbsuez2eibNt6RrXSRIEPfkLMoXCg8rhBEnP+VOEr7no9z10DNMGXGDiADsKY8G/VyVGl06R7/f5s7YXVLCmz325x/8U8jU2w6i2XXiyyG22xIMR5Gw5140MbQA+ivY2ILD4cig7s6Vx+I9ITJ541niztvelNwDVpnP3ZrzqXVvkFFRPaYRQAtCipsWQO/JKOFmQKe+Ul80sIotrFmmd1NPUYkMYbyj9ASs2u4OjmoYk+K94VlUk3Or9SuXN3ObIwvJnTLxt+NRULvug3IcBiqOg00TzfelmsaHyp/CWfvU1rE7D9eK8y6zMPDxFn3PmmIB05AjSUVzGHE+bSagqkSB6W34lPzIRgC2NAiXeDCaGpuTkleNOefXX31MnYHCSeXz7SMgdAgZ+fDMMKm091jqL60pbQ2P47GPhM1DiK3Q2GvFYsm0c/HgGSAbbrjV32VfosYhb/Yp2+TiClzdyB+xctKsT4fF8zaMyRwNOILs+86AuqgRn0QXvpVqZFn48s2zNG8kSh7yryHTVLk9Fc0D880wmAe69P+6XEt7Sjt8HsGZLgNNxrGUzTwthEQTRykYM01FxXPBoVFqqdwIrzM6a0qBENaWPTrb3rpGtg48ERQBNl9H8QgYdCXQ6+CFKhN4NPehgycFd/76O+idyWGrv X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This was refactored out of huge_node(). huge_node()'s interpretation of vma for order assumes the hugetlb-specific storage of the hstate information in the inode. policy_node_nodemask() does not assume that, and can be used more generically. This refactoring also enforces that nid default to the current node id, which was not previously enforced. alloc_pages_mpol_noprof() is the last remaining direct user of policy_nodemask(). All its callers begin with nid being the current node id as well. More refactoring is required for to simplify that. Signed-off-by: Ackerley Tng Reviewed-by: Gregory Price --- include/linux/mempolicy.h | 2 ++ mm/mempolicy.c | 36 ++++++++++++++++++++++++++---------- 2 files changed, 28 insertions(+), 10 deletions(-) diff --git a/include/linux/mempolicy.h b/include/linux/mempolicy.h index 1add16f21612..a49631e47421 100644 --- a/include/linux/mempolicy.h +++ b/include/linux/mempolicy.h @@ -138,6 +138,8 @@ extern void numa_policy_init(void); extern void mpol_rebind_task(struct task_struct *tsk, const nodemask_t *new); extern void mpol_rebind_mm(struct mm_struct *mm, nodemask_t *new); +extern int policy_node_nodemask(struct mempolicy *mpol, gfp_t gfp_flags, + pgoff_t ilx, nodemask_t **nodemask); extern int huge_node(struct vm_area_struct *vma, unsigned long addr, gfp_t gfp_flags, struct mempolicy **mpol, nodemask_t **nodemask); diff --git a/mm/mempolicy.c b/mm/mempolicy.c index b858e22b259d..f3e572e17775 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1212,7 +1212,6 @@ static struct folio *alloc_migration_target_by_mpol(struct folio *src, struct mempolicy *pol = mmpol->pol; pgoff_t ilx = mmpol->ilx; unsigned int order; - int nid = numa_node_id(); gfp_t gfp; order = folio_order(src); @@ -1221,10 +1220,11 @@ static struct folio *alloc_migration_target_by_mpol(struct folio *src, if (folio_test_hugetlb(src)) { nodemask_t *nodemask; struct hstate *h; + int nid; h = folio_hstate(src); gfp = htlb_alloc_mask(h); - nodemask = policy_nodemask(gfp, pol, ilx, &nid); + nid = policy_node_nodemask(pol, gfp, ilx, &nodemask); return alloc_hugetlb_folio_nodemask(h, nid, nodemask, gfp, htlb_allow_alloc_fallback(MR_MEMPOLICY_MBIND)); } @@ -1234,7 +1234,7 @@ static struct folio *alloc_migration_target_by_mpol(struct folio *src, else gfp = GFP_HIGHUSER_MOVABLE | __GFP_RETRY_MAYFAIL | __GFP_COMP; - return folio_alloc_mpol(gfp, order, pol, ilx, nid); + return folio_alloc_mpol(gfp, order, pol, ilx, numa_node_id()); } #else @@ -2084,6 +2084,27 @@ static nodemask_t *policy_nodemask(gfp_t gfp, struct mempolicy *pol, return nodemask; } +/** + * policy_node_nodemask(@mpol, @gfp_flags, @ilx, @nodemask) + * @mpol: the memory policy to interpret. Reference must be taken. + * @gfp_flags: for this request + * @ilx: interleave index, for use only when MPOL_INTERLEAVE or + * MPOL_WEIGHTED_INTERLEAVE + * @nodemask: (output) pointer to nodemask pointer for 'bind' and 'prefer-many' + * policy + * + * Returns a nid suitable for a page allocation and a pointer. If the effective + * policy is 'bind' or 'prefer-many', returns a pointer to the mempolicy's + * @nodemask for filtering the zonelist. + */ +int policy_node_nodemask(struct mempolicy *mpol, gfp_t gfp_flags, + pgoff_t ilx, nodemask_t **nodemask) +{ + int nid = numa_node_id(); + *nodemask = policy_nodemask(gfp_flags, mpol, ilx, &nid); + return nid; +} + #ifdef CONFIG_HUGETLBFS /* * huge_node(@vma, @addr, @gfp_flags, @mpol) @@ -2102,12 +2123,8 @@ int huge_node(struct vm_area_struct *vma, unsigned long addr, gfp_t gfp_flags, struct mempolicy **mpol, nodemask_t **nodemask) { pgoff_t ilx; - int nid; - - nid = numa_node_id(); *mpol = get_vma_policy(vma, addr, hstate_vma(vma)->order, &ilx); - *nodemask = policy_nodemask(gfp_flags, *mpol, ilx, &nid); - return nid; + return policy_node_nodemask(*mpol, gfp_flags, ilx, nodemask); } /* @@ -2549,8 +2566,7 @@ unsigned long alloc_pages_bulk_array_mempolicy_noprof(gfp_t gfp, return alloc_pages_bulk_array_preferred_many(gfp, numa_node_id(), pol, nr_pages, page_array); - nid = numa_node_id(); - nodemask = policy_nodemask(gfp, pol, NO_INTERLEAVE_INDEX, &nid); + nid = policy_node_nodemask(pol, gfp, NO_INTERLEAVE_INDEX, &nodemask); return alloc_pages_bulk_noprof(gfp, nid, nodemask, nr_pages, NULL, page_array); } From patchwork Tue Sep 10 23:43:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799478 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 85F99EE01F1 for ; Tue, 10 Sep 2024 23:44:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 692378D00D0; Tue, 10 Sep 2024 19:44:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 618E28D0002; Tue, 10 Sep 2024 19:44:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 46BA38D00D0; Tue, 10 Sep 2024 19:44:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 2622B8D0002 for ; Tue, 10 Sep 2024 19:44:45 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id B1671140D89 for ; Tue, 10 Sep 2024 23:44:44 +0000 (UTC) X-FDA: 82550460888.12.2FE5596 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) by imf14.hostedemail.com (Postfix) with ESMTP id E563410000D for ; Tue, 10 Sep 2024 23:44:42 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=fCwOAY2L; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf14.hostedemail.com: domain of 36dngZgsKCF468GANHAUPJCCKKCHA.8KIHEJQT-IIGR68G.KNC@flex--ackerleytng.bounces.google.com designates 209.85.210.202 as permitted sender) smtp.mailfrom=36dngZgsKCF468GANHAUPJCCKKCHA.8KIHEJQT-IIGR68G.KNC@flex--ackerleytng.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011806; a=rsa-sha256; cv=none; b=AlR/cPMrFghxl2k4DlluP1EAcponXPKD5tCozuPY12PCQ6QVpvYshRvRw/QA1hun39FIk/ 3CoVnnJs3wW92tuubpGEwnu5a2OcW8Rp7oV/6GYUsbCSQHfzf7sabnXSvqtWZeM9AEoQTJ 5W3pShm9j6TUpURrgozfoCotiq5MxWM= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=fCwOAY2L; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf14.hostedemail.com: domain of 36dngZgsKCF468GANHAUPJCCKKCHA.8KIHEJQT-IIGR68G.KNC@flex--ackerleytng.bounces.google.com designates 209.85.210.202 as permitted sender) smtp.mailfrom=36dngZgsKCF468GANHAUPJCCKKCHA.8KIHEJQT-IIGR68G.KNC@flex--ackerleytng.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011806; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=P0FEJ4ZKBglbUpSmnWaSSUOycb+KTZhx2f+dpCTvB7Q=; b=32ckbhdP8JCOoEjlp0OjHH78hakaEeIZSUkmcuXYBfFth48uUY4E9hqD3EyeunZ+KlRPmY WFiTBjjYFAhfzD3WxqZaGeY7xpBFFiV8YT3IDXbJ8yF/EfvVhbf9LIekj/TGDTr2dqims3 W6XTQEOTQj5gjRSOEJgpHAIgjKMOr0A= Received: by mail-pf1-f202.google.com with SMTP id d2e1a72fcca58-718d6ad6105so6971347b3a.1 for ; Tue, 10 Sep 2024 16:44:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011881; x=1726616681; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=P0FEJ4ZKBglbUpSmnWaSSUOycb+KTZhx2f+dpCTvB7Q=; b=fCwOAY2LGkI9fFyAHnv9tf1wUa+0rE5IPmcPkvyYU4FCf6MIhJHwZ48HIaYsiXIKIN Gijddo9ysM4fZ9fn6feQn0TFKMsLR2e4UM86nK05MxMPMqvp9q2WQhQw/Z/fGEWUOocB rbLMGDn4ztWjfwZys35XC98CNT7rOKfKMUkgbDqwgMWdjxsQ+/cblq+6R6BggxI3sDNh 6IjR7y9ObDSrXZ2u1RBvOkI5/YenAEDdpoA5+kj/rIlts5tFjIGeF9U6fuiJWtl+pAbt fm10p75AkvxxUwZe01hMD8iVxacZ0DWoSS34Inlq0+AKlx1LNG6zet/EaxlUB1vYIpNQ piyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011881; x=1726616681; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=P0FEJ4ZKBglbUpSmnWaSSUOycb+KTZhx2f+dpCTvB7Q=; b=hdoYoJZCpkSdfnooEys3MUg046FOFYw29d7tPGbKKRd7BletfKjxispEQHiyj3O5+P BQqhBQyYefsu7+sWupLNMfskEanv5DKFNrfIb0HfF2NAgAX4lJdB010aiqhqXKAur7pk 3Q8iwiXZD6j16l7OKrhoTU4l+ImyDDj7UKDR9Mga86FChIfCyINgzfI1CBnkW9RKxa6V glQ5pkpkC6Fb6Lic5f5InEI43D8wYOOhwqsvQwRfKmaG2hjK5ZB+ad1OMoVpfZyUB7X1 q0eEJrxqSIqW878HQhpYzYfdbXYmYZfj7x+xOOtoAx8u8JXOfzS30aYOENFVsAXvonz4 hE2g== X-Forwarded-Encrypted: i=1; AJvYcCXrZunkTNTR87ZTA92T1kf8kMoasby8ma3DHJHhaP1JV9vkQT01VK/VtP5kLQUaVTHyot/sGbLDKA==@kvack.org X-Gm-Message-State: AOJu0YwbtXok7dX9JSDSFStqES49RBSIV1rLXv4ixlie6Fkdl2h7p5PN EO5j/eSS0keGAIH8EnyxeDjdUdvfbr8lnNtaGBnNH1aVnPQvrdd9h3cZ0gtTMR04Nmnw0/Ndg0z v6JvjwPbuZSWVONSMTvgMDA== X-Google-Smtp-Source: AGHT+IHx4+TfNxnFVDcFxBFG2SogMutruOf/RcFp6adNqekz63WzrINlsJVnsUCpFS0h+PHdTGU8HSxvhRc6JvhHRw== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a62:ab02:0:b0:710:9d5e:4b9a with SMTP id d2e1a72fcca58-718d5e04dadmr37747b3a.2.1726011881266; Tue, 10 Sep 2024 16:44:41 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:36 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <1778a7324a1242fa907981576ebd69716a94d778.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 05/39] mm: hugetlb: Refactor alloc_buddy_hugetlb_folio_with_mpol() to interpret mempolicy instead of vma From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspamd-Queue-Id: E563410000D X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 9qkizn17cwzyjq9dbynbu134934fkobg X-HE-Tag: 1726011882-418113 X-HE-Meta: U2FsdGVkX1/ATPCHL+9HEPPaOTGKcSZgxlEgH38aJzRgVoiqKRpItzeTB63NcZ/0tkZbBrI6v6G5s/Nx41oDP49Ju4tHLY+Ipor5UbCIG4aRozrd9FbMl2DtTF9zAvq3JPmrQYetH48YmbgrQsJB4i1kz7Jcfql6kqusPLE5FgQdGskRIZHOyFJQ6awqouVruL9wxvWCeckzR0EkYUKRHOKIf4xmaYZ5mjp/hSBFjzcpyXLdPeqvBGoldIUUHKzSHrC54H28h11eOTImViZsNgyQaE7+h3MP1Sfhdb2/g7bU/IwniG6lQdjjus2kDuKUG8sI03k6KTa/robJzzCJEThE4zEM+xae2GbqP8OHM/jSvirQBhrLmkCTwBGqaDRzHK3S9ZqYpUPM8q6kFLcwXJZ2jvDzlcIk/nqh7z+Lwh3FiSz9HybQRIz/aLSi3Z61wlTVCkoqAa9z0+apoOHcPloE30h1X4GjdZo0VAwpg+90TXN5AV//jVQRzUkk1SQuAmtfOhH6ONn9Bp1E9iqW2aObZUEfMdm3zkpGXxMuZw+eYECmGD9/gsQadTZps3N3Y/VpUyCPrF1ogHHZ6zyPFF6YFCLyPhbzJWCAVZ7p15HoXzkM0CwRT+mSxig2U39m37BwFV8zH1XoKY1cCeoAq8PN9KDREJtm97B3HmmQIW5EJlg0LtNjcWGFrKi0S0w3FkYIqXrGOa7u1QqO4QZ7jINqIKWAmOzPr7gnIix5bUwOxv4/C2C1jN2NfjB8o5TaIH9wv5cy+1QJL14ni7gA/e8Bd8vIUem50k0y8ZZ+IFAKwO5S6TmJsD5vBJ8dz1F9BpyVtHkQp+Fusx96/+arHiTh8ltPRI9BqUFXU16jd68CnobTUVcOujpCLds30C8kNPAj/hrIfQkXsfuYLiskdFaDBEvsBTCwaPgpNMsTPk7DR6gc7GbGy1yaaEtrRHp53W80E/4pCwsW++XPEaO W1U0jVGM ywTcMvJD8CIe4/H1RetLuyptkuf4hmTXrdnGVSfCmDxLEuZjACMU+ZnxntozEICGs7XZoj/to8BrkIjIqqWIeAeic5Y40IuWuMl98g3vkdK0q4mAKD4fNGid6Q0OA4WPyzKv8HjfwdEdB2v499ZqqCRy3mmmwyWRFC8t2QU9lSf10MqL4Cv6zNwlye8eTZtgc3kJV/gDZmIEPXLltrRnTKV4Hf/uC8nytxjrTtTi+/20fVf0rUICtoi4srwkdMlfHNQZ/2IcuAhWCHFnO1xINuH2erPt8LaiYgeronzKVvRJDZxsYFuDfrMRcPsFIKcD2/osJIPuw7TVn0zXUFck35x7zbfZKKRaSaJ8FrmhPFH/T2tE5UMQj1c819C/fjW65t24KEiZnFXZExKbqNMbK1KDVjx58cwDHTNKHsicHTqkfDUv4KM5EjhMgi+jZ4lyGvg/W3mA4c5klheFr/KyxXtdTJ7npUG7zdjJ78Mh2nkSet6vRSySe5jZkYXzS58fhUdMIGWxRDlx8Piy1kkx7fz00Kn29LXObbMI5NSxg3xt8ntjHo8eGcRaIY8MCBTtPMGTcI/jfZUFmegU+VpR3bktmQDGfxpmGXpLS+f0LgMfDQe8ofp4g4XPnLlFHuBWR8E3aLJ62TFIwByJhuKdEIBCySG3Xsm3IzQdsMKXLSXJFDZBZLT8gbLWFfCQTO4P1yGWO1j6uxvd1V7Y= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Reducing dependence on vma avoids the hugetlb-specific assumption of where the mempolicy is stored. This will open up other ways of using hugetlb. Signed-off-by: Ackerley Tng --- mm/hugetlb.c | 37 +++++++++++++++++++++++-------------- 1 file changed, 23 insertions(+), 14 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 5cf7fb117e9d..2f2bd2444ae2 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2536,32 +2536,31 @@ static struct folio *alloc_migrate_hugetlb_folio(struct hstate *h, gfp_t gfp_mas } /* - * Use the VMA's mpolicy to allocate a huge page from the buddy. + * Allocate a huge page from the buddy allocator, given memory policy, node id + * and nodemask. */ -static -struct folio *alloc_buddy_hugetlb_folio_with_mpol(struct hstate *h, - struct vm_area_struct *vma, unsigned long addr) +static struct folio *alloc_buddy_hugetlb_folio_from_node(struct hstate *h, + struct mempolicy *mpol, + int nid, + nodemask_t *nodemask) { - struct folio *folio = NULL; - struct mempolicy *mpol; gfp_t gfp_mask = htlb_alloc_mask(h); - int nid; - nodemask_t *nodemask; + struct folio *folio = NULL; - nid = huge_node(vma, addr, gfp_mask, &mpol, &nodemask); if (mpol_is_preferred_many(mpol)) { gfp_t gfp = gfp_mask | __GFP_NOWARN; gfp &= ~(__GFP_DIRECT_RECLAIM | __GFP_NOFAIL); folio = alloc_surplus_hugetlb_folio(h, gfp, nid, nodemask); + } - /* Fallback to all nodes if page==NULL */ + if (!folio) { + /* Fallback to all nodes if earlier allocation failed */ nodemask = NULL; - } - if (!folio) folio = alloc_surplus_hugetlb_folio(h, gfp_mask, nid, nodemask); - mpol_cond_put(mpol); + } + return folio; } @@ -3187,8 +3186,18 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, spin_lock_irq(&hugetlb_lock); folio = dequeue_hugetlb_folio_vma(h, vma, addr, use_hstate_resv); if (!folio) { + struct mempolicy *mpol; + nodemask_t *nodemask; + pgoff_t ilx; + int nid; + spin_unlock_irq(&hugetlb_lock); - folio = alloc_buddy_hugetlb_folio_with_mpol(h, vma, addr); + + mpol = get_vma_policy(vma, addr, hstate_vma(vma)->order, &ilx); + nid = policy_node_nodemask(mpol, htlb_alloc_mask(h), ilx, &nodemask); + folio = alloc_buddy_hugetlb_folio_from_node(h, mpol, nid, nodemask); + mpol_cond_put(mpol); + if (!folio) goto out_uncharge_cgroup; spin_lock_irq(&hugetlb_lock); From patchwork Tue Sep 10 23:43:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799479 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A70CEE01F1 for ; Tue, 10 Sep 2024 23:44:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6B8328D00D1; Tue, 10 Sep 2024 19:44:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6690F8D0002; Tue, 10 Sep 2024 19:44:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4BA748D00D1; Tue, 10 Sep 2024 19:44:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 2AA938D0002 for ; Tue, 10 Sep 2024 19:44:46 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id A3F3E140358 for ; Tue, 10 Sep 2024 23:44:45 +0000 (UTC) X-FDA: 82550460930.14.49099A0 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf06.hostedemail.com (Postfix) with ESMTP id C3000180002 for ; Tue, 10 Sep 2024 23:44:43 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=tbbyyuhC; spf=pass (imf06.hostedemail.com: domain of 36tngZgsKCF879HBOIBVQKDDLLDIB.9LJIFKRU-JJHS79H.LOD@flex--ackerleytng.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=36tngZgsKCF879HBOIBVQKDDLLDIB.9LJIFKRU-JJHS79H.LOD@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011856; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=A2/Y6gthU1MbFrUehJ3S769Qii4LzOP83N9oJxSMzr4=; b=VdbVKYNABPXAV/Qsr3EpL45DK0lnqhVzM2kr2WejZUye7XuHbO/yifXpcwxe7vPtMyl9zx u7cxhqZat4LCZuzu4GJbPizmMHiGkS/Jzb7hCLmuqatw23uRSkNPDb3NrDWWLV3BdWUqoH oX/SZ09DUixieRxH5GfepLYO89JYz5s= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=tbbyyuhC; spf=pass (imf06.hostedemail.com: domain of 36tngZgsKCF879HBOIBVQKDDLLDIB.9LJIFKRU-JJHS79H.LOD@flex--ackerleytng.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=36tngZgsKCF879HBOIBVQKDDLLDIB.9LJIFKRU-JJHS79H.LOD@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011856; a=rsa-sha256; cv=none; b=3Q8UAhaoSCFyVHiAY+p2vcM9b0PnCT3bTKYx8PfRMqOEE2kh5bds8qgb7CL5qqw+7oPL6i M1M1HjhKQ13DtuE9oBawcQHY8tQ+pQPqKDLVTOdv+7wRWyYJbqSM8F/TIyVS5x7NJbxiqQ cJr2aCBTY26pwxYBEKZvI02+nxVZdss= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-6b47ff8a5c4so205361367b3.2 for ; Tue, 10 Sep 2024 16:44:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011883; x=1726616683; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=A2/Y6gthU1MbFrUehJ3S769Qii4LzOP83N9oJxSMzr4=; b=tbbyyuhCb9o2N5++Ot2dV1psVi0SGogtCUP0IgVvr3Lgen8tV8SLsMv/SsABfoN7Ag qHoN279Da8JT6tg3oI2Po3wcW2jnVAM+Qv4KQcvMu1Y3ZeVh0oZZBT7PWF0UnS7AfU2Q R0Mnvx5uk6lH1eCVYLizEn1NbFQ0Of/qkbiJjtnokPidJy3g4YDXiCfESTnzXG+rgvhT gdosnN2o5Nj4YNVLi59/6Moqh+lI3wcHEXNwIOTnKpE+8loMNJHQUa5pev2wCXvVdG6U TU3lfVdVcxvObfQniRltucBdwquR6kfp2x6psziKLwyASw0qtQs8fs9+k3Y0HNd8SoCj OQfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011883; x=1726616683; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=A2/Y6gthU1MbFrUehJ3S769Qii4LzOP83N9oJxSMzr4=; b=QEseundl8IRkiZhzAhypKMl1XVSPCOY3KsJslZQOHjXDG2Q4NKU3smrngCsZ15hcfX hUO3icUWhVNxvbkUlbmEtp0Twa+rN7wHKNzIKhH8oV1fgPDpcB5+FrmE+hKB5BPn25+f 79tuHgJSvLGMFsFwhXx6ur5Gov9VTkSJv0YBLuLjx1iX+aGUKf3ohP4KH4kJN/5BwUUF 31Q6hH9VDfQC7dO2ioADZEN9sAMb/erwXcDd0XO1TJ4qsq33PxTfMpYJdi5+JA5UjTZe j7XGGfMyQBSt9jqbVT+E5lD57ChwNZmd1aIFrHqpVshWbA35b4hTv0hyZcTIxM+hg9mw pYeA== X-Forwarded-Encrypted: i=1; AJvYcCWqISGqMKWEAW4HVk+WoWr4NR6/H3TxPFh4cpBA90b88tZ8xGNkrLkr9xObxpa7xt0+NhGd1A91Ug==@kvack.org X-Gm-Message-State: AOJu0YwrKDwo53jKzTxApHxatKlVWGqtVvr1MAq/3rB7KkPIwskFuH6g qwEbr4l9bZhJBF1q1FMl3C5NqosnNSPPqBDr+e/c9Z6j5WVL3nh7hVKU/FWsx3bRk8kHSzhvodg 8d374FajyRbd8RdilVFXAZQ== X-Google-Smtp-Source: AGHT+IFPoUvOh1q0B66QYZI1YuXyVP47nvXVDhT/pcsOhAfzJk8Td8wo+BQHvkUkT8CnKxYbBRP+4EVLJOXxK7IwcA== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a05:690c:360c:b0:6db:7b3d:b414 with SMTP id 00721157ae682-6db7b3db573mr3408167b3.0.1726011882893; Tue, 10 Sep 2024 16:44:42 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:37 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <2e9109761869029bf82555e60d98850ac7888ae5.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 06/39] mm: hugetlb: Refactor dequeue_hugetlb_folio_vma() to use mpol From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspam-User: X-Stat-Signature: nsa9jxzjmqgxfgz4uqs5b1tfyqdc5gqt X-Rspamd-Queue-Id: C3000180002 X-Rspamd-Server: rspam11 X-HE-Tag: 1726011883-99627 X-HE-Meta: U2FsdGVkX1+QztZoNkfgl3wt5XriQjD03mA5dSfRs9qdY6KST6ds0UsqtC+P41cgXUO2L8Oxpc5puuKOYuE/+XzATDJxISoAu+LyjyfBiuY0pN3Nr/5/YruS05b6KfP9LtOOZP00MXHWw6OaKTlzjMSizuVQiJSDkZSjHLHklimaH2HO+Et2oPrVnKXklR7DmwSz/r+zAF+Xj9fSpAqGBmAMOmIOJCgOjQPEQUqWh4d1FCW9crdJlKSMfz3FjTBWcu90BMVh6kEQeW01j+BcmqExVKm/OpHeYT0QlMKNQ2QJ6KM7VS1dsghcZumOJvH3MJwEFnXlxgRvYBdAfqgp1ahXEouqFnhoCZ5bH/He04smb3lIpPPMaxVlngwQHgEjmrO+z2LbuWWgLw0DgkO08WCfFBMfenyZo2z+X5Z5De5PXYNhEuheZC+8cKJC7MIlGxph2q2mwQmIfkRAb5evaQ+EzqNyODVJTOnEBUNXgMvaxrOe8bW3AS8PH/ub12vriwc/sNGTGZsQNNcVKz4PzOW25AbxZFwhHBXAMXp5EggzXNKW8DUF8dkCxzrIm05hvmpqYYbhPm/zfFBar+xSXh0+yfyaR+uuT+aQj/sQ8KAhKluZzJfi7GYgEog5urahBsd8iM0h9mPJCOgyuQ++uvX25svi3WBbim2CMIHGduXGNCk/QhtEIAcXKaiwk14/J2h67D8p0G2iZZd0+Ykp0b2/9P0mX0H1spO3larXtS9CJ7RV83C112YPZU5qruE5L8m9zb6FXfj4lJGWK38JWtCdKVlpRm3yqNE4GfEuYQhC+YeCR4a2wKcEp1TfXngMD6K5hkSyDorXC+FwL3h8NE9E77IscjgYdZwB1X6X7gMl+DXW81pWatUZ7Ss5dRS66RbrhCMyA83CL1LWcF/W0VqzYjR+GEK05mdHYvqKBa6PAjW/atN7OgybSbdJ5Q3dhVcj6SD78YUCFmkD1x3 YwclwxMs xEpC3EKo9U/foZadr9cieeWCv8Rdgu4oiM/YCM4eETi23VEEeaMh4e7YJi+gg+V0EE4QrPdOzzp+egzIPV9wp5wXCFxThqySS5I5sNKUqj4HxMO4h/b2OcgraRaR1b8T/xTsMW4Pch9d68D2kILrxtRAWyeEVCC3cMmvQ3SG2WtFaP6HMNgKKdmUD8ZI9WicLJW//z+Y6QyjM8vI3W4og2VF2LbcRW9KTIasvBcpSSgShnmTrAWZG1qbKPo8duEkpODW50YOngoE95IRrZW1pQpbcBgOqpOSs9LRXWF4ZiWRukb6bPotRsDlBJQvFs4y2Ypv9UH8th3st+lvHWtB9wwv1SGQgEyIpsXIdkFWUGR8ZXFuEHIVBEHow6s8/zFrpFLpKswFXuP66HfndWYPgAWeLU0m3hnVDTHdNrQdqKQ4n2U0alhf6uYdqJdV7Hez/lYFP+c80p94GJRUWvRxbjoS4ALuAXsEWv/agfIz8P9qXkpWkwcsVwubYTdRjFHuFw5ZgMBBrKYjaUw4AS/aiGJD89KM9diixEjzjYagtFtFpAzkioanSPEi4GKGnlxgzWkpXIR292eWqwPA/7hAJn1+0bj7vquK0iDOqlykTrCe8V9SIN/bKK6SYcwvKd60Y6iqx4yPyp0EB1cipL6Mc7ldeMcTR79A+qeSmXduOmLlqu+/frmusnB0utTxFPAyfXEp3 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000202, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Reduce dependence on vma since the use of huge_node() assumes that the mempolicy is stored in a specific place in the inode, accessed via the vma. Signed-off-by: Ackerley Tng --- mm/hugetlb.c | 55 ++++++++++++++++++++++------------------------------ 1 file changed, 23 insertions(+), 32 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 2f2bd2444ae2..e341bc0eb49a 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1402,44 +1402,33 @@ static unsigned long available_huge_pages(struct hstate *h) return h->free_huge_pages - h->resv_huge_pages; } -static struct folio *dequeue_hugetlb_folio_vma(struct hstate *h, - struct vm_area_struct *vma, - unsigned long address, bool use_hstate_resv) +static struct folio *dequeue_hugetlb_folio(struct hstate *h, + struct mempolicy *mpol, int nid, + nodemask_t *nodemask, + bool use_hstate_resv) { struct folio *folio = NULL; - struct mempolicy *mpol; gfp_t gfp_mask; - nodemask_t *nodemask; - int nid; if (!use_hstate_resv && !available_huge_pages(h)) - goto err; + return NULL; gfp_mask = htlb_alloc_mask(h); - nid = huge_node(vma, address, gfp_mask, &mpol, &nodemask); - if (mpol_is_preferred_many(mpol)) { - folio = dequeue_hugetlb_folio_nodemask(h, gfp_mask, - nid, nodemask); + if (mpol_is_preferred_many(mpol)) + folio = dequeue_hugetlb_folio_nodemask(h, gfp_mask, nid, nodemask); - /* Fallback to all nodes if page==NULL */ - nodemask = NULL; + if (!folio) { + /* Fallback to all nodes if earlier allocation failed */ + folio = dequeue_hugetlb_folio_nodemask(h, gfp_mask, nid, NULL); } - if (!folio) - folio = dequeue_hugetlb_folio_nodemask(h, gfp_mask, - nid, nodemask); - if (folio && use_hstate_resv) { folio_set_hugetlb_restore_reserve(folio); h->resv_huge_pages--; } - mpol_cond_put(mpol); return folio; - -err: - return NULL; } /* @@ -3131,6 +3120,10 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, bool deferred_reserve; gfp_t gfp = htlb_alloc_mask(h) | __GFP_RETRY_MAYFAIL; bool use_hstate_resv; + struct mempolicy *mpol; + nodemask_t *nodemask; + pgoff_t ilx; + int nid; memcg = get_mem_cgroup_from_current(); memcg_charge_ret = mem_cgroup_hugetlb_try_charge(memcg, gfp, nr_pages); @@ -3184,22 +3177,19 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, use_hstate_resv = should_use_hstate_resv(vma, gbl_chg, avoid_reserve); spin_lock_irq(&hugetlb_lock); - folio = dequeue_hugetlb_folio_vma(h, vma, addr, use_hstate_resv); - if (!folio) { - struct mempolicy *mpol; - nodemask_t *nodemask; - pgoff_t ilx; - int nid; + mpol = get_vma_policy(vma, addr, hstate_vma(vma)->order, &ilx); + nid = policy_node_nodemask(mpol, htlb_alloc_mask(h), ilx, &nodemask); + folio = dequeue_hugetlb_folio(h, mpol, nid, nodemask, use_hstate_resv); + if (!folio) { spin_unlock_irq(&hugetlb_lock); - mpol = get_vma_policy(vma, addr, hstate_vma(vma)->order, &ilx); - nid = policy_node_nodemask(mpol, htlb_alloc_mask(h), ilx, &nodemask); folio = alloc_buddy_hugetlb_folio_from_node(h, mpol, nid, nodemask); - mpol_cond_put(mpol); - - if (!folio) + if (!folio) { + mpol_cond_put(mpol); goto out_uncharge_cgroup; + } + spin_lock_irq(&hugetlb_lock); if (use_hstate_resv) { folio_set_hugetlb_restore_reserve(folio); @@ -3209,6 +3199,7 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, folio_ref_unfreeze(folio, 1); /* Fall through */ } + mpol_cond_put(mpol); hugetlb_cgroup_commit_charge(idx, pages_per_huge_page(h), h_cg, folio); /* If allocation is not consuming a reservation, also store the From patchwork Tue Sep 10 23:43:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799480 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6E847EE01F2 for ; Tue, 10 Sep 2024 23:44:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6E13C8D00D2; Tue, 10 Sep 2024 19:44:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 66A718D0002; Tue, 10 Sep 2024 19:44:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4455B8D00D2; Tue, 10 Sep 2024 19:44:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 1FCC78D0002 for ; Tue, 10 Sep 2024 19:44:48 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id C2DF91C4D7D for ; Tue, 10 Sep 2024 23:44:47 +0000 (UTC) X-FDA: 82550461014.17.CF760DC Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) by imf10.hostedemail.com (Postfix) with ESMTP id EF765C000A for ; Tue, 10 Sep 2024 23:44:45 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=CCtSGZtp; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf10.hostedemail.com: domain of 37NngZgsKCGE9BJDQKDXSMFFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--ackerleytng.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=37NngZgsKCGE9BJDQKDXSMFFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--ackerleytng.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011882; a=rsa-sha256; cv=none; b=aR+BcEhlfIjSAXrsShWocCdXiz0X+zsWsYEXBqnCjmrK+MLLlkYr5eBZsB1/A1NrQSVrrH JmNcWBy3RhJ6EEX2RLDdOtD+zb9EUq8uENBIwY+2cnuoaTc9sAVYF961ZyeGxMxzkL0thc NtOE0DH0360Y3/mkGHdPj4SvWjPjReQ= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=CCtSGZtp; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf10.hostedemail.com: domain of 37NngZgsKCGE9BJDQKDXSMFFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--ackerleytng.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=37NngZgsKCGE9BJDQKDXSMFFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--ackerleytng.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011882; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=5f8yQFUffYPzUq824ZihetLpeNkj57gNWJVxt7QoH4Y=; b=QVZquwXVtdq8sEKC1ELLZzIJRzxDvNVcoBXMBc7AOBcOORXLFwa22q2p6IoE4QuGT9C8+h a3Dlb0SmyBaHmjOUX9zwgdsBlbhghMS8dypeM3OqrszmnWXzdYXgNK7GrrVv/p5xI4DqU0 +pk9Q6bTxDdif/Zg9WP1HSW9v81Vo1c= Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-207464a9b59so16641375ad.3 for ; Tue, 10 Sep 2024 16:44:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011885; x=1726616685; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=5f8yQFUffYPzUq824ZihetLpeNkj57gNWJVxt7QoH4Y=; b=CCtSGZtp+Nkt0VNeIE8kbNVkUheHtMP8Fhiu1+72BiGFmftwlKmnKFNm1WAJIJifMo sm6Tb4L5lkEs60gOxH2pOysNtTfNGL+RyfXDhM6Hbez5s+Hj5QPGsxPy1nmx9N32pYPY mQhf7BFKwgazbaLO33ooiMkv+v3xWHUOFuF2wDLL9F7JhzYS3axpa4//gKrA/oqfBeqe 9ZqGzw9DgDFG08o6+oTjHNqXAM1Gg1BvQ9/wq/WJWkGGA4PeutbKeed58N9mzVyXOsqR pSsgsQpL5xNaL/JYmvtBAoHh1jRx0rWCxGnEV4wvOkNShQeDuNoKZ0TdkdngM1ADLiUA R5Xg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011885; x=1726616685; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=5f8yQFUffYPzUq824ZihetLpeNkj57gNWJVxt7QoH4Y=; b=cP2bhJ6cqM6BiA7hHPD7TWIa8HzK2OXygD7p8j4k5h2PHh8K9ERC59TiCX/ptJoPm2 9swKLPBg6bISzBP1L+sq9drDk5FFt9L/VX5C2DQhK7bcqPWZnTv/1zsANR3b1/FWRbaJ qKfQTMu/9RHmjIcp3zFTxC5yXa86oio00JOe03VP3NrssOwiWTO8vQh0gZK6ycRCjX0S MO6c3IaWAD2I4krbVlEWujH0Tfw95j4nIaW6eZidOyVSsjE/Me3101X8g2r17ziV14yM s2ot/1AWyMZ/4kh6d4Xp+bxRSdX9RwgRyP/4p7Hcp51SLtANlvqTKe9jsNoxhLA5OyEK KboA== X-Forwarded-Encrypted: i=1; AJvYcCXzfzEZHPECoaf8L3cN386CgtFzqXB9T3o15y9Mm02ji+M2rHePvUG0UEBRUVVu+ILvi85YRCPkoQ==@kvack.org X-Gm-Message-State: AOJu0YynBhzblVI3BIEldh1uvPYLTOOM4aHed4kTOfGvQLki/NpyFscx l3BBoy0LwtoB8sD34EJjXBu5C41uo/sZWMoAqiy5sb9D2a9xQpbUXrZVQW7ctYK6GfGHBt1E7yC 8fWtZjAwHXlymnosjj33n4Q== X-Google-Smtp-Source: AGHT+IEMrn0OqAypgUQC+spX1CPlDQJLfBB8w26Ns2KNptOMPv9ijUEGA9m5hBvK895d2wE0z6f+lmJkIjqr2SJ6OA== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a17:903:1ca:b0:206:aa07:b62 with SMTP id d9443c01a7336-2074c5f2a6emr2330595ad.5.1726011884521; Tue, 10 Sep 2024 16:44:44 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:38 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <7348091f4c539ed207d9bb0f3744d0f0efb7f2b3.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 07/39] mm: hugetlb: Refactor out hugetlb_alloc_folio From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspam-User: X-Stat-Signature: qn5ujrpt4mq84zjgtnf8txktyp7hg7mo X-Rspamd-Queue-Id: EF765C000A X-Rspamd-Server: rspam02 X-HE-Tag: 1726011885-24409 X-HE-Meta: U2FsdGVkX19XKeWNIrwGpdXMb+ltLD0+Hl9brIAuicu9XZRae0IvEqcm1k0xfoFgxP93zwGAwVyAgZJ4ETakAUrIIR3vs3AglNZ25IO3Gd7t0Vz3pjLpMEkDF6/zh80IY8SWxt5A++BLDasoMtnySnGU3ldXyk4S6CTDwp0CIoWUpPnQtBTf361o0nAYjzADpNVp3HN3q6B6RNXGn++W06a9eqk/XtdoGCu2nyVlWPyygjw07Rzt8U5OnuSoYzysh+LACTUVDKcRZuP4XkIlVjyZd1qu24oCasHOYJ2j4lAMb1eTaFYSM8FENEqJRCagHe4YBymDoKuJuhC5PKv+LZgkrpWyQe1AUb4f1hs35VWVcbFmOkkc9Q5Gd/IJApiC+WMyylrkpoQnFtYtjTp7I0hdjsZo4fOQrWXv2YtE3gto6wbI9ueJvaUihS1f2MYWhViJ56heHUF578+XnHvcZwfWJW4b5hmsjE14XXXw/XJwIEkhA1wuUxuIhkI8rBboIdxQYV/X9+AWpOrDdnyz07gZYfNRY8kieyoWE2ZdRBR7pqTjO0zwMGL8U6ttjJG4E/LyL0c2fkxyXiREDoW93uDQzPMT7h45NSDY0f7dGj5AbgsP7M0Z65YSWTTK3jazUHT48U3x45CtizNP7edJtsE5Uv3fWqhS75sam6rL5hdoDEjrj5ZB4//KpVvsw1fyvoXVv7qsRTA2x+EXu5E21Rdv+dvbWad+OOjiZZpofFcMz1roykJ4ORmwfrzo51YKgI4Lfk7rsEF0YbnDGWISU6LisMzjaWiHg8+YDWNc3j4OBmO5D1x9jF3JS1/O7+iuEZsMrs+tyhiFdwA9MOBATHsrsNu75IJOEWXu3sn1xLeZg0gpihEYgtfN2G1sjYACU3zQgK2QiPsU63kw/JYg7oaVm93XENb4eMnUNTgsA4MJOx4LdjkZ9eBa+AZT3xELavTWGiEdn8EZnG4xNUm EgN1AeHb oDg68oUbu0cgZLwon/2By9Eq0gxof5Q6mYUCXEfinONxdUo7vXvPT9nf2fnRGJNdusInL1jMKUW1BDHrsCQGsBI669EviTBSDFx/63I3qQmx7kuiFEB2tCmNyubzE95hZdtDNsN1dXhn3uAklAOaGagKqs3wvRV/UF9R9R9IEjY4Q96ssCHXwgbUtoMUwH34k9QfUzEpHCDzB1E7Dr4AobS6uoN7BAbYoSn3ynIsRz/3zcaUpOhel0YLVX/JIl9MZ8zfLmbGMsgRC3YVp+eW8wpf+m7zvZWV7Paj+YudGZ0Xc2xvEzxWaEGbBx+dav1CHHj+qZHsrd7xOtHw8iH987+zxOWs68h57cj4F/3HOcoS+56TTzvCsgeB4sPPFpL31T4ztzlL0hRTxwzvxD7wxVK8VPRS4N02YLV2lP5U9EdhfJxd1PlvXDOKdLINW/j+LMIfll0TaTxmentOvKFP8Gcwoj+AJhWl4OamtPOjIzVYg4h98XT+7Hbd/T45P01ozPKgh3m63IhXSfxXniVuw39RRE54dskWQy91ZAUtg0c1bjvy1cxKoQDs1l/IzxtaQARmlHbNe6IPl+zsvof4fXPi+D8sfUrSRNlphCE/UwzadYDrLc97bLMqSE45boA6kHWuXuf4SIKx6BL25uuXxKYM3K7EPnjDHvsz+eqeENEcrHpKUBcYvt8jOoYu6xbcf82nasackbEk1NjY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000117, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: hugetlb_alloc_folio() allocates a hugetlb folio without handling reservations in the vma and subpool, since some of that reservation concepts are hugetlbfs specific. Signed-off-by: Ackerley Tng --- include/linux/hugetlb.h | 12 ++++ mm/hugetlb.c | 144 ++++++++++++++++++++++++---------------- 2 files changed, 98 insertions(+), 58 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index c9bf68c239a0..e4a05a421623 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -690,6 +690,10 @@ struct huge_bootmem_page { }; int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); +struct folio *hugetlb_alloc_folio(struct hstate *h, struct mempolicy *mpol, + int nid, nodemask_t *nodemask, + bool charge_cgroup_reservation, + bool use_hstate_resv); struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, int avoid_reserve); struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, @@ -1027,6 +1031,14 @@ static inline int isolate_or_dissolve_huge_page(struct page *page, return -ENOMEM; } +static inline struct folio * +hugetlb_alloc_folio(struct hstate *h, struct mempolicy *mpol, int nid, + nodemask_t *nodemask, bool charge_cgroup_reservation, + bool use_hstate_resv) +{ + return NULL; +} + static inline struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, int avoid_reserve) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e341bc0eb49a..7e73ebcc0f26 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3106,6 +3106,75 @@ int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list) return ret; } +/** + * Allocates a hugetlb folio either by dequeueing or from buddy allocator. + */ +struct folio *hugetlb_alloc_folio(struct hstate *h, struct mempolicy *mpol, + int nid, nodemask_t *nodemask, + bool charge_cgroup_reservation, + bool use_hstate_resv) +{ + struct hugetlb_cgroup *h_cg = NULL; + struct folio *folio; + int ret; + int idx; + + idx = hstate_index(h); + + if (charge_cgroup_reservation) { + ret = hugetlb_cgroup_charge_cgroup_rsvd( + idx, pages_per_huge_page(h), &h_cg); + if (ret) + return NULL; + } + + ret = hugetlb_cgroup_charge_cgroup(idx, pages_per_huge_page(h), &h_cg); + if (ret) + goto err_uncharge_cgroup_reservation; + + spin_lock_irq(&hugetlb_lock); + + folio = dequeue_hugetlb_folio(h, mpol, nid, nodemask, use_hstate_resv); + if (!folio) { + spin_unlock_irq(&hugetlb_lock); + + folio = alloc_buddy_hugetlb_folio_from_node(h, mpol, nid, nodemask); + if (!folio) + goto err_uncharge_cgroup; + + spin_lock_irq(&hugetlb_lock); + if (use_hstate_resv) { + folio_set_hugetlb_restore_reserve(folio); + h->resv_huge_pages--; + } + list_add(&folio->lru, &h->hugepage_activelist); + folio_ref_unfreeze(folio, 1); + /* Fall through */ + } + + hugetlb_cgroup_commit_charge(idx, pages_per_huge_page(h), h_cg, folio); + + if (charge_cgroup_reservation) { + hugetlb_cgroup_commit_charge_rsvd(idx, pages_per_huge_page(h), + h_cg, folio); + } + + spin_unlock_irq(&hugetlb_lock); + + return folio; + +err_uncharge_cgroup: + hugetlb_cgroup_uncharge_cgroup(idx, pages_per_huge_page(h), h_cg); + +err_uncharge_cgroup_reservation: + if (charge_cgroup_reservation) { + hugetlb_cgroup_uncharge_cgroup_rsvd(idx, pages_per_huge_page(h), + h_cg); + } + + return NULL; +} + struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, int avoid_reserve) { @@ -3114,11 +3183,10 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, struct folio *folio; long map_chg, map_commit, nr_pages = pages_per_huge_page(h); long gbl_chg; - int memcg_charge_ret, ret, idx; - struct hugetlb_cgroup *h_cg = NULL; + int memcg_charge_ret; struct mem_cgroup *memcg; - bool deferred_reserve; - gfp_t gfp = htlb_alloc_mask(h) | __GFP_RETRY_MAYFAIL; + bool charge_cgroup_reservation; + gfp_t gfp = htlb_alloc_mask(h); bool use_hstate_resv; struct mempolicy *mpol; nodemask_t *nodemask; @@ -3126,13 +3194,14 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, int nid; memcg = get_mem_cgroup_from_current(); - memcg_charge_ret = mem_cgroup_hugetlb_try_charge(memcg, gfp, nr_pages); + memcg_charge_ret = + mem_cgroup_hugetlb_try_charge(memcg, gfp | __GFP_RETRY_MAYFAIL, + nr_pages); if (memcg_charge_ret == -ENOMEM) { mem_cgroup_put(memcg); return ERR_PTR(-ENOMEM); } - idx = hstate_index(h); /* * Examine the region/reserve map to determine if the process * has a reservation for the page to be allocated. A return @@ -3160,57 +3229,22 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, } - /* If this allocation is not consuming a reservation, charge it now. - */ - deferred_reserve = map_chg || avoid_reserve; - if (deferred_reserve) { - ret = hugetlb_cgroup_charge_cgroup_rsvd( - idx, pages_per_huge_page(h), &h_cg); - if (ret) - goto out_subpool_put; - } - - ret = hugetlb_cgroup_charge_cgroup(idx, pages_per_huge_page(h), &h_cg); - if (ret) - goto out_uncharge_cgroup_reservation; - use_hstate_resv = should_use_hstate_resv(vma, gbl_chg, avoid_reserve); - spin_lock_irq(&hugetlb_lock); + /* + * charge_cgroup_reservation if this allocation is not consuming a + * reservation + */ + charge_cgroup_reservation = map_chg || avoid_reserve; mpol = get_vma_policy(vma, addr, hstate_vma(vma)->order, &ilx); - nid = policy_node_nodemask(mpol, htlb_alloc_mask(h), ilx, &nodemask); - folio = dequeue_hugetlb_folio(h, mpol, nid, nodemask, use_hstate_resv); - if (!folio) { - spin_unlock_irq(&hugetlb_lock); - - folio = alloc_buddy_hugetlb_folio_from_node(h, mpol, nid, nodemask); - if (!folio) { - mpol_cond_put(mpol); - goto out_uncharge_cgroup; - } - - spin_lock_irq(&hugetlb_lock); - if (use_hstate_resv) { - folio_set_hugetlb_restore_reserve(folio); - h->resv_huge_pages--; - } - list_add(&folio->lru, &h->hugepage_activelist); - folio_ref_unfreeze(folio, 1); - /* Fall through */ - } + nid = policy_node_nodemask(mpol, gfp, ilx, &nodemask); + folio = hugetlb_alloc_folio(h, mpol, nid, nodemask, + charge_cgroup_reservation, use_hstate_resv); mpol_cond_put(mpol); - hugetlb_cgroup_commit_charge(idx, pages_per_huge_page(h), h_cg, folio); - /* If allocation is not consuming a reservation, also store the - * hugetlb_cgroup pointer on the page. - */ - if (deferred_reserve) { - hugetlb_cgroup_commit_charge_rsvd(idx, pages_per_huge_page(h), - h_cg, folio); - } - - spin_unlock_irq(&hugetlb_lock); + if (!folio) + goto out_subpool_put; hugetlb_set_folio_subpool(folio, spool); @@ -3229,7 +3263,7 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, rsv_adjust = hugepage_subpool_put_pages(spool, 1); hugetlb_acct_memory(h, -rsv_adjust); - if (deferred_reserve) { + if (charge_cgroup_reservation) { spin_lock_irq(&hugetlb_lock); hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h), pages_per_huge_page(h), folio); @@ -3243,12 +3277,6 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, return folio; -out_uncharge_cgroup: - hugetlb_cgroup_uncharge_cgroup(idx, pages_per_huge_page(h), h_cg); -out_uncharge_cgroup_reservation: - if (deferred_reserve) - hugetlb_cgroup_uncharge_cgroup_rsvd(idx, pages_per_huge_page(h), - h_cg); out_subpool_put: if (map_chg || avoid_reserve) hugepage_subpool_put_pages(spool, 1); From patchwork Tue Sep 10 23:43:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799481 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2B3EEE01F2 for ; Tue, 10 Sep 2024 23:44:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 731628D00D3; Tue, 10 Sep 2024 19:44:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 68E188D0002; Tue, 10 Sep 2024 19:44:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 507718D00D3; Tue, 10 Sep 2024 19:44:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 1B4568D0002 for ; Tue, 10 Sep 2024 19:44:50 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id C732A1A0C67 for ; Tue, 10 Sep 2024 23:44:49 +0000 (UTC) X-FDA: 82550461098.16.FED322E Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) by imf26.hostedemail.com (Postfix) with ESMTP id 017E114000D for ; Tue, 10 Sep 2024 23:44:47 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=vJi6AcuL; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf26.hostedemail.com: domain of 37tngZgsKCGMBDLFSMFZUOHHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--ackerleytng.bounces.google.com designates 209.85.215.201 as permitted sender) smtp.mailfrom=37tngZgsKCGMBDLFSMFZUOHHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--ackerleytng.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011860; a=rsa-sha256; cv=none; b=mYngEKhANFco2JftlLb4Jav2iwBN9UudmTid1yziCBP+tbTpRP+6CaNcolXLsBCGexLNFK e19TobgJ6fD3Hlm7CoTphiNQmOAEPyDjHZVXvCCXdHpOU/9U9VIpEP0EkR5wLZbtUcfOR3 wrvSxMIFP72Nh9GcoxUKAEwQNbXQLW4= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=vJi6AcuL; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf26.hostedemail.com: domain of 37tngZgsKCGMBDLFSMFZUOHHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--ackerleytng.bounces.google.com designates 209.85.215.201 as permitted sender) smtp.mailfrom=37tngZgsKCGMBDLFSMFZUOHHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--ackerleytng.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011860; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wCQtr3LYUTrh92YO/TM3BLSTa63Bn54Ak3CYVuDQSOE=; b=dqn+4ZEk71CeD2GBnZ6lVlGDeQIVpjxZcNGqzGKsu/IyoLQ5MAiP/v/680tYco3hkIAIbp 0SB6PUeyJa9U+qPXMQPgiW++oz70dy+2P6tNuGUGu+4Ytfv5XxubX57RE/PxMeDU7YlDgH iQE8GYGRmZzocQhdgXsr7//lbdWY8Og= Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-72c1d0fafb3so6508647a12.2 for ; Tue, 10 Sep 2024 16:44:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011886; x=1726616686; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=wCQtr3LYUTrh92YO/TM3BLSTa63Bn54Ak3CYVuDQSOE=; b=vJi6AcuLyisBXw4M/DdH6OGsLpzUjIRTfbsXR/FHhbCIbyQ5BCWijXiqYfJHfTCkp1 33ZpEU6uc32aKwQ25ZysVvzJ17+Okp+rLFFm611RdLSsFl4nY323nDv8Az7HzVKcv1Z0 Cbvd+Bq/qBF1EPltOLebNPUpbzqSiAlcdul7IF76bGEDZtXy1szeevjKGZOxXRmIvApg p2q5GDLZlZsYKOqob207ZLFNObKRk7slXac2nlY0K+fc8Z/0FX2W02xuppdFTd9DfpHf pV1mrhEaVg6rPplySvyuo34TlGiS9mvDPfon1jgy1TjAtvPRpOX+/L4gSNT8ZLW4dvcU nXGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011886; x=1726616686; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=wCQtr3LYUTrh92YO/TM3BLSTa63Bn54Ak3CYVuDQSOE=; b=JnJg/t2AQYpspCRNsHr9mcyi790B3OLzqnbPKVBEoQsxelnu4q7IkeTiYFPTRutP8n pCYqMGJ2wDCa7lHUWr3d17wkj7BL4ReoKH9ZOzgYHcnGB8q9xhnm4ncnScGTQFQKYgUt crIuruFF+wjpQR//zBwxjkQX9HtuMSQlWVAHb4f456kR/oxRvr+e8j+l4LCRQOjoqvyM y2JBnxm4JvCSeGhUQFQtRPPybs4KMbBIEoXXZxgHxv80ePjLTOFL272XGPIEuMWA9VaL IUXle5+NkG60Qb+F8S6t30Q7XIFegttEKQ1EOm7t3/cknWD3/nxcPa4i7QwhoTDXmZ4Y 8uTw== X-Forwarded-Encrypted: i=1; AJvYcCXe/6YwbNLMWZ8mjh6NQeCb2BxHRb84IRxnFomzX25pK6Y7o/nXJTJg3dP043zC6NKAByMlFIg3iA==@kvack.org X-Gm-Message-State: AOJu0YyRwGdq+b+eRFSqi/OAxOUCUBWBunUleZjeyQ8rwr8Q1+dPkF3l g3vL5AJCGP2wdkG/uDbaAFP58DHgMEc1ULB0mLZRXFkPodx8VHTUIsLZ5bzS3XI7obW4ugqH3YH GcCEg8uAnC3TFVrn5be+GFA== X-Google-Smtp-Source: AGHT+IH5Nj5kPHNo6rBfhQxFH2WNjr8MfjXL+h+toY4sJe8PVOOMsYFK1lr+WyYqGBk0WvcoCe7tmQGhI2Eo6rySgw== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a17:902:b686:b0:205:5db4:44b8 with SMTP id d9443c01a7336-2074c63a9cemr948735ad.5.1726011886284; Tue, 10 Sep 2024 16:44:46 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:39 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <9f287e19cb80258b406800c8758fc58eff449d56.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 08/39] mm: truncate: Expose preparation steps for truncate_inode_pages_final From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspam-User: X-Rspamd-Queue-Id: 017E114000D X-Rspamd-Server: rspam01 X-Stat-Signature: 7f713t8x3ux4ogybtu936ytwb53a3uwr X-HE-Tag: 1726011887-212913 X-HE-Meta: U2FsdGVkX19ZkFmtg7015OVNqaOYBLSvxXVyyG+u9IBDYf4e7kGamZMRVb8qKKZkkn/oRL4Ac75dh2wFeDvgYfOhGOolAk8ZUtD08CIaBfC6dlTxrGjUGmlnN+1/n/8/fhDa5jxg8VPTorm+CoJqRpt2aPmZBqJoy0EL5Eb+OJjYTshM0HnPOVjIKVBXCBmFcfaB2j/2VhEmFSHAkwkgKCrhMhT5c1Au9p7az/7tMnkn2RMJDFU/GZbyXpVCrGq+Pr60lfyemEMDbVQodCT64HckCqNLD1yf8H/t+TsZBXLlNTVsaOeaicd2ouF6LidtT3rAf8N85/2jTsSLyseJjanACbw+ONS/Y6TDETZ1YunLZ4ZZycSeIcrmFIXxjuMfpmyvOOPNCFo1//YGnK86eQJGnst94mpgluTtiRX54x1grZXohc9JY4mNRGfG3PwAB4BwzgID/5LWsyFbjk+MLrJimnvsGugunM0DR7q8TXZYd0g46wWv3CbilMDuET5ILvTr/y+dQhH12S571l8ZIXrXtNbNO3sDutn+lE4E8uJkQETNR+AzyUEyV4kOToAesPTM4CFtnRuwk9DeliBNSPomWpKB61B4zyq62Tsvu1zw+Erx6QXARxmRJsAEYa9SmkvMp1vRrA/uTeGo6BGmLjpq2SYHOkIBCSdZyVzQ48xm/mmVw+/uWpFYqP/7La/iAhVu7nLwh+owvf0HFIWO7VTmdnrANVBnR1pXcL6YyBl1EwoGJTrtxto/dITvvmJ3fCE226w/eRzQYi+Q5EJlE2PtGPB49aE5mdGBoaStO+ulCREd7TtgZQschYxWzv64j5cK6yYzy/f/YB+jkMmoCzUpwsRh4ZjxDenKYHuR/PydUwdPZDH7qRoZSFRgyvjl8HPPDSYVsoFPbgDUx1SBIrJmFYtwA83Zp8qUG3E3HW+GixtGCjhgd0ucW6Rek8cOCDcbVCVuStj37M2PVR+ XCY5Fali G1n4QUQqPuNEb0h70Q0zLwGZZkqT+PABtlwLuiqqSlYfsCL0uRsx+qT6WTNcb+7bPwIFXESbPmhvRFrE/4U5zvsnHTVnQ962G5DnTf6AaswOJgP4g5sOuXiyPY5xqc1a59CQmnYrYdQxc14MnS0bTvVSgXYG5x3TdaNiANwwzKA2/XtvSJd+mLBw7lTqO2e2xoqlqi+DFJwnRzsxMpKTKl5yDAsHiv7z0NOAuSmXkcdaMm8pl12Wr4/ASW2H59JUkK3R1G9g5/dPE0tjOfNXCRdWUGZD7R7s1fwkX6Cq4Y6ayyrc/BEd9bKeVYlUWQEli9LCvAp9FNatJgpc2dXf39hOUISxiPZblHma/75okhVxklH+cxnVmtDg2HzhQAdVkkFu1XWiHJq+Jmu+x5ucvT6gdF1bnr0/ppgznfE4/ni9vfw8//HG2X8aQ1yyTmrxCShcoUlF4JWt+AjDoIwg77V4+gYIXYJAoR3uf/J1dP56loXh+3VlEIrw59dR1MjHBc1EuTllJ6vRBoQTiYPDD9WN6KlEbpmJ9H3PgREqeCg5mHM8PZRg5PWZwgHi3jpueVoxFvnb0l9t/bkVCXx1aYk6rF//ato3CLnJtm4uS1JkaQaOwP8zxYp/dYdkOZSDtV7kGf4mq9T7scp2QIEtix0WjkdFwBCtGg6XQ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000184, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This will allow preparation steps to be shared Signed-off-by: Ackerley Tng --- include/linux/mm.h | 1 + mm/truncate.c | 26 ++++++++++++++++---------- 2 files changed, 17 insertions(+), 10 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index c4b238a20b76..ffb4788295b4 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3442,6 +3442,7 @@ extern unsigned long vm_unmapped_area(struct vm_unmapped_area_info *info); extern void truncate_inode_pages(struct address_space *, loff_t); extern void truncate_inode_pages_range(struct address_space *, loff_t lstart, loff_t lend); +extern void truncate_inode_pages_final_prepare(struct address_space *); extern void truncate_inode_pages_final(struct address_space *); /* generic vm_area_ops exported for stackable file systems */ diff --git a/mm/truncate.c b/mm/truncate.c index 4d61fbdd4b2f..28cca86424f8 100644 --- a/mm/truncate.c +++ b/mm/truncate.c @@ -424,16 +424,7 @@ void truncate_inode_pages(struct address_space *mapping, loff_t lstart) } EXPORT_SYMBOL(truncate_inode_pages); -/** - * truncate_inode_pages_final - truncate *all* pages before inode dies - * @mapping: mapping to truncate - * - * Called under (and serialized by) inode->i_rwsem. - * - * Filesystems have to use this in the .evict_inode path to inform the - * VM that this is the final truncate and the inode is going away. - */ -void truncate_inode_pages_final(struct address_space *mapping) +void truncate_inode_pages_final_prepare(struct address_space *mapping) { /* * Page reclaim can not participate in regular inode lifetime @@ -454,6 +445,21 @@ void truncate_inode_pages_final(struct address_space *mapping) xa_lock_irq(&mapping->i_pages); xa_unlock_irq(&mapping->i_pages); } +} +EXPORT_SYMBOL(truncate_inode_pages_final_prepare); + +/** + * truncate_inode_pages_final - truncate *all* pages before inode dies + * @mapping: mapping to truncate + * + * Called under (and serialized by) inode->i_rwsem. + * + * Filesystems have to use this in the .evict_inode path to inform the + * VM that this is the final truncate and the inode is going away. + */ +void truncate_inode_pages_final(struct address_space *mapping) +{ + truncate_inode_pages_final_prepare(mapping); truncate_inode_pages(mapping, 0); } From patchwork Tue Sep 10 23:43:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799482 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8C50EE01F1 for ; Tue, 10 Sep 2024 23:45:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 445BC8D00D4; Tue, 10 Sep 2024 19:44:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3F6538D0002; Tue, 10 Sep 2024 19:44:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2711F8D00D4; Tue, 10 Sep 2024 19:44:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id F417D8D0002 for ; Tue, 10 Sep 2024 19:44:50 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id A178D140D4C for ; Tue, 10 Sep 2024 23:44:50 +0000 (UTC) X-FDA: 82550461140.29.B6D371D Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf25.hostedemail.com (Postfix) with ESMTP id DBE9AA0011 for ; Tue, 10 Sep 2024 23:44:48 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=4C2bF8tr; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf25.hostedemail.com: domain of 379ngZgsKCGQCEMGTNGaVPIIQQING.EQONKPWZ-OOMXCEM.QTI@flex--ackerleytng.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=379ngZgsKCGQCEMGTNGaVPIIQQING.EQONKPWZ-OOMXCEM.QTI@flex--ackerleytng.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011884; a=rsa-sha256; cv=none; b=L1tzoEiKgZ59GrAQ0VpGhNRGXifV8gvti3dodBG+c5btmWqz353Sxg3IovZ5JIw+QrbQXO CYGKT47LIRnEZrV0IiRW4VoDdn7+kAXc5Y314WHqDTnTIlOVkhVxW4ERzZ6F6XrJk1nwIe Wk2P6xrBt1V7Uqlui/5/B2qCJW0RzK8= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=4C2bF8tr; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf25.hostedemail.com: domain of 379ngZgsKCGQCEMGTNGaVPIIQQING.EQONKPWZ-OOMXCEM.QTI@flex--ackerleytng.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=379ngZgsKCGQCEMGTNGaVPIIQQING.EQONKPWZ-OOMXCEM.QTI@flex--ackerleytng.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011884; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wON4ebge/JtWXTatH36Oz8RnYdWXk0aZ0I0s+pE884Y=; b=w6oGcaSk4FHoH4azwjXJ8vXLYfqJGyLtj4jYhKRhIJL4jfTYCCZhW5k6tNFstuiSMnXqeJ dwnzeRBle/HAjP9uTE5UqWb2x8KeqyFh+lggWltNfMPD6mNAEktktHiXeb+NdDn5r/zxLD hg7u/6871MHax1/iC130oXiPOM+0yAY= Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-e1cfb9d655eso12469181276.0 for ; Tue, 10 Sep 2024 16:44:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011888; x=1726616688; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=wON4ebge/JtWXTatH36Oz8RnYdWXk0aZ0I0s+pE884Y=; b=4C2bF8trQH1S1IdSGmWS3dc3q74o7T6V5LSeDoW6Weu1Vuosqt7ppkvGMNlyDaFVSr +jqFzqHf8XH3kKYuUDR77D9pdzyuQWhC7usqs7OlfsmeqBlLJ+s8CcUnYuF4bTBn3yW1 3h4PAaZFv9tq3WgHpStOTspt9rjbzBEgcuVnMt8ngYJhNWiIDfWm/a+PhY9Vhe6U3GlA WYhVbhb3AjfielQBvRTcAg6YBUBSQmV5XUnDp105Wg1FydKddQjeam1fmYh/1GKRJBOn t8AU3bOkWHk/mrmaCo2vAQEa/WxHAsmGlkbUflyhk6c+euONL40Y+73yUFwQzdIVleng 3wYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011888; x=1726616688; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=wON4ebge/JtWXTatH36Oz8RnYdWXk0aZ0I0s+pE884Y=; b=S+s36/gdSOkO1s7gaUWoVD1QVJ6O4p9q0eqcshrkN96X0w+SN7MNyYepw0Fk1Yu4nV 6S2gIaux2jLFNJaoKTYTviJG45xeYkVbxiJiVCSCurMYSV9I5aDJ+xKc7yWu6JSdfDQv 2/WZApWoVvoRahU/pqfShCgstv1fzcOfceA9sRe/wuJ7dHeJMlrojJl2CjNzlKxLZ8wC KO/HGaVammbK6M+0JQvM4R4yFpr4rnznm/QdOVJQmpV95MT5aKKlYFSOowaICDC1i0Oc m9g2uJZrO58kaWLQzrOioRGygh/nJ5K9/mnGYw2z9yOKKty+Yxd9oZr9/CgE2dRzLacW lg/w== X-Forwarded-Encrypted: i=1; AJvYcCXKIHleTZ9r2zmoUOTkKYO1WCenStqGTzexQ67O5Lgxx5Mcie/F/8QETXwiU9Kiag7gnJy4y/DG0Q==@kvack.org X-Gm-Message-State: AOJu0YzS3lhin5txhvGswtJuia2Sg/OO1SjcInd8B0K7uDi1xgQaTGG/ DXEIiuFJdA0xXYtJyAgPlQZEBV/kJMRb49GVI/sf4j5PII1rxsXVjeJCwnaY/RJUJfTQ7S9Hf4C 4tRaTht8PiktcrG935oorxw== X-Google-Smtp-Source: AGHT+IGifFJzomPV42kuvxwSOW6D8KRl1ZUZcLgu/9vMTp3RDdNsVY265K9HmTBfWg/leXDMNAyCn73aN71Ukp8ufA== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a25:d347:0:b0:e16:51f9:59da with SMTP id 3f1490d57ef6-e1d349e2dd5mr43068276.6.1726011887733; Tue, 10 Sep 2024 16:44:47 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:40 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: Subject: [RFC PATCH 09/39] mm: hugetlb: Expose hugetlb_subpool_{get,put}_pages() From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspam-User: X-Stat-Signature: td75dfk8p71ohnqzjne6mo3dnjtxwdr3 X-Rspamd-Queue-Id: DBE9AA0011 X-Rspamd-Server: rspam02 X-HE-Tag: 1726011888-531044 X-HE-Meta: U2FsdGVkX1/jBRr4fWC2WtmYOgXwvaklcFasepSpOm5XqaaL86MSz7JW0UREWbU9bSbuPOyhEq4RDmrOFkd3IoIpRNvqNPa6BpqmKs/cOLFjModsNtL8P/A/YoVtPn61ochTjj36iLNnrqbfGsAQfQqo+aVIWrGcO2+qeQ+/XNd5QFpDHshhlRwMy6CSbhWdZOqlgbiNeDI4QoxHKHX0MkLkFVnlpeesazoOqAP4n9FsQ69GQVOGE+O2VgFo0BcmVQWvFaNB8UOd5M4bK9d8MBw5/znkxni9J5GOrx3zfH0K41qlKySGsrMZUqvXUjbrHpsjaCmVDByIIZFwiZzVUBVnGD8l9KLbjfjQ8b53A+Nj+bFD6W541NKroaumax2MuS6NRJzEhcpFm5Jth5JuEqt+98mH+mge2CmC1Qa7QlYPI2aqH5FCPvIpCLe/1vQZ6GezxVC9CH3WVnJqfNjEg/qhYPj2ZnDSSHJjl54NQ7xkF3GYZGTddX0gCLfvfD7x20oV015H3dIlk1bONZGZLXo9jWaMRSWFJENxd0Tz4m72vh8A/ML7xGzHoEEEe0dnFy7oYS2m7TXIIgKL00hSBR9exP9qC8qm1ozIjoLeIBdcB0lU5U7x7rNA3ePUgqDsp3tAqadDOL3JlUKjnR4UNZJs12xolmEvxMnYJ9Fd8j9HiJYM5yOpysENoeUE7kl3sLUqhdiFjRVMCzsaJVGJ7QFoNhkwKTL8ZWMF+pC4QDt2lxEswhngEAcHQHxYP3dpQb0zCw8O3VUyxqWYoNmpZqK6bZ8J9cZz2p3Z4ihQIrfVVsfrbeDcsc+vxJSug2LV62XAmNr2nIewni69af89vEsG0CsqQ/s25tD0cGilrJSwqfntkcn5KvlUpXWOMdx1q2hbs4UEXbk4a6a07pdTG/Rzp7UYJxlogC5lQwg+pgQweC5SPyk868QFtXsVm2uchVcKIrRAuKID3+joDJc RVuxZDPi 3lgs9aQ9s4GfGsWIpyZlfDLCjWXNHqmLvO6YqCaJD6JYWzloM5qXeX3gmYhjUwTvKg8ytjQ21A8YluvncdLfMtRuzvUi/q7VOSL5Bt3kaOQXnXox14eQQeRiRyAJouUA6dtU05yUaoDcezIHizkksulQJsE6mvT8iJ03B+Pr5hljgvLvBLCpGKC980RiqfiFXYev54boe6SwGewkVzE0Y3qAKfNXrcsjkIeziuKDgMnh4o1l+LiplZD92WmOTdG8uGzA/dB2nvufVhwUUWQCEnz0TEBXRgNwl7CYL0GXmYVGZe56uD7qiQ1QANFhJE5YS5yhz51ff6AKQ4RYpXvzFp3LGy00gCSMexGaiMBClXgOypbjGPJT2SEyARLcuHMbsRXQjKfNODrgcLKwR7azdBvtox9Qny367y2wigZQm36tYyLZRKzdbiy8nkcVk/eSO0PHPpxI1ugV0FR37AiTxwkGt+5VVvSVOX1j1vynrZkzUsTaI86CC2a9luk98Ab902Yxd55pTQW+yVU5C90uGNAtOdCIeTj8zsQlSi2P1rwNK//Rn4j2j0lHqKuXXaBj1Tn4HCEE1FfWudWX3LAmZlSvgyoxWceV0YqJ1XkYicU1jSXxAT5oeb++rFsPH9mzxNHYOFL2U9uzoiPfiN5k9Y+K5UxElYZ8AUQEk X-Bogosity: Ham, tests=bogofilter, spamicity=0.005365, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This will allow hugetlb subpools to be used by guest_memfd. Signed-off-by: Ackerley Tng --- include/linux/hugetlb.h | 3 +++ mm/hugetlb.c | 6 ++---- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index e4a05a421623..907cfbbd9e24 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -119,6 +119,9 @@ struct hugepage_subpool *hugepage_new_subpool(struct hstate *h, long max_hpages, long min_hpages); void hugepage_put_subpool(struct hugepage_subpool *spool); +long hugepage_subpool_get_pages(struct hugepage_subpool *spool, long delta); +long hugepage_subpool_put_pages(struct hugepage_subpool *spool, long delta); + void hugetlb_dup_vma_private(struct vm_area_struct *vma); void clear_vma_resv_huge_pages(struct vm_area_struct *vma); int move_hugetlb_page_tables(struct vm_area_struct *vma, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 7e73ebcc0f26..808915108126 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -170,8 +170,7 @@ void hugepage_put_subpool(struct hugepage_subpool *spool) * only be different than the passed value (delta) in the case where * a subpool minimum size must be maintained. */ -static long hugepage_subpool_get_pages(struct hugepage_subpool *spool, - long delta) +long hugepage_subpool_get_pages(struct hugepage_subpool *spool, long delta) { long ret = delta; @@ -215,8 +214,7 @@ static long hugepage_subpool_get_pages(struct hugepage_subpool *spool, * The return value may only be different than the passed value (delta) * in the case where a subpool minimum size must be maintained. */ -static long hugepage_subpool_put_pages(struct hugepage_subpool *spool, - long delta) +long hugepage_subpool_put_pages(struct hugepage_subpool *spool, long delta) { long ret = delta; unsigned long flags; From patchwork Tue Sep 10 23:43:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799483 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 043D8EE01F1 for ; Tue, 10 Sep 2024 23:45:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B27018D00D5; Tue, 10 Sep 2024 19:44:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A37228D0002; Tue, 10 Sep 2024 19:44:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 815128D00D5; Tue, 10 Sep 2024 19:44:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 61ADB8D0002 for ; Tue, 10 Sep 2024 19:44:52 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 11FCCA0C59 for ; Tue, 10 Sep 2024 23:44:52 +0000 (UTC) X-FDA: 82550461224.08.54F5575 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf08.hostedemail.com (Postfix) with ESMTP id 2E265160003 for ; Tue, 10 Sep 2024 23:44:49 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=F9EEUtvp; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf08.hostedemail.com: domain of 38dngZgsKCGYEGOIVPIcXRKKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--ackerleytng.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=38dngZgsKCGYEGOIVPIcXRKKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--ackerleytng.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011775; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8iv7923HL7PnugAdoQNMJmA75my6sY7uB1Q3OZBVqww=; b=N6jEZJ6J8vP6EE6ZcNoLEr1m1+pW03LzNteVkcJBhyTY1y6NjAUrWFMbFq46i8U+BcowVz O3lsCzziTJkrKc4aSq4R6RIXXhtNlXFoQ/uLBuPKMqAjEgCosfMqs9XiyUEcIr4ZNdmb3C dO+8AuLfsYYBwjAYIG8TnPy2CR1yi5I= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011775; a=rsa-sha256; cv=none; b=6vFgjMKzzK3aNTXUJO0ClExp1eZHX1Ijf8KjQaTsTxOdNOO7REH/HNzxpY1LxhP9Gvh/9q Lhk0x8sMfjMQ8zfO0vLqGVrt7uEJgkKeNlsHZVlvdgPCGJiEmXNPLgszttXAg35/LhVz56 UzKmGDr9QVSrhMtc4suFipG/34GjDBE= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=F9EEUtvp; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf08.hostedemail.com: domain of 38dngZgsKCGYEGOIVPIcXRKKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--ackerleytng.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=38dngZgsKCGYEGOIVPIcXRKKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--ackerleytng.bounces.google.com Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-e163641feb9so1115565276.0 for ; Tue, 10 Sep 2024 16:44:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011889; x=1726616689; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=8iv7923HL7PnugAdoQNMJmA75my6sY7uB1Q3OZBVqww=; b=F9EEUtvpXXJFTpSzgXTZbO0VXe0N/1FRo7ckWSmTqpaW57mmQrLSPfc8vcamWppwF4 wdtRn114DyK46B2GdzwC2qWcvlY9OnteoPOtZil1NS0mKAJnHRq9lRprwgLp29xKTEzL uGDn8KAe29dps9hH7+LHr1fASqc75YeX8NmaJZpfXsCw0kWomCWVpuwmBkdAjQxPggC5 LJEiCZYMAhRyNjTVC1TbhvkiaLUebEW6AXlHn9lNPRxy2UR23YrZ7XIL7q7ut468oWYO PXtfWe64snkeMa6DD+4FRnfpHPkrMe41CahjIWijT1Rnh3hxpddTeuX5oEQtcVbmFauN iM0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011889; x=1726616689; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=8iv7923HL7PnugAdoQNMJmA75my6sY7uB1Q3OZBVqww=; b=VPG9USdYEZlLcvrhOWt1xW4eZ11eye5o9Md1qqmgLxfmplh+M4jGtmFmkYIaiLnpAo UYQMCoxWq68HcazS/3qP8aDHtrjCnSAYdd8bNHaj9gSJnit6Z82XhVFMBx8OhsSga446 TCcdrWmshMQrztVH5DJR8Cld54i12grPSh6K0hhwWoQkV6BawczVo14cy/UH19bZ3yWZ dOkWzCH3WrIFKVFp6pJ/IwS+BLzADOCOgW4hvBQlkNwYfSxxdqRwxuks+2PNp97V8Ux5 0Tnr6CfCh51XiemlG2sqOqZb1TrExG4nDDp2Nz7aEjfPyKBm6U/MA7VkcHiYJs9C+sUf RI5g== X-Forwarded-Encrypted: i=1; AJvYcCWDgO6wfFHVDZlicxLsp9yRqQ3lzg2TllKdvzrTdIY5U9NeQiOjwa4i7iIm00qEphwpRkL02bB24A==@kvack.org X-Gm-Message-State: AOJu0YzN6gHdQNYVcUufoBv9F0uBrHdrE8yyq0FNVY9eAxZO00m7JXhi Dg+4kkhGg2MpznOnFC7A5fBJ4w+JW9eujQfR0fD0rC9GKWeeRIK3VBshLy96tlLg8Oq2Eysf4u7 pFoXFrlJE/s5/r3dZl5KSWQ== X-Google-Smtp-Source: AGHT+IExTdEexhbqpimHxd6gs3l7Pq2iWLuHahj/FFfLrEvqqV1hWUN+M4z6YvLC879N40fo1vT1sN3faPRrRHEZgw== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a25:aaaf:0:b0:e1a:7eff:f66b with SMTP id 3f1490d57ef6-e1d7a0f3520mr31045276.5.1726011889166; Tue, 10 Sep 2024 16:44:49 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:41 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <083829f3f633d6d24d64d4639f92d163355b24fd.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 10/39] mm: hugetlb: Add option to create new subpool without using surplus From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 2E265160003 X-Stat-Signature: pggofrpa975mtj3n8bb7nqxkdrggttxf X-Rspam-User: X-HE-Tag: 1726011889-60012 X-HE-Meta: U2FsdGVkX1+/GZwwNxptWM1Eh7nQwJ+zDZQwzsH0xrdpUcwOENy8IgWU/7uMl2WPJ3NaldZ/2+HlEiPR+k7rG7jG3dIHxHsVlwGe9JSbyfdWhOLd+BMut0HTBMCsNFfYDHbtSOlk+ZIXSow3zrYViHtcWWPs8YtrqqbgV25TOK1mj3yTUymNwTaqsQCpINOr0SZvlfDxIQp4op3vk3sFULSaGiov6gKNBNURfxglg0TpYi7OHcd/f6oOTf+CAwdB8iaeAQx1IREBFhdgQKErGoIFPg0juf3Ylt8geV/fM67XfrhjUMKWa1PHjm6ycrQ5/275eMh/+07Ykj7bxABYGykpllWcWrufAs9N2FV4wKIwY2MGPoIK4UMs+778i+94n85PLzILvGFvnAF9K0Nrrioc5c7g3kHRtkdu7s+3UtjZAmmFM2/P6mJJwEVxHhAV0CsCK4BPGEwM2Brq4i1vGv7o1dvfftyFhXOxiyDgHKz5Ulg5M2VWw0C1L5Bn0pdww1Ic9Ns21h++zH0jVdzLt2RJgWKe39zh5GaMnqEkKJww/DHMhpm6Caun+03yUq3zoglCLkFbEFjeZVPhTIGEDQvNpkRaxRQ/dg3sQGcqvoXwHNrMx36H9OQCvqtAqHtYJEcqC7WaeyBLmQrA5J0aI2X3qW7CalPTcPKweZwXUjkgZN74Ffy2vsC89XQvfWOJynmH/lQk4p1O4etUhVyxe5HJiZD47kLQ/y3MR87pLAGkQJkt6LulCqBrIg0OFZUbl52zJd6AqHLzmJ1wXX/T1bLtrj18moSUtvb4C3EBxr0IYHXvf8Q4pwndwm9WKQGojlhm2ahkgp54GxTDDQKPW5gOho5dzsLUuUSSU5pTSpv8UXMyMIK3YVk2DRtDMNISz6tKGcJy90I2gSpduvrkMC/dzm7KFLxiMUS7m8PKQjpIQCLBkpTIqdlSJQAlMDeE+yZn3THUNMDf65WeuvF 152o3Qko NpKHMoaO8drYIXbEP0x+dGlm0nIv/IKCp9iNbI53xQKGFtiD1ZPqkmChkrpVgqVy7Mit4TUKim3WYceicwOiL678Q0QPB4227WSxOGIXy5mhsm9+K4J99eN7Jn5yMba+qnF+jPA9fjoEDyB59T1GBQaIXj3rQ34BIezUQKh8HfGVCYoeX1wfRffeHRU2eklGcvEVXhrPSt/adQn0SqNpfvwELGJ5WpNCTp5Lmr9GhvzE3qDxANlxTN1CbCRSA0t7U8Yy7COUHH0P+XMAfQJfvyAqU5qk9THs2Xz40I4r8SRvYK1LbYYN213JDuJy176c6e89W5hH38EzQs/GSyHicv1cejtYg9i78yUFXKPW7TSTArmpL2ciTvuuw/esGUEGUVpYJ9w5bIheds2tuW7WP0dQ3X3YsRRzsNeqCO/GEgbOTHQEbGw0KJjIi2WEtE6f3mYHDTwXffEdk8CUFS+Xe2aZQ2x6ZOkiHQPCxIsJ5e4uRfWzHnOJyPNBOKlyZR+v5iClMTxFom5Ewyp8skO3r4I52qi+V+MOXisTEM8cig2jSlMIQo95XA4BwQkBtOxPNrgLH0TXFY45arKnncW706EIPRhouvwdfEgVS7oEQfSnupjxz0dboN3l5eE+Ao4NvFKiXtET4qWO8vga5M4N6An2ENA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: __hugetlb_acct_memory() today does more than just memory accounting. when there's insufficient HugeTLB pages, __hugetlb_acct_memory() will attempt to get surplus pages. This change adds a flag to disable getting surplus pages if there are insufficient HugeTLB pages. Signed-off-by: Ackerley Tng --- fs/hugetlbfs/inode.c | 2 +- include/linux/hugetlb.h | 2 +- mm/hugetlb.c | 43 ++++++++++++++++++++++++++++++----------- 3 files changed, 34 insertions(+), 13 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 9f6cff356796..300a6ef300c1 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -1488,7 +1488,7 @@ hugetlbfs_fill_super(struct super_block *sb, struct fs_context *fc) if (ctx->max_hpages != -1 || ctx->min_hpages != -1) { sbinfo->spool = hugepage_new_subpool(ctx->hstate, ctx->max_hpages, - ctx->min_hpages); + ctx->min_hpages, true); if (!sbinfo->spool) goto out_free; } diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 907cfbbd9e24..9ef1adbd3207 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -116,7 +116,7 @@ extern int hugetlb_max_hstate __read_mostly; for ((h) = hstates; (h) < &hstates[hugetlb_max_hstate]; (h)++) struct hugepage_subpool *hugepage_new_subpool(struct hstate *h, long max_hpages, - long min_hpages); + long min_hpages, bool use_surplus); void hugepage_put_subpool(struct hugepage_subpool *spool); long hugepage_subpool_get_pages(struct hugepage_subpool *spool, long delta); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 808915108126..efdb5772b367 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -92,6 +92,7 @@ static int num_fault_mutexes; struct mutex *hugetlb_fault_mutex_table ____cacheline_aligned_in_smp; /* Forward declaration */ +static int __hugetlb_acct_memory(struct hstate *h, long delta, bool use_surplus); static int hugetlb_acct_memory(struct hstate *h, long delta); static void hugetlb_vma_lock_free(struct vm_area_struct *vma); static void hugetlb_vma_lock_alloc(struct vm_area_struct *vma); @@ -129,7 +130,7 @@ static inline void unlock_or_release_subpool(struct hugepage_subpool *spool, } struct hugepage_subpool *hugepage_new_subpool(struct hstate *h, long max_hpages, - long min_hpages) + long min_hpages, bool use_surplus) { struct hugepage_subpool *spool; @@ -143,7 +144,8 @@ struct hugepage_subpool *hugepage_new_subpool(struct hstate *h, long max_hpages, spool->hstate = h; spool->min_hpages = min_hpages; - if (min_hpages != -1 && hugetlb_acct_memory(h, min_hpages)) { + if (min_hpages != -1 && + __hugetlb_acct_memory(h, min_hpages, use_surplus)) { kfree(spool); return NULL; } @@ -2592,6 +2594,21 @@ static nodemask_t *policy_mbind_nodemask(gfp_t gfp) return NULL; } +static int hugetlb_hstate_reserve_pages(struct hstate *h, + long num_pages_to_reserve) + __must_hold(&hugetlb_lock) +{ + long needed; + + needed = (h->resv_huge_pages + num_pages_to_reserve) - h->free_huge_pages; + if (needed <= 0) { + h->resv_huge_pages += num_pages_to_reserve; + return 0; + } + + return needed; +} + /* * Increase the hugetlb pool such that it can accommodate a reservation * of size 'delta'. @@ -2608,13 +2625,7 @@ static int gather_surplus_pages(struct hstate *h, long delta) int node; nodemask_t *mbind_nodemask = policy_mbind_nodemask(htlb_alloc_mask(h)); - lockdep_assert_held(&hugetlb_lock); - needed = (h->resv_huge_pages + delta) - h->free_huge_pages; - if (needed <= 0) { - h->resv_huge_pages += delta; - return 0; - } - + needed = delta; allocated = 0; ret = -ENOMEM; @@ -5104,7 +5115,7 @@ unsigned long hugetlb_total_pages(void) return nr_total_pages; } -static int hugetlb_acct_memory(struct hstate *h, long delta) +static int __hugetlb_acct_memory(struct hstate *h, long delta, bool use_surplus) { int ret = -ENOMEM; @@ -5136,7 +5147,12 @@ static int hugetlb_acct_memory(struct hstate *h, long delta) * above. */ if (delta > 0) { - if (gather_surplus_pages(h, delta) < 0) + long required_surplus = hugetlb_hstate_reserve_pages(h, delta); + + if (!use_surplus && required_surplus > 0) + goto out; + + if (gather_surplus_pages(h, required_surplus) < 0) goto out; if (delta > allowed_mems_nr(h)) { @@ -5154,6 +5170,11 @@ static int hugetlb_acct_memory(struct hstate *h, long delta) return ret; } +static int hugetlb_acct_memory(struct hstate *h, long delta) +{ + return __hugetlb_acct_memory(h, delta, true); +} + static void hugetlb_vm_op_open(struct vm_area_struct *vma) { struct resv_map *resv = vma_resv_map(vma); From patchwork Tue Sep 10 23:43:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799484 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 71B85EE01F1 for ; Tue, 10 Sep 2024 23:45:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9C4898D00D6; Tue, 10 Sep 2024 19:44:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 958A68D0002; Tue, 10 Sep 2024 19:44:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7792B8D00D6; Tue, 10 Sep 2024 19:44:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 503D38D0002 for ; Tue, 10 Sep 2024 19:44:54 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 1B25680897 for ; Tue, 10 Sep 2024 23:44:54 +0000 (UTC) X-FDA: 82550461308.26.311966D Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf02.hostedemail.com (Postfix) with ESMTP id 5176580009 for ; Tue, 10 Sep 2024 23:44:52 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=C4oCybm9; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf02.hostedemail.com: domain of 38tngZgsKCGcFHPJWQJdYSLLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--ackerleytng.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=38tngZgsKCGcFHPJWQJdYSLLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--ackerleytng.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011888; a=rsa-sha256; cv=none; b=uEDXYgfVTiCBZ6lNtDVrY+9yrZKG9U1/bX4DNXOygks8OgTBB+69XVGA1Hm5xvdjvzjtXh naDVccenUTAQWpIRrwdy80/WZ7bzTn5Ppc/2EAQB4UhZzKFtx/RT8v2sGQI/aY80Acs8XK /BR2DuFR9pmiqZgLYYNoXbCW42EEzak= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=C4oCybm9; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf02.hostedemail.com: domain of 38tngZgsKCGcFHPJWQJdYSLLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--ackerleytng.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=38tngZgsKCGcFHPJWQJdYSLLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--ackerleytng.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011888; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2ar3js6y/uqSuZ8kEtPgJYc0vWuZF8j/b1JF25slk+c=; b=4jYnjBgSef919DCxwG42COhrpc3AQ/uTA8XWlKlICtmI/X8Ihz+tTMG9kXgeZ5y+7iO69X hr+7usy/D+01/6tKiqFwGAEwihc/u6/XUlPP89BdGgTGF/Mgvx7ij/G2P/38PrKdG51Pwn SoJ4SWmcdxHU/6wClc9JY6p1ZJGWSp8= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2da511a99e1so1522249a91.2 for ; Tue, 10 Sep 2024 16:44:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011891; x=1726616691; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=2ar3js6y/uqSuZ8kEtPgJYc0vWuZF8j/b1JF25slk+c=; b=C4oCybm902+ItkWYP4v/uPnSKXXSVBD/2mhgkGIWJQyOu3r4io2DAHK6IggHDVGezO RRA06pcnU4cD+ER//40NXeVL4+lg5Mra5NUc58NP/flQCdpmBwsuadT3g//3XPrI+/SO loyyQ1jS33IwXsGd189BXIIXeoNy2RpMrwAfYm4c+vDPxGk5/MTzNxnJLln8d62iS64P ogfigRK1c0HCb1KDdYFldEBBaBiixq3t73qk9dL+dYj0N67UniLSH340Eb2s5g9Wn8yu ugatj2uJW9bW0guSj8kMDRc4YKFOjeRLvPYic6zB7cqZI7u0PMXy8jIk86oUnXDQbV/r OS0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011891; x=1726616691; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=2ar3js6y/uqSuZ8kEtPgJYc0vWuZF8j/b1JF25slk+c=; b=JLlGOb8WB1KNo+VorlJCJOLWGXnYZbLJAkwQ/SENCPttP9FPdjfSLfiwjS5PevKZGx vObr9vdh+D39NInvNu8u1v1nUWmq3MTmfj35xVezmv0dy3Wfz84wICbqDvhCg3aiYQxm zFywlh0mLhzEdpO2RkTTttyeEDehdrzhL+ZcekSrtjY1DXF+YQxVTS0dpmqW8nbzYLSI VSh1zyjCF/lKfJ/6ThP5xWt4BqKWNk+/rmjKBZm+hV4o2uZhhT+CsFMMTbwjzVQPc6m1 O27Nnwaq8NQDE3/5eWnFgItDDUsUc4kMqAGD32+ki08TrzwAYY0IsjmajA6u5yVnbuB3 2JZQ== X-Forwarded-Encrypted: i=1; AJvYcCXmiBeQGgCPw5dbCewo/5LuujmvldpQSNp7n5f0XMxhTWY8Ru+ifdchB7rmbDEXTBGwA28zCVr9pQ==@kvack.org X-Gm-Message-State: AOJu0Yz3Lpl5/yFoX4KjIX5FTpZUs6rpSMLeQkd7rn/gvGNL4jQnxqfW W/v9n6Z/b73w4O5KFmwwBWHa9/qQV7Nq9jV38AuHZmzdXZpyzFf69G79XH3q8ZS2hT6vbsTlR/y mg7lHkqZSDB+MxY38CNDClQ== X-Google-Smtp-Source: AGHT+IFJTKDezjG0ZGixJJEKykOQcK4Q2RdFfnGKlypTv43g1OIUEWvOedTsxO4Flime3+ygfPrCyueBrEPlyPHzTg== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a17:90b:1e50:b0:2d2:453:1501 with SMTP id 98e67ed59e1d1-2db82e64986mr2443a91.2.1726011890733; Tue, 10 Sep 2024 16:44:50 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:42 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <3b49aeaa7ec0a91f601cde00b9e183bc75dc37a6.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 11/39] mm: hugetlb: Expose hugetlb_acct_memory() From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspam-User: X-Stat-Signature: 5ey8ii485bqoj368us47rfymedbdpokg X-Rspamd-Queue-Id: 5176580009 X-Rspamd-Server: rspam02 X-HE-Tag: 1726011892-469757 X-HE-Meta: U2FsdGVkX18I6UL9lfbLsbaJ8xVwRLnkhb0GOVnfuez0SqPIfVOCaI5pscZcqBoilRihD8r8uFkUYqtqt6JoTe6gR6hfWStT9RNG9/rJ2z5rj4bQ9HPm3jrLwIMHFDs6e03pW4AbRMCechskT3Gkymot6X+PtZFz1VfCuIF2BexAKDvwksHeEYiYBV52I57TLZd4R/4+kucytwi/B415mbB4HKY+Zq16vbEOV0CCx0CiACbJCI3H+guGUPSWly2O543bPllnPAbAH+BJS+HNGK/l2qwfMMwnoNeBqkvraeD+QKsUQsmZVJ/6+U3l5ImLlH2xEekBVZjMhyYqfh8k0nENsoNJj4bN4qGvfmptA5Amf5PNUKNnJogZLAAaWMVa/4VazVYRC4gMmGc+se9hQsCcmTGTJrH8FZztsabM2g/WqqWJ+0bmnNW/jLVceoPhinWVAfCooLUoBY3wCheF1rJbs2FKNLdhgAq2zdkt1JYbTEq0mFWjKlmX4iUsYmGWZXkih6Bz6q/ku1+REoobyL7A9RlITHeZ1Dzc9R5FcESwkFakVWWES/btuGX9n1YiSKoJsWcJvEVhE8K461NrPZjHF0FFj92zOKRxoshMmFfEPh/Bqzz2i+ubhg6hEQwdSGduzraDtBARmk1UtcGS33fUeHyt9/hP4/8TQ3SuQS6URv5stthKFupiLFnKtQusEo8cELd/XWjH2p1o2e0x+65tuq4BRq6vtUEcZcvEsCnONfMCSY8uSWFs7kyuFKJbgnc20S7qcTxfXLzG9zUzwEGeG6FUlMJQKtmdRWwAKz9CR9fY6IUsWKCP8NG4OZ3H/dMLjmSCY4nTCteYBSV3ujApzXOlXUEm0pVaW+2fEPC0TR8hC9MqPzVLLUbE0aniAsLA/6ebxH8z61AHTC1cYUeOL54jghWSDuNA2XAEY2u3DwOsepzFyrLt3Cw6rMpu5YAXpIEluRrnOjrufYB pQicXVRw nDJcCJObk0ha8bMqQihnr29BFy7naUzG3ANf0Xe7+pO4UUP6i7utu4EpseqHBaVjHE/Mdu6hNHxBqHk4NFa/XpDRjRw9ZdFEcId02iRbKqjbwDy5Noq3AjLMcZtGDNvIg/+sTHzvRL4tm+4PThDSF+P1KKqaM7fsaGmbi6u5GyBB8ikEOr1+C8nxc91UJqcQUwrNJUQFZnTLCzK6zI+X0BrLFw+dDnNf8yPFhuJxFkSWM6345WHyzEMexsKEVdnSOqu0kJMFOhQc1FFGIQAd0UAaUkX7c6eZ0DP3GgeDUykORngHlySfSjHSmw/X2EyHCF7lYKRaYqGO65DCKEgOY9gkDz4cwrK5c0+7qU7N3ZWcWmFeuHFSK/HjfWHcVV7/RJCIjCqMcxtXRuHmgODcY9JfUXDRrPl1+lDuNyJao8SF+P/RgjdWSj+7hn82r2FVMLNIH2V1EAcfGmzXpMxr7JJsn3H9kI1A1qABVp9UlymZOgZLtC4Mm0MBh9Wu9WiMVambprGgXkhG1g4lqgzG4SDvO4ahh5tatouL2YFLiwrZQeTN8SGXINcmKqqzVwVdnTU1xPN5x95PhWJKuVv1tvSKBDXjQIJjathUfj7T/H7KvWIm7JtE9KQso8aRMqc/ktUtScHc50M6xDHB7/QvfoeS3/AGkpfQWaeZ+2Gtfu3X0oB8X/uK0SM08gXknWlfOMwKJ9SG1KtJT2ao= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000371, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This will used by guest_memfd in a later patch. Signed-off-by: Ackerley Tng --- include/linux/hugetlb.h | 2 ++ mm/hugetlb.c | 4 ++-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 9ef1adbd3207..4d47bf94c211 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -122,6 +122,8 @@ void hugepage_put_subpool(struct hugepage_subpool *spool); long hugepage_subpool_get_pages(struct hugepage_subpool *spool, long delta); long hugepage_subpool_put_pages(struct hugepage_subpool *spool, long delta); +int hugetlb_acct_memory(struct hstate *h, long delta); + void hugetlb_dup_vma_private(struct vm_area_struct *vma); void clear_vma_resv_huge_pages(struct vm_area_struct *vma); int move_hugetlb_page_tables(struct vm_area_struct *vma, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index efdb5772b367..5a37b03e1361 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -93,7 +93,7 @@ struct mutex *hugetlb_fault_mutex_table ____cacheline_aligned_in_smp; /* Forward declaration */ static int __hugetlb_acct_memory(struct hstate *h, long delta, bool use_surplus); -static int hugetlb_acct_memory(struct hstate *h, long delta); +int hugetlb_acct_memory(struct hstate *h, long delta); static void hugetlb_vma_lock_free(struct vm_area_struct *vma); static void hugetlb_vma_lock_alloc(struct vm_area_struct *vma); static void __hugetlb_vma_unlock_write_free(struct vm_area_struct *vma); @@ -5170,7 +5170,7 @@ static int __hugetlb_acct_memory(struct hstate *h, long delta, bool use_surplus) return ret; } -static int hugetlb_acct_memory(struct hstate *h, long delta) +int hugetlb_acct_memory(struct hstate *h, long delta) { return __hugetlb_acct_memory(h, delta, true); } From patchwork Tue Sep 10 23:43:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799485 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A96D3EE01F1 for ; Tue, 10 Sep 2024 23:45:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0C7998D00D7; Tue, 10 Sep 2024 19:44:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0013E8D0002; Tue, 10 Sep 2024 19:44:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DBB518D00D7; Tue, 10 Sep 2024 19:44:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id B5E518D0002 for ; Tue, 10 Sep 2024 19:44:55 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 73BF0A91EA for ; Tue, 10 Sep 2024 23:44:55 +0000 (UTC) X-FDA: 82550461350.18.F83718D Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf05.hostedemail.com (Postfix) with ESMTP id A97CE10000F for ; Tue, 10 Sep 2024 23:44:53 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Gm0Y4QJt; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf05.hostedemail.com: domain of 39NngZgsKCGkHJRLYSLfaUNNVVNSL.JVTSPUbe-TTRcHJR.VYN@flex--ackerleytng.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=39NngZgsKCGkHJRLYSLfaUNNVVNSL.JVTSPUbe-TTRcHJR.VYN@flex--ackerleytng.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011809; a=rsa-sha256; cv=none; b=6hCQPr+iru2QW+iKrL1kXaDHT27GIusgqWrZvFMU0OMbUVqQVXsuRjxOPnaTbPlMANIjk5 ySHqYMP0DBHpKs1bKapQmTAPse7i9cDwUwJfmS/dwxqu0SqBLgEIaVKBYWsaVK06sWN6F6 oikqfQ+KnLkTNmAgewU6zqy4vaEjuhs= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Gm0Y4QJt; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf05.hostedemail.com: domain of 39NngZgsKCGkHJRLYSLfaUNNVVNSL.JVTSPUbe-TTRcHJR.VYN@flex--ackerleytng.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=39NngZgsKCGkHJRLYSLfaUNNVVNSL.JVTSPUbe-TTRcHJR.VYN@flex--ackerleytng.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011809; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gEfEne/GQ2Ka6MCKI1Fo0Z3xc55YqHsI9PC21Ep9Z3Y=; b=WELnJltJazBYG3uzVHNh/FAFXE5YvnjEWb3aLQt5GThWjch5MBMtXD0eWj0EvKiImyxAli iLbowYWrwAyhEpGoB+MNiRyUJLX7ftm/2wjReAZweo4ZAifXKXZeTzOs2bhpESBeHPpqW7 zIxnChMR2JqQBU3wkxygDMy6qUK2V0o= Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-e1a74f824f9so2807380276.1 for ; Tue, 10 Sep 2024 16:44:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011893; x=1726616693; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=gEfEne/GQ2Ka6MCKI1Fo0Z3xc55YqHsI9PC21Ep9Z3Y=; b=Gm0Y4QJtnjHPOuD9bg2mizdWUKePvjtIITzg6kvFNxjTC/SivgNJoi2P6F+ZyZWTFf kmqh+cORdys3FxKF5UEEuqvAv7tQ5sG9Z/VgOzZvncjkCobU/DgwcFr+7mMEOQUBnEra 0Hl5hGzgSsSbHA4255NIBYNHhfhwv+rqR26LVPqM9Pzg2XDW8cWji3nEuFQXpuhtEZLi W1IqAc8SlbhTgFXe3EvRZdv0BChBxvWl9DxNe4JI1CSWfvmwZbde+0iH1Ce8FfG/vWkn ak8qXtqlyZdr7LGGNyjtqeIdR7fEAXIUltsS3fz5nffiqwrNHEj7+X7pbybJhBGgF7M1 wWEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011893; x=1726616693; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=gEfEne/GQ2Ka6MCKI1Fo0Z3xc55YqHsI9PC21Ep9Z3Y=; b=oo/ptbI/wNy0Q8KtF87Z4D7n6go6cigJP1Z1nxGNsabQhXwFPgkE+Fdhb+fAe7BPRI P1aVnhIoSqEZDLV+JGk14/tycOx5R4IgpH9NqWnBnhy0G693tiaDjugTuDFm3YRBra5N yvho4X4j8yUIF/CMZRUPUmIqUn/ZM3Ri5Hf3za4xzRMm8zgaf3Si9NEidE0mE58LFLJn iv0OqAUzfUCO6HLNBPngUG1Av68ESDQ7Iyc7GDzQhDqg7uWeh04oY5ohcOZWq2GCBQ24 1kQwbCeHROMfb4CxDiDTSIFSOSGheb94h8zRvJpvGlZ0S+Dtcno+JcuiZL3YdTiF1kT5 Dbmw== X-Forwarded-Encrypted: i=1; AJvYcCV5ejeuYSNvvjbroEnaOjNCP8WSrp1Cxr1v4GeC4eaUYm8V86cUtfXmLlSDr5by8YjQhzE/wi7FwA==@kvack.org X-Gm-Message-State: AOJu0YxnFLlHfXvUcihmrPBgDPhyiZ+bnUHtSszbQz2FJqma6QskcY4K 8SobydjKC005j4ccrpoB4q0UUvjKZYc0cqG5cRCshR77LR17JEv6GUTsXENhqamWk4m3t8+Z+JF cF29+MsxrMoJu6RQHR1iKBQ== X-Google-Smtp-Source: AGHT+IF6ODlxEXjF8aJZSBZZ4I/iqFxheCqSbnSqwhL4p+xdg87aJqDyTNVQn4Ts0x37EKQBXrqF/+I4z5+HsrSwWw== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a25:698a:0:b0:e16:50d2:3d39 with SMTP id 3f1490d57ef6-e1d8c5610cbmr1470276.9.1726011892629; Tue, 10 Sep 2024 16:44:52 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:43 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <315ae41e7a53edab139c0323fa96892f2b647450.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 12/39] mm: hugetlb: Move and expose hugetlb_zero_partial_page() From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: A97CE10000F X-Stat-Signature: eauuapedrg5ha7eu5f4jkokkkxnjrnpn X-Rspam-User: X-HE-Tag: 1726011893-33060 X-HE-Meta: U2FsdGVkX1/WqjkMVkz2t4F/akcZALSTTw0URCzfeRxKB87tAtMOBAPMbWydZLJSRaKY31NJMyZjYw5ewWKhSGicGmqDP4/rPVSgElcesAlxYhbN4ahdUQ5w7J9BpzvSlUwXGKGW9vAoXdwYiUbpc7cquovK3/3p9pp8k9F4QTMZrsabtAmthLlMS9Wp1GepNQHskN60+OX6T+y4f5NcWjU4NCqUYwMdqCB7wLyGiZ5nkBfjCXWgd+XQx8Dxf9rlG9r7PjAhnJaR7HUnaVbmUmtVRHka1aLQKWAlecSCR8RDdE7tSGifcfffXTwaroJkna4Zc2RYIqs4HK9iBDEfSdzJTntiZpnt0nTGfnTIM4TA7CKRrHucMkrzi5xO1xX7AjMgKKkf8QE3chEQE6G2/I6mcNyAQy9KZNFoYAu5j1sMS7EFN4Fy4UMji3nq2U/NCJ/I4JPdcM4VKGNQzTprT2DQOlPwWC7PJVUjliJE9TlR6hUmFT7JnGripksAY0g3eJKcPW5QEJaf1EDWbFS57b3xP/YP28QC1VsDm/XeWcX6xjgca9GF7tCeZcaVz8NnwbfuwIMhDZ24kYjCCRNbS/cPsVEROlFWVShWfYfsdGOJO/KZIPNWV2YLtPMW6RIhbWS7PtGQdLTce0rSz9zfNaZOcc/Ysec8dxAfBrV1/lxaTBWwgSpN+JebcJh7eMmNp/JycEWYrx1slLeqmQA0eyD+ij97a+W71n353lOjAqVdeXgroyezRJlGw7+MeBnKTTh+52SYcb9fG9jTeNtgr1NT7Gmy6gYtwi+BpNjz2rMiRW519/M6ddyYz86Tg1rxMAlRBtmEtzKzkbOnhGp6uCRMUMxJepA1DNR06baYfL4q834fD5qWqpK7HK06OkJGTx4vOBobJ4IumOWw/j3BZCPigCkyQh3emwTUikNgUfgs3Q78H1tTCLYbn79/Lof+gEvdk9u3aFWX6cjbf7e bmVJlowG 5WIeBNWXs+KyTgPmA9NhYxd/bsng04hA/umTic4OGutrldx5h65AWguBxIhuPhnYTxcsiPpGOACQcynwO9CRWncBpZJ3Rk9uQZLSVlBkJU3KYi0cINbpe+2vy6bgouEbv4MmsAHgGUHjHXzLfNrqh3hHK5B2ccsGIxHnP4Q02oWq3C2W5SYJW9cNSNv3BJc4leo4/X23LoNtYPdgkh5yRmqVjKeSJwC5SattGAtYYVMbe0StSB28p/DOhDGPe6umFudM3FwoVYMctJ91/MrP36FnKXmkGsiU7toeC40kc2aM9ijpFo3QxxTeCHQmR9dECHOy8H5cUU0PrD4TPBYzSecsiu3j+KLktHIJ1FMpxKYOVDY1uJN3yML4k5BPrLGKAcB1YbkayMhh+ro4NYJIl8shRubXCgmaY0TAdVaDbpd3IUveeN1H57I9xZ6fOvHxKoRzMpw7qbIxOM6F4qWttmvlimoQ0VeIcvRcebIoL4JGYwlJBTttgU+Gin060N6PH9M5P2Tgm8obifrubjRY9FHXM8oxI4z4BKS8LC9ZDR3w8GBHxcGfhhFNn+mDw6kuJmfraW/ZdUUiSp3vaZTop4q9zfnwCDB4IVK0ode0Ur4eduEkkD5jdi3ERMYJCtzMQ7IOUwxAo+0SyA5X6NcOp0/BDPoYL5rDF642K4dCzrtTUXokUQySTQovABGjx6diBGmqR2Fhpohz1R3s= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This will used by guest_memfd in a later patch. Signed-off-by: Ackerley Tng --- fs/hugetlbfs/inode.c | 33 +++++---------------------------- include/linux/hugetlb.h | 3 +++ mm/hugetlb.c | 21 +++++++++++++++++++++ 3 files changed, 29 insertions(+), 28 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 300a6ef300c1..f76001418672 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -720,29 +720,6 @@ static void hugetlb_vmtruncate(struct inode *inode, loff_t offset) remove_inode_hugepages(inode, offset, LLONG_MAX); } -static void hugetlbfs_zero_partial_page(struct hstate *h, - struct address_space *mapping, - loff_t start, - loff_t end) -{ - pgoff_t idx = start >> huge_page_shift(h); - struct folio *folio; - - folio = filemap_lock_hugetlb_folio(h, mapping, idx); - if (IS_ERR(folio)) - return; - - start = start & ~huge_page_mask(h); - end = end & ~huge_page_mask(h); - if (!end) - end = huge_page_size(h); - - folio_zero_segment(folio, (size_t)start, (size_t)end); - - folio_unlock(folio); - folio_put(folio); -} - static long hugetlbfs_punch_hole(struct inode *inode, loff_t offset, loff_t len) { struct hugetlbfs_inode_info *info = HUGETLBFS_I(inode); @@ -768,9 +745,10 @@ static long hugetlbfs_punch_hole(struct inode *inode, loff_t offset, loff_t len) i_mmap_lock_write(mapping); /* If range starts before first full page, zero partial page. */ - if (offset < hole_start) - hugetlbfs_zero_partial_page(h, mapping, - offset, min(offset + len, hole_start)); + if (offset < hole_start) { + hugetlb_zero_partial_page(h, mapping, offset, + min(offset + len, hole_start)); + } /* Unmap users of full pages in the hole. */ if (hole_end > hole_start) { @@ -782,8 +760,7 @@ static long hugetlbfs_punch_hole(struct inode *inode, loff_t offset, loff_t len) /* If range extends beyond last full page, zero partial page. */ if ((offset + len) > hole_end && (offset + len) > hole_start) - hugetlbfs_zero_partial_page(h, mapping, - hole_end, offset + len); + hugetlb_zero_partial_page(h, mapping, hole_end, offset + len); i_mmap_unlock_write(mapping); diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 4d47bf94c211..752062044b0b 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -124,6 +124,9 @@ long hugepage_subpool_put_pages(struct hugepage_subpool *spool, long delta); int hugetlb_acct_memory(struct hstate *h, long delta); +void hugetlb_zero_partial_page(struct hstate *h, struct address_space *mapping, + loff_t start, loff_t end); + void hugetlb_dup_vma_private(struct vm_area_struct *vma); void clear_vma_resv_huge_pages(struct vm_area_struct *vma); int move_hugetlb_page_tables(struct vm_area_struct *vma, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 5a37b03e1361..372d8294fb2f 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1989,6 +1989,27 @@ void free_huge_folio(struct folio *folio) } } +void hugetlb_zero_partial_page(struct hstate *h, struct address_space *mapping, + loff_t start, loff_t end) +{ + pgoff_t idx = start >> huge_page_shift(h); + struct folio *folio; + + folio = filemap_lock_hugetlb_folio(h, mapping, idx); + if (IS_ERR(folio)) + return; + + start = start & ~huge_page_mask(h); + end = end & ~huge_page_mask(h); + if (!end) + end = huge_page_size(h); + + folio_zero_segment(folio, (size_t)start, (size_t)end); + + folio_unlock(folio); + folio_put(folio); +} + /* * Must be called with the hugetlb lock held */ From patchwork Tue Sep 10 23:43:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799486 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1BD38EE01F4 for ; Tue, 10 Sep 2024 23:45:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 139898D00D8; Tue, 10 Sep 2024 19:44:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0C2988D0002; Tue, 10 Sep 2024 19:44:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EA5B38D00D8; Tue, 10 Sep 2024 19:44:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id B98008D0002 for ; Tue, 10 Sep 2024 19:44:57 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 7C0D4A90A1 for ; Tue, 10 Sep 2024 23:44:57 +0000 (UTC) X-FDA: 82550461434.29.5E2BCF4 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf27.hostedemail.com (Postfix) with ESMTP id 9BC3E4000E for ; Tue, 10 Sep 2024 23:44:55 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=wFJ4q8L5; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf27.hostedemail.com: domain of 39tngZgsKCGsJLTNaUNhcWPPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--ackerleytng.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=39tngZgsKCGsJLTNaUNhcWPPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--ackerleytng.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011891; a=rsa-sha256; cv=none; b=doqcpfinkyVgOC++TP/O6Ma+hcLMrjtQTHuwyCoezArk5UD4WKsR2bGg+SgSYmLgG17qcj CsB8E+f/4g9d2r7p2hg8SkI2O4xzE9c7UFErMA6XD851oYdiY9rIrZDYrO1pyiVshidYM/ GbDvd+CkGL/L2H+Bt2OutDWjJGVAW4A= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=wFJ4q8L5; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf27.hostedemail.com: domain of 39tngZgsKCGsJLTNaUNhcWPPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--ackerleytng.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=39tngZgsKCGsJLTNaUNhcWPPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--ackerleytng.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011891; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fsO00i1fJ+fyO29aW7nLdVziXXefzp2jZQcWLLFpi3o=; b=DqtfhmH1bQMoa/mCjgn8F7VFwNzUXp5fIbtFDkPQTKDDA4yygEUSMvnoTzW/YXnlmomKd2 BTb8mXgNQJm7iGijgqEv5lUYyOkEQGx2Gh/+j70KbKmUXP6qPCTtC+sPUsXW+nGnSIL6yH UdZu+y379EdN2PKY8xyGPpDyg8Z6iyI= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-204e310e050so86960175ad.0 for ; Tue, 10 Sep 2024 16:44:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011894; x=1726616694; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=fsO00i1fJ+fyO29aW7nLdVziXXefzp2jZQcWLLFpi3o=; b=wFJ4q8L5zMPJgZcsJWYniRFajI7nhRHcFk7bRUgQvX4RfQJ4eufqV/j3s8y8PJwZ4x LauAym7GnAVRuF3LlC2IRDTfBoMQ1qglzHVHlf0QfESSTGxDWKvXno/D6j/EJyV76JjZ Y9fOurblG8YTyPgqVkMkjVhQXkcmjOWgJaKgrAInz8BYCK2FaR4t1M/s2h4nNj7wDP66 wVaVSXv3vDbVT+Co45A0rxYCIr08Uj/75L5dYI+R6OHd/uyQk9gcSNf9MOG7dXtXHE3P wU0/XFXCWBOkZn74quDT6TZNn3GHR+DkRk53ET7w3pRtojE+r4mhmSOgF7O/qaLaPBtw FMBA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011894; x=1726616694; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=fsO00i1fJ+fyO29aW7nLdVziXXefzp2jZQcWLLFpi3o=; b=dghE/hxAXiedR1tMeG9p7/CQXcjOY/+bU10MwxcKRO/vImqc7DoJ/CZwA3JNaxaS4l 3n8hbFxQE02bkt3ARn98mzvs54C7HBn0XggIBGzXHEWHsvFjYEM8XU68vdlDOS3seUlg 7C5a1n6WsFlJ03Zw5miJ0qgjx68aIT4yiz495Tx8m5VUlfuB0ufEjhV8BTuQui9WTd4r BCqTVLDGVdaVwxFExZIRwgl66m/FvIgfPKKjlw0s716yw10CEAf5iJfwXECA0fK/vJOQ 32PJK2pMpxkBOdNr5EZzaZsTDIEpd37kvAPuY3cW7y1qpJf2U7oR0aLbeJVsL+5iGXa5 oB6w== X-Forwarded-Encrypted: i=1; AJvYcCUJRm+89iHmdkhfZl2m/VEFgJxkBA0YlSuKMl3bitdtOSM6qyG2t+6sUWTBm1Jn/7+CNnsl5FdodQ==@kvack.org X-Gm-Message-State: AOJu0YwGwW4uiawGZhvzTqj0ePCjrPwytWR8NysYct32tLX9gZgFz9L4 jnb6rbX8lvuxUlVKyWJ/XT1a3FPqtbloHpXpBWWpynqvCmriS+6Ly8w9z/HGQd/r308UwA76eqB B4XQOgTX9rAheIiPScKiAQg== X-Google-Smtp-Source: AGHT+IGkG5bdlVVJvvD6Ym5sklISkCRvUY93a1/yDRHhOSbI1kqRvzrl9TXy8aPAOijAjgn0S5+iR/vuXr5UMgKZiQ== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a17:902:c407:b0:205:5284:e52c with SMTP id d9443c01a7336-2074c7995e4mr761845ad.9.1726011894266; Tue, 10 Sep 2024 16:44:54 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:44 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: Subject: [RFC PATCH 13/39] KVM: guest_memfd: Make guest mem use guest mem inodes instead of anonymous inodes From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspam-User: X-Stat-Signature: jszpuwb3rxionpi47qcdi8h3z6b6h7pg X-Rspamd-Queue-Id: 9BC3E4000E X-Rspamd-Server: rspam02 X-HE-Tag: 1726011895-382946 X-HE-Meta: U2FsdGVkX1/UfIzChjJiKM8gp6QBbe/h+HWnWm0uvc9vI1TL2ob5JmjDME1zs3jtSvH0M7AMcfXXl8xeiVQkIU1dpVU1P+9C3t5JxgH1nsEp8LSygkqrbMU99OCz+/y+6gwiRqUPaWFPma1V8ySRuRMASenQ/B8QZ8bfl9EsXwH9PiGPelq7/y1EE0q6JceHg3Yf4M1OAsRMKu8jVF052FEVB5mK8izpzsLTE7Xyg+HcUhF1Rfb2z6aMgMD/cFjvEI+OreYqCHolS5mhoQuI3rqZvaQXmJqVKGRjDu3n+VtG1MMvu/m5+5gI221E5a4nOAJmIFw36vjo1MrtyUh6zZO5hy1SWk+tG51kKd7yuwpUoDjPfb+ZXYrfcNQW6BXzwtrXwBAWCD5HqjMITuUeGL7bBV3wtsG/vxjjfWx++USsqycr4Ul5Ihhdmpv1UOd3e4+ZHMXJ4J3NzaJDuSweTGoD+ftg0SQNC0VInJqSlEz1sYqrgiACdtgtAeUQyBeOWH3drN3rv9yk2UozJu1/3CphVLW/xdhPAxRg+Cdfh4kpOdGG/cqTluvg7NyrZL25nq+ulqpYKH2S5Rs75eiytBULACgLC+ds2aBbkyA2KXpUaR2McYJvulVmzRT8LLMjCEVoSUezhTazUCWzGM9SrIOmNyYHeF48M+jKijvszSYDRQH/i0BqsuJvYLHSaqxCiS+IERVn4I3yz8k39nkM5N5RPr6++Xnsv0gv+VrOoB5AR1nMSaqxMzJzvJuYmGRJFdeqYBIJ642+4Gp3OQvczJIiEQch7930d4fHDkaf/78wnUXOynNj1g4ICY7SWk3imeZUK4EtmUPQbPyj5ZMNaTsIyKgmoamyD7MtYbZ+xuYzOdMiU1E2KseZTBNHfET6J3cLmz8jE6Uis9Q4U8yL+mT8E644yQAIEPzRaxwazBZ2aaJjnLNNanV3Vw/9RjjNnXtmDCeJ8SeRlheYOBX vf855pkJ 4epkYKNaRPZnQN+r1sMj1l9f1H2KOqJjAOdIfa1bFeXwsdK9NYX7xpH4p0vxoVOonZw8gIYMq5sPAX7JCWezCSQZ9hVg3HmHfVEhlcKJCgeTBFBdYnQfY1xNs58UNKwqUDYM6BjMa+QThBbswk7krA0jAfp3M9mKZTcz9R0qdnApOOijWX//3z6mCvw8qAsiYbiuEC+aZ3S1o+xbpVJbr1iyzXYISUG36ZjPs7cbxEgOUE6ZDUpG7AF/xTKsXi/4RFOV6ISzBl0wQIRoMpdY4T92GPsY2FePbAEAnDUAV2VBJ+8dn/zzy3JXlOcoqPDtuVNXjWge1Q3OErtsqbe9U6GubuqR1vDkGP3DwPumulxc8BHEMPsvcVwpdPdgzKc9dgdnSCg2+3SvTCf4PUx9jTX1qrYp2EkfCaGBnPyrWFqNQ7N2SLNxcml9FLBNtXXAsC4+1ht17CcB33TinDy0IT7FMWjjOtUBRCN7kAh91t3FTc2Ihu8S2k7wWuBpzxWIO7+7U/u697RPv74/ymGPXCO6QwIetOV5zpVb2Qw8/oIHlzCfySXva07FdU2AGYk+cKgcHS1wzQ9PjtEgFMGAJ1/vmvSAgZo7LOB2fPqkygaxm983mfKcCOVxw/X7nvdx74JLH9V5kcd+DHVeJBoBvCJw/BiQ+J63dNlYrOi8nqp9/xUesY5C4WZKQbtaBeldrnWqi4v1SZDMX8RU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Using guest mem inodes allows us to store metadata for the backing memory on the inode. Metadata will be added in a later patch to support HugeTLB pages. Metadata about backing memory should not be stored on the file, since the file represents a guest_memfd's binding with a struct kvm, and metadata about backing memory is not unique to a specific binding and struct kvm. Signed-off-by: Ackerley Tng --- include/uapi/linux/magic.h | 1 + virt/kvm/guest_memfd.c | 119 ++++++++++++++++++++++++++++++------- 2 files changed, 100 insertions(+), 20 deletions(-) diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h index bb575f3ab45e..169dba2a6920 100644 --- a/include/uapi/linux/magic.h +++ b/include/uapi/linux/magic.h @@ -103,5 +103,6 @@ #define DEVMEM_MAGIC 0x454d444d /* "DMEM" */ #define SECRETMEM_MAGIC 0x5345434d /* "SECM" */ #define PID_FS_MAGIC 0x50494446 /* "PIDF" */ +#define GUEST_MEMORY_MAGIC 0x474d454d /* "GMEM" */ #endif /* __LINUX_MAGIC_H__ */ diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 8f079a61a56d..5d7fd1f708a6 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -1,12 +1,17 @@ // SPDX-License-Identifier: GPL-2.0 +#include +#include #include #include #include +#include #include #include #include "kvm_mm.h" +static struct vfsmount *kvm_gmem_mnt; + struct kvm_gmem { struct kvm *kvm; struct xarray bindings; @@ -302,6 +307,38 @@ static inline struct file *kvm_gmem_get_file(struct kvm_memory_slot *slot) return get_file_active(&slot->gmem.file); } +static const struct super_operations kvm_gmem_super_operations = { + .statfs = simple_statfs, +}; + +static int kvm_gmem_init_fs_context(struct fs_context *fc) +{ + struct pseudo_fs_context *ctx; + + if (!init_pseudo(fc, GUEST_MEMORY_MAGIC)) + return -ENOMEM; + + ctx = fc->fs_private; + ctx->ops = &kvm_gmem_super_operations; + + return 0; +} + +static struct file_system_type kvm_gmem_fs = { + .name = "kvm_guest_memory", + .init_fs_context = kvm_gmem_init_fs_context, + .kill_sb = kill_anon_super, +}; + +static void kvm_gmem_init_mount(void) +{ + kvm_gmem_mnt = kern_mount(&kvm_gmem_fs); + BUG_ON(IS_ERR(kvm_gmem_mnt)); + + /* For giggles. Userspace can never map this anyways. */ + kvm_gmem_mnt->mnt_flags |= MNT_NOEXEC; +} + static struct file_operations kvm_gmem_fops = { .open = generic_file_open, .release = kvm_gmem_release, @@ -311,6 +348,8 @@ static struct file_operations kvm_gmem_fops = { void kvm_gmem_init(struct module *module) { kvm_gmem_fops.owner = module; + + kvm_gmem_init_mount(); } static int kvm_gmem_migrate_folio(struct address_space *mapping, @@ -392,11 +431,67 @@ static const struct inode_operations kvm_gmem_iops = { .setattr = kvm_gmem_setattr, }; +static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, + loff_t size, u64 flags) +{ + const struct qstr qname = QSTR_INIT(name, strlen(name)); + struct inode *inode; + int err; + + inode = alloc_anon_inode(kvm_gmem_mnt->mnt_sb); + if (IS_ERR(inode)) + return inode; + + err = security_inode_init_security_anon(inode, &qname, NULL); + if (err) { + iput(inode); + return ERR_PTR(err); + } + + inode->i_private = (void *)(unsigned long)flags; + inode->i_op = &kvm_gmem_iops; + inode->i_mapping->a_ops = &kvm_gmem_aops; + inode->i_mode |= S_IFREG; + inode->i_size = size; + mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER); + mapping_set_inaccessible(inode->i_mapping); + /* Unmovable mappings are supposed to be marked unevictable as well. */ + WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping)); + + return inode; +} + +static struct file *kvm_gmem_inode_create_getfile(void *priv, loff_t size, + u64 flags) +{ + static const char *name = "[kvm-gmem]"; + struct inode *inode; + struct file *file; + + if (kvm_gmem_fops.owner && !try_module_get(kvm_gmem_fops.owner)) + return ERR_PTR(-ENOENT); + + inode = kvm_gmem_inode_make_secure_inode(name, size, flags); + if (IS_ERR(inode)) + return ERR_CAST(inode); + + file = alloc_file_pseudo(inode, kvm_gmem_mnt, name, O_RDWR, + &kvm_gmem_fops); + if (IS_ERR(file)) { + iput(inode); + return file; + } + + file->f_mapping = inode->i_mapping; + file->f_flags |= O_LARGEFILE; + file->private_data = priv; + + return file; +} + static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) { - const char *anon_name = "[kvm-gmem]"; struct kvm_gmem *gmem; - struct inode *inode; struct file *file; int fd, err; @@ -410,32 +505,16 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) goto err_fd; } - file = anon_inode_create_getfile(anon_name, &kvm_gmem_fops, gmem, - O_RDWR, NULL); + file = kvm_gmem_inode_create_getfile(gmem, size, flags); if (IS_ERR(file)) { err = PTR_ERR(file); goto err_gmem; } - file->f_flags |= O_LARGEFILE; - - inode = file->f_inode; - WARN_ON(file->f_mapping != inode->i_mapping); - - inode->i_private = (void *)(unsigned long)flags; - inode->i_op = &kvm_gmem_iops; - inode->i_mapping->a_ops = &kvm_gmem_aops; - inode->i_mode |= S_IFREG; - inode->i_size = size; - mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER); - mapping_set_inaccessible(inode->i_mapping); - /* Unmovable mappings are supposed to be marked unevictable as well. */ - WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping)); - kvm_get_kvm(kvm); gmem->kvm = kvm; xa_init(&gmem->bindings); - list_add(&gmem->entry, &inode->i_mapping->i_private_list); + list_add(&gmem->entry, &file_inode(file)->i_mapping->i_private_list); fd_install(fd, file); return fd; From patchwork Tue Sep 10 23:43:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799487 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7C31EE01F1 for ; Tue, 10 Sep 2024 23:45:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 139448D00D9; Tue, 10 Sep 2024 19:45:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 04F558D0002; Tue, 10 Sep 2024 19:44:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DBC998D00D9; Tue, 10 Sep 2024 19:44:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id B30168D0002 for ; Tue, 10 Sep 2024 19:44:59 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 7249040C20 for ; Tue, 10 Sep 2024 23:44:59 +0000 (UTC) X-FDA: 82550461518.16.BEED4D0 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf12.hostedemail.com (Postfix) with ESMTP id 91B2A4000C for ; Tue, 10 Sep 2024 23:44:57 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=b7sLJDsj; spf=pass (imf12.hostedemail.com: domain of 3-NngZgsKCG0LNVPcWPjeYRRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--ackerleytng.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3-NngZgsKCG0LNVPcWPjeYRRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011845; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9Gc31H97UbE78QiSNq4CUIpVlCNlnI2SeIjwGOQZSc0=; b=qINsxTU7QtUTxevIRmf99WIUDclSuSbTnxPBdV3eRha9X3RKmW1U1lOjyv0cB+vchNS8uC AKN2CgY2p4EC8rg5tt9uuEgVMMxhWyBaUsnuVtKbdQU3dCk+Ngf0iyI5nYoGZyds5eKL6X EyY7OvHJPj8DplM3VAWB2XvJJEffk1c= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=b7sLJDsj; spf=pass (imf12.hostedemail.com: domain of 3-NngZgsKCG0LNVPcWPjeYRRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--ackerleytng.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3-NngZgsKCG0LNVPcWPjeYRRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011845; a=rsa-sha256; cv=none; b=uPBJWiVnPE5RLk+P9ktivTdT9dYiMiOfT3Kv+VvlE8d+emwKncQT2pDQtEjRih9r8Cpf8q So0dABfep6Y55bh3B5eLgV0xfUOMRVO8oGcrP4NZvXk0NQaMpqZ+EsSXA3QZU2gUfNvS3p IvMCcHulaeQdT/JOAJLgn/KDhsb4nZA= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2d8859d6e9dso1163423a91.1 for ; Tue, 10 Sep 2024 16:44:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011896; x=1726616696; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=9Gc31H97UbE78QiSNq4CUIpVlCNlnI2SeIjwGOQZSc0=; b=b7sLJDsjOlcbnyTHujKClBCyeAb4b3XJKw5/IIPodD3AWW7SMufAz1pcweAMX5iAA4 5vogATQj5AJ+UhojwhtRFkGGH0Si/7ho3c/u0BbB1ub8WBxVNBu1jN9S4jc/+yiTHOUm c04EZAeYhIp45yBG7wiJV5jrKf0jI1/KyoSJhvgwkdxwJS2lKrDGsL0zWuMlam9s3SOY BdzEu7WsKpVnnm3ctphrRmju24e8LOmv/eFl9rVN1FtflHf/JoBfmDnRD8IOUv/PmFat /mx05OlTJev3qlQl28WioGJVEkGd8rdRXMs+4EDqG4SZw7I70bZeuM03INtIp4+BR7Jt uJ0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011896; x=1726616696; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=9Gc31H97UbE78QiSNq4CUIpVlCNlnI2SeIjwGOQZSc0=; b=s1P5fX8v2wpOC9cwOz2pQ8nc6pgkbj7KdnEAWheZOQQ+ZovoNKAaZK98aATrIhgpnu REc0z5wj7krsxhpg11T0xtD3Tmnn/fjjE0pViUCQh8f4aa4v/BO1dl9ZcMrOQhkfgeDN lg/5O/9O52vLWYeKfh55ymeZrnNMR+fTnMr3wRXAIekO3wPgePLgAVtvJHXRDf/CEr35 USBrfEBXHowqKbe5kyShECSP34kUk189zGOloU/npnIa1C84UbA0GVVQwZRuvLyiYur8 ZXS9NSX6KjNYXHMw5DycRPZtodUYYqMdme1qpjpS4MCuww6Hu7Xf1W4h5jWqbvl6gJGh fEsw== X-Forwarded-Encrypted: i=1; AJvYcCV199jSxzRtrfsMAwfmZXZE/0k/d13wrJQrmkIbjrSaVqRNsCmuqd69GciIoqvMnsoi8YpIiHpF3w==@kvack.org X-Gm-Message-State: AOJu0YwYvjb5ByyKJOOXU3zh2/WTC8xLTwNCTT+a43G3ezW3pMuPKQK8 87JDEykltJaZi9oSm4ozPMMXtqAoHSdvKRim8SFkTdEq6VW54jxex5tWY3PSUE8uhQ0ADHiWQR1 mQvNux7zqL+kazTsixdfTrw== X-Google-Smtp-Source: AGHT+IGTAqnZcaBNVO3Nl4Ef5dtGerZRypTfLYB/jrgUVkEZWYdnxrwGBGsysGpj4YxZly3jtQUA0VOLee5RXF37ZQ== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a17:902:ce91:b0:206:928c:bfd9 with SMTP id d9443c01a7336-20752208a62mr470995ad.6.1726011896097; Tue, 10 Sep 2024 16:44:56 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:45 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <3fec11d8a007505405eadcf2b3e10ec9051cf6bf.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 14/39] KVM: guest_memfd: hugetlb: initialization and cleanup From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: 91B2A4000C X-Stat-Signature: ns7narsky44mamfp8rauxg1ftomw3oe6 X-HE-Tag: 1726011897-862784 X-HE-Meta: U2FsdGVkX1/IJIzdU0XYOtrUla+J5MLwPcZ4fZndQkz1AvZzcpxKareZfIHyB48ciaDLwlW60xfzVBX9rA8LVP4HD1cKPgsW1HRPg8sKmHz/clVB/djP9alCHjjpQkSzLSTXtl5ng1Yde7YX3JWUj+CV3od/p87w/idHfc9oA8+HF/8v2g89d6+NYPRJFAjRhWYG3zRSZKU5c8fLEinXo1h6/CFmfUaTyvHL0jUH3YSWCeBUOtxgx3+j9tXJbtMWudUGe7yr8OCwMmpBxNByw4bNIPdyvnDpH/FVHmbZ3Jh2b1wi0nL0M4fFuJy9t2XCciCkjXRWoFjlM5bTtsPHVEPfwSY1rLp6d1Rdcx17GhSK2qLgUjYMUzX8kB3ediKgt9fidkVzXy7D6zx/LZHTGb1BvCbSLeR7eu6nl5NM76uFYtVrwPIop1osXTEMVGbbWK4i4b3WdSe0pyMeRIf732H+k/c9XWftu16LyN4Hwg5T58MkGltxK4TqaeYhwIGmgEMGFyHCONyn1EDOU6CACz4e2UnJiDS+SWvCu/M/+6RaKdxjVsozwsrJmtDvn2gfL4H3hxEx8wynVwuEUN8A7YnLiTwvqDawywQ9JaVbdHEsT5N+qQPm3AzbLbhuXEFrk3/m8oI8VvEA6sG0G3oi55ZAHVhsCW11vxfxL93fEz2nl28LP+micMierTuCDQV+d003fykuXh9KTX3mR9szWOieshXT2t3CBt3xNcGkvakP8oEG46VMOFLu+OlX+r8Wn7CIkoxvo5hFCqmOkCjP7AzZSke7VsfV3OFXWLnudjOGS5ihMfVCVbu3nWl0VtpBpJqbS/dbpXqrDC5Yinb1VbhZ0qN+Ai1iZd3U7TklF3Z7NsOYLIAvi7/Uwn03C9UOdnGGdO/1lppDT1wm2Dc27QYctI2VwbiLg/JeJoxb5lIgydRu2LIRFCTDRp1+R6Eurfe9LTQvO8EzTrJeRlJ ue5oFCJL zi5QjOkHXVrEyrtj6rqRfG9e/aZrDIvkD8OW/8fdfjR8qfJo4+L7m4xQmr21XWwxG6q/Je86jD5xs5tAY+L4oxdVna54fB1FtB5h1e4wyXNw9ittSDGRJH/dDAM7J7Q+8d6m0RqSq5zmotzMktTPZJyHQrvEVUt5NhKwIN2t8//RGjGN2nsarUNQz4fWWHrRTaOvzpGeKnBn+btvBmn5Vvc16po+9v0BCytJToCv8sfjxkd+G1Y3ZJ+R6naMQ+Z27SL9UmFfF1Wpt2ofpQtNrSVPxBVWOreCATmyQJIZxvB18vexbRNHC2N/H4OPVf0zdspjOQqBQ8vbMprXS2/VGjJF8Ewr7l9kLVWHhA4Vzbrp3yop3J5T3KdKpjAd/zwUXbIoh2tbgGMKnYB1LqMpYvIDi2kOXei84tiQRWerXWRhsYF0DAJ8v8HqRBVxJJS2Bwbfk9CGv4uCtVlRaPidAe7+CPT+fnzOaO9Tyk9CDWg4u9KS4lPoq5SJx2ONDbGJjW/bLzeTLaNA/QMcRkaqpWxcT4ZKBtQS12q9ku+6ROhtW+bF92AH7kKV9KRscqpKMfmHDYG+DDW3s9hXbaPr8gHXA81MHGVFkEVZNxHLlJsQA2Rr4jnmeW+cyN1kLJqFLLhrvCs/eDdl+UYkB6j8YQtWG4Gk13YowXAS411zPoxq63GBDQm2cXo/uGjAGeXjTo8B7EXr5Kl4sWGE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: First stage of hugetlb support: add initialization and cleanup routines. After guest_mem was massaged to use guest_mem inodes instead of anonymous inodes in an earlier patch, the .evict_inode handler can now be overridden to do hugetlb metadata cleanup. Signed-off-by: Ackerley Tng --- include/uapi/linux/kvm.h | 26 ++++++ virt/kvm/guest_memfd.c | 177 +++++++++++++++++++++++++++++++++++++-- 2 files changed, 197 insertions(+), 6 deletions(-) diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 637efc055145..77de7c4432f6 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -13,6 +13,7 @@ #include #include #include +#include #define KVM_API_VERSION 12 @@ -1558,6 +1559,31 @@ struct kvm_memory_attributes { #define KVM_CREATE_GUEST_MEMFD _IOWR(KVMIO, 0xd4, struct kvm_create_guest_memfd) +#define KVM_GUEST_MEMFD_HUGETLB (1ULL << 1) + +/* + * Huge page size encoding when KVM_GUEST_MEMFD_HUGETLB is specified, and a huge + * page size other than the default is desired. See hugetlb_encode.h. All + * known huge page size encodings are provided here. It is the responsibility + * of the application to know which sizes are supported on the running system. + * See mmap(2) man page for details. + */ +#define KVM_GUEST_MEMFD_HUGE_SHIFT HUGETLB_FLAG_ENCODE_SHIFT +#define KVM_GUEST_MEMFD_HUGE_MASK HUGETLB_FLAG_ENCODE_MASK + +#define KVM_GUEST_MEMFD_HUGE_64KB HUGETLB_FLAG_ENCODE_64KB +#define KVM_GUEST_MEMFD_HUGE_512KB HUGETLB_FLAG_ENCODE_512KB +#define KVM_GUEST_MEMFD_HUGE_1MB HUGETLB_FLAG_ENCODE_1MB +#define KVM_GUEST_MEMFD_HUGE_2MB HUGETLB_FLAG_ENCODE_2MB +#define KVM_GUEST_MEMFD_HUGE_8MB HUGETLB_FLAG_ENCODE_8MB +#define KVM_GUEST_MEMFD_HUGE_16MB HUGETLB_FLAG_ENCODE_16MB +#define KVM_GUEST_MEMFD_HUGE_32MB HUGETLB_FLAG_ENCODE_32MB +#define KVM_GUEST_MEMFD_HUGE_256MB HUGETLB_FLAG_ENCODE_256MB +#define KVM_GUEST_MEMFD_HUGE_512MB HUGETLB_FLAG_ENCODE_512MB +#define KVM_GUEST_MEMFD_HUGE_1GB HUGETLB_FLAG_ENCODE_1GB +#define KVM_GUEST_MEMFD_HUGE_2GB HUGETLB_FLAG_ENCODE_2GB +#define KVM_GUEST_MEMFD_HUGE_16GB HUGETLB_FLAG_ENCODE_16GB + struct kvm_create_guest_memfd { __u64 size; __u64 flags; diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 5d7fd1f708a6..31e1115273e1 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -3,6 +3,7 @@ #include #include #include +#include #include #include #include @@ -18,6 +19,16 @@ struct kvm_gmem { struct list_head entry; }; +struct kvm_gmem_hugetlb { + struct hstate *h; + struct hugepage_subpool *spool; +}; + +static struct kvm_gmem_hugetlb *kvm_gmem_hgmem(struct inode *inode) +{ + return inode->i_mapping->i_private_data; +} + /** * folio_file_pfn - like folio_file_page, but return a pfn. * @folio: The folio which contains this index. @@ -154,6 +165,82 @@ static void kvm_gmem_invalidate_end(struct kvm_gmem *gmem, pgoff_t start, } } +static inline void kvm_gmem_hugetlb_filemap_remove_folio(struct folio *folio) +{ + folio_lock(folio); + + folio_clear_dirty(folio); + folio_clear_uptodate(folio); + filemap_remove_folio(folio); + + folio_unlock(folio); +} + +/** + * Removes folios in range [@lstart, @lend) from page cache/filemap (@mapping), + * returning the number of pages freed. + */ +static int kvm_gmem_hugetlb_filemap_remove_folios(struct address_space *mapping, + struct hstate *h, + loff_t lstart, loff_t lend) +{ + const pgoff_t end = lend >> PAGE_SHIFT; + pgoff_t next = lstart >> PAGE_SHIFT; + struct folio_batch fbatch; + int num_freed = 0; + + folio_batch_init(&fbatch); + while (filemap_get_folios(mapping, &next, end - 1, &fbatch)) { + int i; + for (i = 0; i < folio_batch_count(&fbatch); ++i) { + struct folio *folio; + pgoff_t hindex; + u32 hash; + + folio = fbatch.folios[i]; + hindex = folio->index >> huge_page_order(h); + hash = hugetlb_fault_mutex_hash(mapping, hindex); + + mutex_lock(&hugetlb_fault_mutex_table[hash]); + kvm_gmem_hugetlb_filemap_remove_folio(folio); + mutex_unlock(&hugetlb_fault_mutex_table[hash]); + + num_freed++; + } + folio_batch_release(&fbatch); + cond_resched(); + } + + return num_freed; +} + +/** + * Removes folios in range [@lstart, @lend) from page cache of inode, updates + * inode metadata and hugetlb reservations. + */ +static void kvm_gmem_hugetlb_truncate_folios_range(struct inode *inode, + loff_t lstart, loff_t lend) +{ + struct kvm_gmem_hugetlb *hgmem; + struct hstate *h; + int gbl_reserve; + int num_freed; + + hgmem = kvm_gmem_hgmem(inode); + h = hgmem->h; + + num_freed = kvm_gmem_hugetlb_filemap_remove_folios(inode->i_mapping, + h, lstart, lend); + + gbl_reserve = hugepage_subpool_put_pages(hgmem->spool, num_freed); + hugetlb_acct_memory(h, -gbl_reserve); + + spin_lock(&inode->i_lock); + inode->i_blocks -= blocks_per_huge_page(h) * num_freed; + spin_unlock(&inode->i_lock); +} + + static long kvm_gmem_punch_hole(struct inode *inode, loff_t offset, loff_t len) { struct list_head *gmem_list = &inode->i_mapping->i_private_list; @@ -307,8 +394,33 @@ static inline struct file *kvm_gmem_get_file(struct kvm_memory_slot *slot) return get_file_active(&slot->gmem.file); } +static void kvm_gmem_hugetlb_teardown(struct inode *inode) +{ + struct kvm_gmem_hugetlb *hgmem; + + truncate_inode_pages_final_prepare(inode->i_mapping); + kvm_gmem_hugetlb_truncate_folios_range(inode, 0, LLONG_MAX); + + hgmem = kvm_gmem_hgmem(inode); + hugepage_put_subpool(hgmem->spool); + kfree(hgmem); +} + +static void kvm_gmem_evict_inode(struct inode *inode) +{ + u64 flags = (u64)inode->i_private; + + if (flags & KVM_GUEST_MEMFD_HUGETLB) + kvm_gmem_hugetlb_teardown(inode); + else + truncate_inode_pages_final(inode->i_mapping); + + clear_inode(inode); +} + static const struct super_operations kvm_gmem_super_operations = { .statfs = simple_statfs, + .evict_inode = kvm_gmem_evict_inode, }; static int kvm_gmem_init_fs_context(struct fs_context *fc) @@ -431,6 +543,42 @@ static const struct inode_operations kvm_gmem_iops = { .setattr = kvm_gmem_setattr, }; +static int kvm_gmem_hugetlb_setup(struct inode *inode, loff_t size, u64 flags) +{ + struct kvm_gmem_hugetlb *hgmem; + struct hugepage_subpool *spool; + int page_size_log; + struct hstate *h; + long hpages; + + page_size_log = (flags >> KVM_GUEST_MEMFD_HUGE_SHIFT) & KVM_GUEST_MEMFD_HUGE_MASK; + h = hstate_sizelog(page_size_log); + + /* Round up to accommodate size requests that don't align with huge pages */ + hpages = round_up(size, huge_page_size(h)) >> huge_page_shift(h); + + spool = hugepage_new_subpool(h, hpages, hpages, false); + if (!spool) + goto err; + + hgmem = kzalloc(sizeof(*hgmem), GFP_KERNEL); + if (!hgmem) + goto err_subpool; + + inode->i_blkbits = huge_page_shift(h); + + hgmem->h = h; + hgmem->spool = spool; + inode->i_mapping->i_private_data = hgmem; + + return 0; + +err_subpool: + kfree(spool); +err: + return -ENOMEM; +} + static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, loff_t size, u64 flags) { @@ -443,9 +591,13 @@ static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, return inode; err = security_inode_init_security_anon(inode, &qname, NULL); - if (err) { - iput(inode); - return ERR_PTR(err); + if (err) + goto out; + + if (flags & KVM_GUEST_MEMFD_HUGETLB) { + err = kvm_gmem_hugetlb_setup(inode, size, flags); + if (err) + goto out; } inode->i_private = (void *)(unsigned long)flags; @@ -459,6 +611,11 @@ static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping)); return inode; + +out: + iput(inode); + + return ERR_PTR(err); } static struct file *kvm_gmem_inode_create_getfile(void *priv, loff_t size, @@ -526,14 +683,22 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) return err; } +#define KVM_GUEST_MEMFD_ALL_FLAGS KVM_GUEST_MEMFD_HUGETLB + int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args) { loff_t size = args->size; u64 flags = args->flags; - u64 valid_flags = 0; - if (flags & ~valid_flags) - return -EINVAL; + if (flags & KVM_GUEST_MEMFD_HUGETLB) { + /* Allow huge page size encoding in flags */ + if (flags & ~(KVM_GUEST_MEMFD_ALL_FLAGS | + (KVM_GUEST_MEMFD_HUGE_MASK << KVM_GUEST_MEMFD_HUGE_SHIFT))) + return -EINVAL; + } else { + if (flags & ~KVM_GUEST_MEMFD_ALL_FLAGS) + return -EINVAL; + } if (size <= 0 || !PAGE_ALIGNED(size)) return -EINVAL; From patchwork Tue Sep 10 23:43:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799488 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6251BEE01F4 for ; Tue, 10 Sep 2024 23:45:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BC1418D00DA; Tue, 10 Sep 2024 19:45:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B49EB8D0002; Tue, 10 Sep 2024 19:45:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 94D948D00DA; Tue, 10 Sep 2024 19:45:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 6D8018D0002 for ; Tue, 10 Sep 2024 19:45:01 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 2A0DE1A0B68 for ; Tue, 10 Sep 2024 23:45:01 +0000 (UTC) X-FDA: 82550461602.19.B907748 Received: from mail-pg1-f202.google.com (mail-pg1-f202.google.com [209.85.215.202]) by imf29.hostedemail.com (Postfix) with ESMTP id 5439D120006 for ; Tue, 10 Sep 2024 23:44:59 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Koe0dSvt; spf=pass (imf29.hostedemail.com: domain of 3-dngZgsKCG4MOWQdXQkfZSSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--ackerleytng.bounces.google.com designates 209.85.215.202 as permitted sender) smtp.mailfrom=3-dngZgsKCG4MOWQdXQkfZSSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011762; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3koUG4NR7/VYilG4JhqRWXEMHNTy5kmn0YA+V+1Maxk=; b=P7/hYeUdO4LO7uINPX1TNIAjHMJQyKggPLrczj4M8625reQe0buB5a/F051oAwEUWUmT9z 2ibQ8YVMaEubEWY9T7jPICxLQJIRtJ7ToFapE1/R6AwlWS195kFE2ArWGaUHMH9I5jOmy1 BidQDNxX9UfnSR70lF3Yeo84tU7OQpE= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Koe0dSvt; spf=pass (imf29.hostedemail.com: domain of 3-dngZgsKCG4MOWQdXQkfZSSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--ackerleytng.bounces.google.com designates 209.85.215.202 as permitted sender) smtp.mailfrom=3-dngZgsKCG4MOWQdXQkfZSSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011762; a=rsa-sha256; cv=none; b=fgxFivKo1OeSFjQzJlkP4dUxTD0vbReVSf19Flz/1swzH4p9+Nc2aaBeUrwfXnZOEzy2/C sis/aGp6fNdpOrr34dHxxuIUy/XyCdZhwXx4pbTFP1zP61WmMSjXDkCq9zrAvGC5lUZrtB qTx6TFEanAjHmUaIYa9PRSedzxCKojw= Received: by mail-pg1-f202.google.com with SMTP id 41be03b00d2f7-5e4df21f22dso302010a12.0 for ; Tue, 10 Sep 2024 16:44:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011898; x=1726616698; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=3koUG4NR7/VYilG4JhqRWXEMHNTy5kmn0YA+V+1Maxk=; b=Koe0dSvtgAH1ta1t1PyiQAFYx7G+MBTM5t7S2G0Vje7O1T/NC//rk6GaWOZwmGJAR4 5ZbaWwDKEjCoMwfppjsZVRsNdvREZmKZcjZE0hfj8UoaaizdbxLh9hqRzBDrjYnkaM0D HxKcONXosdWLWgdmNqZHlFtj/iRAPZ1fpeRm6oEKaLH7YYicvckmzOWPEfmV7R+7lz/V 1KG6Rgn/sEbHVMKzSPksR2z65s+aAjSm2eVrv2TKFOfEXyE/ABNxppdxauHDI3f1uuEU ZIg09D1tcOku23hr89FZAWuIm7KcM/fijH2GVPBd1d2ZUXwM8pk/N4M8ReARRladfLup aD0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011898; x=1726616698; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=3koUG4NR7/VYilG4JhqRWXEMHNTy5kmn0YA+V+1Maxk=; b=HJy3rsfuaAnjVeT8ZdLmeoxiMRZmy6JhS7Ss0rpJEfKFW53Rx+6tnYK+c0rEX+bY00 aPtuEiA+DsDCagvLA8Wjcpc+toaYwf4c9mEqKsCugNPRNKT4aqcwubpIWsbSi+iiE91/ cTL7xtiFyFNErzmFWj7D7vIk2aoqEPxb5KwgxYs6gPt+u7/ZbVWxy5yd2PyFiY7G2y/C CljDszcWwteJze5Vi1CCSbCC1V6QcJ0BjQNmt2AH8iKTF9wFLm5crULdThobbR4/G6x+ waIFZrswE3EEblW031YerHrIQTxGdtGNqqbS0rTi1NRPwJYJbz5TGUMXs/2E8xSq/2Yw IjKQ== X-Forwarded-Encrypted: i=1; AJvYcCUAHGAVrXdX06lbR4D4OLZjri4+yzZar/w1OGeHar4WblFwhg3gwX1XCcqg5OVJhwGwfFMpzLRQXw==@kvack.org X-Gm-Message-State: AOJu0YxJp/d0fziw+d/3V7RcqwhIT65M8RTdQ0WNeLlUrOPCRKqqfO/S 54YK6oASGQ5afWN+GdxKfIEMY9QpzM+wQ8hh7s+mWzEvohGFfyn5w3VY7eCC0xoc5zMsSv5IsO3 g5DWXrtWyu6Wc7QiUvNwDLQ== X-Google-Smtp-Source: AGHT+IEDqZZk8irFHoo8VNnnGmulx1leYo1hJT0YyoAnYorw/TvU2WADq0CylGqiF1u5/YMn1zVKE8FaP8ki3Qm88w== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a63:8f5e:0:b0:6e3:a2ac:efd4 with SMTP id 41be03b00d2f7-7db088941ecmr7238a12.6.1726011897848; Tue, 10 Sep 2024 16:44:57 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:46 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <768488c67540aa18c200d7ee16e75a3a087022d4.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 15/39] KVM: guest_memfd: hugetlb: allocate and truncate from hugetlb From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 5439D120006 X-Stat-Signature: g5rg36qpnx5wnaao56eq5ab33k4an59m X-Rspam-User: X-HE-Tag: 1726011899-384319 X-HE-Meta: U2FsdGVkX19FU4RNwSR1LRgtK4U904i9ReTkWlmZkxRU2kAe67Dr72KfaDSUQIgoHhzD9lz0mn2DiB3g7e1Aaim3BFwTviqdHBYY9hLZr3LhJfnz7IbxIu99lZGAK+3OY9NWyF15SLW3OVASOO1M6MR4mHkABXCCe86j2yYwGf+2Aoa5up94BRJrkbFC2iJ1ye1sNaPDN/1/yZGbsUo13I3+5z7qbDQznO7WaJ5po361a92q38ZXqfMmJZNLrgExMNkPXeNqPeUvef1UFAKA6mijrp6n+aWslH/kaaklm+GZEGzGngIrJJs7FHaE0sz4RoruE9IFn5fc1ZFQfm/DZvM8ZH4e0+naJ4dx6OenWqm333op3qgy5Im3SW7Z5rUPdvXS8b4g0hh1RPy9AeIzGtOM63uMIM0bnrDLfD/QeAIsTpaltRW2gGte7RP4Ma6nZtz03HuWjNHhSTghTVCK4OBlXVQS5Xl1PcIHbu9EDEU6ZXGM2OY9HjqGOFbqFgsSxhsTAizdHfCI/+o53cDeY1gXN8SUaSOMOq4o/kdha/XDiBgdN9QwL5XNlPYQ1dwgrGoCCpsMhKLbktKsegQydH2+/dGJ0pPvByiMPHiAbEI2XKpWxJBmEvzfX3tcFnO+X5vAI14THpo/1MzJ93hyCzaZK5aHdegQvHllk082DQT/+9Z/bIIrzotKO+D1b/voy81kCOHnkYbQ1vJEJX13j8XvTOk0gRUWLaUUfmMdNoMGjHtG4hHWhaXE0IZR58xggio5VloDcYimlfMRj+rCPm6NR4nfSTq8nSEuFmydGeN0XUlPALPj3ogVJN1GSaf4S/qTxMvXA6NQSVpuOTBynWXsFzQ2Z1jW9bP9sUxxdF3WS26StTfFJQEIk5bhwGUbOhVjcQ6+eh18Z0+zGz+GKDuDi5V68593XQ5qySaEis4z+7HJTngBX/5efElAHzGR4qtJUXhzRCwT6wSsKC9 2pZHGxDP bOtoetr414hUpt3votaUMpcb9a4+A6rtg+M3KU8oLIjw/6aNuGCTaeoQid0bpHgYonZ8jt9C5t5ZSLo7i787JFQmA6TLr72r01J+TRICejpzmszzPoL+QJS1iPOvfe79CTEbeHkVupdCwk8769Hc9BO6q1u8zOQpODcpiJi0UBGtAaWS2FZD2QKj7AQI9/xiAZOGImSBhT81bXIxycwXw5d3AKxKpcQTjH7WFzgnnVWnbBgj8VZ7sYrVtISUYEruEZ8xjKgo30PYFwBHdUKwCNlOTQVgjvuUMnTMeC5KPrRC8RYnogUZzKB1/7ac0e28FCQlSDQc9t4U/wGX4Wgg5wj5qFaW9vmlGsDI1JQrj8erDOOp5314lzqW1RFOYr6NORQYoerQsNJ8VCjaMw074vrGmxYq+aW1ACS/GTT1LhCbEuTxLyIK3wPGc7Uty6G1IrowTHVFwPtT2lGMJ7UFoIXKkHY4LQAdcvUQLVbTp7/N8TvWYD3AAtvUJUuuMZE+x7lUZJiJUYcegOObvLb5KH+5Zz9i3B8xbir0MQaYhkoNK+1yx7kwDsPdsOSeklbWeDgw1YoHxSGDunaYOLehS56KT3xg7xxWp+PCJmdgvIl2OLgmsfFIus6pejohbWlbhebjR22NKY9hkcvwzsU6NxuORpK0Lz3tmajmzdviZNKkc2BQaHwsoWbted/CRe3+FYVY8PG4Zhn23qfk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: If HugeTLB is requested at guest_memfd creation time, HugeTLB pages will be used to back guest_memfd. Signed-off-by: Ackerley Tng --- virt/kvm/guest_memfd.c | 252 ++++++++++++++++++++++++++++++++++++++--- 1 file changed, 239 insertions(+), 13 deletions(-) diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 31e1115273e1..2e6f12e2bac8 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -8,6 +8,8 @@ #include #include #include +#include +#include #include "kvm_mm.h" @@ -29,6 +31,13 @@ static struct kvm_gmem_hugetlb *kvm_gmem_hgmem(struct inode *inode) return inode->i_mapping->i_private_data; } +static bool is_kvm_gmem_hugetlb(struct inode *inode) +{ + u64 flags = (u64)inode->i_private; + + return flags & KVM_GUEST_MEMFD_HUGETLB; +} + /** * folio_file_pfn - like folio_file_page, but return a pfn. * @folio: The folio which contains this index. @@ -58,6 +67,9 @@ static int __kvm_gmem_prepare_folio(struct kvm *kvm, struct kvm_memory_slot *slo return 0; } +/** + * Use the uptodate flag to indicate that the folio is prepared for KVM's usage. + */ static inline void kvm_gmem_mark_prepared(struct folio *folio) { folio_mark_uptodate(folio); @@ -72,13 +84,18 @@ static inline void kvm_gmem_mark_prepared(struct folio *folio) static int kvm_gmem_prepare_folio(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn, struct folio *folio) { - unsigned long nr_pages, i; pgoff_t index; int r; - nr_pages = folio_nr_pages(folio); - for (i = 0; i < nr_pages; i++) - clear_highpage(folio_page(folio, i)); + if (folio_test_hugetlb(folio)) { + folio_zero_user(folio, folio->index << PAGE_SHIFT); + } else { + unsigned long nr_pages, i; + + nr_pages = folio_nr_pages(folio); + for (i = 0; i < nr_pages; i++) + clear_highpage(folio_page(folio, i)); + } /* * Preparing huge folios should always be safe, since it should @@ -103,6 +120,174 @@ static int kvm_gmem_prepare_folio(struct kvm *kvm, struct kvm_memory_slot *slot, return r; } +static int kvm_gmem_get_mpol_node_nodemask(gfp_t gfp_mask, + struct mempolicy **mpol, + nodemask_t **nodemask) +{ + /* + * TODO: mempolicy would probably have to be stored on the inode, use + * task policy for now. + */ + *mpol = get_task_policy(current); + + /* TODO: ignore interleaving (set ilx to 0) for now. */ + return policy_node_nodemask(*mpol, gfp_mask, 0, nodemask); +} + +static struct folio *kvm_gmem_hugetlb_alloc_folio(struct hstate *h, + struct hugepage_subpool *spool) +{ + bool memcg_charge_was_prepared; + struct mem_cgroup *memcg; + struct mempolicy *mpol; + nodemask_t *nodemask; + struct folio *folio; + gfp_t gfp_mask; + int ret; + int nid; + + gfp_mask = htlb_alloc_mask(h); + + memcg = get_mem_cgroup_from_current(); + ret = mem_cgroup_hugetlb_try_charge(memcg, + gfp_mask | __GFP_RETRY_MAYFAIL, + pages_per_huge_page(h)); + if (ret == -ENOMEM) + goto err; + + memcg_charge_was_prepared = ret != -EOPNOTSUPP; + + /* Pages are only to be taken from guest_memfd subpool and nowhere else. */ + if (hugepage_subpool_get_pages(spool, 1)) + goto err_cancel_charge; + + nid = kvm_gmem_get_mpol_node_nodemask(htlb_alloc_mask(h), &mpol, + &nodemask); + /* + * charge_cgroup_reservation is false because we didn't make any cgroup + * reservations when creating the guest_memfd subpool. + * + * use_hstate_resv is true because we reserved from global hstate when + * creating the guest_memfd subpool. + */ + folio = hugetlb_alloc_folio(h, mpol, nid, nodemask, false, true); + mpol_cond_put(mpol); + + if (!folio) + goto err_put_pages; + + hugetlb_set_folio_subpool(folio, spool); + + if (memcg_charge_was_prepared) + mem_cgroup_commit_charge(folio, memcg); + +out: + mem_cgroup_put(memcg); + + return folio; + +err_put_pages: + hugepage_subpool_put_pages(spool, 1); + +err_cancel_charge: + if (memcg_charge_was_prepared) + mem_cgroup_cancel_charge(memcg, pages_per_huge_page(h)); + +err: + folio = ERR_PTR(-ENOMEM); + goto out; +} + +static int kvm_gmem_hugetlb_filemap_add_folio(struct address_space *mapping, + struct folio *folio, pgoff_t index, + gfp_t gfp) +{ + int ret; + + __folio_set_locked(folio); + ret = __filemap_add_folio(mapping, folio, index, gfp, NULL); + if (unlikely(ret)) { + __folio_clear_locked(folio); + return ret; + } + + /* + * In hugetlb_add_to_page_cache(), there is a call to + * folio_clear_hugetlb_restore_reserve(). This is handled when the pages + * are removed from the page cache in unmap_hugepage_range() -> + * __unmap_hugepage_range() by conditionally calling + * folio_set_hugetlb_restore_reserve(). In kvm_gmem_hugetlb's usage of + * hugetlb, there are no VMAs involved, and pages are never taken from + * the surplus, so when pages are freed, the hstate reserve must be + * restored. Hence, this function makes no call to + * folio_clear_hugetlb_restore_reserve(). + */ + + /* mark folio dirty so that it will not be removed from cache/inode */ + folio_mark_dirty(folio); + + return 0; +} + +static struct folio *kvm_gmem_hugetlb_alloc_and_cache_folio(struct inode *inode, + pgoff_t index) +{ + struct kvm_gmem_hugetlb *hgmem; + struct folio *folio; + int ret; + + hgmem = kvm_gmem_hgmem(inode); + folio = kvm_gmem_hugetlb_alloc_folio(hgmem->h, hgmem->spool); + if (IS_ERR(folio)) + return folio; + + /* TODO: Fix index here to be aligned to huge page size. */ + ret = kvm_gmem_hugetlb_filemap_add_folio( + inode->i_mapping, folio, index, htlb_alloc_mask(hgmem->h)); + if (ret) { + folio_put(folio); + return ERR_PTR(ret); + } + + spin_lock(&inode->i_lock); + inode->i_blocks += blocks_per_huge_page(hgmem->h); + spin_unlock(&inode->i_lock); + + return folio; +} + +static struct folio *kvm_gmem_get_hugetlb_folio(struct inode *inode, + pgoff_t index) +{ + struct address_space *mapping; + struct folio *folio; + struct hstate *h; + pgoff_t hindex; + u32 hash; + + h = kvm_gmem_hgmem(inode)->h; + hindex = index >> huge_page_order(h); + mapping = inode->i_mapping; + + /* To lock, we calculate the hash using the hindex and not index. */ + hash = hugetlb_fault_mutex_hash(mapping, hindex); + mutex_lock(&hugetlb_fault_mutex_table[hash]); + + /* + * The filemap is indexed with index and not hindex. Taking lock on + * folio to align with kvm_gmem_get_regular_folio() + */ + folio = filemap_lock_folio(mapping, index); + if (!IS_ERR(folio)) + goto out; + + folio = kvm_gmem_hugetlb_alloc_and_cache_folio(inode, index); +out: + mutex_unlock(&hugetlb_fault_mutex_table[hash]); + + return folio; +} + /* * Returns a locked folio on success. The caller is responsible for * setting the up-to-date flag before the memory is mapped into the guest. @@ -114,8 +299,10 @@ static int kvm_gmem_prepare_folio(struct kvm *kvm, struct kvm_memory_slot *slot, */ static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t index) { - /* TODO: Support huge pages. */ - return filemap_grab_folio(inode->i_mapping, index); + if (is_kvm_gmem_hugetlb(inode)) + return kvm_gmem_get_hugetlb_folio(inode, index); + else + return filemap_grab_folio(inode->i_mapping, index); } static void kvm_gmem_invalidate_begin(struct kvm_gmem *gmem, pgoff_t start, @@ -240,6 +427,35 @@ static void kvm_gmem_hugetlb_truncate_folios_range(struct inode *inode, spin_unlock(&inode->i_lock); } +static void kvm_gmem_hugetlb_truncate_range(struct inode *inode, loff_t lstart, + loff_t lend) +{ + loff_t full_hpage_start; + loff_t full_hpage_end; + unsigned long hsize; + struct hstate *h; + + h = kvm_gmem_hgmem(inode)->h; + hsize = huge_page_size(h); + + full_hpage_start = round_up(lstart, hsize); + full_hpage_end = round_down(lend, hsize); + + if (lstart < full_hpage_start) { + hugetlb_zero_partial_page(h, inode->i_mapping, lstart, + full_hpage_start); + } + + if (full_hpage_end > full_hpage_start) { + kvm_gmem_hugetlb_truncate_folios_range(inode, full_hpage_start, + full_hpage_end); + } + + if (lend > full_hpage_end) { + hugetlb_zero_partial_page(h, inode->i_mapping, full_hpage_end, + lend); + } +} static long kvm_gmem_punch_hole(struct inode *inode, loff_t offset, loff_t len) { @@ -257,7 +473,12 @@ static long kvm_gmem_punch_hole(struct inode *inode, loff_t offset, loff_t len) list_for_each_entry(gmem, gmem_list, entry) kvm_gmem_invalidate_begin(gmem, start, end); - truncate_inode_pages_range(inode->i_mapping, offset, offset + len - 1); + if (is_kvm_gmem_hugetlb(inode)) { + kvm_gmem_hugetlb_truncate_range(inode, offset, offset + len); + } else { + truncate_inode_pages_range(inode->i_mapping, offset, + offset + len - 1); + } list_for_each_entry(gmem, gmem_list, entry) kvm_gmem_invalidate_end(gmem, start, end); @@ -279,8 +500,15 @@ static long kvm_gmem_allocate(struct inode *inode, loff_t offset, loff_t len) filemap_invalidate_lock_shared(mapping); - start = offset >> PAGE_SHIFT; - end = (offset + len) >> PAGE_SHIFT; + if (is_kvm_gmem_hugetlb(inode)) { + unsigned long hsize = huge_page_size(kvm_gmem_hgmem(inode)->h); + + start = round_down(offset, hsize) >> PAGE_SHIFT; + end = round_down(offset + len, hsize) >> PAGE_SHIFT; + } else { + start = offset >> PAGE_SHIFT; + end = (offset + len) >> PAGE_SHIFT; + } r = 0; for (index = start; index < end; ) { @@ -408,9 +636,7 @@ static void kvm_gmem_hugetlb_teardown(struct inode *inode) static void kvm_gmem_evict_inode(struct inode *inode) { - u64 flags = (u64)inode->i_private; - - if (flags & KVM_GUEST_MEMFD_HUGETLB) + if (is_kvm_gmem_hugetlb(inode)) kvm_gmem_hugetlb_teardown(inode); else truncate_inode_pages_final(inode->i_mapping); @@ -827,7 +1053,7 @@ __kvm_gmem_get_pfn(struct file *file, struct kvm_memory_slot *slot, *pfn = folio_file_pfn(folio, index); if (max_order) - *max_order = 0; + *max_order = folio_order(folio); *is_prepared = folio_test_uptodate(folio); return folio; From patchwork Tue Sep 10 23:43:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799489 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 94EDBEE01F1 for ; Tue, 10 Sep 2024 23:45:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 541C88D00DD; Tue, 10 Sep 2024 19:45:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 496148D00DB; Tue, 10 Sep 2024 19:45:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2E70E8D00DC; Tue, 10 Sep 2024 19:45:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id EF7BA8D0002 for ; Tue, 10 Sep 2024 19:45:05 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id E12AC160E79 for ; Tue, 10 Sep 2024 23:45:02 +0000 (UTC) X-FDA: 82550461644.29.56822C9 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) by imf19.hostedemail.com (Postfix) with ESMTP id 1400D1A0007 for ; Tue, 10 Sep 2024 23:45:00 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=l3d5oY1y; spf=pass (imf19.hostedemail.com: domain of 3-9ngZgsKCHAOQYSfZSmhbUUccUZS.QcaZWbil-aaYjOQY.cfU@flex--ackerleytng.bounces.google.com designates 209.85.215.201 as permitted sender) smtp.mailfrom=3-9ngZgsKCHAOQYSfZSmhbUUccUZS.QcaZWbil-aaYjOQY.cfU@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011797; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=j5QWCSGDqbjsS0n7qJVo2UbLIELWUXKqatSc9NHlgvU=; b=030vOcNBkGL+SgVrRtE++mPVJeOdWPUeRxL5c0+cQoWJaDhNvTlWZkV28EwnXTDBMBFhm6 pu3Y/eahoASSf8y5i8MsT2r0QNiP8L7lSRfcMzk13HPohrFk+Dfli51rqrpg5fva9qhp4i yLnnjhtGVLRgIZy8MK3SOvkaX7uxCI4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011797; a=rsa-sha256; cv=none; b=1CG5GFG1bICvEyUZ0GByNOe7OsEKAVNhTgPrWlq8ef7rmslGT8pmr0bdi0BdZYkCkqBKuC TdP52TkHFzklcxZOf+CN4gnGXapt7YYDWiNCIBcwA3RShuIh+re0VcJKhFLE8YyrUiFEco WNNqgdEBr9W9+IKsYegn/ygIZI3kBqs= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=l3d5oY1y; spf=pass (imf19.hostedemail.com: domain of 3-9ngZgsKCHAOQYSfZSmhbUUccUZS.QcaZWbil-aaYjOQY.cfU@flex--ackerleytng.bounces.google.com designates 209.85.215.201 as permitted sender) smtp.mailfrom=3-9ngZgsKCHAOQYSfZSmhbUUccUZS.QcaZWbil-aaYjOQY.cfU@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-70ac9630e3aso1171527a12.1 for ; Tue, 10 Sep 2024 16:45:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011900; x=1726616700; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=j5QWCSGDqbjsS0n7qJVo2UbLIELWUXKqatSc9NHlgvU=; b=l3d5oY1yBoKxSLAo6LN7+qEvnjYtoyPY9dCwodYMvKkXO9/hykh+nWERO7fuQwf3GI /Ns9k5FU5mJ8WNEV9F8OSz9DuuIdqpplJNxOoHxTJbtmrZ1fUekwJ8D5imgCJJguifNq 8oMO0j1MQ+XRzD8hS1AYpCeRyucKkB40x0AwTKgtwA4QkpZCl0fue++0F3SVbPmnA6le 2s2LIQnvDTzzC8IS2bDFYewM11jMA5CkCVT9JvWnZVnUuWsA7b2wAtALqJpTB1IYi4wu qafk8v4KNExjTGMOcoF4TtWZd4scJSNcguh9TZIrH7eVb7ewSpcPwR35CH8WV5sRUY5i 4OSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011900; x=1726616700; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=j5QWCSGDqbjsS0n7qJVo2UbLIELWUXKqatSc9NHlgvU=; b=IhZdi/kaCFNpFt1lFuZRL3LEko+hJLC9u6bSoqcMB7+oo2I4i63u+zSWsYIBDE9QD9 1Vclgq7zE2AFSVITZUhWsW+KQdlTHJFg76c3zRPLq97PA1QTJqgUH2j4KeP6Vrzcw3Wg pxeVsFwKSiZz7S7K/yePTPG3goT/6tIKOph95JApqVghPEkFdeqatNKHaL0blPTxJC6r RCH0PPcLjoauZ4hxnpdFwYiQ/GMUYwJyN8+7zaHhOVIeipId2O1aJCug3v0APN7DCsS0 o9AZOlq29TnMy0+sW91u1jGEM7fpUH99ceTZ14cWpiHoECnyH2a0Rsqws1Y0e18HiZgG tEwg== X-Forwarded-Encrypted: i=1; AJvYcCWjX/Eri727VuX3y8iSai5nNJSgfFf1FtP8D5J2o17IqKiJflWExypx4gfhqgsufcuUwA0ekmqyeQ==@kvack.org X-Gm-Message-State: AOJu0Yx6fAefFqKUlYXc+Ej3NxCLiBpxRrfpXIUKkhj1Jr/QowUKY7Ao yhRBqWJyJd2JWSBiZU8Jzk/NZZHsRRmsKyrTJtWRp3ovWCZeFfeDP2C9nrm/vFF5/YKIVekDOdN lfWJ/9oo/af5YpWns0OaLtw== X-Google-Smtp-Source: AGHT+IHo9nrZMACTW1GYMiSHsHPpyJhtvpckq5UoyzEbjp2HiFKw7SxN6vYrJsaFr5HrF/PjOMK6efHCzGq+Tv7awg== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a63:3dc6:0:b0:6e9:8a61:b8aa with SMTP id 41be03b00d2f7-7db0bb80899mr2016a12.0.1726011899568; Tue, 10 Sep 2024 16:44:59 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:47 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <6f6b891d693ea0733f4b2737858af914bd70a8b6.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 16/39] KVM: guest_memfd: Add page alignment check for hugetlb guest_memfd From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 1400D1A0007 X-Stat-Signature: wnzzemf5fmy7u49ka6eqnn81b1mu46nc X-HE-Tag: 1726011900-777258 X-HE-Meta: U2FsdGVkX18e2VL8LKwFAjCBRFmhsat8/Jvmmf4hZFiqnH8rYtz1q9ujePLUprH/fHB8vt/edLIl1X2NNAjelz0oTRGzxVD7kj5wg0mF7loiB4vUEI8nMBZNRCAZMOJKzFUAIsHkmt1Zdhfh327MDPCzxMTia+ayV48S1P7iaXLd/65Q/X+ljCjeJji//bIle03u/mRbcuwabq2fx/hVBr3lpysg1vy23Fn3uvqjiPlbJLJdhBFqeERDC8Jz6dsIdeTXWkDEmswRP/ohA42LP1jkNpsazSfKay4sbJZNo40rDc6Nr24bGbiSV/iJQ/SPep2fjGP74oyEbnt122kcA2NZ8butCwcbGnWcA5BsWodpDfHjN8o46P8VTTdoVsMAY2JCNZHzC21hVV40lavFaKKgSrNwuPvN51lh90Gp0IWKzrHGLv+oDOJrXyo/KlHcLkpIfSJ2BSiekzFiPbU+AnE73zRW5BLJZFJngTJiFiIWv4HtgO+fn+DIuKS3TiUH4RfeVcqBsPtOyUKDPY5qupbnggzZRWt53pp741DH5pvDSzFNwBtn6WPDLJpAy3Nc6cgSnZgQsAegVfzQ3xuTdXaJoiIuByBRaoVDpR+KYeU5qE/xbVIVqo9aUX0oLaUXYmbG5x3PjJx/8ecDAFtVdtFRdlv3xrHFAf0s1bqLMuqpJnh9bo9QQdrgz9Jro+hoD+kWFOusLo84nuUbgA34HmcuCVBlE6dmbeEpeN7vGhgudl0Y+uZawwT1tUQL5zKH4v1sqtLQBrCB2Rvjix6l9AF15cTFZJn6Spj7gdjWxyKCEGcICW0pJdmxXhEmc8dsPke2+RW4gQtyNFAQbpFLWBkn0SGKRshriDOh3Y+qP10dBlZ6EHjmNLXVJ44XXohpt8rjIvvXhDmvgkV5xpVbfid+qgANK8y2uDZM1GpRpt5lEowcrUiUnvz+QEOm6MJ4yaRaEjyyncaiNf7XATg cBZq8KX+ TuQD/BLQ5sQ+en7B0bQpgVVmkANqxfExPnDx1P/8m2J/Zlos0HCocXZN7Hooo8TQajy2E8GeHulhpI5ZlyVMC2e0GLyZBCfp8Xkjt0VcSRGrJsVmA3qN0yujF4ppnXP4tvzrrOHt+A11U0Lan0Uelj49omeUwl2nDVX2CkZYyR+Ldd88mTA32kaF9WC5NyD1yAaIv/NCzYOH/FOK6cIK0FMi7qehYeauvgz5Z1Z1it/+NtDl2d/GZ5Sj7imSimI7F3KQrydUFeOWCq/6ak+8+FCrxdJlwiuMCD7kVKAX8naH/OIzz07Bqq5qu30nUmGoq1+glmb9F/wiWp+kz5NKJ4jnV6eXGqEPvYjXLDVKk3wPLAR0fiHz2QzF4n/JEjE9IO3dGLPrgdwS0MrzUnzbfUluTJFr8t0wng7Y75h2dO9WYI7CxbIMM2PNu8YUqdeV0Y6r2XsnX6rMFAbWSuUguGJscCL70+zscYfeuvGlru2o1iHfu+f1m7Me8xTfKgyzFuOoD0eBIxIOg4WsQ3dTXGWKeu7RwKGP0/OpcqJA098S6r7UpavPxRGysQUsDk6fQGv7SXAy/ELGOUvbmz0rIfbybkpW5yR4MpGX9dx78VVMJt6CFUhYw8uq7+jLqISpgi+O2tKZd/JUxSjoLiSKIQ/5MRgVt8wJwy/FfSK5n3Tf1iFn7Ffsj5g4sJeu5Ff4EAGOYPm/dNNxLepw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When a hugetlb guest_memfd is requested, the requested size should be aligned to the size of the hugetlb page requested. Signed-off-by: Ackerley Tng --- virt/kvm/guest_memfd.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 2e6f12e2bac8..eacbfdb950d1 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -909,6 +909,13 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) return err; } +static inline bool kvm_gmem_hugetlb_page_aligned(u32 flags, u64 value) +{ + int page_size_log = (flags >> KVM_GUEST_MEMFD_HUGE_SHIFT) & KVM_GUEST_MEMFD_HUGE_MASK; + u64 page_size = 1ULL << page_size_log; + return IS_ALIGNED(value, page_size); +} + #define KVM_GUEST_MEMFD_ALL_FLAGS KVM_GUEST_MEMFD_HUGETLB int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args) @@ -921,12 +928,18 @@ int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args) if (flags & ~(KVM_GUEST_MEMFD_ALL_FLAGS | (KVM_GUEST_MEMFD_HUGE_MASK << KVM_GUEST_MEMFD_HUGE_SHIFT))) return -EINVAL; + + if (!kvm_gmem_hugetlb_page_aligned(flags, size)) + return -EINVAL; } else { if (flags & ~KVM_GUEST_MEMFD_ALL_FLAGS) return -EINVAL; + + if (!PAGE_ALIGNED(size)) + return -EINVAL; } - if (size <= 0 || !PAGE_ALIGNED(size)) + if (size <= 0) return -EINVAL; return __kvm_gmem_create(kvm, size, flags); From patchwork Tue Sep 10 23:43:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799490 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6EC75EE01F2 for ; Tue, 10 Sep 2024 23:45:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7DF368D0002; Tue, 10 Sep 2024 19:45:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 61A7B8D00DC; Tue, 10 Sep 2024 19:45:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3CF2F8D0002; Tue, 10 Sep 2024 19:45:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 187008D00DB for ; Tue, 10 Sep 2024 19:45:06 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 07DEC160E8D for ; Tue, 10 Sep 2024 23:45:04 +0000 (UTC) X-FDA: 82550461728.19.52929DE Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf16.hostedemail.com (Postfix) with ESMTP id 436FC180004 for ; Tue, 10 Sep 2024 23:45:02 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="tk7/vugD"; spf=pass (imf16.hostedemail.com: domain of 3_dngZgsKCHIQSaUhbUojdWWeeWbU.SecbYdkn-ccalQSa.ehW@flex--ackerleytng.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3_dngZgsKCHIQSaUhbUojdWWeeWbU.SecbYdkn-ccalQSa.ehW@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011799; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=o/HXd5Kdj9SBXB3jabZ8LV/kfZeRyP8D+V3Q9EiJjcU=; b=XVQkzVierPpdievBi3boZlMNtwHGOxBDk4XzP7/Zeps34AN4iwysGf02fQcfUI+r2qrWPg i9a94nJvrem0sXWDz0rKqzhRKKzeGWi2Ioixg6tYVHCU8D3LTmi6pa7i60w+LxwIUyY56A 7IvsDXc8OeAf2sw9corzOWlz+KlD6JM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011799; a=rsa-sha256; cv=none; b=3exJKBTgCS41cqVt2HPvXGJ9cHTpoCpDNT5OKp0ac1SjGpJ2+0tLopi+Gn4CcidIUyNp0C N0ZU/zQjqJCpSdNODMl6TL1N5WC/SnlYdP2eLn6Dd8bFuZHrF56Pc1sktrB2FjADKhzdUA OOt4rj2kRqfVeB6km9WhHxOR97xL9fk= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="tk7/vugD"; spf=pass (imf16.hostedemail.com: domain of 3_dngZgsKCHIQSaUhbUojdWWeeWbU.SecbYdkn-ccalQSa.ehW@flex--ackerleytng.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3_dngZgsKCHIQSaUhbUojdWWeeWbU.SecbYdkn-ccalQSa.ehW@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-6d6b2b9a534so37702307b3.2 for ; Tue, 10 Sep 2024 16:45:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011901; x=1726616701; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=o/HXd5Kdj9SBXB3jabZ8LV/kfZeRyP8D+V3Q9EiJjcU=; b=tk7/vugDJXJXFEiCNK+9OD3+NEGFOStfxAtRusFmlR2r5uONSQOR3SBeuIM/OSjYpv V/dECh2AY6mYhCe2IrzOi2XZGYklTikKkaWC3100vzsiTFE3QW8zWhSqPbtCCG2Y2wc7 qz3G8oPZ8NF72PQUNR3Ws5VxtavXoOFH1Ojc+F8NYMTTvGd3glmm9jV18cyrmbapHJEK Drs4LzpnQDjtNGNrt9B4bhzH0Z6SqtxhVRUtvG4qzVbljCuCLxYg8msoXsDlIjyquuZn FeidfkK72GnfbLvXEUkQeJ3K7dl0CGXt4+cwwyydFhw0Aodv/SVeSA/vIyccjHVrPdd8 vG+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011901; x=1726616701; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=o/HXd5Kdj9SBXB3jabZ8LV/kfZeRyP8D+V3Q9EiJjcU=; b=LhCyP6RgIunLBGf5rXNr7LrAP87ceO1Sml7jUo2kKbzIhChIMk6p+ee0YWHuLWO593 u+sqCG4ETvHzLdaKSbX6EJZaCJLX9+u7pEIr3Baqnn1v3HToeVGvLrzUmdr3HqtMcmqp 0HHmETzcECwDqjSY7tXkjWXgl13v+FO3wrecW0rbuKhH+3W5kK2gF8S/NSlaBtq3xGIw nRiTnXL0BlAlEiwPwn0vix6LWZcphCsG3XmW8bN74vI+0vH0wguVzcN3Kjs3+lKNHFaO eU3Q+bjqKbIfCz8XBdZA4DaTx1u9pSfa9GqYs3Z/6It4sUpgReUm5Ki42uqaMf7KuxWv M1nQ== X-Forwarded-Encrypted: i=1; AJvYcCViErk/zrJ1Jf4jsASG4S76gLWIQtRZmSXuNTiwmxjyYqYPoz+bzi9u2zqIdK1H4F0oS/xG7UXNIg==@kvack.org X-Gm-Message-State: AOJu0Yx0SU/2YcTWbQePBp4er10rzSBYj71i4cRknzjfyHwaq9XK3EX2 ALtLaUN8dK1tL2MgDqbv10mE7Q2DMpNBlZeyEsRCcg07+hk/mANHPtf3tNfI/KGmh84uKifMXL0 z36J2ky15qVbjCzxXf6J7aA== X-Google-Smtp-Source: AGHT+IHO3Hz+XYA+Vm8ou+yVR+TLwcjnNq/stukVuZHNAMFeSdnEW4ysXQTgCnqywX2CHH+Ew1hV18uk9npr6EUJ3A== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a05:690c:6e0a:b0:6ae:34f0:8832 with SMTP id 00721157ae682-6dba6d5d3e6mr543327b3.2.1726011901348; Tue, 10 Sep 2024 16:45:01 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:48 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <2f0572464beebbcd2166fe9d709d0ce33a0cee78.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 17/39] KVM: selftests: Add basic selftests for hugetlb-backed guest_memfd From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspamd-Queue-Id: 436FC180004 X-Stat-Signature: yqpuk45711sdnx5g9ca8zkiobm65ank4 X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1726011902-30303 X-HE-Meta: U2FsdGVkX19BxDoudh//Gyuxtq8p7hWHxoKuUi7e5CRaR8W0vTetvdBCN5vNs1SPYxShvkW+J7sjyzANMJK8OLXms+5XIlMZFGRS29W3oupe0Sie1a91Xb5Or6Y6zPa4//g7i6wtnp9QVUEtD40SdODycO+ZuvTGoedEpby/r2sv/l8fBqCaQqpBwLyN+hJUtxzKpYfz1VE5/Y+3NcFPVtUtv5YCzxUlOHYatzFM1AyxMSCJUzTaiejtUrQJudFg1vXRE5oNzNOiY2kPxNBJVs/pwI+zjeizWF/hokt7yoHUpdNUChiUc2KvUKsiQi15gboRIkfkpJZbpuR1/jKTudVrSsZ7vaYYdhKvfb1k9NzwHxzOCjt6fozKHuCWG70eS8nh6LpCgdCyMkB/XbDuI2ZwABNWL9OGcZ4ilu+/6LBSA/nIVx8n+lzbsWZKF8IFyhTs3KRKWm1lyCf4HcHFUVpNoeaY5EAmiSco8TO4OATNW7NShiew0FXNriizjTDlpEVT/mp9cQDB+YZdGDynPNFeie29kwhHgdpeT3czHq94MLORzmy8zqodEPqRd85wk8BEWcs6TycdLWJnCygOVqjPTq8X0/wGspgKKkMIyFw8aWdJzesKU6OogMHsdB9Tgeli4v7RXpEczxaxvn1rqL1Af6j7G2sYMJUM8hvkuZwwKBZ1gS2dUM6pC5yqJZGMS0AOTAXXei3xTUW9+oc2lxv+lzlc4Znh/BRyEEi7/P9Az/EETlyuOQCRXK9FQV8Y6wY4fCX/TpvdxkkWAq9agQ/AM7WoFm559I0vxR5yU3LyvGe3BK58PdrAMGSJA0ANRfiQBBMH6G6RBr5iY8+BV9ZPCBEQXiXaHibnpH7CIcF4C4+4b0Y1afiPDSrdrsKAjzbg4YRzuqDPjYtapdvcKSId7ZfUhQXc9vqjbYhR3XlCAHn/45Xb7Wng1d451NpdWbz5luYLfW51cL20dWN 69HcSjH5 nogN621wN9g3bqN3BM4RsghbdXc4zlasIjncoZwmYB7VIGf4mHFAZxYpfxvHTV6jSmrJFij1Ttj8DZ+DUxncfUALyMA4T/kjDvD8Akn6/FauZq5u17hyXhIhl6/7TiQJHRUGo1pMiDAysiCerSL1fa6Z19MK/IYA7SJ4cIBRqVTLSAt1rvR4jsNM/5Ipg/rYaiYAtOrG3rgBQ+ueNA4mkjVaxmAJlVmLwvvhQud0+3ZPl8+OTOqICpviA226YTcQE6eDfM6oj2BY+QUwJDxVpAy1tqMO8DC3CVvRWVzLkLeyJnz9aHP7Qm7jNTYFxV8l+eCJVr0FDHaYXfjNrfTJyXy9DeVolyRxM1tankVNJHLm62rfLBSsxc5U+8WJWwJSyRSRnoyMNbcx13h4hJXjnGgGyJnYJpzreI9xvAPXyowv3l4nKKTWN8kJfV33G4TfPS98WTl5ESwVqrhTHuL/nzlRiywNwetCZnUoFqa/kVpi9bvbKpW9q0DJYuqOI6yO+zjO9BN3heMD63+c+BWRmaFHoA9TXDkMRfLDnTfMBqJ2+XZHyvENgkujSmLWYuX9QEtxcufSfYFAjfN1jlyAmWboZtsGU8jDuBbsWgPRqccgESiNE3xin+kLURfs02eQspeeK+N0orCUgqMseR0L8VYL+iQZiPv7xhb7Z X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add tests for 2MB and 1GB page sizes, and update the invalid flags test for the new KVM_GUEST_MEMFD_HUGETLB flag. Signed-off-by: Ackerley Tng --- .../testing/selftests/kvm/guest_memfd_test.c | 45 ++++++++++++++----- 1 file changed, 35 insertions(+), 10 deletions(-) diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c index ba0c8e996035..3618ce06663e 100644 --- a/tools/testing/selftests/kvm/guest_memfd_test.c +++ b/tools/testing/selftests/kvm/guest_memfd_test.c @@ -13,6 +13,7 @@ #include #include +#include #include #include #include @@ -122,6 +123,7 @@ static void test_invalid_punch_hole(int fd, size_t page_size, size_t total_size) static void test_create_guest_memfd_invalid(struct kvm_vm *vm) { + uint64_t valid_flags = KVM_GUEST_MEMFD_HUGETLB; size_t page_size = getpagesize(); uint64_t flag; size_t size; @@ -135,6 +137,9 @@ static void test_create_guest_memfd_invalid(struct kvm_vm *vm) } for (flag = 0; flag; flag <<= 1) { + if (flag & valid_flags) + continue; + fd = __vm_create_guest_memfd(vm, page_size, flag); TEST_ASSERT(fd == -1 && errno == EINVAL, "guest_memfd() with flag '0x%lx' should fail with EINVAL", @@ -170,24 +175,16 @@ static void test_create_guest_memfd_multiple(struct kvm_vm *vm) close(fd1); } -int main(int argc, char *argv[]) +static void test_guest_memfd(struct kvm_vm *vm, uint32_t flags, size_t page_size) { - size_t page_size; size_t total_size; int fd; - struct kvm_vm *vm; TEST_REQUIRE(kvm_has_cap(KVM_CAP_GUEST_MEMFD)); - page_size = getpagesize(); total_size = page_size * 4; - vm = vm_create_barebones(); - - test_create_guest_memfd_invalid(vm); - test_create_guest_memfd_multiple(vm); - - fd = vm_create_guest_memfd(vm, total_size, 0); + fd = vm_create_guest_memfd(vm, total_size, flags); test_file_read_write(fd); test_mmap(fd, page_size); @@ -197,3 +194,31 @@ int main(int argc, char *argv[]) close(fd); } + +int main(int argc, char *argv[]) +{ + struct kvm_vm *vm; + + TEST_REQUIRE(kvm_has_cap(KVM_CAP_GUEST_MEMFD)); + + vm = vm_create_barebones(); + + test_create_guest_memfd_invalid(vm); + test_create_guest_memfd_multiple(vm); + + printf("Test guest_memfd with 4K pages\n"); + test_guest_memfd(vm, 0, getpagesize()); + printf("\tPASSED\n"); + + printf("Test guest_memfd with 2M pages\n"); + test_guest_memfd(vm, KVM_GUEST_MEMFD_HUGETLB | KVM_GUEST_MEMFD_HUGE_2MB, + 2UL << 20); + printf("\tPASSED\n"); + + printf("Test guest_memfd with 1G pages\n"); + test_guest_memfd(vm, KVM_GUEST_MEMFD_HUGETLB | KVM_GUEST_MEMFD_HUGE_1GB, + 1UL << 30); + printf("\tPASSED\n"); + + return 0; +} From patchwork Tue Sep 10 23:43:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799491 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57073EE01F1 for ; Tue, 10 Sep 2024 23:45:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0768F8D00DC; Tue, 10 Sep 2024 19:45:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0003A8D00DB; Tue, 10 Sep 2024 19:45:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D6D368D00DC; Tue, 10 Sep 2024 19:45:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id B28CB8D00DB for ; Tue, 10 Sep 2024 19:45:06 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 39ABC4071A for ; Tue, 10 Sep 2024 23:45:06 +0000 (UTC) X-FDA: 82550461812.29.9C9D960 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf14.hostedemail.com (Postfix) with ESMTP id 63636100006 for ; Tue, 10 Sep 2024 23:45:04 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=1RGVZ84y; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf14.hostedemail.com: domain of 3_tngZgsKCHMRTbVicVpkeXXffXcV.TfdcZelo-ddbmRTb.fiX@flex--ackerleytng.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3_tngZgsKCHMRTbVicVpkeXXffXcV.TfdcZelo-ddbmRTb.fiX@flex--ackerleytng.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011876; a=rsa-sha256; cv=none; b=X6UykBBOklnDqPeECWlvYC2NsFA74QotoxMTzsEvquRKhAFQTti4R6H46SuLWNv7/SP6Yr G5PvU4esjNSm/BJHmNj11N1qcySNrOA8G26pP/sHgo751kgf3ysIO/mv+dv5Dgclt/TuTA Ao5blPiCLtLmoovns9afeU3PogxpQ3o= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=1RGVZ84y; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf14.hostedemail.com: domain of 3_tngZgsKCHMRTbVicVpkeXXffXcV.TfdcZelo-ddbmRTb.fiX@flex--ackerleytng.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3_tngZgsKCHMRTbVicVpkeXXffXcV.TfdcZelo-ddbmRTb.fiX@flex--ackerleytng.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011876; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YRCNcbi6F+8e/+W6bDlSdC9DBYtBZ6HMw9eZxHX0xEQ=; b=yDUBsZG+Ok4W2SCFPzzghhGlKs5FTTurKFx4/jXaza1a5zOFfuzaIDOk6G0ucyuu3mMqqK wJBBLmJ8+9y1zwfI71tuA1lZF4qhq4vTdnZDm5ym0rtOJRn97qJL2ON/+q1IPf0k6CIfjT HUJtH6EqgAuNmx3Wm+crWYV3RBZ5PcE= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2d88116d768so6348779a91.0 for ; Tue, 10 Sep 2024 16:45:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011903; x=1726616703; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=YRCNcbi6F+8e/+W6bDlSdC9DBYtBZ6HMw9eZxHX0xEQ=; b=1RGVZ84ybTrPdRb1WoFPTNzYjR59L3bDM9iiO05RiCuFE6AC24uXr8Vp0Dc/c2Z7VK IhogV9Cfdw46qVzIVPUe2ZQfrY63sqydze1T97WvmnF1OF5OKbbxP29U2FVf2+FZw/Y5 BKryZabliuYBOZwLjIRFQMlyHEIrzISjLcj+u72rb5UgF1JXF0KHkesmHhk6HwbOrGWL 0mFVUIWQLGGeIutwkce7hQSNlbR95CsGDJC1Ovenw38WHQpIGfFFdTX1I6u96FcjwBTk aVuWnJ5A2FpvKaLSGEVCTpuyUJav8z61V1mHV452rftG7TTGWgbHMJCcchpq/CEjr/tH OkQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011903; x=1726616703; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YRCNcbi6F+8e/+W6bDlSdC9DBYtBZ6HMw9eZxHX0xEQ=; b=LQxd4nSnNoMvnC2ME48vE1o82UG0BlzDTZF/yzB4jwBBBxExJSctZ+uYaa+3OX5Qfm NEML+DBM/Jvy2psnzmZ7K4h7Bjy9iHykFbtNbfzjO9ts8AMsKhxjip8hYQx3CYmap3+7 upig4QBotZ0+WrzNKUmgcrojhs3klTHJyuDEive9TURu81e3gPTKNnOZ6BY/2lT+xTIP 1IDkZz1w3qK1bTeaDhkEsb1FP+fS31q+K+DxrUzqfYCnwlL9vE91aQjlW/NcgX4P50Za uOM7BxnAUbLS1Q5XvxHhbAX781WYipzGy/KAm+uC0I+LZJlJAOEPRlopBSZXkT5Zp8o/ /PCA== X-Forwarded-Encrypted: i=1; AJvYcCUZcvysBoD796NRMyAo7XgK5JTWX3KVUxRZGi/8yt4mDe7z4ne6/pxc00IZZ5Rx6WFq6CkRpqtGIg==@kvack.org X-Gm-Message-State: AOJu0YwlhgGPQi08KC6tK//XgebU5J+ExZSvjqLL2INoNcuDitVZTvRU cqwXmd2NGak34QbJuX3WCbh2K1TpGHyGOI/WAbRS1HWbRYaMWzNcyzMVXo1CKlotaECzEGLiPy0 EgUHkOPB2od1vYCYarEYkQQ== X-Google-Smtp-Source: AGHT+IHtETmvMtzECuoQj0TF7p/fCvI8MzpUYVVeKR6wp9ZmcH42X+aBOrBQUronIKH1gIl+RV35B4SpP9LwziE8Sg== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a17:90a:b38d:b0:2d8:a37d:b762 with SMTP id 98e67ed59e1d1-2dad517ea05mr68668a91.4.1726011902955; Tue, 10 Sep 2024 16:45:02 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:49 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <327dadf390b5e397074e5bcc9f85468b1467f9a6.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 18/39] KVM: selftests: Support various types of backing sources for private memory From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspam-User: X-Rspamd-Queue-Id: 63636100006 X-Rspamd-Server: rspam01 X-Stat-Signature: nettr3ntouxe6569n6e5jexgsdwhctqu X-HE-Tag: 1726011904-309353 X-HE-Meta: U2FsdGVkX199SdXw4SAY8kgGaBMMDvI1IK2VRA/IwSMsg/tsTBXCgDGGy6D9cJXuiiGPNbVEt+XhpLZcLhHPC4d+L4PPeyBxwDJR+hsfshR2l1EXsPZdlMFu5hTGnwG+pRGbd+gWx1brjcRMFG9Nb/5s1gfS9nL1Vee9qzx0z5/m0MU2l+tblxZIsgbSM8YpZZVLnjSZcB96z+HM+zRYzV/umEdXX+QwEromZWXm7t5WjE7WMN3zaELGzrEvaELjWqK4WAAu6qnc5jpq793z1fTU4PlwWs1MDn3F4iiP+LZIBtC+lYjXpf2u8+mzYJHOFQzo5CpOT6EwCMLyz1jAhBYso4JGU73Vae2yHvf2V+GdwOyCsbg1o+V0cjQX0OP/s0ABsNH1BwPMbX2/g6XeGpPOPR3t/Rm6GOzErDbM16sbxmzFl9hHhDhA+zAnkugOxH7E80IEqIfW6I1YsEmRll2cZGFYpC3/Rt0coTDChcRLi0b+9DZW6x8A/W8gfy7YdhKUQYY6dHTJz1rY0ZFoGSF4uZjDyaToRzXzvrk3Mttp/IP9HctnVNnpqbXTWkUd/BG+hRN6fdhuYW9UavcP6iSOON8zAaYGJcrcB4nClJ1afO7LhD+sGVMX3fIl/SzgnztmEBWxES6xssZwEfv/3TeGzw2WdP4hqVT50GrcJADtK488lHAWnI0lQyEJ/LfC9/8NZeQX3qI0RYyO1UeDjUnOwOX2Xt+nmp3JGZGztF15mEcTlqGRIDMinGct6Qlhap5bBDdh6CwCNs+Zh6EEcnGByW5YXni1XXPzuhQBAZ9LQHBohtxBt+3pXr8ZdDKh0G3kKvgGEg7vaSfasLXSA9wMHdc6c+rJp9VBxvgTVPQbOBGs80AVdm6eQN4LnnsDY21GDk6sDXIUE9tr5PSA7k3bBZzNoqbK3/5e7L3h6ICh5Vng0DdVNb3kbuz3DoKfm1dycUvEKTCcuHpSKh/ H/YXzRmO VYiWTrPACsKLHgq1MZydf3MDhtKYDwICFuPfoFwT2fiFp6UYH1mGl9oUfnnmiF/HbgnwvtnSxXsNsoVOx96fgmqnByfaquFdVcf9iK5Gt0urC2H33BBhr/bTzYUDfNGF89y9IfjNKwrtHN3/92oJyqBPbpjgmugyzAo9QFwL+YLTLsDtjN7R3Ehn+JM3PWHh+8tS2kGmqAJRJrNet+c2EagvNtHJgj73Q0jC+5TKzMZT5LggTYZ3ipsT6l/oyuKBTnUhC+AnPXhyR/G7K7Yfau/MUmQFUJnIGZDK4NTdjORvu65x4gJvlnyFx3alL80mP0mjjlE3ztpDJpQzkQtWqhJHMvDFga/NHhrJs7RupFhqsG97pbGAeHoGTVuDLtquQu+ovU1kVwfLmyvTPBGSauGe9lB3y2jJ7LShO7w9kCm8fbK2BjwCpsGOdcz0g9rbwYrekfMX6Y5yN6ZHT7x7n16EWXS2NFI2jt89lqF1H4jonvM1ktgrnmhExVDQiMis4uAm4p9sn7sBTEfWBb2m82wXRxiAUthvoQ1sDL7Ww9T0WMnpZFAcBEN0l/TaqsSNEFP2CCHt0VGOiLOj0V++NHs1D1Rd+eAMaWKJWbBrldivwCo6E88N0PCKWGcNtdWu1XZR1LvOHQtg7qdC12RzmzRhfjw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000005, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Adds support for various type of backing sources for private memory (in the sense of confidential computing), similar to the backing sources available for shared memory. Signed-off-by: Ackerley Tng --- .../testing/selftests/kvm/include/test_util.h | 16 ++++ tools/testing/selftests/kvm/lib/test_util.c | 74 +++++++++++++++++++ 2 files changed, 90 insertions(+) diff --git a/tools/testing/selftests/kvm/include/test_util.h b/tools/testing/selftests/kvm/include/test_util.h index 3e473058849f..011e757d4e2c 100644 --- a/tools/testing/selftests/kvm/include/test_util.h +++ b/tools/testing/selftests/kvm/include/test_util.h @@ -142,6 +142,16 @@ struct vm_mem_backing_src_alias { uint32_t flag; }; +enum vm_private_mem_backing_src_type { + VM_PRIVATE_MEM_SRC_GUEST_MEM, /* Use default page size */ + VM_PRIVATE_MEM_SRC_HUGETLB, /* Use kernel default page size for hugetlb pages */ + VM_PRIVATE_MEM_SRC_HUGETLB_2MB, + VM_PRIVATE_MEM_SRC_HUGETLB_1GB, + NUM_PRIVATE_MEM_SRC_TYPES, +}; + +#define DEFAULT_VM_PRIVATE_MEM_SRC VM_PRIVATE_MEM_SRC_GUEST_MEM + #define MIN_RUN_DELAY_NS 200000UL bool thp_configured(void); @@ -152,6 +162,12 @@ size_t get_backing_src_pagesz(uint32_t i); bool is_backing_src_hugetlb(uint32_t i); void backing_src_help(const char *flag); enum vm_mem_backing_src_type parse_backing_src_type(const char *type_name); + +void private_mem_backing_src_help(const char *flag); +enum vm_private_mem_backing_src_type parse_private_mem_backing_src_type(const char *type_name); +const struct vm_mem_backing_src_alias *vm_private_mem_backing_src_alias(uint32_t i); +size_t get_private_mem_backing_src_pagesz(uint32_t i); + long get_run_delay(void); /* diff --git a/tools/testing/selftests/kvm/lib/test_util.c b/tools/testing/selftests/kvm/lib/test_util.c index 8ed0b74ae837..d0a9b5ee0c01 100644 --- a/tools/testing/selftests/kvm/lib/test_util.c +++ b/tools/testing/selftests/kvm/lib/test_util.c @@ -15,6 +15,7 @@ #include #include #include "linux/kernel.h" +#include #include "test_util.h" @@ -288,6 +289,34 @@ const struct vm_mem_backing_src_alias *vm_mem_backing_src_alias(uint32_t i) return &aliases[i]; } +const struct vm_mem_backing_src_alias *vm_private_mem_backing_src_alias(uint32_t i) +{ + static const struct vm_mem_backing_src_alias aliases[] = { + [VM_PRIVATE_MEM_SRC_GUEST_MEM] = { + .name = "private_mem_guest_mem", + .flag = 0, + }, + [VM_PRIVATE_MEM_SRC_HUGETLB] = { + .name = "private_mem_hugetlb", + .flag = KVM_GUEST_MEMFD_HUGETLB, + }, + [VM_PRIVATE_MEM_SRC_HUGETLB_2MB] = { + .name = "private_mem_hugetlb_2mb", + .flag = KVM_GUEST_MEMFD_HUGETLB | KVM_GUEST_MEMFD_HUGE_2MB, + }, + [VM_PRIVATE_MEM_SRC_HUGETLB_1GB] = { + .name = "private_mem_hugetlb_1gb", + .flag = KVM_GUEST_MEMFD_HUGETLB | KVM_GUEST_MEMFD_HUGE_1GB, + }, + }; + _Static_assert(ARRAY_SIZE(aliases) == NUM_PRIVATE_MEM_SRC_TYPES, + "Missing new backing private mem src types?"); + + TEST_ASSERT(i < NUM_PRIVATE_MEM_SRC_TYPES, "Private mem backing src type ID %d too big", i); + + return &aliases[i]; +} + #define MAP_HUGE_PAGE_SIZE(x) (1ULL << ((x >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK)) size_t get_backing_src_pagesz(uint32_t i) @@ -308,6 +337,20 @@ size_t get_backing_src_pagesz(uint32_t i) } } +size_t get_private_mem_backing_src_pagesz(uint32_t i) +{ + uint32_t flag = vm_private_mem_backing_src_alias(i)->flag; + + switch (i) { + case VM_PRIVATE_MEM_SRC_GUEST_MEM: + return getpagesize(); + case VM_PRIVATE_MEM_SRC_HUGETLB: + return get_def_hugetlb_pagesz(); + default: + return MAP_HUGE_PAGE_SIZE(flag); + } +} + bool is_backing_src_hugetlb(uint32_t i) { return !!(vm_mem_backing_src_alias(i)->flag & MAP_HUGETLB); @@ -344,6 +387,37 @@ enum vm_mem_backing_src_type parse_backing_src_type(const char *type_name) return -1; } +static void print_available_private_mem_backing_src_types(const char *prefix) +{ + int i; + + printf("%sAvailable private mem backing src types:\n", prefix); + + for (i = 0; i < NUM_PRIVATE_MEM_SRC_TYPES; i++) + printf("%s %s\n", prefix, vm_private_mem_backing_src_alias(i)->name); +} + +void private_mem_backing_src_help(const char *flag) +{ + printf(" %s: specify the type of memory that should be used to\n" + " back guest private memory. (default: %s)\n", + flag, vm_private_mem_backing_src_alias(DEFAULT_VM_PRIVATE_MEM_SRC)->name); + print_available_private_mem_backing_src_types(" "); +} + +enum vm_private_mem_backing_src_type parse_private_mem_backing_src_type(const char *type_name) +{ + int i; + + for (i = 0; i < NUM_PRIVATE_MEM_SRC_TYPES; i++) + if (!strcmp(type_name, vm_private_mem_backing_src_alias(i)->name)) + return i; + + print_available_private_mem_backing_src_types(""); + TEST_FAIL("Unknown private mem backing src type: %s", type_name); + return -1; +} + long get_run_delay(void) { char path[64]; From patchwork Tue Sep 10 23:43:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799492 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 26500EE01F2 for ; Tue, 10 Sep 2024 23:45:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7EF238D00DE; Tue, 10 Sep 2024 19:45:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7A0128D00DB; Tue, 10 Sep 2024 19:45:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5CA9F8D00DE; Tue, 10 Sep 2024 19:45:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 3891F8D00DB for ; Tue, 10 Sep 2024 19:45:08 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id E4B1E1C4D26 for ; Tue, 10 Sep 2024 23:45:07 +0000 (UTC) X-FDA: 82550461854.23.7B5ED89 Received: from mail-pg1-f202.google.com (mail-pg1-f202.google.com [209.85.215.202]) by imf27.hostedemail.com (Postfix) with ESMTP id 0D0FC4000A for ; Tue, 10 Sep 2024 23:45:05 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=azY92Ktd; spf=pass (imf27.hostedemail.com: domain of 3ANrgZgsKCHUTVdXkeXrmgZZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--ackerleytng.bounces.google.com designates 209.85.215.202 as permitted sender) smtp.mailfrom=3ANrgZgsKCHUTVdXkeXrmgZZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011853; a=rsa-sha256; cv=none; b=FO4fCzFNzsx1ZLRBKfYcXrm8JOiejYlZ9wj6fH4C3yGt8czLaXqfV3M3c92I9a5Wy2YB61 IAqsCtc+hE0g7pzClezZNkHHg2W9C9D5t3AsR983zldyubctyWsG0p3oNXH6eNaLpqTZIZ t0RkTmNWAJiunAvm2wNDpIgCiBDuqw0= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=azY92Ktd; spf=pass (imf27.hostedemail.com: domain of 3ANrgZgsKCHUTVdXkeXrmgZZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--ackerleytng.bounces.google.com designates 209.85.215.202 as permitted sender) smtp.mailfrom=3ANrgZgsKCHUTVdXkeXrmgZZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011853; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=JNe/WmHLeg6VKg7h5W2f+s+eqgKPFc2fIX3V6u3knDI=; b=MBkB52wz+4k4kQWT6fdLcELrJ+2GBTk/jCKxIKElypXe7Bp1j93Q+vj6eH7gZpX/LdWrZx 0X/fcxhjIj9a0UNOFu/JutbprBotzlTeNTUEudjJsrRVj3hlxHJy+/Wm/ZmezAemASxkp2 t6XvaY4gkNxtLbKAEh0szeuY0cpgmGg= Received: by mail-pg1-f202.google.com with SMTP id 41be03b00d2f7-7db0c10238eso151565a12.1 for ; Tue, 10 Sep 2024 16:45:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011905; x=1726616705; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=JNe/WmHLeg6VKg7h5W2f+s+eqgKPFc2fIX3V6u3knDI=; b=azY92KtdI42cdyODdwOOcFPeaQCZzvVWkyEZ+sYdjt44bwRZpRJNHrneFT+DNosHTc ITE1t9plBeP/fJzbbWogvpGO/ah/7ZRD/hAw3+cA+Zu9fF2dmcZb+7PMvx3LuHai6iZF G9y+C6w3EgzbDkINc9nA5zbFZGm3TDh57f/pbQ5ufGTpX2ZGI1VW+zmmebEWM20tVmty bvjWS0Z2XDCcvqiGaepv5U3kPgc+tfM61iSoEdqN61zdiSfeZ5QN2B53LV1Riz9BnngK 6OFwbgRtaQW+GSn9JqAUJQ5nx4Yp8grwNiGARefY3KlZDJmpTIFYD0PhhAkbV9iNm5xX PVMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011905; x=1726616705; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=JNe/WmHLeg6VKg7h5W2f+s+eqgKPFc2fIX3V6u3knDI=; b=UhQ/N/Oc3H4A/UDhbpwabY1UPZN1rGq8NG4PwwzUkfrwHpsCxDxxEchH6mU0AtIsSl HAe2r2GuqSK1ksOVgBXi0g2XKb4k+9fU5srYpJ2PHHGTDUxrF3Ar5JwMXndu3akuy2Mr LR15E3RtvHWd56oOopoGLUG2hvepGZq1RsWtO7qeX2sM8nPOHvvz6hD/3e7xMR0VmtdT wjH1AF9sAqRbu4/h1Ft90z+NIaMr9ebxf1Hat1bMtpfVlMuXA0TeSfIvaKI6uCXwovCe vbHjTf1VTWv7QupSAMcGpH8HirmEqN3aOtLjHnWUjgjDgkQ2FeNI24lDLvvTYz5Te5yF Mj+w== X-Forwarded-Encrypted: i=1; AJvYcCUlcFWBf6aqbrzheMp3TlbBccWdIzSdtEKeBFOuyQ3ZNWAyWboWzkjEmaRTscYU5cB5w+pzcBu4nw==@kvack.org X-Gm-Message-State: AOJu0YzYWollojQvFWFBX+D9IrBzWI8hyaVHz6Xiqb2G2WkV5xqpK7yD mJ8xHQT3zEj6H8Mz17zbUdIVZDqAkhCsrRBWBDlAnspxiiCLh+CZa5nLt4YODvp2YTjpEw0f/VL FwxClP8/J/o5pLiExdixYeA== X-Google-Smtp-Source: AGHT+IEf/I6TXDj4AnDTGwL9SBffiRIBIsWvgKs+GwMaxvWB3wQdGWq/Gxc5GyUB8zA1nCQX52oCQ5i2eL9ajayvXg== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a63:2506:0:b0:7c9:58ed:7139 with SMTP id 41be03b00d2f7-7db084ae38emr8766a12.2.1726011904575; Tue, 10 Sep 2024 16:45:04 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:50 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <41d7d714cfa7cec3e7089a184918da39e93008ee.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 19/39] KVM: selftests: Update test for various private memory backing source types From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Stat-Signature: 7fuwungdtcitosskzwh9573pyx1njz7k X-Rspamd-Queue-Id: 0D0FC4000A X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1726011905-560798 X-HE-Meta: U2FsdGVkX1/c/RkEjE9H0hF6xBRBtAGBt8wjMBGE6/DuSq2jYDhkch0gSoCdi3Mc8sd/Q0sJc4Vc4pWUuKs5N6z2HkiiJ6ueQpcGw1cbpl7E1z8DBGwm0vl8l+S/A/N57NzuyOIZWT2l5nfmofubbhCv3ojudsxdnln/SdNg4eO+W5RGvswuDMX5lEhhuxbADyWnKU8hIrSJU2Rohh33A/7OfBlKE4vLikcGrIz1IfcoIZ3WrPPWmlyNi4xMCKPYiG9eIsSJEr11vanMIu6sEcBDRLH5s/dLWv2X+MmqkxORHRN7bgCsTNM9MKsKKUWTdWicb5nmc7gL4Ye0h1tlScgKS+uD3Fl95F7MbwopGLh/PL1FiBoUBVep9hpMvlOASMtOcGzDrv8rUzZYYYPqVg81tBexO7ln+59dDOTeWe91KB7/RcqtLkR4S5lHHMgsextwCATuoyFruUmvoYCc3a/ndA2MqyaBm6FEPeYkLVd9Na+9oQd2MWMLf1MGlBME0+YieVDy5FG5EjA5/YtKTggPnEtF9VCZZEX+VWpdPMRCeN5+bfrPHD0qxYU+r9tB2196YuPEkAwJrCODN9fXXfZgaiFRpUC3YXf+hl0M1Lp19a6rrbYGVzmzUPmiOqXW/fFXd93K1T43VwQWayX7oJsG/zgkAy3/w4gYkwZr29oTutTsRablkcCXPTQteCA6q1a36zWLYM7e3w/kNkAVl/vPv7AnKy2NnRvQ1sE+IBSdmkaoeyE2QS6DSu+z611L5BdS44JoWjEv1cYq4Fx7GW9FODOE5wXKV8MQNfG/i3Cr8SW/l8Ki2dlOTG2CTm+HAFt+H+P02y25Zpcc4AxdQnQ5+YVBIdL9rcc+E5cf+bJZTrQK5g9W6gXQ/twl+5sgZtV9Q5gn6i0vLY0WMGEt5DK6OxrFFqqLOd0UXuL697HWiwWis6xmmLhXuP6BY9jxkbSdG8FAcmkhuJvzzZQ RKcXP77M xgEbrpslXb1r30/Wb4Kj2wQuVanBYdIti+EC9ncmiaH58pJT840JR7TYQgCxQc+OsyZCjXxaiPR3FRwSxCPbJN6WM+5btwiZqNZfw/E7a8TlGrseBVoc/PxFEVHPPBie0KfR2KMmOnQyHGQ6TR10Qy1byFyGS7r41O2w4EF0YJYNHRJEn2/dqMukln/EIAU4AeEXrazlkH4hl+m5Y1gUPV6SW2Z+U/9zLjqtN29hnCkZsBkKLIu2aQxNTH4jrQr9a8gg1ObUzcXnNZexL0EcHFahQ6WE5JUMKzRMn7chkXh0Mmeneqc9KgSnXBAVaaCG7wub4czPndw3Q3sJ6JBF7TjlrEuhUGxGBuBpE06TA8MyS2clwCYv0fuHZ+g/7R5Ra8TJGwXN73BSyWyJLhoeMKupPzdnzILKV+RDkRxNtmi/NBn9hXisxyukoTIwsImVYd5L3+a5GQ41uJLqepz5dqopT2Xs5yrrxP+vhdM/dYDPF9c7Cx2YwfnNcLovMNZui1MTi/f8UDXe+CZ4fUsltw+nzYG4z8eMD6Wdk9i8iSUSxjVOVJK5pUHdqDa6ZXMIYpNM2mAViwnNpBYueR0S2GVpZAQurHmgnIFcfI1/Shrjgca9ycAFYGbTABPatREojIYK6k6W2qZgltL93O7Uuj+7lsg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000006, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Update private_mem_conversions_test for various private memory backing source types. Signed-off-by: Ackerley Tng --- .../kvm/x86_64/private_mem_conversions_test.c | 28 ++++++++++++++----- 1 file changed, 21 insertions(+), 7 deletions(-) diff --git a/tools/testing/selftests/kvm/x86_64/private_mem_conversions_test.c b/tools/testing/selftests/kvm/x86_64/private_mem_conversions_test.c index 82a8d88b5338..71f480c19f92 100644 --- a/tools/testing/selftests/kvm/x86_64/private_mem_conversions_test.c +++ b/tools/testing/selftests/kvm/x86_64/private_mem_conversions_test.c @@ -366,14 +366,20 @@ static void *__test_mem_conversions(void *__vcpu) } } -static void test_mem_conversions(enum vm_mem_backing_src_type src_type, uint32_t nr_vcpus, - uint32_t nr_memslots) +static void +test_mem_conversions(enum vm_mem_backing_src_type src_type, + enum vm_private_mem_backing_src_type private_mem_src_type, + uint32_t nr_vcpus, + uint32_t nr_memslots) { /* * Allocate enough memory so that each vCPU's chunk of memory can be * naturally aligned with respect to the size of the backing store. */ - const size_t alignment = max_t(size_t, SZ_2M, get_backing_src_pagesz(src_type)); + const size_t alignment = max_t(size_t, SZ_2M, + max_t(size_t, + get_private_mem_backing_src_pagesz(private_mem_src_type), + get_backing_src_pagesz(src_type))); const size_t per_cpu_size = align_up(PER_CPU_DATA_SIZE, alignment); const size_t memfd_size = per_cpu_size * nr_vcpus; const size_t slot_size = memfd_size / nr_memslots; @@ -394,7 +400,9 @@ static void test_mem_conversions(enum vm_mem_backing_src_type src_type, uint32_t vm_enable_cap(vm, KVM_CAP_EXIT_HYPERCALL, (1 << KVM_HC_MAP_GPA_RANGE)); - memfd = vm_create_guest_memfd(vm, memfd_size, 0); + memfd = vm_create_guest_memfd( + vm, memfd_size, + vm_private_mem_backing_src_alias(private_mem_src_type)->flag); for (i = 0; i < nr_memslots; i++) vm_mem_add(vm, src_type, BASE_DATA_GPA + slot_size * i, @@ -440,10 +448,12 @@ static void test_mem_conversions(enum vm_mem_backing_src_type src_type, uint32_t static void usage(const char *cmd) { puts(""); - printf("usage: %s [-h] [-m nr_memslots] [-s mem_type] [-n nr_vcpus]\n", cmd); + printf("usage: %s [-h] [-m nr_memslots] [-s mem_type] [-p private_mem_type] [-n nr_vcpus]\n", cmd); puts(""); backing_src_help("-s"); puts(""); + private_mem_backing_src_help("-p"); + puts(""); puts(" -n: specify the number of vcpus (default: 1)"); puts(""); puts(" -m: specify the number of memslots (default: 1)"); @@ -453,17 +463,21 @@ static void usage(const char *cmd) int main(int argc, char *argv[]) { enum vm_mem_backing_src_type src_type = DEFAULT_VM_MEM_SRC; + enum vm_private_mem_backing_src_type private_mem_src_type = DEFAULT_VM_PRIVATE_MEM_SRC; uint32_t nr_memslots = 1; uint32_t nr_vcpus = 1; int opt; TEST_REQUIRE(kvm_check_cap(KVM_CAP_VM_TYPES) & BIT(KVM_X86_SW_PROTECTED_VM)); - while ((opt = getopt(argc, argv, "hm:s:n:")) != -1) { + while ((opt = getopt(argc, argv, "hm:s:p:n:")) != -1) { switch (opt) { case 's': src_type = parse_backing_src_type(optarg); break; + case 'p': + private_mem_src_type = parse_private_mem_backing_src_type(optarg); + break; case 'n': nr_vcpus = atoi_positive("nr_vcpus", optarg); break; @@ -477,7 +491,7 @@ int main(int argc, char *argv[]) } } - test_mem_conversions(src_type, nr_vcpus, nr_memslots); + test_mem_conversions(src_type, private_mem_src_type, nr_vcpus, nr_memslots); return 0; } From patchwork Tue Sep 10 23:43:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799493 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E035EE01F2 for ; Tue, 10 Sep 2024 23:45:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 35E288D00DF; Tue, 10 Sep 2024 19:45:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 30B4F8D00DB; Tue, 10 Sep 2024 19:45:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 137068D00DF; Tue, 10 Sep 2024 19:45:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id DB93C8D00DB for ; Tue, 10 Sep 2024 19:45:09 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 7C16B140B1C for ; Tue, 10 Sep 2024 23:45:09 +0000 (UTC) X-FDA: 82550461938.01.797A5E9 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) by imf17.hostedemail.com (Postfix) with ESMTP id 9D0D640009 for ; Tue, 10 Sep 2024 23:45:07 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=LH1EojmL; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf17.hostedemail.com: domain of 3AtrgZgsKCHcVXfZmgZtoibbjjbgZ.Xjhgdips-hhfqVXf.jmb@flex--ackerleytng.bounces.google.com designates 209.85.210.201 as permitted sender) smtp.mailfrom=3AtrgZgsKCHcVXfZmgZtoibbjjbgZ.Xjhgdips-hhfqVXf.jmb@flex--ackerleytng.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011880; a=rsa-sha256; cv=none; b=befOOhgNwWory5/Thi4vOZU88xUthDGkGgcXVg8y0loF/GcUhjJGDbZPYXePwJMhV5nAJv T6CgQecBEiuQQbgH99dRw+AklvBcsAu4WHP/zgdyG5vUCzvn59a70w90IJK/1vS05Ynugl +Q0wOSZ4ly6NjPoeHSqSLQRmh8QI/Gw= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=LH1EojmL; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf17.hostedemail.com: domain of 3AtrgZgsKCHcVXfZmgZtoibbjjbgZ.Xjhgdips-hhfqVXf.jmb@flex--ackerleytng.bounces.google.com designates 209.85.210.201 as permitted sender) smtp.mailfrom=3AtrgZgsKCHcVXfZmgZtoibbjjbgZ.Xjhgdips-hhfqVXf.jmb@flex--ackerleytng.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011880; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WNwyUtrg2ydCG7OEuuYFt3GviKLhR33B3wmqNjEPlY0=; b=lQr2rdvrixDWxa7uvEY+417r3aDeXWdrxnIRePAIBwpy4fm+CPd+D5tObc7i8F2ePN6oWl SncirgF4u1MVJYDKY6F3m1C1ZSJMDnK42I4jipuRhGJYWJnmw2/kd7OUy42ZHszsSBeV7q 81yoWJ4ERV/t/Fx3buCZSJaYSyrbrxs= Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-717dfdd7c72so7071968b3a.0 for ; Tue, 10 Sep 2024 16:45:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011906; x=1726616706; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=WNwyUtrg2ydCG7OEuuYFt3GviKLhR33B3wmqNjEPlY0=; b=LH1EojmLW+lsmwnmzuQGxgDoMDi1ZW21tyKZMGSt4D5X3NnEUdKJNS28sNbxZ3rXVS Yj5TK9BMYLYjkF3S/RYiV67i+t0U3i7dmnom3J/bRhLfJ5RBO+9uNUI59+8rwp8r3Dig xwTCaV4sso/BgKRvRTWHeP41Glg789xlZFxXE9Rul5AlQR2chHz4RUyZ5Eafey0T2qiu clvhzBEQM0114wPHkpKty2MV/n45DFgJb+VuQnBNvEXEEfqruQeblPm96Qdad7eqTRm0 YlkCDLYrR4+7ggbNB6R7XyjQ84Rw4xnNZ+AcWmcV41WVIFWAV2dbl/mcHTVrdeiOgRQn Rr5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011906; x=1726616706; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=WNwyUtrg2ydCG7OEuuYFt3GviKLhR33B3wmqNjEPlY0=; b=BzWzfYX3eEadcq70BUqTdulcSFoiCMUDz4+gXE5k7ErAoHmfGSEx8AbI6B/LwQeomK WZLo1zFtUBi5TI3N4+UJIh45hRjdNxYA3KOo9mx+DAU3SNKuPUyw5Mcei6qfewNCzfhT bXDYWoA6I0KNhja2T/aQgMKOIbd9Zja/JVyzxE6ZOJRlUQ4ns8rgQZsQM1HM/DrKuMxf qX8tkB7ptx0mvrC+5B9DCPLviQnu4OWcSS0rTZg+aw0RiZqJiHbTlbJr+JY840yChJWr 9SGjE1CfpHjxNeV9ACjl5SuWa/1Qy+PBl+YFdwOJkqWmESDkrqdT4/GAjyIZ61H5JCA6 mwqQ== X-Forwarded-Encrypted: i=1; AJvYcCVheCPLCmPmzgcqF4IAZVJqiDnNaYey3VrCFRfvz5R3etCxJJccE+ixFvZOSZtVxrUt8xCh+hpZUQ==@kvack.org X-Gm-Message-State: AOJu0YzJ1LWknkyDrnnrIy7xMCPPad1pYvt5NTenl48JPKGoejTw3SY6 DkTq8/Qc0lAprYMDJJPi8RTsO8oVycv6qW54HM8+q92Wr9Q/aiSPeQWEQvEnlbi1tyoi9+fHlB+ GA3DXtGL4lalukZusDM71ZA== X-Google-Smtp-Source: AGHT+IEcZ/ghxITnGvgKkAKwRaOU4W9SdtHb2nrXhbk+ACBlgEJVwXnXXaBkRqqDbpe8CxVzylPUbg24aRXjUe363A== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a05:6a00:81c6:b0:718:d8af:4551 with SMTP id d2e1a72fcca58-718d8af4650mr30583b3a.1.1726011906107; Tue, 10 Sep 2024 16:45:06 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:51 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <6bfe8c9baabc6ad89ccc2c4481db2b4983cbfd8b.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 20/39] KVM: selftests: Add private_mem_conversions_test.sh From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspam-User: X-Rspamd-Queue-Id: 9D0D640009 X-Rspamd-Server: rspam01 X-Stat-Signature: in67tdfmp7yfpa7mesqspqotqm7c6rka X-HE-Tag: 1726011907-692334 X-HE-Meta: U2FsdGVkX1+YoBPRzDjegWgOZqIKhqmcNtPNVKt841uXMFr9BcfBeVFZe/yvkepzh0Z1kL5UgO2GY77phf518HeMmDZ8NecIP4DZCCfPnMtQ6PNoJGra3B+UovWxdF5zoehDPyscDAoggWrvOiaXgGg61uwTbHFqlrM2fQKtpWg2r8llWgAHDNCSl+8QiWeBqOFho3jOq0NJANxUDQSKFFsnN47hoj9ChovUgBX2dZF5WRd2ICDLDEvuBjaGFSlvMMP0HCz5iQ2Fx0Cyqkz/+RpoJvb0CrMh+GEsj4hRnwzByTsN9t0ZAQOBHpPX9l+MF0ZECvwoqQ/pSJNb2c3M9XBrbCWZwNRuLefk8vz6V1sr/x7p1aUMBKUO+K04EoZdzufEOqN5DLyX3DLDhnBK4WY+FRmmjPc6x6JTGGXG7j2taxJRp4PYeZQwwsxGeAa42Gq1IVxp6WWZB8Q9mUq6ANT5Zu/HW21vRamjaG4VHgN4TFAlcvAaEZQWywScDxitrXq5+S8ZwIO1YH/7B1ufC4YTVpquN8LC+1OTJgi9lbDcynIo9FIMEbnPTEAsrOEHhFz7m3CUgNoWH7EhMZdCxKsQZZ4irz/42HpEBmhHUHsx1QquqlPeLXxboJ2hOCd/EROrDHXu8u+nYf+hK8aHP9ep7JeAbQ1waJTG3A5CHfQcIbyG01P1H8emfpcND8e48Qoj4SrjtDmGbHFujBsLog75y+vxVUblGr8Y+EYRhGkyMcKM/Pk7nZgBiAk6SFeMyYbGSLQ1C9Mkqg1tfn3J4U+KRuw1NpOPq/UXdLclNpKNxVnoG75jeJC38lDHH/VMwRO6qKIjaaYNujhvOb+qTCXVs9H0e99DBsar4SY8CxgrXWLQKYKCJnnsiAuslYIriIJPAwkWsbmX2W1ZH40VIdA801yaELVnVqKauct/F+U9Aym2OoR/9eWhxnWS7J9YUBVzYArZD4stoSC8kEP 6xW979mw i1LWCHIWjxgbMTPJQAZHWdCB6gYqs4eOaq8A4ILkqH50hOkam7KBEhK9eMSEY8LHIrMM0wjmH/2/tMCA7eIya51e6M2RyRcg38pVOcgQ1BiYCX7SflgbmZrsFm3qKrsWH0QMZMveyxSxsTIBVe17E1ghCQ3qZHZm0bqHgo17//bq+ZQr1JmU01S4sGeLud6hAAYMU3g+CsEkKlXQVRjyJ6sstzMrq0AEWyEHmQLw7ZQ7QxOqfisqgghwk7FIiUBuT8dFxNUt/OtHcxqq1KScA0MQrKv9kq8BPxnSCAjZbUq36zKgdHOM7s5yFewyvcUJfmXHBs3RpLzLmqKftzIkWwKf7D1GS+feOAS3eeZeASFc8ZoD9QrlmkCPSeEADsPxinqFnC8nMUFYZ0Dg9HZFIx41CwbC6I3WmcL5dqaPBAqsLurlHozWYxQzl8SgIo426mSvAY6I4BGxdgSR037Br7nIBNwGnpa9Byb3v1kwWp/K33S3BXr+dFeoGrZfYILnL86yhj+VkQUldbbI/cxZi9RgsKYwz36SYFF+LzYsgThufFnw7Ed2HpZz6ygfaJUpzHrLiWcoiLxHQ4Qw4uWXZv3uzteEXSFUvq6+UOGD7IktS1YeYqCkbo1WgQSgC5dO762/e4p/QnzNUTO8lqx3DmZwKVQP3zHAvB6wwpw2S8JpK6f+2WU9c6GPOoMVXMfXphId7h67EUaw/+L0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000059, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add private_mem_conversions_test.sh to automate testing of different combinations of private_mem_conversions_test. Signed-off-by: Ackerley Tng --- .../x86_64/private_mem_conversions_test.sh | 88 +++++++++++++++++++ 1 file changed, 88 insertions(+) create mode 100755 tools/testing/selftests/kvm/x86_64/private_mem_conversions_test.sh diff --git a/tools/testing/selftests/kvm/x86_64/private_mem_conversions_test.sh b/tools/testing/selftests/kvm/x86_64/private_mem_conversions_test.sh new file mode 100755 index 000000000000..fb6705fef466 --- /dev/null +++ b/tools/testing/selftests/kvm/x86_64/private_mem_conversions_test.sh @@ -0,0 +1,88 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0-only */ +# +# Wrapper script which runs different test setups of +# private_mem_conversions_test. +# +# tools/testing/selftests/kvm/private_mem_conversions_test.sh +# Copyright (C) 2023, Google LLC. + +set -e + +num_vcpus_to_test=4 +num_memslots_to_test=$num_vcpus_to_test + +get_default_hugepage_size_in_kB() { + grep "Hugepagesize:" /proc/meminfo | grep -o '[[:digit:]]\+' +} + +# Required pages are based on the test setup (see computation for memfd_size) in +# test_mem_conversions() in private_mem_migrate_tests.c) + +# These static requirements are set to the maximum required for +# num_vcpus_to_test, over all the hugetlb-related tests +required_num_2m_hugepages=$(( 1024 * num_vcpus_to_test )) +required_num_1g_hugepages=$(( 2 * num_vcpus_to_test )) + +# The other hugetlb sizes are not supported on x86_64 +[ "$(cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages 2>/dev/null || echo 0)" -ge "$required_num_2m_hugepages" ] && hugepage_2mb_enabled=1 +[ "$(cat /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages 2>/dev/null || echo 0)" -ge "$required_num_1g_hugepages" ] && hugepage_1gb_enabled=1 + +case $(get_default_hugepage_size_in_kB) in + 2048) + hugepage_default_enabled=$hugepage_2mb_enabled + ;; + 1048576) + hugepage_default_enabled=$hugepage_1gb_enabled + ;; + *) + hugepage_default_enabled=0 + ;; +esac + +backing_src_types=( anonymous ) +backing_src_types+=( anonymous_thp ) +[ -n "$hugepage_default_enabled" ] && \ + backing_src_types+=( anonymous_hugetlb ) || echo "skipping anonymous_hugetlb backing source type" +[ -n "$hugepage_2mb_enabled" ] && \ + backing_src_types+=( anonymous_hugetlb_2mb ) || echo "skipping anonymous_hugetlb_2mb backing source type" +[ -n "$hugepage_1gb_enabled" ] && \ + backing_src_types+=( anonymous_hugetlb_1gb ) || echo "skipping anonymous_hugetlb_1gb backing source type" +backing_src_types+=( shmem ) +[ -n "$hugepage_default_enabled" ] && \ + backing_src_types+=( shared_hugetlb ) || echo "skipping shared_hugetlb backing source type" + +private_mem_backing_src_types=( private_mem_guest_mem ) +[ -n "$hugepage_default_enabled" ] && \ + private_mem_backing_src_types+=( private_mem_hugetlb ) || echo "skipping private_mem_hugetlb backing source type" +[ -n "$hugepage_2mb_enabled" ] && \ + private_mem_backing_src_types+=( private_mem_hugetlb_2mb ) || echo "skipping private_mem_hugetlb_2mb backing source type" +[ -n "$hugepage_1gb_enabled" ] && \ + private_mem_backing_src_types+=( private_mem_hugetlb_1gb ) || echo "skipping private_mem_hugetlb_1gb backing source type" + +set +e + +TEST_EXECUTABLE="$(dirname "$0")/private_mem_conversions_test" + +( + set -e + + for src_type in "${backing_src_types[@]}"; do + + for private_mem_src_type in "${private_mem_backing_src_types[@]}"; do + set -x + + $TEST_EXECUTABLE -s "$src_type" -p "$private_mem_src_type" -n $num_vcpus_to_test + $TEST_EXECUTABLE -s "$src_type" -p "$private_mem_src_type" -n $num_vcpus_to_test -m $num_memslots_to_test + + { set +x; } 2>/dev/null + + echo + + done + + done +) +RET=$? + +exit $RET From patchwork Tue Sep 10 23:43:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799496 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90B8BEE01F2 for ; Tue, 10 Sep 2024 23:45:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 16C348D00DB; Tue, 10 Sep 2024 19:45:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0E9638D00E5; Tue, 10 Sep 2024 19:45:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B1B98D00E2; Tue, 10 Sep 2024 19:45:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 0BBF38D00DB for ; Tue, 10 Sep 2024 19:45:12 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id AFACCC0E2E for ; Tue, 10 Sep 2024 23:45:11 +0000 (UTC) X-FDA: 82550462022.18.2A9A953 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf23.hostedemail.com (Postfix) with ESMTP id CE77614001C for ; Tue, 10 Sep 2024 23:45:09 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=q7o42NY6; spf=pass (imf23.hostedemail.com: domain of 3A9rgZgsKCHgWYganhaupjcckkcha.Ykihejqt-iigrWYg.knc@flex--ackerleytng.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3A9rgZgsKCHgWYganhaupjcckkcha.Ykihejqt-iigrWYg.knc@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011772; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/qKZmcmtlvLxUiaB9mxVA7dCbtHF2c/ntC+IAYmWTbE=; b=n2LYwoKRPZnr2YV8el/lYktiSEgxpadrjcvQdJsX3fSGGDfzURwDJrW8Axjye0M2uBdxHc oEaeymyEent5wbomSRHob5NIHeIZQjYcood9Ph1y/qghjjuiD3DiL9hwQTM8IGzQLmsHVy 7h0XokHlLpZZ8GMnk5D+Xz5UdeeUvdk= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=q7o42NY6; spf=pass (imf23.hostedemail.com: domain of 3A9rgZgsKCHgWYganhaupjcckkcha.Ykihejqt-iigrWYg.knc@flex--ackerleytng.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3A9rgZgsKCHgWYganhaupjcckkcha.Ykihejqt-iigrWYg.knc@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011772; a=rsa-sha256; cv=none; b=KcxXXDoOYHfj2bCVOhrIpbwRPemim3nyaJ2SM9ato/uy/XoSXDuYrcHMevLtb/wnVqoT+y lVPdaFG6WH1/MsiUAgLUVVY493ap35xXrjbkp9SPkuQtMYiljRRqz5HP7NmcXHG8gCgUzF zqyCvoKwvLQps3JDFPVk73kMRzrXaUo= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-206e07915c2so58148675ad.3 for ; Tue, 10 Sep 2024 16:45:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011908; x=1726616708; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=/qKZmcmtlvLxUiaB9mxVA7dCbtHF2c/ntC+IAYmWTbE=; b=q7o42NY6apEXBRC4cGJOlj2foNBSX9sKsTVXg5s9VqPjbl4H7ixL/8BAbIgjg6XK7A zJgRyklZU67F3IUWeJWyb6usN4aHxu99HI1swOObXmkMtNWFNZaO6FdaNwMN9BfrcigJ /uzJLbPxO28Q6jhGQc160Qdb5HsNiRNJU8YZhQ9BhIUW/fiVAX4ZoXcCyNHIGfUCj2WY 1OE7tAfS4KC52688Y2sYIoPgIebtG1tVwfdQNjxNbOA/k/ao/1GOdi1mh/v2kYbVL/xA 4dlFBJRyzoDaNwNLc4rPSlSkPeJKpB0PEzDlSLCEcdaOs+CYlWwRijTYFUPyv0+XazE1 eulA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011908; x=1726616708; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=/qKZmcmtlvLxUiaB9mxVA7dCbtHF2c/ntC+IAYmWTbE=; b=KQfUyohUOYe4jqSToX/N/yCtBvj1DriuRwJRD/FZ8m0iErJje4Q8Kx1jjmcDqKTb7r 08dc1FKnT5Jp1l1xEUnrygsLTOG5C1w+vYSU3ZIFsFNx3cgs/xd6ZnhLEcX7jdzR5hQ/ +pEhvkpZuDacD6glXP+UoV19M8igpExQ/2SnicB/fvqZJNpC8y7rd6mq29JlLiK4JbLR NnhsBdERzuikvHo9kSCSH/0W2WqzPgCjjFpm2MDPQMAoSyQHAgw/ReBZjMFyriraAosQ RrimTZ2CRPXB5GOymEAUeiZesWymHILn9yDya/dO6Lh8ykBV8vs4CoygexdiQ4dJa1mF q1IA== X-Forwarded-Encrypted: i=1; AJvYcCXU9gXPaR7OjbnUaW4Rp5ifHuklWTAURwUn9EMb6mm7USkaJESFDEFpH67+BSLm6oVD5tZQLessuQ==@kvack.org X-Gm-Message-State: AOJu0Yz/wr2WjoK74+hK00fNN9rQgaPLhQmqO7zd4lpBkY+o5B1rvSzo hzbMeZd/5+5oD4JW0u/kmlj5V2vpcZPudjNbs93qkXt05gBa9KlPySAArTKhUh3ZqEyt5KvElJR gAhYwDtCHGDmCPC96IBeZFw== X-Google-Smtp-Source: AGHT+IFg8OuONbuLb/Y7bD7B++eP0JBBTd74fUtN+JQ8GVPvsqKbJU+z/AbwfH4duxwkO4C0+G16tdQi6/1xEgA62w== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a17:902:c946:b0:206:a858:40d0 with SMTP id d9443c01a7336-2074c7cb495mr2763305ad.9.1726011907972; Tue, 10 Sep 2024 16:45:07 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:52 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <405825c1c3924ca534da3016dda812df17d6c233.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 21/39] KVM: selftests: Test that guest_memfd usage is reported via hugetlb From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: CE77614001C X-Stat-Signature: cg79re686ertekrwd5g9ek4yn11xhbxd X-Rspam-User: X-HE-Tag: 1726011909-140508 X-HE-Meta: U2FsdGVkX1+ugmSzDSsj/3zPJqDvI8IK1pAj32feBPbAEZ3cQN8Zg/uWVObKzIazzkqpOJAWQr76lcCdiT+h7u6av71avd4bvMxXPW/S8C1qwBZ9J4gl4/BWPm7G4dBhe29HHdAVTuBTycmuhM5KdhLzm5xDVS0fYR2FQ5K+zo1wsYZ/0jx4TMCQGLcSwmQYHuFU9sbgECXrFieLLb1TlxtwnQLAgQ2+qhUiVpd9ZbnhLidOGf09QHrl0wpZev0mvkqN8HcBZggFcBDGlvoYImpGgbmJ/QvGriP331OI7qzevFj6nROFsgkILPCsYjoj1Ovw62xwzKsNX6wRnwF9xb49Tdb1V2XLkYIG9iORbdcaWuObZ28T6kQcgkfAoDrYQKvQ1T0XHpvVLQpm70blqS/pYGOmnwC0O9mOMWxPHLf9z6rC20f55QVI8iQ3T9mmdTXcqMtpysjTWh9rAQbZMxXgFq0tUfChLfZIGpg52g7jzrAhHn7UDGhdyMYVKAgqTA+uU8lig3dizU0eec/BQOAFeREkkGkVXw90bNkQ5arg6ACqWx/uZ4oB2Lm7c8lSVsTWYGKkVOTDG69mnDQkf0nWzn73r7UyOrFrwQ7fSElcTnG47uR4CCXYIRQVO7u0ixkvhNNQCFc+fV3soaCdbiVPnA22o4DEzPjurJRGYxFG+ixgG+DTEUE9g9gcPqT0KUEY0R09+T/n7eXL0B2ZUKe6XzeBUsS9GNCxi4dR8M82nfpmC8N/OfS7zuQaP8pXbo1FsX4jJRDs7qseWTZilHh508As1j0gdBZZeuYmXBz3HIKCKQlICQYoOG2iHpVtXCjTVoltYDlTQl8m4GIj35IeSkH8N8mP/fS+kiSnYCWuwPLa0lvjiJNfhl8zoz5E3TanLRqCFF5T+Kuv9LYov3IsaDhKQSlGaQgpRxxvuhAMkuSr55kA+5WWIgdAdbUeu15XFPRzJejEk2WrRoW /C+7l1Jq ky40FoBd0mSmpONbTEwjyW6ifb1DD0aFCHOs+FpB1JJp143/+VHsOaBdfsXdR2nwYH08lqaHoTcCQ3QYdASrv+2mIhfnmfJ7ezX0oXMbDVI4rtT4X/67v7jDoyQjV/kesd8ecG/JU1E7BYMbcNzS1eBSKewQfh6rLc1InPJePPMg2ul1l3hxn+F8qX4G6ytg4Nrk3SbJTtKUsWrvgRXX33vax21sycTnVt64113x0WTgo/o/JBmZvmuOyaMe+xO8OJN3/y0fQzwO4Y+J8lMiYsngLFSvlu8we61Cw+34N33v+R0xFr2ulmET8NHnpgK0m6JdKcdkUZLQJQa0cjEKQATueJgv5EXLr3+sLd59AvvFvUP8G8IXOj7cpf4LBorPU/AA9IRFMcTaAQ+bzADg7gflPl1cxrVxQptR2UP7glNvGcwczkolQLKni8KjxVuivOswpysYrTtFT2w+fGOl59+Yix2jp4Q4XW+Bxnq8L+o5WFfPnA/WBWs7BNpAMw19zHKcWasZMDKf+BFicpje1F8vs0o0e/9P3qa2W91SqzmW/lgDae2IF+Kb/eg8GSHxB/6sUUp7AY3Le+0FBNhBvPYn0CxP1ydrxSklxWdZzNXO1MVNpyccz09NB15TGE2TXw/AfxR8bU3eiwxw0aTz7IQMuPsM0Rz1bJHN5nlcegKPB0rI9Q+IB8wRlfnhH2OWn5tk3y1FLowHjQ94= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Using HugeTLB as the huge page allocator for guest_memfd allows reuse of HugeTLB's reporting mechanism. Signed-off-by: Ackerley Tng --- tools/testing/selftests/kvm/Makefile | 1 + .../kvm/guest_memfd_hugetlb_reporting_test.c | 222 ++++++++++++++++++ 2 files changed, 223 insertions(+) create mode 100644 tools/testing/selftests/kvm/guest_memfd_hugetlb_reporting_test.c diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 48d32c5aa3eb..b3b7e83f39fc 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -134,6 +134,7 @@ TEST_GEN_PROGS_x86_64 += demand_paging_test TEST_GEN_PROGS_x86_64 += dirty_log_test TEST_GEN_PROGS_x86_64 += dirty_log_perf_test TEST_GEN_PROGS_x86_64 += guest_memfd_test +TEST_GEN_PROGS_x86_64 += guest_memfd_hugetlb_reporting_test TEST_GEN_PROGS_x86_64 += guest_print_test TEST_GEN_PROGS_x86_64 += hardware_disable_test TEST_GEN_PROGS_x86_64 += kvm_create_max_vcpus diff --git a/tools/testing/selftests/kvm/guest_memfd_hugetlb_reporting_test.c b/tools/testing/selftests/kvm/guest_memfd_hugetlb_reporting_test.c new file mode 100644 index 000000000000..cb9fdf0d4ec8 --- /dev/null +++ b/tools/testing/selftests/kvm/guest_memfd_hugetlb_reporting_test.c @@ -0,0 +1,222 @@ +#include +#include +#include +#include +#include +#include +#include + +#include "kvm_util.h" +#include "test_util.h" +#include "processor.h" + +static int read_int(const char *file_name) +{ + FILE *fp; + int num; + + fp = fopen(file_name, "r"); + TEST_ASSERT(fp != NULL, "Error opening file %s!\n", file_name); + + TEST_ASSERT_EQ(fscanf(fp, "%d", &num), 1); + + fclose(fp); + + return num; +} + +enum hugetlb_statistic { + FREE_HUGEPAGES, + NR_HUGEPAGES, + NR_OVERCOMMIT_HUGEPAGES, + RESV_HUGEPAGES, + SURPLUS_HUGEPAGES, + NR_TESTED_HUGETLB_STATISTICS, +}; + +static const char *hugetlb_statistics[NR_TESTED_HUGETLB_STATISTICS] = { + [FREE_HUGEPAGES] = "free_hugepages", + [NR_HUGEPAGES] = "nr_hugepages", + [NR_OVERCOMMIT_HUGEPAGES] = "nr_overcommit_hugepages", + [RESV_HUGEPAGES] = "resv_hugepages", + [SURPLUS_HUGEPAGES] = "surplus_hugepages", +}; + +enum test_page_size { + TEST_SZ_2M, + TEST_SZ_1G, + NR_TEST_SIZES, +}; + +struct test_param { + size_t page_size; + int memfd_create_flags; + int guest_memfd_flags; + char *path_suffix; +}; + +const struct test_param *test_params(enum test_page_size size) +{ + static const struct test_param params[] = { + [TEST_SZ_2M] = { + .page_size = PG_SIZE_2M, + .memfd_create_flags = MFD_HUGETLB | MFD_HUGE_2MB, + .guest_memfd_flags = KVM_GUEST_MEMFD_HUGETLB | KVM_GUEST_MEMFD_HUGE_2MB, + .path_suffix = "2048kB", + }, + [TEST_SZ_1G] = { + .page_size = PG_SIZE_1G, + .memfd_create_flags = MFD_HUGETLB | MFD_HUGE_1GB, + .guest_memfd_flags = KVM_GUEST_MEMFD_HUGETLB | KVM_GUEST_MEMFD_HUGE_1GB, + .path_suffix = "1048576kB", + }, + }; + + return ¶ms[size]; +} + +static int read_statistic(enum test_page_size size, enum hugetlb_statistic statistic) +{ + char path[PATH_MAX] = "/sys/kernel/mm/hugepages/hugepages-"; + + strcat(path, test_params(size)->path_suffix); + strcat(path, "/"); + strcat(path, hugetlb_statistics[statistic]); + + return read_int(path); +} + +static int baseline[NR_TEST_SIZES][NR_TESTED_HUGETLB_STATISTICS]; + +static void establish_baseline(void) +{ + int i, j; + + for (i = 0; i < NR_TEST_SIZES; ++i) + for (j = 0; j < NR_TESTED_HUGETLB_STATISTICS; ++j) + baseline[i][j] = read_statistic(i, j); +} + +static void assert_stats_at_baseline(void) +{ + TEST_ASSERT_EQ(read_statistic(TEST_SZ_2M, FREE_HUGEPAGES), + baseline[TEST_SZ_2M][FREE_HUGEPAGES]); + TEST_ASSERT_EQ(read_statistic(TEST_SZ_2M, NR_HUGEPAGES), + baseline[TEST_SZ_2M][NR_HUGEPAGES]); + TEST_ASSERT_EQ(read_statistic(TEST_SZ_2M, NR_OVERCOMMIT_HUGEPAGES), + baseline[TEST_SZ_2M][NR_OVERCOMMIT_HUGEPAGES]); + TEST_ASSERT_EQ(read_statistic(TEST_SZ_2M, RESV_HUGEPAGES), + baseline[TEST_SZ_2M][RESV_HUGEPAGES]); + TEST_ASSERT_EQ(read_statistic(TEST_SZ_2M, SURPLUS_HUGEPAGES), + baseline[TEST_SZ_2M][SURPLUS_HUGEPAGES]); + + TEST_ASSERT_EQ(read_statistic(TEST_SZ_1G, FREE_HUGEPAGES), + baseline[TEST_SZ_1G][FREE_HUGEPAGES]); + TEST_ASSERT_EQ(read_statistic(TEST_SZ_1G, NR_HUGEPAGES), + baseline[TEST_SZ_1G][NR_HUGEPAGES]); + TEST_ASSERT_EQ(read_statistic(TEST_SZ_1G, NR_OVERCOMMIT_HUGEPAGES), + baseline[TEST_SZ_1G][NR_OVERCOMMIT_HUGEPAGES]); + TEST_ASSERT_EQ(read_statistic(TEST_SZ_1G, RESV_HUGEPAGES), + baseline[TEST_SZ_1G][RESV_HUGEPAGES]); + TEST_ASSERT_EQ(read_statistic(TEST_SZ_1G, SURPLUS_HUGEPAGES), + baseline[TEST_SZ_1G][SURPLUS_HUGEPAGES]); +} + +static void assert_stats(enum test_page_size size, int num_reserved, int num_faulted) +{ + TEST_ASSERT_EQ(read_statistic(size, FREE_HUGEPAGES), + baseline[size][FREE_HUGEPAGES] - num_faulted); + TEST_ASSERT_EQ(read_statistic(size, NR_HUGEPAGES), + baseline[size][NR_HUGEPAGES]); + TEST_ASSERT_EQ(read_statistic(size, NR_OVERCOMMIT_HUGEPAGES), + baseline[size][NR_OVERCOMMIT_HUGEPAGES]); + TEST_ASSERT_EQ(read_statistic(size, RESV_HUGEPAGES), + baseline[size][RESV_HUGEPAGES] + num_reserved - num_faulted); + TEST_ASSERT_EQ(read_statistic(size, SURPLUS_HUGEPAGES), + baseline[size][SURPLUS_HUGEPAGES]); +} + +/* Use hugetlb behavior as a baseline. guest_memfd should have comparable behavior. */ +static void test_hugetlb_behavior(enum test_page_size test_size) +{ + const struct test_param *param; + char *mem; + int memfd; + + param = test_params(test_size); + + assert_stats_at_baseline(); + + memfd = memfd_create("guest_memfd_hugetlb_reporting_test", + param->memfd_create_flags); + + mem = mmap(NULL, param->page_size, PROT_READ | PROT_WRITE, + MAP_SHARED | MAP_HUGETLB, memfd, 0); + TEST_ASSERT(mem != MAP_FAILED, "Couldn't mmap()"); + + assert_stats(test_size, 1, 0); + + *mem = 'A'; + + assert_stats(test_size, 1, 1); + + munmap(mem, param->page_size); + + assert_stats(test_size, 1, 1); + + madvise(mem, param->page_size, MADV_DONTNEED); + + assert_stats(test_size, 1, 1); + + madvise(mem, param->page_size, MADV_REMOVE); + + assert_stats(test_size, 1, 1); + + close(memfd); + + assert_stats_at_baseline(); +} + +static void test_guest_memfd_behavior(enum test_page_size test_size) +{ + const struct test_param *param; + struct kvm_vm *vm; + int guest_memfd; + + param = test_params(test_size); + + assert_stats_at_baseline(); + + vm = vm_create_barebones_type(KVM_X86_SW_PROTECTED_VM); + + guest_memfd = vm_create_guest_memfd(vm, param->page_size, + param->guest_memfd_flags); + + assert_stats(test_size, 1, 0); + + fallocate(guest_memfd, FALLOC_FL_KEEP_SIZE, 0, param->page_size); + + assert_stats(test_size, 1, 1); + + fallocate(guest_memfd, FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE, 0, + param->page_size); + + assert_stats(test_size, 1, 0); + + close(guest_memfd); + + assert_stats_at_baseline(); + + kvm_vm_free(vm); +} + +int main(int argc, char *argv[]) +{ + establish_baseline(); + + test_hugetlb_behavior(TEST_SZ_2M); + test_hugetlb_behavior(TEST_SZ_1G); + + test_guest_memfd_behavior(TEST_SZ_2M); + test_guest_memfd_behavior(TEST_SZ_1G); +} From patchwork Tue Sep 10 23:43:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799494 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D65BFEE01F1 for ; Tue, 10 Sep 2024 23:45:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8B5638D00E3; Tue, 10 Sep 2024 19:45:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 862748D00E1; Tue, 10 Sep 2024 19:45:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6B70E8D00E2; Tue, 10 Sep 2024 19:45:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 5CEFF8D00E0 for ; Tue, 10 Sep 2024 19:45:13 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 1391140C20 for ; Tue, 10 Sep 2024 23:45:13 +0000 (UTC) X-FDA: 82550462106.26.42BBBAE Received: from mail-pg1-f202.google.com (mail-pg1-f202.google.com [209.85.215.202]) by imf25.hostedemail.com (Postfix) with ESMTP id 4A968A0012 for ; Tue, 10 Sep 2024 23:45:11 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=HO01hlm0; spf=pass (imf25.hostedemail.com: domain of 3BdrgZgsKCHoYaicpjcwrleemmejc.amkjglsv-kkitYai.mpe@flex--ackerleytng.bounces.google.com designates 209.85.215.202 as permitted sender) smtp.mailfrom=3BdrgZgsKCHoYaicpjcwrleemmejc.amkjglsv-kkitYai.mpe@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011774; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=v2byeEE5glf1HSY5uEmaELERh+IVrgf4yY9vZeuIn10=; b=qnUtaH3j/BvamlM39zMCMVrQRpgS/aYD48ecmsmqxQEdl17EADKg3+cjzOAwkahbMUgQj+ rfkbHMZQ4MIR1CX8YDz11fS/DtFao7CURyo4J1guOuIrJ0cqwaprB5FSXKI1p6MKm4lSlJ V+/40MbUHvNNrt9zFfm5ypuKiGOs4vE= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=HO01hlm0; spf=pass (imf25.hostedemail.com: domain of 3BdrgZgsKCHoYaicpjcwrleemmejc.amkjglsv-kkitYai.mpe@flex--ackerleytng.bounces.google.com designates 209.85.215.202 as permitted sender) smtp.mailfrom=3BdrgZgsKCHoYaicpjcwrleemmejc.amkjglsv-kkitYai.mpe@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011774; a=rsa-sha256; cv=none; b=CsCf492M/SL7wEtdDZxi3ikfjQBjvh2sFi76v5XSBwSuhUAIOeCVD2V3qJRaIvlqotM1uL bQrjd0G1PgXMkGGGcVGyoYEwDrqhDvA3U2YXxUXL3bZUfWFsleSPHUHD8PqCFNHf35RLd9 R2rmolAd13HC9PryXxgfD+3PFR5ZGTw= Received: by mail-pg1-f202.google.com with SMTP id 41be03b00d2f7-7cf58491fe9so5906081a12.0 for ; Tue, 10 Sep 2024 16:45:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011910; x=1726616710; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=v2byeEE5glf1HSY5uEmaELERh+IVrgf4yY9vZeuIn10=; b=HO01hlm0UqCGMbbMhRlOPxo06cO30vVINpOVM9bRKAkiNF/g66Pnoq97+UOBPdzT/h kicwe0gvqiRgmKGKYQu2TgV7sSo6QOAcBeQVVf7pYWGWvSWJymyIkFTO9v+TaIEUv4D8 EC7jtl7ymL2/cIO/CTSuj0SwrfltrpY6sLmxNIXLKAPnUkEwGNGRkFlVrY88j/zln7Me HnfLCyPimHe1Or4Xbs+vAG1/iUW9gFDm/yrroImi3J8/CNDGtT5QnndOHiIfV9yKE7Ar XDPWKs/E0vX7v85KFb5TSGx1PAYexBKskpRL8fJICQMFsuycptylZ64ooyi/VqNVut7p 2UZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011910; x=1726616710; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=v2byeEE5glf1HSY5uEmaELERh+IVrgf4yY9vZeuIn10=; b=eciVQ0qI0GbrjWI187lnKvnJ/P6E2Mh/31Xb8yJ9bKIQ0C58jynzWhCvP9ZKTexf4r a0vL35GHHucb6hCm2aVHAnqOUgMGkcU3HVNmZQ5JpDuykdcThIRtWf9jAEGrJcegDtIZ Pk5fTjbTp07QEV1XL37lQJmshhvTtpVfBIlLTLRAuh2QSoi7hTWkmpVWgfD7QEB3qlbp HlNvbcP9KnDr2PaJWtVRR6c5BL8MArbBlYSdLSL8FcvvO/3BduTi2dDjPcRr0ydwurxW uAKt700Gd2IRFz3mQYSiPnRt2jBZ+z3i3b5U4coCm+5exYVQjGoDd9WVioIsMfeT34qC MX2Q== X-Forwarded-Encrypted: i=1; AJvYcCW0OhJ63jmv5ZHczLT8Gjrac1hC7iQv9AwK7tifoLVwjTkOEXOfZJZ8HTxwEeYpmWdvSugnVKeepA==@kvack.org X-Gm-Message-State: AOJu0YwcvZOubN6tri5pVgi7wGzRKSGfFB3P1ZwmBI/wYbXmkeVGcOhV IbBaAgq+FoTOfQB6C+RZipc516wkxtuSEZqLfVWnw+TlE8+qsB3xMUGh8s/61x9s5TIbRlmgc37 1UB2RnNjloUvKje2UxUun7A== X-Google-Smtp-Source: AGHT+IHRNRhjiY3twcj3iiSAd6sK0DpsvR2293BDcZR4oPPmoZmD0jtdld7lS7ASmX+XqQWY6/tAjSk1hpoqvcGWvg== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a17:902:eccb:b0:206:aa47:adb6 with SMTP id d9443c01a7336-2074c703af1mr2052395ad.6.1726011909816; Tue, 10 Sep 2024 16:45:09 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:53 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: Subject: [RFC PATCH 22/39] mm: hugetlb: Expose vmemmap optimization functions From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 4A968A0012 X-Stat-Signature: oenoccoqdcy8g7ohbwc8zety3mt3dqx9 X-Rspam-User: X-HE-Tag: 1726011911-253737 X-HE-Meta: U2FsdGVkX18e1BOxL1A/lYmGfCLY2/JcwKszreuP9sxpO5CNozb135IXvEm63W9JDtu9M7aqXnfKeBZKyq5PBEwvR0eSqMKk3R8qYjxvWoBvFJPhECjrrOHL4lw7XwCKpdlqfXgWNQqEWB9Z1U644XHDa1fWXNlUgr1Ignr+p/WH6KIqzOjbBUHgulgSCrhzDOoX2/m3bk9Rk/Ba0kIF5FOZVx2X/qyR3e5LmeLWBZIAauyxoJSZuolCb7+d0ALAZWXpzUoOWhsOd4VJuHun/5MnHNJvVYnIzXmz3+LmZXwji1WQILl8n+Psjsuqhm02sMpG1U4dTpUGc42DniufXAbhoac47d7jCvRdFoFGVoY7PygIsKlbkffyBvGww4xdVSVF5BMZLK8ciuhFXXL25dzWoYcGBirCuSdbIb8TJ5ChG/73p8pCIPIHaatpNANwi1vBh8zMSQav7VxE0b/GFmESdXrIDpwYCYfDSQN3DFTXJgud8G5RJ4hQlLqYQIzI24asLG9QTQemZriqrjd7rDuQvlCQxJxs/+K0uhBtsJuUq0ValzIYOSNquZOMQ0U/ndUMQjAmYdXfSLcDeUq7Uc4Bkw1qP/blPwM0Bd0hvqTb/dkSfWO2CBXGa9N0a48d1/2YbQxgTod3C+vC2wCgb4D9G+F3qLgR+q7gwiWrf0hgq/XI9nOP/Z2JmDD3ItfRBVmJD2M+LXq8AaDRQd0EkzTOE0hzCCe2LOCa4/+1dpSG7+wIKTJpRXelKQLdtHdXmhJ+LyA7QdGRqCfjOEa1JG3AMeUTgdsmhFV078nt8J4F/IOoZ2vOBiuyXrI7/PPFV1T5jJNRuwW9uUbjHUE+c+OnFCmHASRzbwHI2k3ufse1DgzeuvITQyfUy4u1O8qFNgFG8MSjJV1Gwa7+A7CAaTasP7yww4RaBeF7galFU1yNFcm0bUkTp9gH3lv9wyh7bNSrHyZzVybXyn21Rfm 05p44qQ9 IYW8T2qZmL0nUEl872vdVcSuwKF+gHpWDx8sZoW7rkTi55rg87BHbmhkhaw7ZdbXW5uuUaOEChd4VYQv4PbTA1xee8rqqr6D5E6igActBEQ3PYXPSEDRvDPgtjvLljygLL4OCYfm9P6Ju+Lm1K0zyjT7boDjc41fsdYcTsvFwvdFLyUJhLbt09K2LdWxHVIzhOgQ4m7qJF6DMwvEydKtcU0YrHsgiLDm5+TqXdFg1DtXQE12eDsejU96IS062jcs2sHsSgvirFEc274KVvuxKMsEYKSsRKtEzHBr6jtXkTGmPh0KTgPInp6d0rkIX/qsaoTRON4EjySvtuh1Zb6ZcUc7adW8t1Aojg6CvPaDpmsxmr2lHQTKN4p00+AFo6mmifAw0QsRxy/V1ikEnIvm2eUTq117qe967PkSIY8pPSaZrA6WAP/toHXvLMR0i/4hEsttexbEsAVfK+lXXtjShruIWUAUqVD/pME/YfA/56iBAucx3xZo+Vsmg9142Ru7qGvMFQzpueTNlI7ViRBdO/AgSk/iTWAQhq3K+55vcIxDRdxrjCvx7nyAGN5padUOu1vWD4UhXTlxb5NAt/hCnliwG+IhnS1Sz41LZy+b3WYLqahFplCEyUAGHFv0fQhMDVaO0qADktoagAaOhRhGnu97xzsAr2yU7yoI2qp9gyfkbubP0DX1N03X4rJL0IC4Jpike X-Bogosity: Ham, tests=bogofilter, spamicity=0.000621, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: These functions will need to be used by guest_memfd when splitting/reconstructing HugeTLB pages. Co-developed-by: Ackerley Tng Signed-off-by: Ackerley Tng Co-developed-by: Vishal Annapurve Signed-off-by: Vishal Annapurve --- include/linux/hugetlb.h | 14 ++++++++++++++ mm/hugetlb_vmemmap.h | 11 ----------- 2 files changed, 14 insertions(+), 11 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 752062044b0b..7ba4ed9e0001 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -284,6 +284,20 @@ bool is_hugetlb_entry_migration(pte_t pte); bool is_hugetlb_entry_hwpoisoned(pte_t pte); void hugetlb_unshare_all_pmds(struct vm_area_struct *vma); +#ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP +int hugetlb_vmemmap_restore_folio(const struct hstate *h, struct folio *folio); +void hugetlb_vmemmap_optimize_folio(const struct hstate *h, struct folio *folio); +#else +static inline int hugetlb_vmemmap_restore_folio(const struct hstate *h, struct folio *folio) +{ + return 0; +} + +static inline void hugetlb_vmemmap_optimize_folio(const struct hstate *h, struct folio *folio) +{ +} +#endif + #else /* !CONFIG_HUGETLB_PAGE */ static inline void hugetlb_dup_vma_private(struct vm_area_struct *vma) diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index 2fcae92d3359..e702ace3b42f 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -18,11 +18,9 @@ #define HUGETLB_VMEMMAP_RESERVE_PAGES (HUGETLB_VMEMMAP_RESERVE_SIZE / sizeof(struct page)) #ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP -int hugetlb_vmemmap_restore_folio(const struct hstate *h, struct folio *folio); long hugetlb_vmemmap_restore_folios(const struct hstate *h, struct list_head *folio_list, struct list_head *non_hvo_folios); -void hugetlb_vmemmap_optimize_folio(const struct hstate *h, struct folio *folio); void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list); static inline unsigned int hugetlb_vmemmap_size(const struct hstate *h) @@ -43,11 +41,6 @@ static inline unsigned int hugetlb_vmemmap_optimizable_size(const struct hstate return size > 0 ? size : 0; } #else -static inline int hugetlb_vmemmap_restore_folio(const struct hstate *h, struct folio *folio) -{ - return 0; -} - static long hugetlb_vmemmap_restore_folios(const struct hstate *h, struct list_head *folio_list, struct list_head *non_hvo_folios) @@ -56,10 +49,6 @@ static long hugetlb_vmemmap_restore_folios(const struct hstate *h, return 0; } -static inline void hugetlb_vmemmap_optimize_folio(const struct hstate *h, struct folio *folio) -{ -} - static inline void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list) { } From patchwork Tue Sep 10 23:43:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799495 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A70D6EE01F2 for ; Tue, 10 Sep 2024 23:45:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B74118D00E0; Tue, 10 Sep 2024 19:45:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A856B8D00E4; Tue, 10 Sep 2024 19:45:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 751038D00E0; Tue, 10 Sep 2024 19:45:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id A30A08D00E1 for ; Tue, 10 Sep 2024 19:45:16 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 5916FA0EAD for ; Tue, 10 Sep 2024 23:45:16 +0000 (UTC) X-FDA: 82550462232.06.461B24D Received: from mail-pg1-f202.google.com (mail-pg1-f202.google.com [209.85.215.202]) by imf27.hostedemail.com (Postfix) with ESMTP id 9140C40008 for ; Tue, 10 Sep 2024 23:45:14 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="VHuqBmE/"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf27.hostedemail.com: domain of 3B9rgZgsKCHwfhpjwqj3ysllttlqj.htrqnsz2-rrp0fhp.twl@flex--ackerleytng.bounces.google.com designates 209.85.215.202 as permitted sender) smtp.mailfrom=3B9rgZgsKCHwfhpjwqj3ysllttlqj.htrqnsz2-rrp0fhp.twl@flex--ackerleytng.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011910; a=rsa-sha256; cv=none; b=LiKUdgV6DZ94jj19ITy0ObE3yimyV7QdGtFYosSons8hSlpR+m8CD8ycB9RvoiLxfWehJ5 ABl7WLGb5Ptdt0fi62fmEgznzLfRH2AlXvpf6GS+HxrfaH3+QqPX5z1XEHJ+lLesInZ4kM I+RBMYi2LC3LorTdrJQf6lQCNzx64fU= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="VHuqBmE/"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf27.hostedemail.com: domain of 3B9rgZgsKCHwfhpjwqj3ysllttlqj.htrqnsz2-rrp0fhp.twl@flex--ackerleytng.bounces.google.com designates 209.85.215.202 as permitted sender) smtp.mailfrom=3B9rgZgsKCHwfhpjwqj3ysllttlqj.htrqnsz2-rrp0fhp.twl@flex--ackerleytng.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011910; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XJnVJ3hPk966+JzwkBdvbRfaDiCGChJRXNWj1r0A/AI=; b=NTGU7jYjKfx/AvgxSbl7UgKREMXiDGFaiG/UgBbwmZGg0BHGc+xIi6oXERiMFQxUtu7/hY WECwsnCiGwC/QrSENRZKbhxIHWsRiRlmuhS57LkL/XBqr0jidmKmE5SuHgC+1B3BfwRkG3 +yizP73REvMP7A2bxX+/pOpLwFkO1Tk= Received: by mail-pg1-f202.google.com with SMTP id 41be03b00d2f7-7d50c3d0f1aso5097216a12.3 for ; Tue, 10 Sep 2024 16:45:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011913; x=1726616713; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=XJnVJ3hPk966+JzwkBdvbRfaDiCGChJRXNWj1r0A/AI=; b=VHuqBmE/+DJuAyavGZ0/CGT5oZNUZt3h2Pur4y6Rgil7Ug8c4fb4AE3tqAvhHh15fp hTAhFd710mjqiw6gmVqLODI+y2R24izJjroxPHws9qj4rUZkNg33XLAnCLEBW4IKDh/l K2B7I4GvGYaJjz1xC1Ui89mZs5WAAPEXLnXqjOURotC6YeamQqQWuEYe8uqM0/8gFMja G77cl9H7d//pUl4RlUdLoI5+ozmjZdWk1u82sC+EK4K08+ZWqKt10oIrRhM3MR4aPHjY j1/so1vRhlfNjIfS5+0hqq/yw8PiuXg0lve+KLQqJ+3i+oool9o5IF7EPTBfHC2fpgwE rSiA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011913; x=1726616713; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=XJnVJ3hPk966+JzwkBdvbRfaDiCGChJRXNWj1r0A/AI=; b=Oc0sdAbSWZ6z4V4Kr8PMUKVOMEAQ4RqZqpuXASa5RwsBuTLRtMNhCtZi0WGC76mkEh mkV+aIaeCo9NzUtkdAmZi1LdCPg1jALNYxs/2yWLukyhXbRC1/QSiTOBTj9Do95P+74g BhIY5dQFTpJuf2hydlGZxMNvBQ/61lMKj2jCnmftgF+DmE39tfx0Y6Edz9A1hxds0waJ Bf/XKDUiyEw7RK3jyFsMMf+56h/RIcVEe6vLk6XZrOmCGVvipVuDxv9o6Q1u2X7R2sSV +J58R4MAfDnKIdb01jQyU4zNFqnfGXfbAPVr+bk9LRhpr+WfNEleAMNgFu8PE6+8ep0b oGUg== X-Forwarded-Encrypted: i=1; AJvYcCVhW0nf6rMFIyCbLRw2h9+p3BSqfWP1sZ4M0ETgRAGhFbEwDcn1lsA8epxggXLbsteG1JnZjPK2Hg==@kvack.org X-Gm-Message-State: AOJu0Yz+IToYqaiWNuTrNWX4F3eHepfKbc5xAhYEQAQIUn72rCLhoNOY 9isvxBrVohdppGZAHmgV3rKWCBCwbl0gnFDz8yRYOvAD4XjDAi4mFKu1E+t7cUI5AsGOSvpS6+v sHF2dqD/zsPwcxUa8SutWFg== X-Google-Smtp-Source: AGHT+IFupyBy4ChWh4+QQZzDjp/0RDwyZGNbYZlLigdfa+9eTsAjgorPIS9wF7AAjJqg5MMFYi7bl04vejqoUPfawA== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a17:903:244d:b0:206:cace:ae9c with SMTP id d9443c01a7336-2074c6a44aamr66745ad.6.1726011911534; Tue, 10 Sep 2024 16:45:11 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:54 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <226a836ca381824cfe17ed42be5cbf9972b09ab1.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 23/39] mm: hugetlb: Expose HugeTLB functions for promoting/demoting pages From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspam-User: X-Stat-Signature: s5bztqbbp5a3mdm8nxrki1688egjqym7 X-Rspamd-Queue-Id: 9140C40008 X-Rspamd-Server: rspam02 X-HE-Tag: 1726011914-155752 X-HE-Meta: U2FsdGVkX18PvgXYoVA3lKIOM9ccNFJoh56KuKQ8dJNOIUS2jFVI00Yhai5LmygCRxYZt9jnKsoIxWkkU6530yhkSCZouIsGh6KyUZ4reYX1EHYFfGmrYPYlDTQmQDLc//Eju01a0qJ/38mKCTodbdf1z4xPiDJa5+4DRXE1sOQTcmlDGVP/0G51NwKTVSMD3dnwA5j+qc/Q7PPovVNZNHBpwvuSxHbXmUpJd4mCmXqwMEsjljiJxkUF4NhV/GiSx0fP3BI6oeycjakYLIpczxiDK7u305ZdvCqHFwa11JNenJsbjuNKi6vZRw1zUUV+tULaoTc/rM0lkDFB8UZ4ACVq7h6MVxP8zY6yQe80fY/SpS+Mog/QsxWUInMNg9Uco9wfxO074+qLHvIXhsmzA764n2C8Pi++HsNpJH0IiseMn75QBLMrdKFmdt8WSKOx92HFHkvIQUwib00M4pOTygK6VGoA22k5h8KO6ZJp/Z7llGCYHsOqysDFd6bv7qBxBoCopQaq675aGNnbrjfD7cEu4JUaSiatqPXgJnD0nFQLqAuOMvUCAAfOJsI0g8ZeRyhhwj9pVhu4U8gbRVapVjLsLVG37MECYGcIMEtOYvk0nIaU8qNGvOYF2JWZKBEWNeGMoZHyf07RnbZlozuiVvhLi/0sgFKDFaKpXaW6Y8xFEbr4acCiGnyLGIy1FpelzQn+wNeQJUtDeh/bD5ZaejhhHRPy1FZGjLx2CJMVXJaVM4bB50JzsJ+4mzJlslQuPftnYoA3bhDc9BKN4LQJX/B+3O37xsxZcSXfBnhN0V8EJR3ewIOK0Zmdik1rqhmRzYTzdhXYYUk7ttMnFsYjPla1Ar7jIpzyaMqf4qCbdIqqPskDV83VhXr7/E/SWcW/8Ih1KQ1SUUeaMctUTIFV39pD+9GfVKst+tkALVyB4oX3CUuUfngCQuyAGzXCUGOibK/DtWx/S1y7YzUVaBt GpTigNeb GIOeHou2lhKLeaz61KkBp5Jpp86J4/e5562MZw1SKHu++3WG2Ft4PriOkTH93k7093Lx55rSJyCtsDYO9w2TOiQuMFCR0BmTsuqhRc6MQ4pwRcTmqnh3XH3LnpMSwMcd2IoGz83cD6jlA8U13naQ0eSXrTcu6Ik8pgZHUx+YoaYuVy20Q/hdEoSXrtf0ALCxPjg2BTAKCRCRy4olevo0G0zNPWqDZ1a28XoaG7cquqTZDVWCPAGoLR6kKAbtbkJ6Lxzu9Vr8vSCDg3K1By8k0mNoO5tVMdVhdivDw+Tgbv8PEA0EtFwJhIvxS38XGs/wMlmV3FK5CZzy0zS95+uFU78GmPluZN63ShupIs808pyN88oUmbWVa2izv1pSFYFZs2IxJIf7+v5z9NTDxV7hXCxyaUqOSNZGnNAi+dejtB6f/04+hQqPg6XJ7cFJoWJl17a1YIz7sa8jU/IkOyyiTxt2JC1HtbE6ffzuluCm59qTvY21+X0brJKfAPF7ux+V7sgLcfb60ygstSJ6bMMCbuhfVu59jviFJCt3cAXG2sSIuVwQ1Q1xy7Hur7ON3Qk+bJc+XBwSUK6jiQ2LvRN95fLyrw8uCtIshlA/cGi6EoQ4QUceGK4owXCpl/wfZLYNtaIL2o6Sqpn3WCZ2lvHtWN0A8dWwe3L2Jrc3H X-Bogosity: Ham, tests=bogofilter, spamicity=0.000363, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: These functions will be used by guest_memfd to split/reconstruct HugeTLB pages. Co-developed-by: Ackerley Tng Signed-off-by: Ackerley Tng Co-developed-by: Vishal Annapurve Signed-off-by: Vishal Annapurve --- include/linux/hugetlb.h | 15 +++++++++++++++ mm/hugetlb.c | 8 ++------ 2 files changed, 17 insertions(+), 6 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 7ba4ed9e0001..ac9d4ada52bd 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -298,6 +298,21 @@ static inline void hugetlb_vmemmap_optimize_folio(const struct hstate *h, struct } #endif +#ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE +bool prep_compound_gigantic_folio(struct folio *folio, unsigned int order); +void destroy_compound_gigantic_folio(struct folio *folio, unsigned int order); +#else +bool prep_compound_gigantic_folio(struct folio *folio, unsigned int order) +{ + return false; +} + +static inline void destroy_compound_gigantic_folio(struct folio *folio, + unsigned int order) +{ +} +#endif + #else /* !CONFIG_HUGETLB_PAGE */ static inline void hugetlb_dup_vma_private(struct vm_area_struct *vma) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 372d8294fb2f..8f2b7b411b60 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1533,8 +1533,7 @@ static void destroy_compound_hugetlb_folio_for_demote(struct folio *folio, } #ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE -static void destroy_compound_gigantic_folio(struct folio *folio, - unsigned int order) +void destroy_compound_gigantic_folio(struct folio *folio, unsigned int order) { __destroy_compound_gigantic_folio(folio, order, false); } @@ -1609,8 +1608,6 @@ static struct folio *alloc_gigantic_folio(struct hstate *h, gfp_t gfp_mask, } static inline void free_gigantic_folio(struct folio *folio, unsigned int order) { } -static inline void destroy_compound_gigantic_folio(struct folio *folio, - unsigned int order) { } #endif /* @@ -2120,8 +2117,7 @@ static bool __prep_compound_gigantic_folio(struct folio *folio, return false; } -static bool prep_compound_gigantic_folio(struct folio *folio, - unsigned int order) +bool prep_compound_gigantic_folio(struct folio *folio, unsigned int order) { return __prep_compound_gigantic_folio(folio, order, false); } From patchwork Tue Sep 10 23:43:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799497 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D664EE01F4 for ; Tue, 10 Sep 2024 23:45:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4886F8D00E1; Tue, 10 Sep 2024 19:45:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 389248D00E2; Tue, 10 Sep 2024 19:45:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0022A8D00E1; Tue, 10 Sep 2024 19:45:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 9081E8D00DB for ; Tue, 10 Sep 2024 19:45:18 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 524F114066F for ; Tue, 10 Sep 2024 23:45:18 +0000 (UTC) X-FDA: 82550462316.04.2D35CCF Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf29.hostedemail.com (Postfix) with ESMTP id 71567120006 for ; Tue, 10 Sep 2024 23:45:16 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=XKwui3Wm; spf=pass (imf29.hostedemail.com: domain of 3CtrgZgsKCH8dfnhuoh1wqjjrrjoh.frpolqx0-ppnydfn.ruj@flex--ackerleytng.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3CtrgZgsKCH8dfnhuoh1wqjjrrjoh.frpolqx0-ppnydfn.ruj@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011888; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lBJef16ALyOYFlot9YzaY+1ElfdQQGF78ZnzsjyO0V0=; b=fhrQwc8mN3NSmCEYPvLYxhnkr9SI8+d9WHDbVFkOxujiL5QmxKYyKFHwQQZp0xDO/ZW4yi bYEW9JWYSvZwN84UnncJs581MT/XfRCKegH9ORGb30DnmhynT7v0R5kIaY8cx7Qp0uHP+7 1NUJ2syc5TyhOeDvL6gHiruyfuTzHz8= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=XKwui3Wm; spf=pass (imf29.hostedemail.com: domain of 3CtrgZgsKCH8dfnhuoh1wqjjrrjoh.frpolqx0-ppnydfn.ruj@flex--ackerleytng.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3CtrgZgsKCH8dfnhuoh1wqjjrrjoh.frpolqx0-ppnydfn.ruj@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011888; a=rsa-sha256; cv=none; b=juPatSvmruJKqYlkHksCUK3MswOc1uJczWAv06fNKocf7x+sux1fMds6WcNXqknvWYW9Kh Z2us+Bd+5EXzQBt99ved5QqXq5qnfJfM141d0mGYwFagW9Z3o8Jfz78u17Hz14vP8Y5+yH ++XjIxRMIP1O1z0uCBNfnvQx4ihRCQw= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2d8dd20d0ddso308952a91.1 for ; Tue, 10 Sep 2024 16:45:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011915; x=1726616715; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=lBJef16ALyOYFlot9YzaY+1ElfdQQGF78ZnzsjyO0V0=; b=XKwui3Wm8yjuizkujdDJx03d6t+lTIuv701dpbgUs6ipEi9JXQh6BxzFFbgAOCIPkp yhmab6cJtOPm/vwAei+gi1eyudZKHL3NBnSy/ix5bnBZNwhP3YNIDoeymudUtPL6ksg3 jFsfiwtRIx2AVN3zVItGmIIVEPVOEQO21svAFSH735zYsUnTLW7hA4KXdD8e3lCsDQBj njX4i25Sj7ZbRoODFWDPeadkSUcwAcNqS18dH2dHQKvrkVT4V/4xzWvARE+0/misFXOm M+cgvjxvomKn+bs812Fzd7oIxSUDq6UsYa5n+jR/8Z9w5Dyd0Dqdy2WEjjT9hjlKiBSx HD1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011915; x=1726616715; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=lBJef16ALyOYFlot9YzaY+1ElfdQQGF78ZnzsjyO0V0=; b=jN8FTcC38UwT/W6/Hz+otf+RyEKS7pcetlMgyrYKEik2Oraixt/vKvry1tuD7HGJh5 hrmutN7CN3ruGLLZ5hTmOcqAbQv4l1fR8EUw7pQIkuphNx8MaBPkgn0JUnqDPuFN1jFk +7bWPUxit81TIs2o/gORE/kiebvBNRtfL9mbmu635kVP9UmUts3yEZU5Y+1BWp7kZsKu dns1uVGcu+6vnul+OEZJcOwLcJ/+LqxPkuDYi5PsrCKHjpyWBTnQPugfpl+coolhqAQJ NZBOxtx4YYu8OA6WPEbenho/VJCY4Y5YGbbF1DFHASvInBP3BZza0gB7udtGdP8gOnLM CHSQ== X-Forwarded-Encrypted: i=1; AJvYcCV8Yb7HpasMjmI8ZTW6DNbaIXWnMmyScq1xgb6CKXnx0IpkQfmpexc/jI0ifsxwiQyX2lSwpQvsnQ==@kvack.org X-Gm-Message-State: AOJu0YyfP51tMEa3+o0PE0vR4A8fBOx250sK65GrLGlkTOttrbp056vt xasDXWrdQT/IVXXYvhWKWC0F38JfU+cqPwEYSTsAZJiEuYcoQxDkH62ZbdFyH0FeWBPv8VTSN/2 fgZsK6WUn7ShgW5jixHewcQ== X-Google-Smtp-Source: AGHT+IE6UzsGakpseqVGpYmVeT4krDva7uEV1yzairw6T5Hn14xupYXurGRjEZiCuCr5K2NNFKPuVFRVzmSBBIl0nw== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a17:90a:d14d:b0:2da:6d50:e79c with SMTP id 98e67ed59e1d1-2db67176e9fmr38917a91.1.1726011914978; Tue, 10 Sep 2024 16:45:14 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:55 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <55b2d15ddd03b4c7df195cace3dff83ffcbfa71c.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 24/39] mm: hugetlb: Add functions to add/move/remove from hugetlb lists From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspam-User: X-Stat-Signature: xkqnj474xb6tc6aa4tpsqcuhtw4d7rs7 X-Rspamd-Queue-Id: 71567120006 X-Rspamd-Server: rspam11 X-HE-Tag: 1726011916-198998 X-HE-Meta: U2FsdGVkX185pnE0eD2a85jy18sbuh5RhIyN2NzOhbZEUlDTJkkZ8TywGa80i6bt5bDgjTZiZveMWMoyXxizbJnevg6VC6bUxYcK0zANmICHFFl61yvOXl1SmC4a9WPUaFemEgqvnItEEYgnlfOAdKhFTncbcHJHw83Gjhp4Bn3Rsd0Y/xZXctTxPTUDqnsSxSdlhdjSNhdmfvzMA1EnA3/we91kUiZ/yth7bDsspM2MYz6kK2EF6QhiswYSkAvxuL8j70mXQ1BnLGTvoEC5iVKh8GP+tKcMro7bTyxg4AwipQL+sCjVL1V51q+jFbbHWucsk9wKOHOVZgmoFzNfgrK6SjVQ0PPU32MLxAdgQpfwKVWtYh3xpiTxfHzPtZbo0lt7f2mupFsiDikmbQ2MReu4Pwt1MYu6koXXak3F6djT8LwFIcI7upwjbwsChPziMH0r4wyyt2hNrFbTFFMGFvB3L6WiNWxUz+BL6yIYoNZSfTmr+tUtwpISRFm4xUf8HTZkgcnpBUaF0KKgHkTnzIp1tCI8d8Q18QyAKOE4aMu1zh2Lw0j5qMIu6UGQOoHgKkXJJS9p5JYWbj7YypB5OLnSr803g6lWa3YznFTLma8/x3rg+yPjnLkY8oEGYD30fi4iraNCuDKvvKLovOR1KUtdIR/ywoMFad4Dc327j12VgEr4ebg4V31WGJX8igFyvMICDGqpHC/pYkThHJ58venbVy59YqZB+11kVworEbyzfqwipYlfXVxA5/qCqKtuBYUXm2N3c67v7hfuFsYIrTbRsqrFnHxIyElJLc4wCVJl+ENMEwNimbqyVVpoXZRWah1h8CtLoEHNcW0uUK0U4JqB3BKODmFkAd3crL9MvSaFPrV5iLw5+j+0VBzynlWQM/radFH4OFcrm4A3FD+M9KmD2CMwrSU0IRDAC3KgRqpUmivB1FTZ1LPC3PqX6h2ORdmqCk6jzk37yDyZGK9 9IcDnJuh TDGsIf9z9PirzRmZbUUOr4a3HqDOh+LIcjfpPfbm3En0CTqfNKIvr4HFARoULSLCa7UJPnx78k7Ecxz2S7ouYyt3olUyZ8lgG6ABfeszmsyEehjO304Di9/r9TdoY8lW9aSJ/F1eFeiKQAvyE133lBCZ3FbiO8RNketdOsFAD6rYl+Bn/pJ5JQ1VPZ/LAqRyoP4xT48iRkc+v9v9OHOtq8zCOBm1Z56ROhBPcCEmcqGhSyYoR7pwQb9f0v6D7lcABhEuUNUMY3qk01kS/yeiXevhpXBe4UEoTixfmKCrMD8FpX/0GM0k05tUf9FWw22Oun+ncIoIVB4M7B4xlVZfrFOKEYf6vRNz45OTotwtv2BHaUVJzIPYn4JgB8HkQ/Smq93VSNuzR65g7qKy9kQnizcPpI2ERATEtoX/FH8fYpjn00wiRfeUQS+5W9tbDeEE2RY/mU2kPRmNSz3asibMHHuuk5S78VOmNTibU27ZfucZex8riCUElj4oO/3Gk/6Y78LyQyMR/PRX3Up7o+Rix298InPOx0kuLkuyDPR3SLZIYqwmSzQUju7IzKCYU1qxLnKj1mo3vlNnnvMha6VtWGsjsdaxT3mkJEZysXBGUwpeAU7/ivj41H1fiHBlAo/R0+YwI8dauA+xklgdiQEgMn13bfWYStsy41iBigs6+Xv8YIICSTBcTF+wnBw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: These functions are introduced in hugetlb.c so the private hugetlb_lock can be accessed. hugetlb_lock is reused for this PoC, but a separate lock should be used in a future revision to avoid interference due to hash collisions with HugeTLB's usage of this lock. Co-developed-by: Ackerley Tng Signed-off-by: Ackerley Tng Co-developed-by: Vishal Annapurve Signed-off-by: Vishal Annapurve --- include/linux/hugetlb.h | 3 +++ mm/hugetlb.c | 21 +++++++++++++++++++++ 2 files changed, 24 insertions(+) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index ac9d4ada52bd..0f3f920ad608 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -164,6 +164,9 @@ bool hugetlb_reserve_pages(struct inode *inode, long from, long to, vm_flags_t vm_flags); long hugetlb_unreserve_pages(struct inode *inode, long start, long end, long freed); +void hugetlb_folio_list_add(struct folio *folio, struct list_head *list); +void hugetlb_folio_list_move(struct folio *folio, struct list_head *list); +void hugetlb_folio_list_del(struct folio *folio); bool isolate_hugetlb(struct folio *folio, struct list_head *list); int get_hwpoison_hugetlb_folio(struct folio *folio, bool *hugetlb, bool unpoison); int get_huge_page_for_hwpoison(unsigned long pfn, int flags, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 8f2b7b411b60..60e72214d5bf 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -7264,6 +7264,27 @@ long hugetlb_unreserve_pages(struct inode *inode, long start, long end, return 0; } +void hugetlb_folio_list_add(struct folio *folio, struct list_head *list) +{ + spin_lock_irq(&hugetlb_lock); + list_add(&folio->lru, list); + spin_unlock_irq(&hugetlb_lock); +} + +void hugetlb_folio_list_move(struct folio *folio, struct list_head *list) +{ + spin_lock_irq(&hugetlb_lock); + list_move_tail(&folio->lru, list); + spin_unlock_irq(&hugetlb_lock); +} + +void hugetlb_folio_list_del(struct folio *folio) +{ + spin_lock_irq(&hugetlb_lock); + list_del(&folio->lru); + spin_unlock_irq(&hugetlb_lock); +} + #ifdef CONFIG_ARCH_WANT_HUGE_PMD_SHARE static unsigned long page_table_shareable(struct vm_area_struct *svma, struct vm_area_struct *vma, From patchwork Tue Sep 10 23:43:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799498 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16E8DEE01F1 for ; Tue, 10 Sep 2024 23:45:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 88EB58D00E4; Tue, 10 Sep 2024 19:45:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 814BF8D00E2; Tue, 10 Sep 2024 19:45:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 641128D00E4; Tue, 10 Sep 2024 19:45:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 320AC8D00E2 for ; Tue, 10 Sep 2024 19:45:20 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id EC03E1A0C1F for ; Tue, 10 Sep 2024 23:45:19 +0000 (UTC) X-FDA: 82550462358.21.17671C1 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) by imf05.hostedemail.com (Postfix) with ESMTP id 25C51100005 for ; Tue, 10 Sep 2024 23:45:17 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=eG0mz4VC; spf=pass (imf05.hostedemail.com: domain of 3DNrgZgsKCIEfhpjwqj3ysllttlqj.htrqnsz2-rrp0fhp.twl@flex--ackerleytng.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3DNrgZgsKCIEfhpjwqj3ysllttlqj.htrqnsz2-rrp0fhp.twl@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011890; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dBcAwVvZIYzGXNkByOPNSzS/2ZaPoaCsQUMEIRBg/WU=; b=L3A3F6P7Ad766XeUGX9q/oU809Ml8NitUZhy5eoiw16iESM3xEp7iGVNiaeelJ1bS6SfT/ T6926zLZ/KRGX0hR8dM/vvoM1uOxkJuadvvalmtMjT/zpNbvk3czKj9qxBVeW8RXx9+s3l d1O/0orl9uyQzpuDzVQ6i4dAguvKoWw= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=eG0mz4VC; spf=pass (imf05.hostedemail.com: domain of 3DNrgZgsKCIEfhpjwqj3ysllttlqj.htrqnsz2-rrp0fhp.twl@flex--ackerleytng.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3DNrgZgsKCIEfhpjwqj3ysllttlqj.htrqnsz2-rrp0fhp.twl@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011890; a=rsa-sha256; cv=none; b=Ug/XSzOJpbOcmlGRgnptbHT8uHV9+2lV3siYQDCbkDBtZrpXMn++v6Virq6VPW+Pbke5nS hmiO0pIm426tp6TJjdSBQ/HFc4vankXMZZAVhqY4iveuhCPdg862N8jhCS5Ado/dlPZwHv N1eowTllNZ5XjfEH96YRz0v/JF9DHUQ= Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-20556f1cfebso88215085ad.1 for ; Tue, 10 Sep 2024 16:45:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011917; x=1726616717; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=dBcAwVvZIYzGXNkByOPNSzS/2ZaPoaCsQUMEIRBg/WU=; b=eG0mz4VC85YeZSkNXh9WbbMDDrqg4wUbEVmh9oFv1yk4H+Ght4dj1p0sfC3OGqzFV+ x9EvAFPIcmba89fAoEWE3RUSCbpD4ccoeOizu02Yd15IRIYVEBVApmjE06AGIXcJSz5Y 39G+wbI8RixRCbDA6/GKzrdH9GQuPju9vfj/ATEWWwxmsOw0WobnCO1rQfdKtvM01YHs cJvD1JXfHT33ZcMLNdWPPT4ze8kaNmHzvWf3e86XagbbWHpf/7GRdVzVtVxySYmRdet8 Z0aWxsem5RKRwckokCQvuKu2P56ImZz6Z7hzRR3RgM3C0yQvEldPvQuZ7CNMPKiPLNRy tiCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011917; x=1726616717; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=dBcAwVvZIYzGXNkByOPNSzS/2ZaPoaCsQUMEIRBg/WU=; b=ny6SxiytjTgRdJ+vw/JBgzE6/39HdnvqsoXzk5jRSQmBF5Z6UFhX0P3XJHXcwTc9ED m9DAj6BAffwfECk+277RQEUnRopngtWY038JkjCKDuc9nKnBeLGrUEyw7Ng/cnQMMeyr wOkh4YulY0sZ30rrIv4S7nHEAUibLT7FRQo2JYjFndzBCmU5D+dF1PzN+jxWw0KuKJjg 4QXV7AGrpVLLazQsKZSUsqHhYgwn/qxHUHaL5g35o773WTTew/+a5/ZETaTCyIXEQjUg a5VJhXcae7W2jo5c33xgsiDqfUNLdBnq9fbzx0h+Iy8L2pvD+Li34VdWSeBpt7XcyPVr nW5Q== X-Forwarded-Encrypted: i=1; AJvYcCWKmbPjxf2ft929eH+qRDZ7xIFTswiG5R4SYhAaW0DpG0NrdyPHxoAoeR5S9hGFnFQQ0Z20IJnM8w==@kvack.org X-Gm-Message-State: AOJu0YyEvPw+IuMBkmtLmfFTlNLw4PjNgeXWDqZGxPcEL5nOmJVctutN qUnGRi/gxvKeccvebj2+zf7f2mC8evoZQUZ7vT2a4fetVAubR2CXJsHFTrTgdZ0pRc5B5zn+Ruh qrJugTj/SC0Y5t1FTc9TP1g== X-Google-Smtp-Source: AGHT+IHPvqQuW1nzaRxrJX4GIRZi5NgMOC89C+YN+0vpncrauk1fada8rcGXAL5RiJtXK/Fd4loftkEuOb/GRSTx+A== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a17:902:e5c6:b0:205:799f:124f with SMTP id d9443c01a7336-2074c5f3b46mr1590105ad.5.1726011916743; Tue, 10 Sep 2024 16:45:16 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:56 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: Subject: [RFC PATCH 25/39] KVM: guest_memfd: Split HugeTLB pages for guest_memfd use From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspam-User: X-Stat-Signature: spkpbwhwz7t7xonp365ucraa5qktqmog X-Rspamd-Queue-Id: 25C51100005 X-Rspamd-Server: rspam11 X-HE-Tag: 1726011917-399442 X-HE-Meta: U2FsdGVkX18EGBI6wHwHwEsdV1eRFOnvM0osyDz9y3UTEQKKx2aJEHSvkxC8m81TXqgdBiK3SyMYKHEY20JFxk2AZvKogWpATyt5SEEAY+L+9GyQAhkGoCVPYp/lARnEIb3lNdmEu9xZelLKe7klIJPTXnQYI7Q2m2QL1BVOPNTaUXu1AplhkNWuClxyVWcPz7P4DVxgUQ1qNcKTX8NFXN9Xq2AIFAa53p+XF9vPnbVbNcVRoNRNEudVy4MYAMVvG98niofKxMuAvvIqUvVxv9m+j9qfBaD34QN0dzdvHpbECbvk1hNMc8ohG/ar9zzqOa2kczoJHPIgWCCpyqIPObXI97sB4FFO2jiwmSWewXfnpL0yVeRrtLMHlMPPi75lhUXxIB2/y0tQYFKUQjmTkBJiSZf5x817qSRrj9dMbG9gsMN9zu53FCjgtJ5qmefAExUuq9B/RPb9nAmwq891H0SX1ER3dlzb49SB5Nfwfr5rF8DPnA3AH+XyyWWMv4EapcAHahM4Mtj9dMuviLB9Q7XcYCIRBQbafNpkrzvnD5aWC53EIoCoeRkyfCp4kXNxJBJoqwPgIH97pbtUfrxJaFPGOjeIS56tdgJ5m4brgltgUVPP+uR0FcVLFCpLjo85m7QXJnG9eW2uGcE15izZIhHQWFLtE/kmS2ImDid52oeE1HxC3fovMvK24L+H+9LkFtjk7z+MSDpbH7FTz1K5pU1zZaRN8gmPA9wX0hE+hWKteogVA2BwBTny3+EL59qoy5kFbCBJgWY1kWVlqHSHjd507OCfDqyjTBSa/HxiZPMOJFxIzkRzrvPIciZElur0f1PO1o73C3weFJyqjvk6FluGMX+IrjnE2y3+lZ9v/gLE9v5HGbNRjmWHRTGuX38VGm5axXMEWQvChNnLuaQ8k1QoqHCK1UF4bv1+xT4Mw4Ig6dsYBY2uPpNtUy9Po2F6teSIggUMTkArF1hj43Q MoMrpSNl Qjoy46sOIQSbt+oa0ZUkqacX9h2b/b8VGAuGs2Mrcre8Mgcr9Pg5W/Vpq7WnU8OwNawaPHEAbEdYBYETDxNN01MtgmqPsPdznaBrtZVjyGmROF/WBc9/sMWYyBLCMCJS3bQQB8q9GIutIT1pCwyzYBW9d3VnTTHubIvEeSU6BCEM87SjqoM0pOpcRnnYj+lx369ZouIVCx9NwY29vhaAs0t4GFVV6YVAHJ7XCx+dDDKX3WXzOfET9AnHvrtfdDCXbbsySDV9EsHjNKhEYGYALwoHq3fVU8fhCgh9j+kENwv5wKLdUDL75sXXtM3VWvdH2jmbw9cEaW1nAdguan94mJq+qlrOWrWRDHrKB1n0PXZZo7mzUMpeMI85Kz+ndwDgJ9HPgrPLQUIO/lq2/WB37nHlcdJ+qpm6n3OoDcqu0riIU2mSewsoHeRdjo73te/VZK+tAZ1zxucBbIgUcUfhQAjqod0p4UEH2X5UwKmLpeUigMJ9g8F3JqQDY+MM4zTOimPxfgiWSDNGUG70dJjXouegVffU3G9uCBIx/HBlvpM3w6J0vzX5lVqFGNPBkwSNDalocH6W2N2L2OLjBIzOqzC7QDnEdFUBpCubGAmlMnuZfkfZlgFQhY4UUc6NIAD/eG/XS4MHBYfA/bMWBq2sc3Ud0VR8Pikt7YBb8tMVFaYbhFas+NrUDpahEld45VxTQWe32bkTTAGs/3OU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Vishal Annapurve In this patch, newly allocated HugeTLB pages are split to 4K regular pages before providing them to the requester (fallocate() or KVM). The pages are then reconstructed/merged to HugeTLB pages before the HugeTLB pages are returned to HugeTLB. This is an intermediate step to build page splitting/merging functionality before allowing guest_memfd files to be mmap()ed. Co-developed-by: Ackerley Tng Signed-off-by: Ackerley Tng Co-developed-by: Vishal Annapurve Signed-off-by: Vishal Annapurve --- virt/kvm/guest_memfd.c | 299 ++++++++++++++++++++++++++++++++++++++--- 1 file changed, 281 insertions(+), 18 deletions(-) diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index eacbfdb950d1..8151df2c03e5 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -229,31 +229,206 @@ static int kvm_gmem_hugetlb_filemap_add_folio(struct address_space *mapping, return 0; } +struct kvm_gmem_split_stash { + struct { + unsigned long _flags_2; + unsigned long _head_2; + + void *_hugetlb_subpool; + void *_hugetlb_cgroup; + void *_hugetlb_cgroup_rsvd; + void *_hugetlb_hwpoison; + }; + void *hugetlb_private; +}; + +static int kvm_gmem_hugetlb_stash_metadata(struct folio *folio) +{ + struct kvm_gmem_split_stash *stash; + + stash = kmalloc(sizeof(*stash), GFP_KERNEL); + if (!stash) + return -ENOMEM; + + stash->_flags_2 = folio->_flags_2; + stash->_head_2 = folio->_head_2; + stash->_hugetlb_subpool = folio->_hugetlb_subpool; + stash->_hugetlb_cgroup = folio->_hugetlb_cgroup; + stash->_hugetlb_cgroup_rsvd = folio->_hugetlb_cgroup_rsvd; + stash->_hugetlb_hwpoison = folio->_hugetlb_hwpoison; + stash->hugetlb_private = folio_get_private(folio); + + folio_change_private(folio, (void *)stash); + + return 0; +} + +static int kvm_gmem_hugetlb_unstash_metadata(struct folio *folio) +{ + struct kvm_gmem_split_stash *stash; + + stash = folio_get_private(folio); + + if (!stash) + return -EINVAL; + + folio->_flags_2 = stash->_flags_2; + folio->_head_2 = stash->_head_2; + folio->_hugetlb_subpool = stash->_hugetlb_subpool; + folio->_hugetlb_cgroup = stash->_hugetlb_cgroup; + folio->_hugetlb_cgroup_rsvd = stash->_hugetlb_cgroup_rsvd; + folio->_hugetlb_hwpoison = stash->_hugetlb_hwpoison; + folio_change_private(folio, stash->hugetlb_private); + + kfree(stash); + + return 0; +} + +/** + * Reconstruct a HugeTLB folio from a contiguous block of folios where the first + * of the contiguous folios is @folio. + * + * The size of the contiguous block is of huge_page_size(@h). All the folios in + * the block are checked to have a refcount of 1 before reconstruction. After + * reconstruction, the reconstructed folio has a refcount of 1. + * + * Return 0 on success and negative error otherwise. + */ +static int kvm_gmem_hugetlb_reconstruct_folio(struct hstate *h, struct folio *folio) +{ + int ret; + + WARN_ON((folio->index & (huge_page_order(h) - 1)) != 0); + + ret = kvm_gmem_hugetlb_unstash_metadata(folio); + if (ret) + return ret; + + if (!prep_compound_gigantic_folio(folio, huge_page_order(h))) { + kvm_gmem_hugetlb_stash_metadata(folio); + return -ENOMEM; + } + + __folio_set_hugetlb(folio); + + folio_set_count(folio, 1); + + hugetlb_vmemmap_optimize_folio(h, folio); + + return 0; +} + +/* Basically folio_set_order(folio, 1) without the checks. */ +static inline void kvm_gmem_folio_set_order(struct folio *folio, unsigned int order) +{ + folio->_flags_1 = (folio->_flags_1 & ~0xffUL) | order; +#ifdef CONFIG_64BIT + folio->_folio_nr_pages = 1U << order; +#endif +} + +/** + * Split a HugeTLB @folio of size huge_page_size(@h). + * + * After splitting, each split folio has a refcount of 1. There are no checks on + * refcounts before splitting. + * + * Return 0 on success and negative error otherwise. + */ +static int kvm_gmem_hugetlb_split_folio(struct hstate *h, struct folio *folio) +{ + int ret; + + ret = hugetlb_vmemmap_restore_folio(h, folio); + if (ret) + return ret; + + ret = kvm_gmem_hugetlb_stash_metadata(folio); + if (ret) { + hugetlb_vmemmap_optimize_folio(h, folio); + return ret; + } + + kvm_gmem_folio_set_order(folio, 0); + + destroy_compound_gigantic_folio(folio, huge_page_order(h)); + __folio_clear_hugetlb(folio); + + /* + * Remove the first folio from h->hugepage_activelist since it is no + * longer a HugeTLB page. The other split pages should not be on any + * lists. + */ + hugetlb_folio_list_del(folio); + + return 0; +} + static struct folio *kvm_gmem_hugetlb_alloc_and_cache_folio(struct inode *inode, pgoff_t index) { + struct folio *allocated_hugetlb_folio; + pgoff_t hugetlb_first_subpage_index; + struct page *hugetlb_first_subpage; struct kvm_gmem_hugetlb *hgmem; - struct folio *folio; + struct page *requested_page; int ret; + int i; hgmem = kvm_gmem_hgmem(inode); - folio = kvm_gmem_hugetlb_alloc_folio(hgmem->h, hgmem->spool); - if (IS_ERR(folio)) - return folio; + allocated_hugetlb_folio = kvm_gmem_hugetlb_alloc_folio(hgmem->h, hgmem->spool); + if (IS_ERR(allocated_hugetlb_folio)) + return allocated_hugetlb_folio; + + requested_page = folio_file_page(allocated_hugetlb_folio, index); + hugetlb_first_subpage = folio_file_page(allocated_hugetlb_folio, 0); + hugetlb_first_subpage_index = index & (huge_page_mask(hgmem->h) >> PAGE_SHIFT); - /* TODO: Fix index here to be aligned to huge page size. */ - ret = kvm_gmem_hugetlb_filemap_add_folio( - inode->i_mapping, folio, index, htlb_alloc_mask(hgmem->h)); + ret = kvm_gmem_hugetlb_split_folio(hgmem->h, allocated_hugetlb_folio); if (ret) { - folio_put(folio); + folio_put(allocated_hugetlb_folio); return ERR_PTR(ret); } + for (i = 0; i < pages_per_huge_page(hgmem->h); ++i) { + struct folio *folio = page_folio(nth_page(hugetlb_first_subpage, i)); + + ret = kvm_gmem_hugetlb_filemap_add_folio(inode->i_mapping, + folio, + hugetlb_first_subpage_index + i, + htlb_alloc_mask(hgmem->h)); + if (ret) { + /* TODO: handle cleanup properly. */ + pr_err("Handle cleanup properly index=%lx, ret=%d\n", + hugetlb_first_subpage_index + i, ret); + dump_page(nth_page(hugetlb_first_subpage, i), "check"); + return ERR_PTR(ret); + } + + /* + * Skip unlocking for the requested index since + * kvm_gmem_get_folio() returns a locked folio. + * + * Do folio_put() to drop the refcount that came with the folio, + * from splitting the folio. Splitting the folio has a refcount + * to be in line with hugetlb_alloc_folio(), which returns a + * folio with refcount 1. + * + * Skip folio_put() for requested index since + * kvm_gmem_get_folio() returns a folio with refcount 1. + */ + if (hugetlb_first_subpage_index + i != index) { + folio_unlock(folio); + folio_put(folio); + } + } + spin_lock(&inode->i_lock); inode->i_blocks += blocks_per_huge_page(hgmem->h); spin_unlock(&inode->i_lock); - return folio; + return page_folio(requested_page); } static struct folio *kvm_gmem_get_hugetlb_folio(struct inode *inode, @@ -365,7 +540,9 @@ static inline void kvm_gmem_hugetlb_filemap_remove_folio(struct folio *folio) /** * Removes folios in range [@lstart, @lend) from page cache/filemap (@mapping), - * returning the number of pages freed. + * returning the number of HugeTLB pages freed. + * + * @lend - @lstart must be a multiple of the HugeTLB page size. */ static int kvm_gmem_hugetlb_filemap_remove_folios(struct address_space *mapping, struct hstate *h, @@ -373,37 +550,69 @@ static int kvm_gmem_hugetlb_filemap_remove_folios(struct address_space *mapping, { const pgoff_t end = lend >> PAGE_SHIFT; pgoff_t next = lstart >> PAGE_SHIFT; + LIST_HEAD(folios_to_reconstruct); struct folio_batch fbatch; + struct folio *folio, *tmp; int num_freed = 0; + int i; + /* + * TODO: Iterate over huge_page_size(h) blocks to avoid taking and + * releasing hugetlb_fault_mutex_table[hash] lock so often. When + * truncating, lstart and lend should be clipped to the size of this + * guest_memfd file, otherwise there would be too many iterations. + */ folio_batch_init(&fbatch); while (filemap_get_folios(mapping, &next, end - 1, &fbatch)) { - int i; for (i = 0; i < folio_batch_count(&fbatch); ++i) { struct folio *folio; pgoff_t hindex; u32 hash; folio = fbatch.folios[i]; + hindex = folio->index >> huge_page_order(h); hash = hugetlb_fault_mutex_hash(mapping, hindex); - mutex_lock(&hugetlb_fault_mutex_table[hash]); + + /* + * Collect first pages of HugeTLB folios for + * reconstruction later. + */ + if ((folio->index & ~(huge_page_mask(h) >> PAGE_SHIFT)) == 0) + list_add(&folio->lru, &folios_to_reconstruct); + + /* + * Before removing from filemap, take a reference so + * sub-folios don't get freed. Don't free the sub-folios + * until after reconstruction. + */ + folio_get(folio); + kvm_gmem_hugetlb_filemap_remove_folio(folio); - mutex_unlock(&hugetlb_fault_mutex_table[hash]); - num_freed++; + mutex_unlock(&hugetlb_fault_mutex_table[hash]); } folio_batch_release(&fbatch); cond_resched(); } + list_for_each_entry_safe(folio, tmp, &folios_to_reconstruct, lru) { + kvm_gmem_hugetlb_reconstruct_folio(h, folio); + hugetlb_folio_list_move(folio, &h->hugepage_activelist); + + folio_put(folio); + num_freed++; + } + return num_freed; } /** * Removes folios in range [@lstart, @lend) from page cache of inode, updates * inode metadata and hugetlb reservations. + * + * @lend - @lstart must be a multiple of the HugeTLB page size. */ static void kvm_gmem_hugetlb_truncate_folios_range(struct inode *inode, loff_t lstart, loff_t lend) @@ -427,6 +636,56 @@ static void kvm_gmem_hugetlb_truncate_folios_range(struct inode *inode, spin_unlock(&inode->i_lock); } +/** + * Zeroes offsets [@start, @end) in a folio from @mapping. + * + * [@start, @end) must be within the same folio. + */ +static void kvm_gmem_zero_partial_page( + struct address_space *mapping, loff_t start, loff_t end) +{ + struct folio *folio; + pgoff_t idx = start >> PAGE_SHIFT; + + folio = filemap_lock_folio(mapping, idx); + if (IS_ERR(folio)) + return; + + start = offset_in_folio(folio, start); + end = offset_in_folio(folio, end); + if (!end) + end = folio_size(folio); + + folio_zero_segment(folio, (size_t)start, (size_t)end); + folio_unlock(folio); + folio_put(folio); +} + +/** + * Zeroes all pages in range [@start, @end) in @mapping. + * + * hugetlb_zero_partial_page() would work if this had been a full page, but is + * not suitable since the pages have been split. + * + * truncate_inode_pages_range() isn't the right function because it removes + * pages from the page cache; this function only zeroes the pages. + */ +static void kvm_gmem_hugetlb_zero_split_pages(struct address_space *mapping, + loff_t start, loff_t end) +{ + loff_t aligned_start; + loff_t index; + + aligned_start = round_up(start, PAGE_SIZE); + + kvm_gmem_zero_partial_page(mapping, start, min(aligned_start, end)); + + for (index = aligned_start; index < end; index += PAGE_SIZE) { + kvm_gmem_zero_partial_page(mapping, index, + min((loff_t)(index + PAGE_SIZE), end)); + } +} + static void kvm_gmem_hugetlb_truncate_range(struct inode *inode, loff_t lstart, loff_t lend) { @@ -442,8 +701,8 @@ static void kvm_gmem_hugetlb_truncate_range(struct inode *inode, loff_t lstart, full_hpage_end = round_down(lend, hsize); if (lstart < full_hpage_start) { - hugetlb_zero_partial_page(h, inode->i_mapping, lstart, - full_hpage_start); + kvm_gmem_hugetlb_zero_split_pages(inode->i_mapping, lstart, + full_hpage_start); } if (full_hpage_end > full_hpage_start) { @@ -452,8 +711,8 @@ static void kvm_gmem_hugetlb_truncate_range(struct inode *inode, loff_t lstart, } if (lend > full_hpage_end) { - hugetlb_zero_partial_page(h, inode->i_mapping, full_hpage_end, - lend); + kvm_gmem_hugetlb_zero_split_pages(inode->i_mapping, full_hpage_end, + lend); } } @@ -1060,6 +1319,10 @@ __kvm_gmem_get_pfn(struct file *file, struct kvm_memory_slot *slot, if (folio_test_hwpoison(folio)) { folio_unlock(folio); + /* + * TODO: this folio may be part of a HugeTLB folio. Perhaps + * reconstruct and then free page? + */ folio_put(folio); return ERR_PTR(-EHWPOISON); } From patchwork Tue Sep 10 23:43:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799499 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 020F4EE01F2 for ; Tue, 10 Sep 2024 23:45:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CD2228D00E5; Tue, 10 Sep 2024 19:45:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C5BA78D00E2; Tue, 10 Sep 2024 19:45:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AAE388D00E5; Tue, 10 Sep 2024 19:45:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 875378D00E2 for ; Tue, 10 Sep 2024 19:45:21 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 3FC5880897 for ; Tue, 10 Sep 2024 23:45:21 +0000 (UTC) X-FDA: 82550462442.09.15FB169 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf09.hostedemail.com (Postfix) with ESMTP id 760EA140013 for ; Tue, 10 Sep 2024 23:45:19 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=CXtTXfxR; spf=pass (imf09.hostedemail.com: domain of 3DtrgZgsKCIMhjrlysl50unnvvnsl.jvtspu14-ttr2hjr.vyn@flex--ackerleytng.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3DtrgZgsKCIMhjrlysl50unnvvnsl.jvtspu14-ttr2hjr.vyn@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011817; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=tJq0MqYSe34io4UmOdJbczSi4Mtmt5uxlF7QPZDIEdU=; b=3jd+3XQOKQuioAAxNs1t0CRFDmnEAVlo4TVjwdYfhcnpJ9WyxvUrET+LaU0YP95dkPY9K2 TpDZLvBmUcNIg16UbyvgFt2U0qw1TBQOalX29Q+hbyAagSo7biN4+xVx/AgANDWxGXYG3v sAjazLDKAPcrpEGLybVSLcWNWnj7Vvo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011817; a=rsa-sha256; cv=none; b=tzVdFR6BZpsw64zdSX7u7pkmBn0r8RdugGIL3a6l4BKZqvhlv3wazzSYOBV76tlOCZ6gZL X9f2ebC5HpFXx+7IHGHZSjQhoCr75SXV9ac6JolQeDdc14S87ml2MuC6eD5QTBuYkDmmBW MN42iVoBh15YBJybLRZ1A1qbLmbv17w= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=CXtTXfxR; spf=pass (imf09.hostedemail.com: domain of 3DtrgZgsKCIMhjrlysl50unnvvnsl.jvtspu14-ttr2hjr.vyn@flex--ackerleytng.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3DtrgZgsKCIMhjrlysl50unnvvnsl.jvtspu14-ttr2hjr.vyn@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-6d5235d1bcaso181004727b3.2 for ; Tue, 10 Sep 2024 16:45:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011918; x=1726616718; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=tJq0MqYSe34io4UmOdJbczSi4Mtmt5uxlF7QPZDIEdU=; b=CXtTXfxRxiSxNWy/souNmVzJKUgI5X2QHQI3p/7IqrTkzK4FuMSg0q7oCSk4pRlsMZ TtLpSPDOuIiOoR9hPrxUNcAzrJwo1CFu+ORI17WEKVJSxWHB7dLyS5kzzfV6MyBjilLR jnBLlngDKcmWUX3+uIDxbQa2iA8wFYGY2e6ULVip5HaaDUg5UDYWBLkuU/UZhMsCQQuB hrnQl4ZlvkLuH89qWnvlhNJfKfRA8PgFpGqzxukQcHOyGRA9WYhW26drCBdJ3HxB2H5P 8LiSmwfWu6kJCcBZLxpgiSWz6F6qE/zNj4vVdpaFHrTvFp9sloC/Hbk8Th/ZxD0p8XIc S5jQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011918; x=1726616718; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=tJq0MqYSe34io4UmOdJbczSi4Mtmt5uxlF7QPZDIEdU=; b=n8R8W8evJ+qRo594EZF6NUXRmsa8gbsywTr0maPTGLX9w4WUNG+nwHrE1tADWISvCO /OkVcz+NFNdJy3ORx6Cpal2cVagZyu8CIKyO9gUFVcccTcBBeZgLCwrv3EG8n6ZEhwkB 6MZEuGDdhqWHLIi2CcfmT+BN2vw9cgktzNR7vBmJv6dUdKD13Fkv/G7cDEdIu1P+T/UM 0fGnz8o0xt5mvT9edQobiKRLEwR/DJmM+ygrawoeIWLHR1U5wI+IHjJ5nWKCIOZnXe3x zaES6SRBNtxnK683e9Zh2BVk0KaKqTGeu0yFLpfuOtAvewInjWAHRa6zBA4YyeifgyCX VYOA== X-Forwarded-Encrypted: i=1; AJvYcCWB8aOON4lXyEiktlkq72C7OV41yldMoMi1UQGStBnl5Z0pLabzMAGrZQIwWNJe8dN8UW4XTgzkPA==@kvack.org X-Gm-Message-State: AOJu0YwOozw6UESOnfP9KNzH7Y0rI3Ws9prlFsOo976MCtVfWFRJIWQ6 xMz7wvAIsCbzcV+cOSyuEVQDEEnGXWg9FU9snLGnlUn0sAQpFloL2Vn30Q9NrPpQV8Eh/l+HK4+ oguIAuojfFi0M7v6h7DEwVQ== X-Google-Smtp-Source: AGHT+IGSLUm8oBiaMD+q3wklbXJeQqiWR8FYu/x3IcQqY2JhzZy9eAY0OzQzZWTZkFFReNAsglf+owucD34Qou3lfw== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a05:690c:38c:b0:6af:623c:7694 with SMTP id 00721157ae682-6db44a5d200mr10772047b3.0.1726011918533; Tue, 10 Sep 2024 16:45:18 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:57 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: Subject: [RFC PATCH 26/39] KVM: guest_memfd: Track faultability within a struct kvm_gmem_private From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspamd-Queue-Id: 760EA140013 X-Stat-Signature: ewokzxdrajiu75qjdctk4k94qpqqbkme X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1726011919-526864 X-HE-Meta: U2FsdGVkX19Fb4ZeI3eC/hrAR5wXCbLdFW+C0Ln74CWy7uSrGPlfeFgRMFF8JkKAzdxSyCwFXK5wtgS1R9j4a4pcAuUUXXkk6e2ESMM/LLAzTC+7/OxvVKawgXrGn5yDG3jQma+3rIxsILCfj9h52dvPT3HtWLBdJ8emjZTwHsFsxcGcFkJ1HHWjd0BNS1ZCS/luYoRq4O0Nqy9lrVDMYWYvu00FCEyyvTFuwsmBwQOSz1kaJFn4CxfXnlRO9fO6WWmUQ4khK5WwGIE3G8GaP1oP8EKAnw9Tl5UPPoN/gUfAMrcdNeSZ/TC9kv+JKkP10yJF6n9eUfK/aznt0xTINWbLiVr/f4NSl4d5UD9Y9F/NS2s4oLGcdkEq1BZtq0kx96/isS71HtL5zS+ffKcx0++qV88DegCIzRs3F8jGZaUqw9BKFzN5FW5bTp1E/ImtzIEBlMKCWYBYE+SdIg0JhIpx3bWofsxKtpfyo3MK6w4BTGWUAP/dsf93vlplz2itYFOAdwCzpYtz3hHeyGcy4kzm0K9CFkqIl2bQQ5vCuBIDcWVVCRfbs4u4LLAvubG/lYHT7frvJBvZ6NVSu/c5zmmfSNh+nQWJQwueSlD4f9+IZAB8IMHNMVbT7+JWN6DlJ0HXbOVhUwL6kRf51qvtvFjwD44A7AotwNd8SiUr8kXATnRxIaH+KoObFxq/um1M+dpRtypy4joKE7bH9TYtgaguIw7+g1j2DqB2zuU2fWYy8lhHTrh/T+apWzbnZ7BtWN3nflZiPI9cl6OUITm8MVrupgXUMPJoDu1GtLTnDsmq7GwDAREptAMgXmfdA2MvGam0dMx4aIUIUJ4Cmw8SZKRwe5kouGS9exQK/CsDx3t8sChRkyxRen9U20Ul3pU4g4OaH9O/+U7fuARYI5TaKP+Aw88U+7iU1JavZQ1I8HUX+iry9iVRRim3BXbq4mAwcAsCNTWARmOhKabZNky ppTv7ANd 0iH4iGMbqk7T6+NagexW2slfbBbwKkCAUc3noRx8viBKDlshRH/7XHdmQ2NK2+12TMK541hPLddMH4JtSdUxEW1bA7EIY8bjNd404fmdVNpzkPMyU/ktAnFLGy+qDbqwjZ505COEsOC+uxlnbmI3yFdxeQx+jI1GnMCswcvgUMSk6+lP4Qm/mEWg29ccwIevnBGVNUUIpMKE0NHWIBoMVcgTg02Wlv/WNZH+aYJ2EMoIsFHopr7/Ydpaisssd/SDNt1RiOm94F/qABx/VCjBu/5f2XUnfHOgQCKubhPry514UgjWTktIrezTUni0Ly3+CnSqRozHVytA3hl674yxshUv8c1yZXeeRTCgEBPHRtPLgXBllPN8in09sJX0JkCRSQyt1N/BdFHlM3bcxCL2+CsxSkeEvjpz/XWFH81EOysdEyuc4JZ69LBn3/RLFtiHZYpKMkEnRvdyacKPtqIRNE/kLBzsgmrD7gAOjBUG8GAMFADfipCvKTkWrwWHbpQ2V2Fp7uQco9j7pjN7/NxMJVLgnIClt2Xwz7k0JDGY/wqQ9l4sgxUuM38aK1RpYffevjpeFeUGrt9iB2Tuln+pBQFDlVo+X3sE9K54qoXSecYafZMs/tWlr7wSNnbilkwr5ryDwZvuDlPAKFm7irE4kR4uh1FRXfaLwgFbI X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The faultability xarray is stored on the inode since faultability is a property of the guest_memfd's memory contents. In this RFC, presence of an entry in the xarray indicates faultable, but this could be flipped so that presence indicates unfaultable. For flexibility, a special value "FAULT" is used instead of a simple boolean. However, at some stages of a VM's lifecycle there could be more private pages, and at other stages there could be more shared pages. This is likely to be replaced by a better data structure in a future revision to better support ranges. Also store struct kvm_gmem_hugetlb in struct kvm_gmem_hugetlb as a pointer. inode->i_mapping->i_private_data. Co-developed-by: Fuad Tabba Signed-off-by: Fuad Tabba Co-developed-by: Ackerley Tng Signed-off-by: Ackerley Tng Co-developed-by: Vishal Annapurve Signed-off-by: Vishal Annapurve --- virt/kvm/guest_memfd.c | 105 ++++++++++++++++++++++++++++++++++++----- 1 file changed, 94 insertions(+), 11 deletions(-) diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 8151df2c03e5..b603518f7b62 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -26,11 +26,21 @@ struct kvm_gmem_hugetlb { struct hugepage_subpool *spool; }; -static struct kvm_gmem_hugetlb *kvm_gmem_hgmem(struct inode *inode) +struct kvm_gmem_inode_private { + struct xarray faultability; + struct kvm_gmem_hugetlb *hgmem; +}; + +static struct kvm_gmem_inode_private *kvm_gmem_private(struct inode *inode) { return inode->i_mapping->i_private_data; } +static struct kvm_gmem_hugetlb *kvm_gmem_hgmem(struct inode *inode) +{ + return kvm_gmem_private(inode)->hgmem; +} + static bool is_kvm_gmem_hugetlb(struct inode *inode) { u64 flags = (u64)inode->i_private; @@ -38,6 +48,57 @@ static bool is_kvm_gmem_hugetlb(struct inode *inode) return flags & KVM_GUEST_MEMFD_HUGETLB; } +#define KVM_GMEM_FAULTABILITY_VALUE 0x4641554c54 /* FAULT */ + +/** + * Set faultability of given range of inode indices [@start, @end) to + * @faultable. Return 0 if attributes were successfully updated or negative + * errno on error. + */ +static int kvm_gmem_set_faultable(struct inode *inode, pgoff_t start, pgoff_t end, + bool faultable) +{ + struct xarray *faultability; + void *val; + pgoff_t i; + + /* + * The expectation is that fewer pages are faultable, hence save memory + * entries are created for faultable pages as opposed to creating + * entries for non-faultable pages. + */ + val = faultable ? xa_mk_value(KVM_GMEM_FAULTABILITY_VALUE) : NULL; + faultability = &kvm_gmem_private(inode)->faultability; + + /* + * TODO replace this with something else (maybe interval + * tree?). store_range doesn't quite do what we expect if overlapping + * ranges are specified: if we store_range(5, 10, val) and then + * store_range(7, 12, NULL), the entire range [5, 12] will be NULL. For + * now, use the slower xa_store() to store individual entries on indices + * to avoid this. + */ + for (i = start; i < end; i++) { + int r; + + r = xa_err(xa_store(faultability, i, val, GFP_KERNEL_ACCOUNT)); + if (r) + return r; + } + + return 0; +} + +/** + * Return true if the page at @index is allowed to be faulted in. + */ +static bool kvm_gmem_is_faultable(struct inode *inode, pgoff_t index) +{ + struct xarray *faultability = &kvm_gmem_private(inode)->faultability; + + return xa_to_value(xa_load(faultability, index)) == KVM_GMEM_FAULTABILITY_VALUE; +} + /** * folio_file_pfn - like folio_file_page, but return a pfn. * @folio: The folio which contains this index. @@ -895,11 +956,21 @@ static void kvm_gmem_hugetlb_teardown(struct inode *inode) static void kvm_gmem_evict_inode(struct inode *inode) { + struct kvm_gmem_inode_private *private = kvm_gmem_private(inode); + + /* + * .evict_inode can be called before faultability is set up if there are + * issues during inode creation. + */ + if (private) + xa_destroy(&private->faultability); + if (is_kvm_gmem_hugetlb(inode)) kvm_gmem_hugetlb_teardown(inode); else truncate_inode_pages_final(inode->i_mapping); + kfree(private); clear_inode(inode); } @@ -1028,7 +1099,9 @@ static const struct inode_operations kvm_gmem_iops = { .setattr = kvm_gmem_setattr, }; -static int kvm_gmem_hugetlb_setup(struct inode *inode, loff_t size, u64 flags) +static int kvm_gmem_hugetlb_setup(struct inode *inode, + struct kvm_gmem_inode_private *private, + loff_t size, u64 flags) { struct kvm_gmem_hugetlb *hgmem; struct hugepage_subpool *spool; @@ -1036,6 +1109,10 @@ static int kvm_gmem_hugetlb_setup(struct inode *inode, loff_t size, u64 flags) struct hstate *h; long hpages; + hgmem = kzalloc(sizeof(*hgmem), GFP_KERNEL); + if (!hgmem) + return -ENOMEM; + page_size_log = (flags >> KVM_GUEST_MEMFD_HUGE_SHIFT) & KVM_GUEST_MEMFD_HUGE_MASK; h = hstate_sizelog(page_size_log); @@ -1046,21 +1123,16 @@ static int kvm_gmem_hugetlb_setup(struct inode *inode, loff_t size, u64 flags) if (!spool) goto err; - hgmem = kzalloc(sizeof(*hgmem), GFP_KERNEL); - if (!hgmem) - goto err_subpool; - inode->i_blkbits = huge_page_shift(h); hgmem->h = h; hgmem->spool = spool; - inode->i_mapping->i_private_data = hgmem; + private->hgmem = hgmem; return 0; -err_subpool: - kfree(spool); err: + kfree(hgmem); return -ENOMEM; } @@ -1068,6 +1140,7 @@ static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, loff_t size, u64 flags) { const struct qstr qname = QSTR_INIT(name, strlen(name)); + struct kvm_gmem_inode_private *private; struct inode *inode; int err; @@ -1079,12 +1152,20 @@ static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, if (err) goto out; + err = -ENOMEM; + private = kzalloc(sizeof(*private), GFP_KERNEL); + if (!private) + goto out; + if (flags & KVM_GUEST_MEMFD_HUGETLB) { - err = kvm_gmem_hugetlb_setup(inode, size, flags); + err = kvm_gmem_hugetlb_setup(inode, private, size, flags); if (err) - goto out; + goto free_private; } + xa_init(&private->faultability); + inode->i_mapping->i_private_data = private; + inode->i_private = (void *)(unsigned long)flags; inode->i_op = &kvm_gmem_iops; inode->i_mapping->a_ops = &kvm_gmem_aops; @@ -1097,6 +1178,8 @@ static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, return inode; +free_private: + kfree(private); out: iput(inode); From patchwork Tue Sep 10 23:43:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799500 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92EC9EE01F1 for ; Tue, 10 Sep 2024 23:45:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B9BB68D00E6; Tue, 10 Sep 2024 19:45:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AFB458D00E2; Tue, 10 Sep 2024 19:45:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 92A218D00E6; Tue, 10 Sep 2024 19:45:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 6FEF68D00E2 for ; Tue, 10 Sep 2024 19:45:23 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 1EADA160C63 for ; Tue, 10 Sep 2024 23:45:23 +0000 (UTC) X-FDA: 82550462526.14.43076D7 Received: from mail-pg1-f202.google.com (mail-pg1-f202.google.com [209.85.215.202]) by imf21.hostedemail.com (Postfix) with ESMTP id 5CB511C0011 for ; Tue, 10 Sep 2024 23:45:21 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=KVMrRjcC; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of 3D9rgZgsKCIQiksmztm61voowwotm.kwutqv25-uus3iks.wzo@flex--ackerleytng.bounces.google.com designates 209.85.215.202 as permitted sender) smtp.mailfrom=3D9rgZgsKCIQiksmztm61voowwotm.kwutqv25-uus3iks.wzo@flex--ackerleytng.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011893; a=rsa-sha256; cv=none; b=orO/P2TU4ITuhNAiGQ/BBPdxrEbPlXSl+yt7L2oD+LZZh/HKdCtGemHa5Beo0fLOLRk0a5 7qmUZSL1Xwhv15bqEBVOOqtyPz+AZB0XKAFJ3dXLbA2Ruo42jMArobw/fMiRQa1j/ivM9Q SmD32A96IZK8Q0E5yyrUanZgluQCjig= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=KVMrRjcC; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of 3D9rgZgsKCIQiksmztm61voowwotm.kwutqv25-uus3iks.wzo@flex--ackerleytng.bounces.google.com designates 209.85.215.202 as permitted sender) smtp.mailfrom=3D9rgZgsKCIQiksmztm61voowwotm.kwutqv25-uus3iks.wzo@flex--ackerleytng.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011893; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=34wlcJFkfGNQzzb1hZdL7ujzeulibiS7N8dJMuUgV78=; b=Q/J2ba4qDS6ZHaqnt2p+1B1zY68w9xCKeBh+skedc3TobexcQD3tAL6CQMvapXtR/3hixc FvnlfaMOIzZ80ctmrrC0mcMAxmUT+2RAJq2OeI3f/O/Fc8JVfAludio7A9ntw8Qy1LsivP itvt5redeB2AEoSv72sVk/Lc/a+aelg= Received: by mail-pg1-f202.google.com with SMTP id 41be03b00d2f7-7d511de6348so2576190a12.0 for ; Tue, 10 Sep 2024 16:45:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011920; x=1726616720; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=34wlcJFkfGNQzzb1hZdL7ujzeulibiS7N8dJMuUgV78=; b=KVMrRjcCgdYY/ILCnWda/rSMyvhrCwpCbHgeVtBygfEDeU2A/Gbib5enwONFG0pIpZ VLgvkmBUS/t9EvMid+IY96GEiBq50b/deRUSrEsmiw4mGmKMySfqqdI+636jCCY3bRIw O28Mk4chUOUop2uHXobrLF84fkWOfz2ZZ8BAqUNUFJQJ1HQRllJzS+mnol/KUHPryVcv 2SNEiNQlsLKrf6+LfaL8hc894fFIg7itmF/AuFE1Hvihk3cjn0BcW3AbZoJbByDdptmV NjKTrf6Zlzhs1iRuy4g8IWr7SOWIuvygClkrOJGqskv2mPwgQeu9O7vd7PQND0veaqjw 2viA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011920; x=1726616720; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=34wlcJFkfGNQzzb1hZdL7ujzeulibiS7N8dJMuUgV78=; b=PIUeNkZ5wlAsVisvumSHtR+9Jh6Fovg6cbHMMBk7WYFfJpnQnOX4LyIPwBD1X1IA3Z HpJRJPh9BlTZTedQntaIHZAClC0Rhw4Im1PTDSmNHgroqVt7cbPX2/z19qX1atwEk3z7 FmexiUE/Z5tXUZNd4tjGUHlOr93AHvOrMW6FjJLRVxAvJsOlydts2s2pCAfc04a2fzzs wBEhlFqdCg+P/YknfEPfNcxCKgzBkb0A2flo3cXEKigRS3ZmWRd1SBVVwpKX5CG2OpKA MrsZW+FJbpYSOmU8aR6SaJuRA2AvxB3IabKzp9XO/xGZNJ0+NYOn86bP8uhDMvVJFHoG EMWw== X-Forwarded-Encrypted: i=1; AJvYcCXadLN+bvc1TRK7LrgiexejGAmEfLGgm+DBGO2y+jLMMcuhiNEljs40yxa9V2SHaAO/LE/wsw00qQ==@kvack.org X-Gm-Message-State: AOJu0YwtENFUTO228YTZFf8S+GXE6X0W6umMNfA1Lc7xNRe08XBygndZ /UMynuLRq+KtoTkAXVaBkin2Hp21hCu6yMznKeGC6hEt9pJoL1XZPIBNP90/C8eaMF1Ek7xx6gR JUOnQfFngj8eoN8wJfdBF/g== X-Google-Smtp-Source: AGHT+IHhJHaKR4KPnPlGSVYGkBp4x4+Xb5fuhdBUkXuBYQ8usu5fhaV07d9H987RUToHw3cMQAZNLNO7RWO0YnhXdQ== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a65:41c6:0:b0:785:e3e:38db with SMTP id 41be03b00d2f7-7db08543c58mr8464a12.8.1726011919943; Tue, 10 Sep 2024 16:45:19 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:58 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <5a05eb947cf7aa21f00b94171ca818cc3d5bdfee.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 27/39] KVM: guest_memfd: Allow mmapping guest_memfd files From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspam-User: X-Rspamd-Queue-Id: 5CB511C0011 X-Rspamd-Server: rspam01 X-Stat-Signature: 58r4mq779wqrgeobodc4et11n5okcnn8 X-HE-Tag: 1726011921-128130 X-HE-Meta: U2FsdGVkX18RKej2bZT4toy4zOziuhQ+jc+1dVKGpFHssWknUwVpSxkhonvV/ShnRMhl2fG+ZWNFGPPukrNUY2n5p8e4bgAqANaNcZ+LoDQuPJEy6pZwt4r/3Jf1e+iX456XR9yyIUZSBzwbcdO4z121u20WTXdj5xClTzMmkV0S2X7wcxD3krw39ijIopOYssNZWkA0nS6eI2ps4Sc2+aaH6gRP5qrundNKBuTvF+c04kntl3bbeSHq6W9S6KRFTpBlvkxow05c10QkR3F38symMIsqOBwNJ+JPFt3lhv//0SUmEERtyjtwUYXMd+6GeB9NApocgeNXrzo1PhBNelbKQp/8on1H8ZJzoIIXPZPg1GhtWJooqaakGEHAbiVA3760nwktZaw7BTvIplLrk51oppAv79KTK63Jwz+HzT5vC51z5Ga81cHvn5E1cWi6nUBnQtn3DZPRNpSGVQEHuql6Pp97Mt2z2ZrIBIT+r/j/kw+HaFA29Pzd4OT5c++fIpP2i8026bIVPc3o4yET2/mOQBF9SgnqBV2PlI087mGsL01Jemhae+qii1lUQ6vSoc+LpE20pgxfgO2xMxlyTHsxAGdaBViS1QuGSdvstXG7UiTYCpVa5LrPjSwDO1wn6tzZ4iR5rCoJ9rDJmDoVTw/h4+RU/0eMSzSNM8mjMSSbDgQeSeQzbNlydb/GNHuVce55sEB9NvmbPyiUpL8erx/f1p8VS1LMmGH/AwArd3ktv2ukatMFaQ9qtT9yw/bOyZRzxPId6nzLz6AFsoIIqQar+jIuU3hieVN1peB1TwRgtw81WUE9k5Yl/r+Zs8nMe43wu74rb5IdHAIHbpfXxoAuS3H3Nr6OFomFXuoGfkYQFQesfh6H1NoyS4QzAL1LbyXSJqVAyydz4ChgFLmHlQ0kk/LlFNh+pvj2w7wwIlgHv7Sq6l7jcn0ML+P7M6O1HyKbcxgtDAHAA+1GyC0 K1sDNBZr ad+2o9sx5t29wcLGc5qGkcZ0EfyiI2yXKCU/Ejia55KIeiDehUMT9nOMQPZyYJFkcoAsWFnjmuDotd2ZovVIigrM9tVZfuZGMuO5r3HlAbOrYPlKH/H/s5AXQnFQ24KZKtvhqSgqur+UNBsc/pJK7+PsS94VRZ1SdT7iYWW8jg+5fXWfLi6g8cUXGsIffrAhy3+hh48Nk1H6AZi0MuBnMH14cTXcAoFn5PZKMjKrvXu2x0HIhjdpn3NudjhcJS+JKA5bSxy7SMMQU7ulKbeLm/5pMNeSWIFtuBaNt1kDay+ku6CgFjPOLaXO6QF+D3U0lW9KvLInzqlqgm/FLGBLVqQifjLqMnKp88wB584JU5DsrWwwGvw9BTTpmaNUHsWGFgIQjJtZb6X7C1zlpDSpkh4V+3IOUe+x7GdwXCiDrs81GNK4oJvRIwy1szc0rzYrs9BfcPr+Nb2NEyASWYfKb0Pjmu8KgbnDLgD4np8XIwGsP2b7YV1ZpYOjR6CJ6dBLcHJoygjAohCcwIC+l0kAjYEZvgR0yrSyG4G8M0QVGYccRz3gzQ/kSXbNQ10zJbM5rrN/x2s/dSpFJVamnbzdUJ7QxZK14IHUakjnAZYgSAch3e6aKYR3IhQvEV0q+l6QzhP7Bb8AQBoqDe4Gv6UFPLInSRj21rxsrVL7hWn2DoOMPwukSvvLU874gLD9NITEMzklv3FvQluxLGjo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: guest_memfd files can always be mmap()ed to userspace, but faultability is controlled by an attribute on the inode. Co-developed-by: Fuad Tabba Signed-off-by: Fuad Tabba Co-developed-by: Ackerley Tng Signed-off-by: Ackerley Tng --- virt/kvm/guest_memfd.c | 46 ++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 44 insertions(+), 2 deletions(-) diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index b603518f7b62..fc2483e35876 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -781,7 +781,8 @@ static long kvm_gmem_punch_hole(struct inode *inode, loff_t offset, loff_t len) { struct list_head *gmem_list = &inode->i_mapping->i_private_list; pgoff_t start = offset >> PAGE_SHIFT; - pgoff_t end = (offset + len) >> PAGE_SHIFT; + pgoff_t nr = len >> PAGE_SHIFT; + pgoff_t end = start + nr; struct kvm_gmem *gmem; /* @@ -790,6 +791,9 @@ static long kvm_gmem_punch_hole(struct inode *inode, loff_t offset, loff_t len) */ filemap_invalidate_lock(inode->i_mapping); + /* TODO: Check if even_cows should be 0 or 1 */ + unmap_mapping_range(inode->i_mapping, start, len, 0); + list_for_each_entry(gmem, gmem_list, entry) kvm_gmem_invalidate_begin(gmem, start, end); @@ -946,6 +950,9 @@ static void kvm_gmem_hugetlb_teardown(struct inode *inode) { struct kvm_gmem_hugetlb *hgmem; + /* TODO: Check if even_cows should be 0 or 1 */ + unmap_mapping_range(inode->i_mapping, 0, LLONG_MAX, 0); + truncate_inode_pages_final_prepare(inode->i_mapping); kvm_gmem_hugetlb_truncate_folios_range(inode, 0, LLONG_MAX); @@ -1003,11 +1010,46 @@ static void kvm_gmem_init_mount(void) kvm_gmem_mnt = kern_mount(&kvm_gmem_fs); BUG_ON(IS_ERR(kvm_gmem_mnt)); - /* For giggles. Userspace can never map this anyways. */ kvm_gmem_mnt->mnt_flags |= MNT_NOEXEC; } +static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf) +{ + struct inode *inode; + struct folio *folio; + + inode = file_inode(vmf->vma->vm_file); + if (!kvm_gmem_is_faultable(inode, vmf->pgoff)) + return VM_FAULT_SIGBUS; + + folio = kvm_gmem_get_folio(inode, vmf->pgoff); + if (!folio) + return VM_FAULT_SIGBUS; + + vmf->page = folio_file_page(folio, vmf->pgoff); + return VM_FAULT_LOCKED; +} + +static const struct vm_operations_struct kvm_gmem_vm_ops = { + .fault = kvm_gmem_fault, +}; + +static int kvm_gmem_mmap(struct file *file, struct vm_area_struct *vma) +{ + if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) != + (VM_SHARED | VM_MAYSHARE)) { + return -EINVAL; + } + + file_accessed(file); + vm_flags_set(vma, VM_DONTDUMP); + vma->vm_ops = &kvm_gmem_vm_ops; + + return 0; +} + static struct file_operations kvm_gmem_fops = { + .mmap = kvm_gmem_mmap, .open = generic_file_open, .release = kvm_gmem_release, .fallocate = kvm_gmem_fallocate, From patchwork Tue Sep 10 23:43:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799501 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D7D5EE01F2 for ; Tue, 10 Sep 2024 23:46:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 817BF8D00E7; Tue, 10 Sep 2024 19:45:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7A1CC8D00E2; Tue, 10 Sep 2024 19:45:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5F4938D00E7; Tue, 10 Sep 2024 19:45:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 36C978D00E2 for ; Tue, 10 Sep 2024 19:45:25 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id E59981A0B68 for ; Tue, 10 Sep 2024 23:45:24 +0000 (UTC) X-FDA: 82550462568.02.08F7007 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf03.hostedemail.com (Postfix) with ESMTP id 2F45E2000E for ; Tue, 10 Sep 2024 23:45:22 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=CHU6olBJ; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf03.hostedemail.com: domain of 3EdrgZgsKCIYkmuo1vo83xqqyyqvo.mywvsx47-wwu5kmu.y1q@flex--ackerleytng.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3EdrgZgsKCIYkmuo1vo83xqqyyqvo.mywvsx47-wwu5kmu.y1q@flex--ackerleytng.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011847; a=rsa-sha256; cv=none; b=Qgwp+NzWowmBPVxbcAd0Enyj92ZBmcPEftZHwGFN2zhj7MMIquoofy5RhKztmPvFnSD3hH for4xq4Yq2R6SFoaOQiLSutyA3k5jOoXCSHwirRd114/imw4EINxOHBVZhKmUdSWohxFTY bpBXbMi6K+SDuOg/S2sy/I33DGE7+tA= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=CHU6olBJ; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf03.hostedemail.com: domain of 3EdrgZgsKCIYkmuo1vo83xqqyyqvo.mywvsx47-wwu5kmu.y1q@flex--ackerleytng.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3EdrgZgsKCIYkmuo1vo83xqqyyqvo.mywvsx47-wwu5kmu.y1q@flex--ackerleytng.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011847; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iSL2kMUQX0MYfWs91FUoGWuk2V9BSwbXfpTOfw5RJmI=; b=iDZFOlAR8BEVJoqn7vP+OdNzW9DVJE9GCXn/qSVH1CZe0xNQqCk3q9W/ezMWZmS9MErFeQ f8+H/28MpqgQGp+dyCQT0xZxcobP3JTR4ZzpoUfwUe+RD9rQo6diireOe2U5Lnu18yFQz2 Qo7ESkBBUxF8pvqGjKjZ2ReJCRhQof8= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2d87b7618d3so1521718a91.3 for ; Tue, 10 Sep 2024 16:45:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011922; x=1726616722; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=iSL2kMUQX0MYfWs91FUoGWuk2V9BSwbXfpTOfw5RJmI=; b=CHU6olBJlFPj2PNaLR18CJxnxVPjgVbd62QsSu3lrSBOBHTaDtxObdTZh9SYDW80Q/ icdzTdmF2mLU7FLz/5d8zgLCBeY0BgXDYdGde/S0DBM3LxzgVUzKI+2Tk6PxzBQT5d5Q hSj9CiQm2Yhk+gK7I+07+UBCWOC5QMQpXWuJaVguERbgqvAWvqDkwuipbGNtIyht/pvM Ls3v8pPrTA11ZAFtzDCvCWE0TCQ6+iYIFZf3PLgtuht5i0JwJ0Q+ZIwYqk6OYKoqsIkE ADIPhaL3DFBx+/XfdcZZ5He7crgpY7S08tQbkXHsIrOZmIgoJe4CL4OJhwR/eTsvGSx1 HL+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011922; x=1726616722; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=iSL2kMUQX0MYfWs91FUoGWuk2V9BSwbXfpTOfw5RJmI=; b=lSGQoA2w44AoziWOGLu27bgsG7Pc3vHqDrUDg8uXmuqgH6Ts0PXqMb1VrZ2Pcj1P3I fPUyc1k6ZSljRkjD6dWE+/y0GCP3YHbkCCE0a0VOI4B8lSppoOUFmfT/J1mhhKrw2ejj ZrlfiYIEnV62cBX/GaNpOiIgy8i/5DqBio4Op9z4318IuqWQHY1CjFDfNwaLz+sAalNB g6Q11bRAEFi+bKqbZvvQTntdkkHXVMJ+JHEUsTsWrd8yG5LkYgxRcF9Q6Xz68KvuOM/Y EpF32vHvB9c0aSK0W6GKfkmNeU73+Oh2ZVD+RzDu1Ytzxt/q8N63Y6CjOjS6RjIob16k yetA== X-Forwarded-Encrypted: i=1; AJvYcCVSFQUAns/11/r9sK+nZdOVGA52mLZPAGztkW/X7OgKECHtvEwOpP2CajMYiSO6cM+aQ2uEWu+jTg==@kvack.org X-Gm-Message-State: AOJu0YxglMFr6K3zozQxQwc0+bELzJHZAI68E54rJE9AKTL4OTIL7hbm CpAzvvb6QdmhNqU5MOX1vQE/eVdXAUQbKOYWgZdtCnvp1VGJW5p4GTXM5EUf5Lop8RSMJkJrVON eiFkWiAWhdapY7nhXDge9/w== X-Google-Smtp-Source: AGHT+IHeEJa/t3gbMh3LQ6wjI0/JOXt73EyPzcfCUwf0I0tvYHeE0b/ouOGMNpZEECjGvQ2LqF2A810orFlmLkMS4Q== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a17:90a:c2c5:b0:2d8:bf47:947c with SMTP id 98e67ed59e1d1-2db8304d552mr2449a91.3.1726011921647; Tue, 10 Sep 2024 16:45:21 -0700 (PDT) Date: Tue, 10 Sep 2024 23:43:59 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: Subject: [RFC PATCH 28/39] KVM: guest_memfd: Use vm_type to determine default faultability From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspamd-Queue-Id: 2F45E2000E X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: jw6ccgbi1gx7qt4yun3ou7ymg5ngytks X-HE-Tag: 1726011922-395605 X-HE-Meta: U2FsdGVkX19fOv08Vt/j7uVplRqpT44nUoiZ1l3XddOcbb76UWdNuLtE/U/PPwAHum96LulHrVsbR327fikCK8GGhwiG0dKSdJtKswUwz444L1XABoub1ZLUe2zv8J4UrP+oSAxgMq0YbnJfWjh3/smf6zLV76KrTD0rbZoImQIm0d2L9jO49BYQ9Nb2HRyHcw5unSV19bTmTFf/YtkV/EA1Fqq4KN4ZlVvBDEgwbJtVMK40NascwIEA4/esXvrnbxYXuhoTFwfifdOcokCkI3ZCOVADGRW7EbmSew2zETgZrPMFYC7WKq35f/IIv5aYDFJF+s8D8hvzhOy8sHlI79QnDyWUR7aoTfKcjW5Y1vDRSmx9ct8wz7NYmaENjMKTci+d0SM64EwYJMoky9JGbyf8WVvMj1mUvTUrzNDLWE5xvRWxwa3np6V/w0tYqaqXsDSk+IkapNfi1lo+PjzD+8o9r5+BzADif1x5yhDXv3CjFA7uAbNE6ejV7CjbauDDsY1GQlLNBELfbduWZ+cgfsYcKyVmUOAmtq1/Yp+3HzrT3TNbKzbmiG/2IkMnoGhhE9mAYzsdd2YbeGYEZhL59ybc5cNAY0s2ORLLy2Cx3j1S1yrEXMHVyA4DYTZMoVJkGim+CudGmrJiR3OwL15ffZLHoPxd8AReCJ+F46ywxVrSoQNKPdI7+GRBBPzOAdD38UFMUZHwUea6fuYl7DiSfKPfW2wmNT7tlCzM3z7e7l2TZV8ziSsXfR4yR5F3rfjX2oesCWD7C2pexvRXqXITN59OzeUHYzPqqAvatqM6G4XP5Q5HnNIH081BwiPwd4PZFJbK12DUU4Gd8rD9M3qCpU0jXxQnacredQlPdeu2IRoWblRzXppsu8G4z9lpfwxvCbxAjFGv40hIsOVzrv0SL76uFUMUtD9mDDGezG6yKf6eKMFurtcK8Y2iQrMymtWxPw46HZwEM/ND9US0zmV v0I5kZS/ Hl2kdYTIu1GOzncotauKUgfey5tYuZcmS2oqrRM0GneY2ZYB8lnwlWx6l3YnpqNg//4HBmPPuqRwswDyXC4PWJhZU/oJg8vjClFHen+jXVomoZNCjNdT+Y32uHLlO7FS1/mLWSx5BSng628ASXKJDA+Z0KVaRufVbBKKpyGvNaruFa5H+fHR59EPlTv5Csg77K8HSRaAxUN2qtkT6TTKUK7No+iRtz7YV/7MEBwW6kOcjAmo1RHWdy+6lg/BQcrTV14CDayLht9CTYXeTJDnxI1ocPyw2zAVQMCy3FxsXIkH7xrWUxkih868aDX7/VxSVYxm+J+aAI77uRDojt7DfEOgc6oY/gTVQnrfkcfi4uDU+t0SHgMiEpI5AjLqZUxq6IZwWbELTaJss4/j3dKldt537qEX4/VKHW1Pq7+weZxEXwVW8W0qwhEmAp0ORsuK3q8xv2gIhJG8n2Kwq/fyFw/tE531GP440ozMEaN+Zv9TUjaJHHnL9A4zy+JYA1/pRxCA0svyJCKRYFOywSIqJmJ+JSVuIeE+ilpoiSAXmtAKHpRb26aDF5RcZYVgJCEX2nzoIUS6a9mHiCrFFs8/QM4+2/1/eNseQFGCtRORgvpvzWGcRdw6fo6McddvUhIXRur/L22s6oAv5JJwJxnFJP65tjVH1GQ6LPP/afXvZImvFz88iaOzTj4PhEnJcUhA4iZz1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000013, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Memory of a KVM_X86_SW_PROTECTED_VM defaults to faultable to align with the default in kvm->mem_attr_array. For this RFC, determine default faultability when associating a range with a memslot. Another option is to determine default faultability at guest_memfd creation time. guest_memfd is created for a specific VM, hence we can set default faultability based on the VM type. In future, if different struct kvms are bound to the same guest_memfd inode, all the struct kvms must be of the same vm_type. TODO: Perhaps faultability should be based on kvm->mem_attr_array? Signed-off-by: Ackerley Tng --- virt/kvm/guest_memfd.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index fc2483e35876..1d4dfe0660ad 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -1256,6 +1256,23 @@ static struct file *kvm_gmem_inode_create_getfile(void *priv, loff_t size, return file; } +static void kvm_gmem_set_default_faultability_by_vm_type(struct inode *inode, + u8 vm_type, + loff_t start, loff_t end) +{ + bool faultable; + + switch (vm_type) { + case KVM_X86_SW_PROTECTED_VM: + faultable = true; + break; + default: + faultable = false; + } + + WARN_ON(kvm_gmem_set_faultable(inode, start, end, faultable)); +} + static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) { struct kvm_gmem *gmem; @@ -1378,6 +1395,11 @@ int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot, slot->gmem.pgoff = start; xa_store_range(&gmem->bindings, start, end - 1, slot, GFP_KERNEL); + + kvm_gmem_set_default_faultability_by_vm_type(file_inode(file), + kvm->arch.vm_type, + start, end); + filemap_invalidate_unlock(inode->i_mapping); /* From patchwork Tue Sep 10 23:44:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799502 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0EDE0EE01F1 for ; Tue, 10 Sep 2024 23:46:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3AF378D00E8; Tue, 10 Sep 2024 19:45:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 333888D00E2; Tue, 10 Sep 2024 19:45:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 18FAB8D00E8; Tue, 10 Sep 2024 19:45:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id EA8DF8D00E2 for ; Tue, 10 Sep 2024 19:45:26 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id A6DA61607E9 for ; Tue, 10 Sep 2024 23:45:26 +0000 (UTC) X-FDA: 82550462652.23.778AA40 Received: from mail-pg1-f202.google.com (mail-pg1-f202.google.com [209.85.215.202]) by imf23.hostedemail.com (Postfix) with ESMTP id C01E3140008 for ; Tue, 10 Sep 2024 23:45:24 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=GSUu6mpb; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf23.hostedemail.com: domain of 3E9rgZgsKCIgmowq3xqA5zss00sxq.o0yxuz69-yyw7mow.03s@flex--ackerleytng.bounces.google.com designates 209.85.215.202 as permitted sender) smtp.mailfrom=3E9rgZgsKCIgmowq3xqA5zss00sxq.o0yxuz69-yyw7mow.03s@flex--ackerleytng.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011920; a=rsa-sha256; cv=none; b=DXVHVRRI5sECGjjB/XiqWO6uYtC366BUIM8txcsHKsCKwyQFkv5D7or2VrM6BzJUDwl2Az /kls048XykigjB36JuswSfSLRd7Bs3cBUfZfu16rfZc63Llsujonz4lkj5hz5XRyeAEBD0 q6wAizT9Rffauo79zpUxG+QQa8FsqL8= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=GSUu6mpb; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf23.hostedemail.com: domain of 3E9rgZgsKCIgmowq3xqA5zss00sxq.o0yxuz69-yyw7mow.03s@flex--ackerleytng.bounces.google.com designates 209.85.215.202 as permitted sender) smtp.mailfrom=3E9rgZgsKCIgmowq3xqA5zss00sxq.o0yxuz69-yyw7mow.03s@flex--ackerleytng.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011920; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=FjI6cBjR9AW7KizYJRES3kHs2nxbgzqD9PMU12DXiXA=; b=rF0jRYMoHfQFzoO+BKitZLs8IQRCXXvmPOcObF7eIsI6NlqSNQx8/fUdjvt2JtRC7BYZn9 IH6Ec+PeNHmQ7cJ+7abqIFk1D4LQ6ELFdJCB1tCnTiB75xEEFMLq5XVoNnSx0fw+IUodFu 0ExLNBHGDZnJp9V7StMqB0lsvITvLCU= Received: by mail-pg1-f202.google.com with SMTP id 41be03b00d2f7-7d904fe9731so2772688a12.3 for ; Tue, 10 Sep 2024 16:45:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011923; x=1726616723; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=FjI6cBjR9AW7KizYJRES3kHs2nxbgzqD9PMU12DXiXA=; b=GSUu6mpb++lj5ZPRelS1rjd70W0Rc9IkyYiPo6FGLoI6zx6maKuzROdQISxVAyze8u qtja9QepWYmTamAMcv/ittphbeNMAr3jFKw/rLH72J2Squa9ZImSx1l3uOhu2IrQuMpE by+uwlFiIHaEj2Oz218z30uxVefli/+JcdFPcODuLM2mr8PnE9i8kAG5S8UGV3LONYTy uQj+CnWssDsWR/qdGIK9a/S1beKU1wCFcRv6DjIGiKQEYGMdVYBJXIvoR7YxKWAqlcU/ o46gBfrSk0ffxv2Tcfp6D7p276DcDx75rtC9PRlCY2c0/ebEKtHrv1ApR7/9NEw0DaLj z1Zw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011923; x=1726616723; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=FjI6cBjR9AW7KizYJRES3kHs2nxbgzqD9PMU12DXiXA=; b=AaYQGTChrFnshJZ++oPc3eV5ypums1+A5h8PY9kzwJXLo0lRtjTm3mbYX4+t4CJJWj j4Jgmyt6UQti9CmPdG+XfzxB3u0XKA9FKMkeT7f4MCVcZKY7lEj9JppxnCW7CkXRXTYh YbH4Xl7tNCC0wy8gyhs0q5Y9vUg1s/6yIPMM4RTi9FBejJggRVDJ9T7zVKPfRTueMSkw RSQe2k7+TjgmGOHlF4toiHCIP2KadDWe0rIo7/sLwJEaFiEM8VrIIsxlDIRJ6bY0H0WT PZOU3aZ6dREEC7G+w0BZX3l6N4CxONUsKA6GTFKTN56rbIgMCHqYr8fEo6trXKx08958 pHMA== X-Forwarded-Encrypted: i=1; AJvYcCWLBeVonHZiZF/21/4TSpa+9O77POujqUBbrvjBAmamseoa3JUfABdV5tvR1zaPyANtDRHTtubxmQ==@kvack.org X-Gm-Message-State: AOJu0YwMZ5E4tYbXr42lsh/l8u0VxgnHC2sp3rTTAmhzGuOEC+GDY8qu cdrov3r9EcdJdeV5yGh3KxzEGR6bQyfrC0mPPccK2tjlRYAziHGxKGUQY1xmVKvxnDFyJFmucZp 0twE8EPTy0lFsVISRdEYikg== X-Google-Smtp-Source: AGHT+IHSKJ4yx6H+R6vmjAajFKhOO5Lbm3TP4ppQgPnLaqzbpul3ZYzZk4Z4nMlq9x4ej3kbxbURUt5MTOfpTqRbXQ== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a17:902:c405:b0:207:50e2:f54 with SMTP id d9443c01a7336-20750e20f6emr1226925ad.1.1726011923409; Tue, 10 Sep 2024 16:45:23 -0700 (PDT) Date: Tue, 10 Sep 2024 23:44:00 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: Subject: [RFC PATCH 29/39] KVM: Handle conversions in the SET_MEMORY_ATTRIBUTES ioctl From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspam-User: X-Stat-Signature: gqp8kytrxq1c6zm53hnsdx74gna735ch X-Rspamd-Queue-Id: C01E3140008 X-Rspamd-Server: rspam02 X-HE-Tag: 1726011924-7445 X-HE-Meta: U2FsdGVkX1+U3JaFaVyyoUeL/X+mp9FgV4cyW+OALzUvUTZ48TNeNtLaSB8Wo397dpeH+XoDcMMvQZgYCXrBNO+5mRRPp+K3kQVcuUkOd+wtNRdSMI3qP1C/9UwkU6rymUceJ1LPlox+MzvkOGvEHLwgcgzVOihZfXF0VdxOB6HAZ+IYH4AXQGJnwrWWguPFvVYxcM0SgFROycCOYwZoNNL8ScFhUIkuF8ZvSxm0VMnQKF7aYu8hYc60kmLJvsJ6eG8PE/kzBI4gJJM4U2Y3ulLXsPaT2VjUKfe7gseOZv0wbFXU4wLzcgaQquxM1ZheqYtDugdC481tn9yuyotf/UvXpHdY6z0qURvI+jBuEi7JXrq4DrzomOjwKylA2m+g/OAkaFwP30n3sE1KDnqztj+KMmYHHNmKVjhxAhj/eX67J4HsDyhplRQCgQZTUk3Re8ZZ3LTxd8/HRmgK9H3MnuMMV+RRRm+JFSi470jwno6x2ppH2PLm7OSUWuQHyis4qkam7Rg4apABQAMnOj/1d3Nr1c53zG2/t+FsMqXhc6kd7EChbvSD4qHbXuezapMmjmQ9bvtFewa77JQVp1GtNwWzGBB6rXb3tsoSVFuFWiRU7tLx0wAshcKW2hN7gXp+bwbVS4IvXGT5RoU6sK7/zJyg3joSfNA00766UapuZEnmoHRrH1u5OLI1b37A5kjk3ODzwy+QP+9NejZIBo7D1OdcNxZkzDHcuKbeIutlRGi7TWVY9IKd0rdAcxB5hM/kyqFkJHjsxAckkAJ6G/OCRz613q+uQRr31xmjQKEOlzrSuSSZFuHnoQR8gxKwhFnQs/DaGgm6B1zEbMZ2x+s5HZVwm1WFTIuNbWusP1QRXJAI0fP9wi6y3YX/nqdLD0JwWYjmyhGANyqBD0vDHWZU04UZNJR1uMrjiOz6hZa5IqU5et0Ffu7BmM4Ht3qjIjQtfkYergVB7tKSMBURmvO MNfBd5ts 5QZbUaFB5Fgp48e3JdBU5zN/PEMIrCMCdmp7i/54W8VjuSw8dK28zVbGsAC51wS19EL5PlTv3xl83yXeh1Bdard2yo3sYcG2SRGO25mg1Tm2HyN4kpBPJkVma1mhAOeeBaTDEHXRDRxsPmEYTvRIh0M1yYnJLQ/a3hI1l3VwzyNRgPxzFKcxX6GqkuxHaxO+I4jRi95d3hEc3gGriejXm/W3t4pPbR9Gn86D3PTAn5rKilKh+LHK8GRs3uO4mqXlk8Kuz1sUNG7EOebk9RpMMHCQnuo4HgyWRDvB9SX765YoMPhaac3OiT5r8+ZpYGcL8us5JePBJuVukX+BAIaCID/k8lqMj+IKRLHf7ADgwO/ELP2hG2DueLRKVqWCvU3veMWjRtHnH4PwlrubLxg8KlL8Z+CfiV4LTw2pdzWu0wsC2WrnlhWP0Nsu+GJ7fA6bnRcP6g+2KvEZrllY21q1jNiLG4ViGWwJt1p3jVJHx2ko06j8GBqW/CpE4z4jAU+c1vMb4cf9gw9tWf2VKaiFpP4nUSPJNKYdPoax1Z0cxlKm3juvSZ/WGg+BbYazJv03n7IQ9/lORHMUJ+Sfp7EbKrDF9pE6aIZsDwcMCk91EedNiq/EJ7tZ1YtQlBJ/o6uyc0ilANi0A9/aeSPjz0A2QsurKLbVb28I6Uu/0k7G//0SMUMf5keihzsM2ymH29s5vyXiQ2dGh7WarvHY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The key steps for a private to shared conversion are: 1. Unmap from guest page tables 2. Set pages associated with requested range in memslot to be faultable 3. Update kvm->mem_attr_array The key steps for a shared to private conversion are: 1. Check and disallow set_memory_attributes if any page in the range is still mapped or pinned, by a. Updating guest_memfd's faultability to prevent future faulting b. Returning -EINVAL if any pages are still pinned. 2. Update kvm->mem_attr_array Userspace VMM must ensure shared pages are not in use, since any faults racing with this call will get a SIGBUS. Co-developed-by: Ackerley Tng Signed-off-by: Ackerley Tng Co-developed-by: Vishal Annapurve Signed-off-by: Vishal Annapurve --- include/linux/kvm_host.h | 1 + virt/kvm/guest_memfd.c | 207 +++++++++++++++++++++++++++++++++++++++ virt/kvm/kvm_main.c | 15 +++ virt/kvm/kvm_mm.h | 9 ++ 4 files changed, 232 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 79a6b1a63027..10993cd33e34 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2476,6 +2476,7 @@ typedef int (*kvm_gmem_populate_cb)(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn, long kvm_gmem_populate(struct kvm *kvm, gfn_t gfn, void __user *src, long npages, kvm_gmem_populate_cb post_populate, void *opaque); + #endif #ifdef CONFIG_HAVE_KVM_ARCH_GMEM_INVALIDATE diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 1d4dfe0660ad..110c4bbb004b 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -1592,4 +1592,211 @@ long kvm_gmem_populate(struct kvm *kvm, gfn_t start_gfn, void __user *src, long return ret && !i ? ret : i; } EXPORT_SYMBOL_GPL(kvm_gmem_populate); + +/** + * Returns true if pages in range [@start, @end) in inode @inode have no + * userspace mappings. + */ +static bool kvm_gmem_no_mappings_range(struct inode *inode, pgoff_t start, pgoff_t end) +{ + pgoff_t index; + bool checked_indices_unmapped; + + filemap_invalidate_lock_shared(inode->i_mapping); + + /* TODO: replace iteration with filemap_get_folios() for efficiency. */ + checked_indices_unmapped = true; + for (index = start; checked_indices_unmapped && index < end;) { + struct folio *folio; + + /* Don't use kvm_gmem_get_folio to avoid allocating */ + folio = filemap_lock_folio(inode->i_mapping, index); + if (IS_ERR(folio)) { + ++index; + continue; + } + + if (folio_mapped(folio) || folio_maybe_dma_pinned(folio)) + checked_indices_unmapped = false; + else + index = folio_next_index(folio); + + folio_unlock(folio); + folio_put(folio); + } + + filemap_invalidate_unlock_shared(inode->i_mapping); + return checked_indices_unmapped; +} + +/** + * Returns true if pages in range [@start, @end) in memslot @slot have no + * userspace mappings. + */ +static bool kvm_gmem_no_mappings_slot(struct kvm_memory_slot *slot, + gfn_t start, gfn_t end) +{ + pgoff_t offset_start; + pgoff_t offset_end; + struct file *file; + bool ret; + + offset_start = start - slot->base_gfn + slot->gmem.pgoff; + offset_end = end - slot->base_gfn + slot->gmem.pgoff; + + file = kvm_gmem_get_file(slot); + if (!file) + return false; + + ret = kvm_gmem_no_mappings_range(file_inode(file), offset_start, offset_end); + + fput(file); + + return ret; +} + +/** + * Returns true if pages in range [@start, @end) have no host userspace mappings. + */ +static bool kvm_gmem_no_mappings(struct kvm *kvm, gfn_t start, gfn_t end) +{ + int i; + + lockdep_assert_held(&kvm->slots_lock); + + for (i = 0; i < kvm_arch_nr_memslot_as_ids(kvm); i++) { + struct kvm_memslot_iter iter; + struct kvm_memslots *slots; + + slots = __kvm_memslots(kvm, i); + kvm_for_each_memslot_in_gfn_range(&iter, slots, start, end) { + struct kvm_memory_slot *slot; + gfn_t gfn_start; + gfn_t gfn_end; + + slot = iter.slot; + gfn_start = max(start, slot->base_gfn); + gfn_end = min(end, slot->base_gfn + slot->npages); + + if (iter.slot->flags & KVM_MEM_GUEST_MEMFD && + !kvm_gmem_no_mappings_slot(iter.slot, gfn_start, gfn_end)) + return false; + } + } + + return true; +} + +/** + * Set faultability of given range of gfns [@start, @end) in memslot @slot to + * @faultable. + */ +static void kvm_gmem_set_faultable_slot(struct kvm_memory_slot *slot, gfn_t start, + gfn_t end, bool faultable) +{ + pgoff_t start_offset; + pgoff_t end_offset; + struct file *file; + + file = kvm_gmem_get_file(slot); + if (!file) + return; + + start_offset = start - slot->base_gfn + slot->gmem.pgoff; + end_offset = end - slot->base_gfn + slot->gmem.pgoff; + + WARN_ON(kvm_gmem_set_faultable(file_inode(file), start_offset, end_offset, + faultable)); + + fput(file); +} + +/** + * Set faultability of given range of gfns [@start, @end) in memslot @slot to + * @faultable. + */ +static void kvm_gmem_set_faultable_vm(struct kvm *kvm, gfn_t start, gfn_t end, + bool faultable) +{ + int i; + + lockdep_assert_held(&kvm->slots_lock); + + for (i = 0; i < kvm_arch_nr_memslot_as_ids(kvm); i++) { + struct kvm_memslot_iter iter; + struct kvm_memslots *slots; + + slots = __kvm_memslots(kvm, i); + kvm_for_each_memslot_in_gfn_range(&iter, slots, start, end) { + struct kvm_memory_slot *slot; + gfn_t gfn_start; + gfn_t gfn_end; + + slot = iter.slot; + gfn_start = max(start, slot->base_gfn); + gfn_end = min(end, slot->base_gfn + slot->npages); + + if (iter.slot->flags & KVM_MEM_GUEST_MEMFD) { + kvm_gmem_set_faultable_slot(slot, gfn_start, + gfn_end, faultable); + } + } + } +} + +/** + * Returns true if guest_memfd permits setting range [@start, @end) to PRIVATE. + * + * If memory is faulted in to host userspace and a request was made to set the + * memory to PRIVATE, the faulted in pages must not be pinned for the request to + * be permitted. + */ +static int kvm_gmem_should_set_attributes_private(struct kvm *kvm, gfn_t start, + gfn_t end) +{ + kvm_gmem_set_faultable_vm(kvm, start, end, false); + + if (kvm_gmem_no_mappings(kvm, start, end)) + return 0; + + kvm_gmem_set_faultable_vm(kvm, start, end, true); + return -EINVAL; +} + +/** + * Returns true if guest_memfd permits setting range [@start, @end) to SHARED. + * + * Because this allows pages to be faulted in to userspace, this must only be + * called after the pages have been invalidated from guest page tables. + */ +static int kvm_gmem_should_set_attributes_shared(struct kvm *kvm, gfn_t start, + gfn_t end) +{ + /* Always okay to set shared, hence set range faultable here. */ + kvm_gmem_set_faultable_vm(kvm, start, end, true); + + return 0; +} + +/** + * Returns 0 if guest_memfd permits setting attributes @attrs for range [@start, + * @end) or negative error otherwise. + * + * If memory is faulted in to host userspace and a request was made to set the + * memory to PRIVATE, the faulted in pages must not be pinned for the request to + * be permitted. + * + * Because this may allow pages to be faulted in to userspace when requested to + * set attributes to shared, this must only be called after the pages have been + * invalidated from guest page tables. + */ +int kvm_gmem_should_set_attributes(struct kvm *kvm, gfn_t start, gfn_t end, + unsigned long attrs) +{ + if (attrs & KVM_MEMORY_ATTRIBUTE_PRIVATE) + return kvm_gmem_should_set_attributes_private(kvm, start, end); + else + return kvm_gmem_should_set_attributes_shared(kvm, start, end); +} + #endif diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 92901656a0d4..1a7bbcc31b7e 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2524,6 +2524,13 @@ static int kvm_vm_set_mem_attributes(struct kvm *kvm, gfn_t start, gfn_t end, .on_lock = kvm_mmu_invalidate_end, .may_block = true, }; + struct kvm_mmu_notifier_range error_set_range = { + .start = start, + .end = end, + .handler = (void *)kvm_null_fn, + .on_lock = kvm_mmu_invalidate_end, + .may_block = true, + }; unsigned long i; void *entry; int r = 0; @@ -2548,6 +2555,10 @@ static int kvm_vm_set_mem_attributes(struct kvm *kvm, gfn_t start, gfn_t end, kvm_handle_gfn_range(kvm, &pre_set_range); + r = kvm_gmem_should_set_attributes(kvm, start, end, attributes); + if (r) + goto err; + for (i = start; i < end; i++) { r = xa_err(xa_store(&kvm->mem_attr_array, i, entry, GFP_KERNEL_ACCOUNT)); @@ -2560,6 +2571,10 @@ static int kvm_vm_set_mem_attributes(struct kvm *kvm, gfn_t start, gfn_t end, mutex_unlock(&kvm->slots_lock); return r; + +err: + kvm_handle_gfn_range(kvm, &error_set_range); + goto out_unlock; } static int kvm_vm_ioctl_set_mem_attributes(struct kvm *kvm, struct kvm_memory_attributes *attrs) diff --git a/virt/kvm/kvm_mm.h b/virt/kvm/kvm_mm.h index 715f19669d01..d8ff2b380d0e 100644 --- a/virt/kvm/kvm_mm.h +++ b/virt/kvm/kvm_mm.h @@ -41,6 +41,8 @@ int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args); int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot, unsigned int fd, loff_t offset); void kvm_gmem_unbind(struct kvm_memory_slot *slot); +int kvm_gmem_should_set_attributes(struct kvm *kvm, gfn_t start, gfn_t end, + unsigned long attrs); #else static inline void kvm_gmem_init(struct module *module) { @@ -59,6 +61,13 @@ static inline void kvm_gmem_unbind(struct kvm_memory_slot *slot) { WARN_ON_ONCE(1); } + +static inline int kvm_gmem_should_set_attributes(struct kvm *kvm, gfn_t start, + gfn_t end, unsigned long attrs) +{ + return 0; +} + #endif /* CONFIG_KVM_PRIVATE_MEM */ #endif /* __KVM_MM_H__ */ From patchwork Tue Sep 10 23:44:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799503 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3143EEE01F2 for ; Tue, 10 Sep 2024 23:46:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 46F908D00E9; Tue, 10 Sep 2024 19:45:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3F61B8D00E2; Tue, 10 Sep 2024 19:45:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 24AAD8D00E9; Tue, 10 Sep 2024 19:45:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id F3A608D00E2 for ; Tue, 10 Sep 2024 19:45:27 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id B316DA91BC for ; Tue, 10 Sep 2024 23:45:27 +0000 (UTC) X-FDA: 82550462694.19.1E122BC Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf19.hostedemail.com (Postfix) with ESMTP id E51241A000C for ; Tue, 10 Sep 2024 23:45:25 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=zVWvl281; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf19.hostedemail.com: domain of 3FNrgZgsKCIknpxr4yrB60tt11tyr.p1zyv07A-zzx8npx.14t@flex--ackerleytng.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3FNrgZgsKCIknpxr4yrB60tt11tyr.p1zyv07A-zzx8npx.14t@flex--ackerleytng.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011849; a=rsa-sha256; cv=none; b=hZSAk7AukAjnIW+d+89724Aw6xHIhEWd+HIyS3MZs2aXxN2+gYWQoWf3DJEMyWdH+mEWp/ KGuUE/Fm70tw1V+UqQK9WsNiNMbXcOuK9EpRTX5C3HNJ222pIrDVc2N5QZ7Umb+KHqsbDS jIiPA4Y/ACrnfeoEdDYWFnlOjirIogI= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=zVWvl281; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf19.hostedemail.com: domain of 3FNrgZgsKCIknpxr4yrB60tt11tyr.p1zyv07A-zzx8npx.14t@flex--ackerleytng.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3FNrgZgsKCIknpxr4yrB60tt11tyr.p1zyv07A-zzx8npx.14t@flex--ackerleytng.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011849; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=v3nxHl/A8XO7B4MgOY4YbNCzyESNqY5WdHlaVGBDG2E=; b=vu+Zz1a7PMog/cLLpGONROkMOyZm/rnKi7AIFZXErwD2dErpKLQ3DclRIeWIzF4XGyhxrt 1NagmOADLQ6G9HlZQ5jrR5Dz2sz/Mnw+lUwG89FH5tsHROIGuQEM5zk+ZxYBAW9M341d8V sZLTkEvzYPMhST269GTtoebfOlc9dH4= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-6d5fccc3548so6667677b3.1 for ; Tue, 10 Sep 2024 16:45:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011925; x=1726616725; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=v3nxHl/A8XO7B4MgOY4YbNCzyESNqY5WdHlaVGBDG2E=; b=zVWvl281L3Sn6pmXwnukGOkbg47GVhuSS3juIzRymrHpv/SnX81JuvR8eX55JY225D s/uvA3Gaa5zGeiBFX/AF+nK6epZHhwysD0UGduiTN3Kwz7yH/YMwS4BWj15Mg/AmCnkG RKN5mvfacjFPwXwkE4EwI+CrP6ZDRbdWUEfq/lQVZTyN+OeNfYUdPEh7bjQG5lXxRq1Q pq7YRs7HhCUHLp5ZjeLEgzS3x+PebL4B3PzxXPM3fuj1RdKLLmRnojeG/WaYJxeBJW3T 5eXvfOVJkVFnpAco+gXSs6FpxnBHlNnnE7CsilT6pfkssuCpXDJ2U9OuBXasxGlxdGlR yxkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011925; x=1726616725; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=v3nxHl/A8XO7B4MgOY4YbNCzyESNqY5WdHlaVGBDG2E=; b=FH4xjKR/8t6iyktFeSwhuN3m8T7NBZCxOidD1H7fhDKIagnvTUoBuebFewFH2nSYsq JF+xaqykawr47S0lHyv8C+HEHAZ8Srk9NMKXnB8c5R3NlHOgTjWIrASHGagO+V0e+pOT NM8VBv/w0Q3cbdvfumTam5N9mWPAMyHKJ3o15HmUt0Mu6p2wadORTHyk+ytNHlsPH2LI ncyiw5vuwQSkFIA0BBuJsdcttGxMLxTL0Bx5xemNhNakUgfBKEtJCOdCv4DIN4Vm8moF cfoqHAFBw/Ojwcl2xZLiTeyfXyIOcW/dUhqecWCQlmH/e2beahBKVmx6jT7DDXLmL6cz rAMA== X-Forwarded-Encrypted: i=1; AJvYcCWwHQNTUi968X+vZvkwsWllQSg/3WpzA/MOSR0NHa21F6fWCMv2QYOL3hQjE7ZRWwAQQEtLWwUBdw==@kvack.org X-Gm-Message-State: AOJu0YwsP7OC3CLxMTAtNT7ekJ1Nxou8uzKXOum1o+7GU4+oF5OEOweo M0w/9dolAkDascDDJP3YQNn+YMOpZ3xVR2yaPvm3JS0clrzCQ3TJ5yt7SSqI0CBXsn2PbEifnD4 QDSi39YTsIV7t9alN4BjT2A== X-Google-Smtp-Source: AGHT+IHnLEBVmryIC9G4GF4GWQoXu2VnZ3KvRT32M4K9/h8Y/56ASW5COSwM++WZGjik0HMML4LUNHLenr0THNvvKg== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a81:a883:0:b0:6d3:e7e6:8462 with SMTP id 00721157ae682-6db952f5ee5mr1095417b3.1.1726011924932; Tue, 10 Sep 2024 16:45:24 -0700 (PDT) Date: Tue, 10 Sep 2024 23:44:01 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <24cf7a9b1ee499c4ca4da76e9945429072014d1e.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 30/39] KVM: guest_memfd: Handle folio preparation for guest_memfd mmap From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspamd-Queue-Id: E51241A000C X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: yu6jn35w4qt9gw7r7u4umx6h4q95579u X-HE-Tag: 1726011925-167395 X-HE-Meta: U2FsdGVkX18jI0r/E/lJXbAS8v9FTqAScAx75L9qf5L+Nv4iLJ8nyVtPrFngwK7AflZY82+Zu/SEWcEbCjuf8zd6d3j8HyUoQWL31NKWOsATN6seFeeguFv3QccnUQHn4Kj43ggztXXJ+LPnwM4UVVaGhX/OoUpSPsosX3gBUUMURgvSOJuVQyfyFtQdUQfZ8GwXo1xacoJVGxfFoz6kAcpg1xL4bC8S5tr5KgkGuYQSoVFLuG3I8HEIF4sYh3x+UoftuLkct7JJ8kpQ1BH1msOLRU+8q80/0Eh8kM+baM6P8xG5nGqjAAS/IJQfWy/1EBWb6d2nDPCLgkxz8dPIgQ2Gj9h5P7WTH6FBU6IUzEzBXyhtJ9oYdVyME/Uj4zDYx28ExFHgyoYbLdY0LXGwo9eVKh85vrRp3JKIUAhapOQNR6fghzuIOr82UFvW/GCAYnC3tQgS5N0J2s5BVTVkvsos6RTZYyqe/T+gb43xKElV7g8klZS+TDQIBKhJJzR4aNE4kOH1DF/7YyaWF1ygTlZjzp7DrwZU9LVIIczUkaZl+j/7DFBeK81V+kdaT6P7s+lWi/Zu5qhCUwrLt17ReyLOMboH56VJh55+mCSBjgbkbWIpZ9wO8CRHlYjU8WozMestScf+Z0P9fDbl9LxXS5/6dDUYrUYIcY4h0To3TiN3yQiMzU9oBKlCUO5PSslFaTtcALccxHEzSPW4FJwVCQb5WPFIDN8/hCvY9BFiysiK0wKJ9196F+tbR4NKkZjQmPpbs5Lgg7hyGLJlmAE50Cyh+OqtVap3pGTjuQgGo9C5SyuiagglPbhjGNLKa0d1Dy6KIsWvKnc6eo+sCEnjUWrIf9jYO++QF/hhAYvfHl0fW49OzNbWKLwA/49VHDsPUOJIFrVljii3eFr9JP/LViAuDZzgvSYoQv5UCV0qcoYOXp4RhjsFnGdH/qg17eRHr7mzi+yWGfGAmcTOch5 LuMvNr7G oXqv4OMU/PJRVjQg8uQ3vb3gdLhIAG1nVAZ0MyIkem4Fov8RG5GIZTLqqhaCplOLFT5fwVJTRhbMg1paL6M86ZA1JUFvScHg/ZP6aoH38h76KvFC0O1USiqAUPFLgyEaS+nGx1LldK+JIwBGhK+vuJ71W4983muuRz4m7ZSTThMXKEKR0RCbOA8eswLrNQQu/tfn7KmG+8RzAgmVsUB3MoHeG6K95ZMwKgzxszgCh/K0+isxweiR799+yHJXqd012VxnadtebbhlxMbDuDY3JFqguCcrnvthD/IdDoLWJoqL8e6SWRhTWloCJKawiAWcgFKizPqKQrBSAYEi28SKX7ZYMBcNcfys1CIuOCrTCJrJ1ixBrB4Cr1uA3tDkoK1XAnU0JvQ12Sx0AywXI4FCcDlsaVPOyR15tkWyBlji0/TMunRprIcBc71rJK2xQRqbPK3hUbOJ86mtQXf0lZYO053yWaq+ANsINJl4++/4yEBmYGWBKBpdZE9WL6+YXOPnM/EqwTN2mfh5uYHMc0O1aCaafCGbEiqsd4zQx509YEedmKE3USwwlqsV2Sya0qYeYebSUoAWmHYDffLiKroZv8GsOAtO08uMeK4iuJdC0cSKfrpuhc8OlPXCFyaVvPZpgImzbCvQQmyHe+Y47jmpSOoYz7Qsc/BBafBrtb8PxiElFJ+4uQokjfFHdGEvQIkS/I9CyYGuSm7Z+3Wc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Since guest_memfd now supports mmap(), folios have to be prepared before they are faulted into userspace. When memory attributes are switched between shared and private, the up-to-date flags will be cleared. Use the folio's up-to-date flag to indicate being ready for the guest usage and can be used to mark whether the folio is ready for shared OR private use. Signed-off-by: Ackerley Tng --- virt/kvm/guest_memfd.c | 131 ++++++++++++++++++++++++++++++++++++++++- virt/kvm/kvm_main.c | 2 + virt/kvm/kvm_mm.h | 7 +++ 3 files changed, 139 insertions(+), 1 deletion(-) diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 110c4bbb004b..fb292e542381 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -129,13 +129,29 @@ static int __kvm_gmem_prepare_folio(struct kvm *kvm, struct kvm_memory_slot *slo } /** - * Use the uptodate flag to indicate that the folio is prepared for KVM's usage. + * Use folio's up-to-date flag to indicate that this folio is prepared for usage + * by the guest. + * + * This flag can be used whether the folio is prepared for PRIVATE or SHARED + * usage. */ static inline void kvm_gmem_mark_prepared(struct folio *folio) { folio_mark_uptodate(folio); } +/** + * Use folio's up-to-date flag to indicate that this folio is not yet prepared for + * usage by the guest. + * + * This flag can be used whether the folio is prepared for PRIVATE or SHARED + * usage. + */ +static inline void kvm_gmem_clear_prepared(struct folio *folio) +{ + folio_clear_uptodate(folio); +} + /* * Process @folio, which contains @gfn, so that the guest can use it. * The folio must be locked and the gfn must be contained in @slot. @@ -148,6 +164,12 @@ static int kvm_gmem_prepare_folio(struct kvm *kvm, struct kvm_memory_slot *slot, pgoff_t index; int r; + /* + * Defensively zero folio to avoid leaking kernel memory in + * uninitialized pages. This is important since pages can now be mapped + * into userspace, where hardware (e.g. TDX) won't be clearing those + * pages. + */ if (folio_test_hugetlb(folio)) { folio_zero_user(folio, folio->index << PAGE_SHIFT); } else { @@ -1017,6 +1039,7 @@ static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf) { struct inode *inode; struct folio *folio; + bool is_prepared; inode = file_inode(vmf->vma->vm_file); if (!kvm_gmem_is_faultable(inode, vmf->pgoff)) @@ -1026,6 +1049,31 @@ static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf) if (!folio) return VM_FAULT_SIGBUS; + is_prepared = folio_test_uptodate(folio); + if (!is_prepared) { + unsigned long nr_pages; + unsigned long i; + + if (folio_test_hugetlb(folio)) { + folio_zero_user(folio, folio->index << PAGE_SHIFT); + } else { + /* + * Defensively zero folio to avoid leaking kernel memory in + * uninitialized pages. This is important since pages can now be + * mapped into userspace, where hardware (e.g. TDX) won't be + * clearing those pages. + * + * Will probably need a version of kvm_gmem_prepare_folio() to + * prepare the page for SHARED use. + */ + nr_pages = folio_nr_pages(folio); + for (i = 0; i < nr_pages; i++) + clear_highpage(folio_page(folio, i)); + } + + kvm_gmem_mark_prepared(folio); + } + vmf->page = folio_file_page(folio, vmf->pgoff); return VM_FAULT_LOCKED; } @@ -1593,6 +1641,87 @@ long kvm_gmem_populate(struct kvm *kvm, gfn_t start_gfn, void __user *src, long } EXPORT_SYMBOL_GPL(kvm_gmem_populate); +static void kvm_gmem_clear_prepared_range(struct inode *inode, pgoff_t start, + pgoff_t end) +{ + pgoff_t index; + + filemap_invalidate_lock_shared(inode->i_mapping); + + /* TODO: replace iteration with filemap_get_folios() for efficiency. */ + for (index = start; index < end;) { + struct folio *folio; + + /* Don't use kvm_gmem_get_folio to avoid allocating */ + folio = filemap_lock_folio(inode->i_mapping, index); + if (IS_ERR(folio)) { + ++index; + continue; + } + + kvm_gmem_clear_prepared(folio); + + index = folio_next_index(folio); + folio_unlock(folio); + folio_put(folio); + } + + filemap_invalidate_unlock_shared(inode->i_mapping); +} + +/** + * Clear the prepared flag for all folios in gfn range [@start, @end) in memslot + * @slot. + */ +static void kvm_gmem_clear_prepared_slot(struct kvm_memory_slot *slot, gfn_t start, + gfn_t end) +{ + pgoff_t start_offset; + pgoff_t end_offset; + struct file *file; + + file = kvm_gmem_get_file(slot); + if (!file) + return; + + start_offset = start - slot->base_gfn + slot->gmem.pgoff; + end_offset = end - slot->base_gfn + slot->gmem.pgoff; + + kvm_gmem_clear_prepared_range(file_inode(file), start_offset, end_offset); + + fput(file); +} + +/** + * Clear the prepared flag for all folios for any slot in gfn range + * [@start, @end) in @kvm. + */ +void kvm_gmem_clear_prepared_vm(struct kvm *kvm, gfn_t start, gfn_t end) +{ + int i; + + lockdep_assert_held(&kvm->slots_lock); + + for (i = 0; i < kvm_arch_nr_memslot_as_ids(kvm); i++) { + struct kvm_memslot_iter iter; + struct kvm_memslots *slots; + + slots = __kvm_memslots(kvm, i); + kvm_for_each_memslot_in_gfn_range(&iter, slots, start, end) { + struct kvm_memory_slot *slot; + gfn_t gfn_start; + gfn_t gfn_end; + + slot = iter.slot; + gfn_start = max(start, slot->base_gfn); + gfn_end = min(end, slot->base_gfn + slot->npages); + + if (iter.slot->flags & KVM_MEM_GUEST_MEMFD) + kvm_gmem_clear_prepared_slot(iter.slot, gfn_start, gfn_end); + } + } +} + /** * Returns true if pages in range [@start, @end) in inode @inode have no * userspace mappings. diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 1a7bbcc31b7e..255d27df7f5c 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2565,6 +2565,8 @@ static int kvm_vm_set_mem_attributes(struct kvm *kvm, gfn_t start, gfn_t end, KVM_BUG_ON(r, kvm); } + kvm_gmem_clear_prepared_vm(kvm, start, end); + kvm_handle_gfn_range(kvm, &post_set_range); out_unlock: diff --git a/virt/kvm/kvm_mm.h b/virt/kvm/kvm_mm.h index d8ff2b380d0e..25fd0d9f66cc 100644 --- a/virt/kvm/kvm_mm.h +++ b/virt/kvm/kvm_mm.h @@ -43,6 +43,7 @@ int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot, void kvm_gmem_unbind(struct kvm_memory_slot *slot); int kvm_gmem_should_set_attributes(struct kvm *kvm, gfn_t start, gfn_t end, unsigned long attrs); +void kvm_gmem_clear_prepared_vm(struct kvm *kvm, gfn_t start, gfn_t end); #else static inline void kvm_gmem_init(struct module *module) { @@ -68,6 +69,12 @@ static inline int kvm_gmem_should_set_attributes(struct kvm *kvm, gfn_t start, return 0; } +static inline void kvm_gmem_clear_prepared_slots(struct kvm *kvm, + gfn_t start, gfn_t end) +{ + WARN_ON_ONCE(1); +} + #endif /* CONFIG_KVM_PRIVATE_MEM */ #endif /* __KVM_MM_H__ */ From patchwork Tue Sep 10 23:44:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799504 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9ADC2EE01F4 for ; Tue, 10 Sep 2024 23:46:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ECD438D00EA; Tue, 10 Sep 2024 19:45:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E55C48D00E2; Tue, 10 Sep 2024 19:45:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CAA9A8D00EA; Tue, 10 Sep 2024 19:45:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id A74C08D00E2 for ; Tue, 10 Sep 2024 19:45:29 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 65CEE4071A for ; Tue, 10 Sep 2024 23:45:29 +0000 (UTC) X-FDA: 82550462778.18.B50CD93 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf05.hostedemail.com (Postfix) with ESMTP id 9320A10000B for ; Tue, 10 Sep 2024 23:45:27 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=OqKOybBf; spf=pass (imf05.hostedemail.com: domain of 3FtrgZgsKCIsprzt60tD82vv33v0t.r310x29C-11zAprz.36v@flex--ackerleytng.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3FtrgZgsKCIsprzt60tD82vv33v0t.r310x29C-11zAprz.36v@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011875; a=rsa-sha256; cv=none; b=uzs9hxJGUdWIUHoRAKfRzpGMs1C9psPq1O4qrPjKviUN/sB1UcWoCjJtNbt4mipOyjrD4+ Vvta4HyS4saZqtqz/9EiyYw1wd1gZdPrMlXmCK7kvRxpUkPceaDz67cP3nF8eVwn50t6J1 rbWu7yc/KDEdB714UCqWsmyaY+A6lsw= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=OqKOybBf; spf=pass (imf05.hostedemail.com: domain of 3FtrgZgsKCIsprzt60tD82vv33v0t.r310x29C-11zAprz.36v@flex--ackerleytng.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3FtrgZgsKCIsprzt60tD82vv33v0t.r310x29C-11zAprz.36v@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011875; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=uwf1PxblOjz5NXZUaTh7zfbmgn6lBhrxKJq/6ZLohcs=; b=ZWK78U0pDMzUtz4hL1TITOxrxhUdf59nWwPNWztYJ9DfOXoP9ddx6ipWqUft3fblhRZ3EV gPgc7+vjGq+tOALd/pIu3x/y8J9Eoay260/BvOhc+z2Y+hcH7MILkUq+zZU7QuRTOdYMKV UWwfoOKPKHmfi0BnMh+ToZw4F4513Lo= Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-e1a6d328eacso2912653276.3 for ; Tue, 10 Sep 2024 16:45:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011927; x=1726616727; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=uwf1PxblOjz5NXZUaTh7zfbmgn6lBhrxKJq/6ZLohcs=; b=OqKOybBffYGtfH6+mUpVj0HcOA/ly+4lPENgWjbK4ud6RVtv3rGFNzTScMj29BawZL M1R1KrxzJyjNEp5csRX8vNifbgRuEk6IZa3dg3g+/oanN+10oUGR8+Lu1G0iqpJO5NZa 7tUDa6wL0ZU7dBdtCDVQJ0vZDPV7w0r+IGjkW4ikfscGxDhsn5vHkXJ4IL5ndPmQGAci HPfVuCZ/Xu0UIYZoO1ai/wl5DC8oUF9e+2GzPi3zTE4Fg5t1fqTvplRINuYZ4mPEzX7U 11pPYPzOvarj7Avxs3t1sf7uC3WH7H0MrboqiebtzircWIDrseAgX81GYPgVx5KpSTIh XZdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011927; x=1726616727; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=uwf1PxblOjz5NXZUaTh7zfbmgn6lBhrxKJq/6ZLohcs=; b=EdM19PNYAxiGf4rajIpnO02Sz55JOGHKxl4brp4ux0jVvXVxg3r+6H4xxThlQtjHYZ igAL0yaB3E9kPPdWzViSV4r9MzumHrLr3NzoD0L+driwIMCQ6mkbsKDnR1E2mrBTsfHU RJRwXnnTx9m55QY3eX6XOJHa/tdMIP4NmWaRutqtT68uxq13huxwRuubUtQpKTh8eW5S RlX9joXQMQzrHKWuXwJ74sJbxAKr7SwCmOvtsaD/mRa9a0DS6fBrvyLHEGWw+Lyt01A5 rs1kVgZGnnB7h5vaWgWm9wzFd+LMlH3lTzkj/5ADPlX+VX639mizengsiaPE5/uYcoqK mqYw== X-Forwarded-Encrypted: i=1; AJvYcCVFXwgdApfoPFD+8nBIsbER9uzheJWBprBi7En03GGCsP8e64uq2fENPRf/bz/+KJLmCT7Zrywh+w==@kvack.org X-Gm-Message-State: AOJu0YwJxM19D9ubRPDtymhNrHnArrAMD+GHcvvFr7Dstm+coCoEsUJP CPYowyMKwKLAnz9MTpFmzJM3qG8xvkBj9fDhkrUT0RpP2ISVsWUNfdYi1P0yK+phqvDAjDki1ud dhtUll+RFoUP+DDr7tb0kkA== X-Google-Smtp-Source: AGHT+IFHLL/JWASNEnEkZcv6PFIj58I/J8+OY1FCIRm5r4Bk9ay3Th/8kINWVB3DiXd8md6IPOJl8xsEmDXgV5bz6w== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a25:aa43:0:b0:e16:55e7:5138 with SMTP id 3f1490d57ef6-e1d8c022ba7mr1649276.0.1726011926642; Tue, 10 Sep 2024 16:45:26 -0700 (PDT) Date: Tue, 10 Sep 2024 23:44:02 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <6faf6d63a98531539b05ea36728e51ff51bb3cde.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 31/39] KVM: selftests: Allow vm_set_memory_attributes to be used without asserting return value of 0 From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Stat-Signature: yddi8a3zyrufsicnrajcjwhrtk4drkbd X-Rspamd-Queue-Id: 9320A10000B X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1726011927-14029 X-HE-Meta: U2FsdGVkX19ZAzfKH//ZCbqJgw+BkRRrjQ/MNSdmNeofNNj/3MuPwgrC8buHYC8MUjiV9PSUSnHRO9FLIZyV1hnQD77fXfjWjFbNq0emCghUDOZg1t5YGfv/T2pajhVa4yHu8JbgsQgkdA/RCsv7X6uX87nOaYIZu/lLEOctZ3e2CUmNRW8IkpZYIBC6IeV5VkQw2VeJHWKxMBT8mPtug/IrdBD/Jg1naUyr6Fd1xdbTkLTIrMAUNIiCPiRH5qf8woxRj4azA+BZEoxdUqeJcyMaaXT9z0OousBXURi3ZK7PVm3gPbean5hqDGftu7u5PbnBI6K31aihfTbgP9h6+qgkuuw/rXXR411hjwJ/O6r54PbofUbwOvQvZwgULWx3VslXI7gfStV1Bt7PFU6fAUY4mRf8OtWF5zogF/a6Qz3iotUyL1J+0xB5GkQsD3z4cVTzKE1Ngn7kRJo3sjGhz6pPBfO5pT+IToTPpFB1TJNFfeo9jlruRHy/CdrW7FdI4NWs9k51I3ju6YtPIerYFQq+ELQVaykXyOg0QeUp9gWEyeT28vmEN0tWzKdZKK6Zz1S6nMh/5d9qr+r5bcuVOgq8RcVSW+kn6xlsdwGEA7JP70XPbt06/ysl3IxTe5WfjyqoR3ZsnveI1y/BaisYy4sTYUjxG6T+JQkzPogX53QBerCfOWhFADkeCmGDA0vHPkErUiqTKm7Xihu9F9xu0831H3+ErVeYZlg/8eBGnVYn6rMg1u33ZFs5wZNF2PtiRyc4LC9/dTCxvj4KWIM034uhU1c2a+9Atdg0Uqsmh+Vklik6oO0Ot4m0HDo6iWFOe8s4ieXyiIV13HcmC36lf7JqoiyG+DZFYikthCAYZCFptcfoWJNbTLyVkRtZqSt6cqCshuiaSuwz9IGkCFRdJujM2zc9XGlfQrHsxcytaaBAm0MDr76xt/rUb4PpGTaVyPYhoSRmkLbo/rF0NbF cB7SNN9z HCyzlEtSDEQJNCT13XOAGhyUoMIuBws3gdqbL57R2KihY0KhCwbsgEv2k7sw1JvhaKODdKSJH9/oBOt23atNlzJujQO0bfSn1qKQwR3faf2J0UiGvIrlRN05LSWGUkDoauyttns40nig1oEJlVZDpVwA+27zHwgHA70QqCbexhVQ+NmvHWA0deBbg6bqj1qv1jYlUI4kQThFV97D30HLKxAZ13A41+uPh7Bpg7uNzBrEdb1f+9A5meuP6bT89iwV81y3KiQ3D/R8mF9Eay1rPjkJVeOI7N9jDTe7ajaZ8TWdOCwzWswtXGodlozLCiiEB3wppa0WgBAaBJMdmknUvSEUbqs5lU/h5INvUEFjNTBuICIYmcBB6ZRu4+aJi2SJTJFKRU8kpLAxZ7y0FaxzAKiwQxgdmkF0a+zB26LF8I250xdwCtdGVo8DVKZXG+yAK6ikpB8AX9JrnsKXLEUlKdOhdXmrI92fvQ6w+fHsrebUfqXPkqMFRtJgfQXxKbEpgFCYAVAOoK57cj818IZ050pmK/oL07tEwbfecZfxryjONIOHXtZ4+V6z2Tngd37ZCeLnw4RVPYoiT8dMOYj5sAHpeFgA0HpRZXa5JDLmEw3TlY/PC7aIczko5xQW7dL+3Eg3FabzyVpU46ZQlKVd4xg7HId+YrFr0AjP9I8b1jqhYhKCUZ44WiNnQ4LE2NM1AEpEOo1BdsARBPa8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.027282, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: No functional change intended. Signed-off-by: Ackerley Tng --- tools/testing/selftests/kvm/include/kvm_util.h | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h index 63c2aaae51f3..d336cd0c8f19 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -374,8 +374,8 @@ static inline void vm_enable_cap(struct kvm_vm *vm, uint32_t cap, uint64_t arg0) vm_ioctl(vm, KVM_ENABLE_CAP, &enable_cap); } -static inline void vm_set_memory_attributes(struct kvm_vm *vm, uint64_t gpa, - uint64_t size, uint64_t attributes) +static inline int __vm_set_memory_attributes(struct kvm_vm *vm, uint64_t gpa, + uint64_t size, uint64_t attributes) { struct kvm_memory_attributes attr = { .attributes = attributes, @@ -391,7 +391,15 @@ static inline void vm_set_memory_attributes(struct kvm_vm *vm, uint64_t gpa, TEST_ASSERT(!attributes || attributes == KVM_MEMORY_ATTRIBUTE_PRIVATE, "Update me to support multiple attributes!"); - vm_ioctl(vm, KVM_SET_MEMORY_ATTRIBUTES, &attr); + return __vm_ioctl(vm, KVM_SET_MEMORY_ATTRIBUTES, &attr); +} + +static inline void vm_set_memory_attributes(struct kvm_vm *vm, uint64_t gpa, + uint64_t size, uint64_t attributes) +{ + int ret = __vm_set_memory_attributes(vm, gpa, size, attributes); + + __TEST_ASSERT_VM_VCPU_IOCTL(!ret, "KVM_SET_MEMORY_ATTRIBUTES", ret, vm); } From patchwork Tue Sep 10 23:44:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799505 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C6A3EE01F2 for ; Tue, 10 Sep 2024 23:46:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5D0F38D00EB; Tue, 10 Sep 2024 19:45:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 556C78D00E2; Tue, 10 Sep 2024 19:45:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 333E28D00EB; Tue, 10 Sep 2024 19:45:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 093828D00E2 for ; Tue, 10 Sep 2024 19:45:32 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id B326C1607E9 for ; Tue, 10 Sep 2024 23:45:31 +0000 (UTC) X-FDA: 82550462862.11.8310BCE Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf24.hostedemail.com (Postfix) with ESMTP id E28D018000F for ; Tue, 10 Sep 2024 23:45:29 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=eGIszQJD; spf=pass (imf24.hostedemail.com: domain of 3GNrgZgsKCI0rt1v82vFA4xx55x2v.t532z4BE-331Crt1.58x@flex--ackerleytng.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3GNrgZgsKCI0rt1v82vFA4xx55x2v.t532z4BE-331Crt1.58x@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011877; a=rsa-sha256; cv=none; b=koHecwbErh2Y3ArO1ke9Beph+4ZQyunYBtqP26RbAyrbqKxZHBY6Kmr6QgF6QqHuSiowOS GY7L3fc30p/X/ea0ghkNXwXb1uzMEcn5qEkbynsIRS1s6EeS5tA7yhAAfe1gZ/gF1WkVN9 UkuanwNent5PpYmBV/DwNw4HXWibEZA= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=eGIszQJD; spf=pass (imf24.hostedemail.com: domain of 3GNrgZgsKCI0rt1v82vFA4xx55x2v.t532z4BE-331Crt1.58x@flex--ackerleytng.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3GNrgZgsKCI0rt1v82vFA4xx55x2v.t532z4BE-331Crt1.58x@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011877; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OpTxhyUXuMtHPMfLsf+VosdcWmSbqHHnAO9eI97+DWM=; b=YE4QAKt5W526/xOT11FW4jlH7wcIe4pLmLEkuCCIrxqiteVxM8hsytow1AHmk6PWLH4klS sJdEMn1TRiZ2j/Z9g+1K/B3D/0YS8R8sA1HX9RbNWDNMc0SKujmIf9o2cyG/5V/UL5r5/R WqKbL4GucFukYYg++kWeAffpxQphQyo= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2d88977c5a2so5620277a91.3 for ; Tue, 10 Sep 2024 16:45:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011928; x=1726616728; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=OpTxhyUXuMtHPMfLsf+VosdcWmSbqHHnAO9eI97+DWM=; b=eGIszQJDdHzVDZESNTjCXWpA1r3bbXSkxF+GQACC+HKrddC/cd6clSAm/4Lx570g08 CxBindKai4rFc3yh8TpyOfYypwKmkfnqwrcnHrfqa0QswlMOIWfEeO9oLshGdQAswvlS dTAwB+vjmpgNTc7R+99vRV1u2JwpqU3wBCdo46LHkwzHdnujZOFHg9cWOVKW+D/vWhND OzkD19XEUtj4PBsREFTEWaEhBNBRwSwxTeMEcKLR/6WDTYaCI8EekwOI0UuPZLsvBppw pz1rtRQtDmsB7DSaRMtDbDzoFnFwBvAPIdPpQ5mteF9oXmTZ63tQauZXn3Qf4waGCNFs F9lg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011928; x=1726616728; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=OpTxhyUXuMtHPMfLsf+VosdcWmSbqHHnAO9eI97+DWM=; b=ayYbc6qvppjXcZTpBYgi4NDBcxmM53IM+uTsKQhMtODdyXH6gbh/ig6TkLySVXrFZk nILjA/gL3NqN+WQ3kozVqPHKQ2FTnWhyjGjYFD5hFoziixswJ0aR23poBwDoGNVMUBEt gr2350B6j+XaCMT15zFmKsP93QIIPAPgrlCTMvQdRu7cz1O99JEq52W9qdNhyqZhXwit LOZ7oIcsNHHEWcM8ETaw9XoYwilVOsd7uZYVGGfKJhpYpIbfKneyfX5Q/QXNU4Alc25V mwJb12sD/EnprB+5n8LlQx3Ly3iVOmbCKluqo7vNS/6YLTY2PXeWJ35llXJlQSsgtI9g 4RlA== X-Forwarded-Encrypted: i=1; AJvYcCWVVkX4z0XnUbDon1wG20I7qv6KyanjDHgOQBR0WuqBtOikz7YjHip8cFsBK7ARXZBmFwrAwcK+jA==@kvack.org X-Gm-Message-State: AOJu0Ywm+CZd1SgZao6QaYL+ndIsd7MyE79/3hDDdKPEUcgYs/JPd0jm f277JVMnujVUXs01mzpgiY1pfP0tIwGeP0ypsLlrc7t2SQGbXpOsyXSc/VebWNLUw3KIy+6w7Cl YMbMxp9srpDNyUdqOwrO3LA== X-Google-Smtp-Source: AGHT+IGDCVAzc3Os2kH8Tey/rqyfjIzDxCVVgwAtu26YcOS+JGYzGKn0YwTy0lhdNyewynSeBma3RYB4AAVHIEMUkQ== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a17:90b:4f49:b0:2da:6c1e:1576 with SMTP id 98e67ed59e1d1-2dad4b8ba79mr41392a91.0.1726011928288; Tue, 10 Sep 2024 16:45:28 -0700 (PDT) Date: Tue, 10 Sep 2024 23:44:03 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: Subject: [RFC PATCH 32/39] KVM: selftests: Test using guest_memfd memory from userspace From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Stat-Signature: 9igfnix4n9gwka9omndmkxgyr9nk19o9 X-Rspamd-Queue-Id: E28D018000F X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1726011929-353448 X-HE-Meta: U2FsdGVkX19bOLPb780ss8vbzcSYcsQUGqcf850PsK2vGq5I/g5U4Ry0S4lJFmjOT8ubpTFZecY95oiVOG0kAyPPU8d8SG5lxpGyGC87eP81+HK0me6MrEkeFJPImsqxITsjmP2eSU+xnajXl9B3z5HfO9gX2Tqhx32EHV06fH2u2Bwk5CfDeh9p7OPWOERIlRrdFDLYm27bY8XWldFFn3HgVU55lfBqxosaXaENSMohHKP66CB95jo2mLAMmpZnd3oMaTgVsqPP0lU8IO3XUUV3byY9fShZo7yI7DFUUZE6n2/CxxsWzzbxRDkBuSTROIElPGNRueM6ZCTnLKVewKB4wYDZ8Ke9qEANyUrcQr3ZOF0crYaX5B6KQYjdf6dbJqRHQ9Qy4n2cPaY+mcUbUkqu387It+TQtYuh8nX0Xbe7SeGBcab9BpnGItFA4xJpTKKCQU/PDXOtH/6jXvH0Pq8UX40gO1LxoFThRorR7MzVyP2o08OtGQ0nOCjmKiTrcs2BQTtfiCGY9EZTBZdfXGqNIZ2w8t77D0HxjRjH1pHE9FtWFTPcbxjEfrmCT22+iIFX3U9o3FsIUHY+nbyq41qHzX0DyYMD/FBihqnp0c0LioKUvdhZ9aeRViorKMbXXJeNVhqkvxuCOoQrEC2fQicED+iN8mUXI26l4UwNoqfVmsnHsZDsefRPB0Q9UTqMus180FD+WIUptOxSfUsJ5uHUZFSjgkmLOVs+jjRk2xrTAL4/aWJyHmQ37XCTs24POXqNvyvF/ZO8e8xKGMEe06c1E0WF8If04eD7eLJQumKHgVJ/gaH6e+1G0/EAH8gn0/ESXeXmZZGr8wGZ6ijXQ5pCBQsdItZjLIeiYKIOCtmJkepEFE7mq3yNW95VCKQ6Y28xKYQXTyKG7yGpPjRWtVFNRVZv+WfUpW4y/Y5eXH9Gy9/P9bMbX9+jJKet/0vvyVPiihgmH+YlaHF/mJW jD1HXg4H 7r24RBgRdxmiwIKtaYgZdr/mc3mrSoy3QK7+2WoUFppzSqfJFe/6NEocIMC+3Nx+097Fwa9tmd0agMtzkc9hVgX4YrMK2INQDfVqOTpHTooD4M5Aqjjft6GS5Wrr2LzsGwfefjTfYfECp/riqRSJa/cJ8ro5zdfYLbkxWYGngxvLOX0TYe8lDlYHIDdh5TLUHRbJJFPWnHXrC8hb9OfgSb/XHmtV2Q72bLzNjrSz0qpviMrGrY8iNFtadT0k+4glKBPnG3FMyufByiJZhP2ckuNWoXCpvrGsE282y9NVb45xECDxB8hKgaESTfVLiSmWapL+6+gSkzMN8HofWe4osuWab3vdhkoFlWly+V+MF7ySt8un7bzLjPzfgFHxhIi5CPeIE9RI1DFG4SjoqwRjmXr8WzltEfOAX0qB5f+Xwo9lnCHkmuGorLjNXRNuJZBXjFTdEEK1vn9sgEi3NaeuOPDXEEmmcnQAYhWeXTI04ruilh1tLkEldkZq4hN10xL5m1t9cJezWEjuH9cVlsBiC5j0OzJiCV1QmLOrzsVd52zlFSMnMkqb53RlZonOB9mBekgaedXVVEzCCnwLyttRkxNnIziHgg2QAI5J003ZUMJUmKr1TtX+Pd/J6fyXW3wsAUi2u2gnGzuq7+NvoRTmeSS0Im0CSIeE1d9ETWuwb+cYx3e8NqCb6A6sagkIP+G3zXc2AiBIDtqVMdT0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Test using guest_memfd from userspace, since guest_memfd now has mmap() support. Tests: 1. mmap() should now always return a valid address 2. Test that madvise() doesn't give any issues when pages are not faulted in. 3. Test that pages should not be faultable before association with a memslot, and that faults result in SIGBUS. 4. Test that pages can be faulted if marked faultable, and the flow of setting a memory range as private, which is: a. madvise(MADV_DONTNEED) to request kernel to unmap pages b. Set memory attributes of VM to private Also test that if pages are still mapped, setting memory attributes will fail. 5. Test that madvise(MADV_REMOVE) can be used to remove pages from guest_memfd, forcing zeroing of those pages before the next time the pages are faulted in. Signed-off-by: Ackerley Tng --- .../testing/selftests/kvm/guest_memfd_test.c | 195 +++++++++++++++++- 1 file changed, 189 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c index 3618ce06663e..b6f3c3e6d0dd 100644 --- a/tools/testing/selftests/kvm/guest_memfd_test.c +++ b/tools/testing/selftests/kvm/guest_memfd_test.c @@ -6,6 +6,7 @@ */ #include #include +#include #include #include #include @@ -35,12 +36,192 @@ static void test_file_read_write(int fd) "pwrite on a guest_mem fd should fail"); } -static void test_mmap(int fd, size_t page_size) +static void test_mmap_should_map_pages_into_userspace(int fd, size_t page_size) { char *mem; mem = mmap(NULL, page_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); - TEST_ASSERT_EQ(mem, MAP_FAILED); + TEST_ASSERT(mem != MAP_FAILED, "mmap should return valid address"); + + TEST_ASSERT_EQ(munmap(mem, page_size), 0); +} + +static void test_madvise_no_error_when_pages_not_faulted(int fd, size_t page_size) +{ + char *mem; + + mem = mmap(NULL, page_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + TEST_ASSERT(mem != MAP_FAILED, "mmap should return valid address"); + + TEST_ASSERT_EQ(madvise(mem, page_size, MADV_DONTNEED), 0); + + TEST_ASSERT_EQ(munmap(mem, page_size), 0); +} + +static void assert_not_faultable(char *address) +{ + pid_t child_pid; + + child_pid = fork(); + TEST_ASSERT(child_pid != -1, "fork failed"); + + if (child_pid == 0) { + *address = 'A'; + } else { + int status; + waitpid(child_pid, &status, 0); + + TEST_ASSERT(WIFSIGNALED(status), + "Child should have exited with a signal"); + TEST_ASSERT_EQ(WTERMSIG(status), SIGBUS); + } +} + +/* + * Pages should not be faultable before association with memslot because pages + * (in a KVM_X86_SW_PROTECTED_VM) only default to faultable at memslot + * association time. + */ +static void test_pages_not_faultable_if_not_associated_with_memslot(int fd, + size_t page_size) +{ + char *mem = mmap(NULL, page_size, PROT_READ | PROT_WRITE, + MAP_SHARED, fd, 0); + TEST_ASSERT(mem != MAP_FAILED, "mmap should return valid address"); + + assert_not_faultable(mem); + + TEST_ASSERT_EQ(munmap(mem, page_size), 0); +} + +static void test_pages_faultable_if_marked_faultable(struct kvm_vm *vm, int fd, + size_t page_size) +{ + char *mem; + uint64_t gpa = 0; + uint64_t guest_memfd_offset = 0; + + /* + * This test uses KVM_X86_SW_PROTECTED_VM is required to set + * arch.has_private_mem, to add a memslot with guest_memfd to a VM. + */ + if (!(kvm_check_cap(KVM_CAP_VM_TYPES) & BIT(KVM_X86_SW_PROTECTED_VM))) { + printf("Faultability test skipped since KVM_X86_SW_PROTECTED_VM is not supported."); + return; + } + + mem = mmap(NULL, page_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, + guest_memfd_offset); + TEST_ASSERT(mem != MAP_FAILED, "mmap should return valid address"); + + /* + * Setting up this memslot with a KVM_X86_SW_PROTECTED_VM marks all + * offsets in the file as shared, allowing pages to be faulted in. + */ + vm_set_user_memory_region2(vm, 0, KVM_MEM_GUEST_MEMFD, gpa, page_size, + mem, fd, guest_memfd_offset); + + *mem = 'A'; + TEST_ASSERT_EQ(*mem, 'A'); + + /* Should fail since the page is still faulted in. */ + TEST_ASSERT_EQ(__vm_set_memory_attributes(vm, gpa, page_size, + KVM_MEMORY_ATTRIBUTE_PRIVATE), + -1); + TEST_ASSERT_EQ(errno, EINVAL); + + /* + * Use madvise() to remove the pages from userspace page tables, then + * test that the page is still faultable, and that page contents remain + * the same. + */ + madvise(mem, page_size, MADV_DONTNEED); + TEST_ASSERT_EQ(*mem, 'A'); + + /* Tell kernel to unmap the page from userspace. */ + madvise(mem, page_size, MADV_DONTNEED); + + /* Now kernel can set this page to private. */ + vm_mem_set_private(vm, gpa, page_size); + assert_not_faultable(mem); + + /* + * Should be able to fault again after setting this back to shared, and + * memory contents should be cleared since pages must be re-prepared for + * SHARED use. + */ + vm_mem_set_shared(vm, gpa, page_size); + TEST_ASSERT_EQ(*mem, 0); + + /* Cleanup */ + vm_set_user_memory_region2(vm, 0, KVM_MEM_GUEST_MEMFD, gpa, 0, mem, fd, + guest_memfd_offset); + + TEST_ASSERT_EQ(munmap(mem, page_size), 0); +} + +static void test_madvise_remove_releases_pages(struct kvm_vm *vm, int fd, + size_t page_size) +{ + char *mem; + uint64_t gpa = 0; + uint64_t guest_memfd_offset = 0; + + /* + * This test uses KVM_X86_SW_PROTECTED_VM is required to set + * arch.has_private_mem, to add a memslot with guest_memfd to a VM. + */ + if (!(kvm_check_cap(KVM_CAP_VM_TYPES) & BIT(KVM_X86_SW_PROTECTED_VM))) { + printf("madvise test skipped since KVM_X86_SW_PROTECTED_VM is not supported."); + return; + } + + mem = mmap(NULL, page_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + TEST_ASSERT(mem != MAP_FAILED, "mmap should return valid address"); + + /* + * Setting up this memslot with a KVM_X86_SW_PROTECTED_VM marks all + * offsets in the file as shared, allowing pages to be faulted in. + */ + vm_set_user_memory_region2(vm, 0, KVM_MEM_GUEST_MEMFD, gpa, page_size, + mem, fd, guest_memfd_offset); + + *mem = 'A'; + TEST_ASSERT_EQ(*mem, 'A'); + + /* + * MADV_DONTNEED causes pages to be removed from userspace page tables + * but should not release pages, hence page contents are kept. + */ + TEST_ASSERT_EQ(madvise(mem, page_size, MADV_DONTNEED), 0); + TEST_ASSERT_EQ(*mem, 'A'); + + /* + * MADV_REMOVE causes pages to be released. Pages are then zeroed when + * prepared for shared use, hence 0 is expected on next fault. + */ + TEST_ASSERT_EQ(madvise(mem, page_size, MADV_REMOVE), 0); + TEST_ASSERT_EQ(*mem, 0); + + TEST_ASSERT_EQ(munmap(mem, page_size), 0); + + /* Cleanup */ + vm_set_user_memory_region2(vm, 0, KVM_MEM_GUEST_MEMFD, gpa, 0, mem, fd, + guest_memfd_offset); +} + +static void test_using_memory_directly_from_userspace(struct kvm_vm *vm, + int fd, size_t page_size) +{ + test_mmap_should_map_pages_into_userspace(fd, page_size); + + test_madvise_no_error_when_pages_not_faulted(fd, page_size); + + test_pages_not_faultable_if_not_associated_with_memslot(fd, page_size); + + test_pages_faultable_if_marked_faultable(vm, fd, page_size); + + test_madvise_remove_releases_pages(vm, fd, page_size); } static void test_file_size(int fd, size_t page_size, size_t total_size) @@ -180,18 +361,17 @@ static void test_guest_memfd(struct kvm_vm *vm, uint32_t flags, size_t page_size size_t total_size; int fd; - TEST_REQUIRE(kvm_has_cap(KVM_CAP_GUEST_MEMFD)); - total_size = page_size * 4; fd = vm_create_guest_memfd(vm, total_size, flags); test_file_read_write(fd); - test_mmap(fd, page_size); test_file_size(fd, page_size, total_size); test_fallocate(fd, page_size, total_size); test_invalid_punch_hole(fd, page_size, total_size); + test_using_memory_directly_from_userspace(vm, fd, page_size); + close(fd); } @@ -201,7 +381,10 @@ int main(int argc, char *argv[]) TEST_REQUIRE(kvm_has_cap(KVM_CAP_GUEST_MEMFD)); - vm = vm_create_barebones(); + if ((kvm_check_cap(KVM_CAP_VM_TYPES) & BIT(KVM_X86_SW_PROTECTED_VM))) + vm = vm_create_barebones_type(KVM_X86_SW_PROTECTED_VM); + else + vm = vm_create_barebones(); test_create_guest_memfd_invalid(vm); test_create_guest_memfd_multiple(vm); From patchwork Tue Sep 10 23:44:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799506 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 853AFEE01F1 for ; Tue, 10 Sep 2024 23:46:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4BB778D00EC; Tue, 10 Sep 2024 19:45:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 445D58D00E2; Tue, 10 Sep 2024 19:45:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 249818D00EC; Tue, 10 Sep 2024 19:45:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 02CAA8D00E2 for ; Tue, 10 Sep 2024 19:45:33 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id BAFD2C04E9 for ; Tue, 10 Sep 2024 23:45:32 +0000 (UTC) X-FDA: 82550462904.19.2AFA436 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf13.hostedemail.com (Postfix) with ESMTP id 0140E20006 for ; Tue, 10 Sep 2024 23:45:30 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=oLCWlCx+; spf=pass (imf13.hostedemail.com: domain of 3GdrgZgsKCI4su2w93wGB5yy66y3w.u64305CF-442Dsu2.69y@flex--ackerleytng.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3GdrgZgsKCI4su2w93wGB5yy66y3w.u64305CF-442Dsu2.69y@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011793; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=afFoS57EFj7PA7MrTOtXA3rUEPuyIQGDyAEgOLUqk8M=; b=R2SU7obHKp6y9YqhzO7tEfal1nNR3vbxhTyvcDkh1xHe63k3sdJ7BogbVlz8v/DqTucoDq dsJ8eCgJqjK5n6PuepM0SKwQCFMJtjxeQCAEabFLGMYqfdG3Ybb8nY/MKzXQ6mtoAwcvfj HWXNpm6VKBh1O/B+/RoZrm2bVSuxKhg= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=oLCWlCx+; spf=pass (imf13.hostedemail.com: domain of 3GdrgZgsKCI4su2w93wGB5yy66y3w.u64305CF-442Dsu2.69y@flex--ackerleytng.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3GdrgZgsKCI4su2w93wGB5yy66y3w.u64305CF-442Dsu2.69y@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011793; a=rsa-sha256; cv=none; b=Y7HEAwhCX9lUKhUvo9krl3TOM1HGeQvicySnb61U3SYGPD4ZDySku9ydQqLn18Bc1Q5bqu pmRwKeYtaclQZ7uNoAWcFsqWl8e++AtPULmeosWcP6UYZMVEBzbcDj+OiVqGbNYwmDY2cy 3lO9lzJhn38br75JE+zTAURR/cPsr94= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-6d475205628so186142497b3.0 for ; Tue, 10 Sep 2024 16:45:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011930; x=1726616730; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=afFoS57EFj7PA7MrTOtXA3rUEPuyIQGDyAEgOLUqk8M=; b=oLCWlCx+ktMEDQTJYcAcGLrmttUcFAhuluQ0Iaxg4UwC7LA8TSj0dDy9a6vKpku+0E 5Sdn1YNVhZifo1AImEOm7/6wCCXSrzvoi2nprpis0DOuvhv1kNk0p+EmsRAMk4kP2UvW vDY2616NwSz3mjCgpnAGFWOu7BSJ0i7g2Ifg8atTJizKhODkHKe8mHEGvnKVX6RoFMry GUHevUduFxdVBChuJQ01n7lz5al62n4sgwUYkJkA/gzo52+nMbXtDPTyx4wTnW9trg8i 6becC5hwXzG0pPafGMHhdKYFrxnuBr5T+bkglt2W65sqqjYMx/xYWBCexj6p7M3FRxXj AqRA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011930; x=1726616730; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=afFoS57EFj7PA7MrTOtXA3rUEPuyIQGDyAEgOLUqk8M=; b=UUlcWINSD/y0gyCC695XfvaBj9/kGW/+mo3u3LeE5hvQcXDf3osfhQytHAnWA9yBWr H0xq3UA9gwtez8MYkI4tXlL1KtXCvJMmqqejOgh+btRGD7KNXqm1F6cOln2WFfqHnZ9E /Ta3AQCuYn/0qVUs39LjtT39+ee38C+cx27YBM2c/yil7HykGUHKp8Z8gzYUQzwfFqMp lDyCBYh+FSCXCUiDjyNQNzMOOSwWzhHzXT2YYtERQAqQFFto2g6gf6cbq9FneauVpJhS PiXN5YFdMZk23Bl4Alx0ltvB7AFPTQPS00fH/Na2Z5XN/UcLOnayF8U23awYoRQPpFqm xJWg== X-Forwarded-Encrypted: i=1; AJvYcCUWLD06YFhNw9NwmtOX+S9VuTZgOS6TENlbuW+azKcqnIYzUYDHBGzo+wT8MMSXQLLxnMI5CBvDjg==@kvack.org X-Gm-Message-State: AOJu0YyRoS/vzKHE7v++ZLJ9ogZ0/x2Jjz7e4I+LaMiWU6Dka07qwoVu Fs6f+9mzhy4BseFIR9n+3uvdX2j/3hvhrrMirkSPLh1YP7d3SqzClbmIKhZSaGcygfwlagjbFlE FOYyyDmGIi4eFS8GTYyPo5Q== X-Google-Smtp-Source: AGHT+IFM7zwKBbA2/Jy7tyIWzTxWbF+45RbnY6QrI6xQ5u7bHZg40qPkgO7/Rd+Sibot2kYeLv7TVh8XJXwwFOEoiw== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a25:ef46:0:b0:e1a:8735:8390 with SMTP id 3f1490d57ef6-e1d3489c673mr70623276.4.1726011929867; Tue, 10 Sep 2024 16:45:29 -0700 (PDT) Date: Tue, 10 Sep 2024 23:44:04 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <19a16094c3e99d83c53931ff5f3147079d03c810.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 33/39] KVM: selftests: Test guest_memfd memory sharing between guest and host From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 0140E20006 X-Stat-Signature: 1fraj5uk8owqmmzzd9cd57jkzfiezhwn X-Rspam-User: X-HE-Tag: 1726011930-940024 X-HE-Meta: U2FsdGVkX1/DGlm5kMeP2+ysM5RlwOZokkYw4ugyFBacI2r1eKXvsh0SwctdDcB128HtlvwWpLEeM+W4TagrN9QhqYwBm+Jwjuq4H5yWE8f+NAbnYBkGSu4/uwNqUkSYQocAkPlLCwDEj6MnySA1R1VQl0d7DDu8HUKBfr327xNW2hjHgkCtT81ffsV0KiWq3fZAXd/TPkm0fDpyomPWqIiWCq8HV05Bhs8Sh24lICfl8xlItZKY7v/W8esjnHyggfSAw6bKXfQxK8lsRSv+LdrW31wKNI+hj4/S4ylX4py8FUFAchIoX6gsIKYsdWr5ev3JeiD5qkGLMM7v/yhkojGYUgLhlPK+xXnbFVQRn+wI0f9cYN7PvauEixf4ry8sQOSrjDsLQcrDriX1hMl6NxEKAvO3yFpCRTOEL3uIbEX7gPbcDO+raarlZHMQgdq7kkJqjxx7v/XXEJwzJ7VgeroNrhJtTbTAHILe9VSTuETqWkNTOK4xMIUxZc3MM1cR2BkZQ+5n3w2Toavm6u3fK876rfqY0li6QBCSr0Kr2eoEts+jZKNu95v7ttLKItyCSl/b8AxQhi/W+sbYoCpWshekareBRZwyGPF1NOf3T3F3OCjEVAf8wehYBZOHvK9LdICuvfaPUolkLsTO1qoW8O6YFYkX+AW9hIJ5JwCjWUxn0tczB1M96+drd/jPTPgWjTg7rb7EyPpIVwna2CJYuxtn2rnqTnSqNLWcEAdzGDiJzKdUIRjxr4/Hdwl3+SIjQImjlnW0QsSRCVvtmJ+otNRlEkAU02RU12xPe+p69Ekswr/T4rS26nu1+lqQr95pxmSJxDRVD9aPM/ZUn/DHJopiqR1DSqtST7C8QfKxBHdKk7FY4w5/8e05xJ77bY4Dmx4OLOZS2GnaIBrTPdwnnvUo8JGlbvC+56WVQ2yGyVPrsZSlCh7Mq1jU9NxfDEeuuCLz8M91pmMx0SFejbY wOZggPWr EejweWF0bJROpsV/a3aoXzm90xEGds9WzHEOK68z0Bbo8KnJK3dad4g1o/ef8P58nVFwCtxsP3oZH8bnMt3wsh6bFC4vbKOjIMSWbgrSmcDtMdUdXXecEOmQRM8wbTUtAGG5ZJ0jCl9QCbvI2Bd+OgpD9YRMDJGh/mkQ09MVVc+FXY9UnaFOhTJlGcT8hc14RZ1HG5JGwpjgZA26QAV7kSqioteVa1RHFSNVipBNvbZf7Scghpy70xQZL5mRG9lFJDw/2CJVNBkTyyd/t6OltggPcnW9LRGuZIuBjGQIESm5Vm+f409el+ORK7ejRKEjuxIbHmxrDdCuXFaOOh/bV2bHCYusQmkCB3eY+uyP3N5ODvxrfu8Xdplrxhhce7ZfeFYnBTZ29cy6ld4MTgoClGwhm7r07uTM1EoUv5sayHK+KJ/SDhPLPGRzluhUNHFRM3aOVgCdGniHYg96XeI4eLnCZxSoJvb2KrDL7nYuH6Xslw+pPULdeI3y2ky2C76WSn4meQWRV+cAdLUUqU8FEcf6FKHgrX9SWAWeOQJlNwYLX/ssBg8Ys6AIJiOj4BlX5OTjJGtnkPt12KOLZ7rZkyokgAf4HdI2mnAqABiiEJDF+NNj4rHVt/EVsE1tDFyBa8pJ7VPCV0Oo/c0z5YFpZwtJu/Wf1+ze6PWOYsg+DoBS6DcDu72Ko9yr4t1Mg62Qdov+MoyzLjalnvo0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000002, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Minimal test for guest_memfd to test that when memory is marked shared in a VM, the host can read and write to it via an mmap()ed address, and the guest can also read and write to it. Signed-off-by: Ackerley Tng --- tools/testing/selftests/kvm/Makefile | 1 + .../selftests/kvm/guest_memfd_sharing_test.c | 160 ++++++++++++++++++ 2 files changed, 161 insertions(+) create mode 100644 tools/testing/selftests/kvm/guest_memfd_sharing_test.c diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index b3b7e83f39fc..3c1f35456bfc 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -135,6 +135,7 @@ TEST_GEN_PROGS_x86_64 += dirty_log_test TEST_GEN_PROGS_x86_64 += dirty_log_perf_test TEST_GEN_PROGS_x86_64 += guest_memfd_test TEST_GEN_PROGS_x86_64 += guest_memfd_hugetlb_reporting_test +TEST_GEN_PROGS_x86_64 += guest_memfd_sharing_test TEST_GEN_PROGS_x86_64 += guest_print_test TEST_GEN_PROGS_x86_64 += hardware_disable_test TEST_GEN_PROGS_x86_64 += kvm_create_max_vcpus diff --git a/tools/testing/selftests/kvm/guest_memfd_sharing_test.c b/tools/testing/selftests/kvm/guest_memfd_sharing_test.c new file mode 100644 index 000000000000..fef5a73e5053 --- /dev/null +++ b/tools/testing/selftests/kvm/guest_memfd_sharing_test.c @@ -0,0 +1,160 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Minimal test for guest_memfd to test that when memory is marked shared in a + * VM, the host can read and write to it via an mmap()ed address, and the guest + * can also read and write to it. + * + * Copyright (c) 2024, Google LLC. + */ +#include +#include +#include + +#include "test_util.h" +#include "kvm_util.h" +#include "ucall_common.h" + +#define GUEST_MEMFD_SHARING_TEST_SLOT 10 +#define GUEST_MEMFD_SHARING_TEST_GPA 0x50000000ULL +#define GUEST_MEMFD_SHARING_TEST_GVA 0x90000000ULL +#define GUEST_MEMFD_SHARING_TEST_OFFSET 0 +#define GUEST_MEMFD_SHARING_TEST_GUEST_TO_HOST_VALUE 0x11 +#define GUEST_MEMFD_SHARING_TEST_HOST_TO_GUEST_VALUE 0x22 + +static void guest_code(int page_size) +{ + char *mem; + int i; + + mem = (char *)GUEST_MEMFD_SHARING_TEST_GVA; + + for (i = 0; i < page_size; ++i) { + GUEST_ASSERT_EQ(mem[i], GUEST_MEMFD_SHARING_TEST_HOST_TO_GUEST_VALUE); + } + + memset(mem, GUEST_MEMFD_SHARING_TEST_GUEST_TO_HOST_VALUE, page_size); + + GUEST_DONE(); +} + +int run_test(struct kvm_vcpu *vcpu, void *hva, int page_size) +{ + struct ucall uc; + uint64_t uc_cmd; + + memset(hva, GUEST_MEMFD_SHARING_TEST_HOST_TO_GUEST_VALUE, page_size); + vcpu_args_set(vcpu, 1, page_size); + + /* Reset vCPU to guest_code every time run_test is called. */ + vcpu_arch_set_entry_point(vcpu, guest_code); + + vcpu_run(vcpu); + uc_cmd = get_ucall(vcpu, &uc); + + if (uc_cmd == UCALL_ABORT) { + REPORT_GUEST_ASSERT(uc); + return 1; + } else if (uc_cmd == UCALL_DONE) { + char *mem; + int i; + + mem = hva; + for (i = 0; i < page_size; ++i) + TEST_ASSERT_EQ(mem[i], GUEST_MEMFD_SHARING_TEST_GUEST_TO_HOST_VALUE); + + return 0; + } else { + TEST_FAIL("Unknown ucall 0x%lx.", uc.cmd); + return 1; + } +} + +void *add_memslot(struct kvm_vm *vm, int guest_memfd, size_t page_size, + bool back_shared_memory_with_guest_memfd) +{ + void *mem; + + if (back_shared_memory_with_guest_memfd) { + mem = mmap(NULL, page_size, PROT_READ | PROT_WRITE, MAP_SHARED, + guest_memfd, GUEST_MEMFD_SHARING_TEST_OFFSET); + } else { + mem = mmap(NULL, page_size, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + } + TEST_ASSERT(mem != MAP_FAILED, "mmap should return valid address"); + + /* + * Setting up this memslot with a KVM_X86_SW_PROTECTED_VM marks all + * offsets in the file as shared. + */ + vm_set_user_memory_region2(vm, GUEST_MEMFD_SHARING_TEST_SLOT, + KVM_MEM_GUEST_MEMFD, + GUEST_MEMFD_SHARING_TEST_GPA, page_size, mem, + guest_memfd, GUEST_MEMFD_SHARING_TEST_OFFSET); + + return mem; +} + +void test_sharing(bool back_shared_memory_with_guest_memfd) +{ + const struct vm_shape shape = { + .mode = VM_MODE_DEFAULT, + .type = KVM_X86_SW_PROTECTED_VM, + }; + struct kvm_vcpu *vcpu; + struct kvm_vm *vm; + size_t page_size; + int guest_memfd; + void *mem; + + TEST_REQUIRE(kvm_check_cap(KVM_CAP_VM_TYPES) & BIT(KVM_X86_SW_PROTECTED_VM)); + + vm = vm_create_shape_with_one_vcpu(shape, &vcpu, &guest_code); + + page_size = getpagesize(); + + guest_memfd = vm_create_guest_memfd(vm, page_size, 0); + + mem = add_memslot(vm, guest_memfd, page_size, back_shared_memory_with_guest_memfd); + + virt_map(vm, GUEST_MEMFD_SHARING_TEST_GVA, GUEST_MEMFD_SHARING_TEST_GPA, 1); + + run_test(vcpu, mem, page_size); + + /* Toggle private flag of memory attributes and run the test again. */ + if (back_shared_memory_with_guest_memfd) { + /* + * Use MADV_REMOVE to release the backing guest_memfd memory + * back to the system before it is used again. Test that this is + * only necessary when guest_memfd is used to back shared + * memory. + */ + madvise(mem, page_size, MADV_REMOVE); + } + vm_mem_set_private(vm, GUEST_MEMFD_SHARING_TEST_GPA, page_size); + vm_mem_set_shared(vm, GUEST_MEMFD_SHARING_TEST_GPA, page_size); + + run_test(vcpu, mem, page_size); + + kvm_vm_free(vm); + munmap(mem, page_size); + close(guest_memfd); +} + +int main(int argc, char *argv[]) +{ + /* + * Confidence check that when guest_memfd is associated with a memslot + * but only anonymous memory is used to back shared memory, sharing + * memory between guest and host works as expected. + */ + test_sharing(false); + + /* + * Memory sharing should work as expected when shared memory is backed + * with guest_memfd. + */ + test_sharing(true); + + return 0; +} From patchwork Tue Sep 10 23:44:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799507 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0ADE9EE01F4 for ; Tue, 10 Sep 2024 23:46:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 691858D00ED; Tue, 10 Sep 2024 19:45:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 642E18D00E2; Tue, 10 Sep 2024 19:45:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 447178D00ED; Tue, 10 Sep 2024 19:45:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 1E3768D00E2 for ; Tue, 10 Sep 2024 19:45:35 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id D15E51A0B68 for ; Tue, 10 Sep 2024 23:45:34 +0000 (UTC) X-FDA: 82550462988.11.B52A045 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) by imf17.hostedemail.com (Postfix) with ESMTP id 19CF940006 for ; Tue, 10 Sep 2024 23:45:32 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Mh0qniMd; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf17.hostedemail.com: domain of 3G9rgZgsKCJAuw4yB5yID7008805y.w86527EH-664Fuw4.8B0@flex--ackerleytng.bounces.google.com designates 209.85.210.201 as permitted sender) smtp.mailfrom=3G9rgZgsKCJAuw4yB5yID7008805y.w86527EH-664Fuw4.8B0@flex--ackerleytng.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011857; a=rsa-sha256; cv=none; b=Aqm3qeqjQzPS09GoX4R4WQ03/gn2g9Jb878bSk/IVLzlYTZNhc1ohTOEEbZxziDVn1Y0LO fHyFY5fe6EQCugOfs98CrmZ41pVRlUtJksduC1fuv5CImtdd1et368Q1T17FPx5jLsmbhN jqXBt2CCmObLQBmE7P2X0dxSEml6r+c= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Mh0qniMd; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf17.hostedemail.com: domain of 3G9rgZgsKCJAuw4yB5yID7008805y.w86527EH-664Fuw4.8B0@flex--ackerleytng.bounces.google.com designates 209.85.210.201 as permitted sender) smtp.mailfrom=3G9rgZgsKCJAuw4yB5yID7008805y.w86527EH-664Fuw4.8B0@flex--ackerleytng.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011857; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=AYJem9uK2cy9PiRW4XqF1AVoy187cj8pbxn6tsiytpU=; b=Q6yyvorxHk9OacntikhMaoY/iaRtiT1QsmyoOp4nQOZLNeYcY4iir+2IY9T19GvKZ2gExX 7/ODoYDqXT8F0dedPIRyesrA5aUK1lrUcQo1gCaDbSXZLa1guSUa2ZGkgwzmrSfXET/sOA BKsEMS3fwgCpXgILWQgpR1tdqdsr0eY= Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-718e34530dfso1602607b3a.3 for ; Tue, 10 Sep 2024 16:45:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011932; x=1726616732; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=AYJem9uK2cy9PiRW4XqF1AVoy187cj8pbxn6tsiytpU=; b=Mh0qniMdxt1mje8BfNNJJuwDTS1otIxjEysTrFXpPiONFalZC9s6jB6Zbn08x8Dz04 Q2qNAiwGefA9Hw+wrX8P092IQ41r6QdWhs29g72msBx03IVUM5Ju179/ViqPvHLbLdV3 Z88xw40R8ne2pC1cWGnAy7g0NEYB3yt8NSgYHFeUBJ+h5AUz0uPaTe/QicitCpBJ+O5i GnKdgP5yxRm7vUDuvPH7fuJjFH7z0R+ZeFoyOfMxXrbygA9Z+pkNXWRaa4mrSOb3+u92 jO1z4IWS9DbbHW4BYoSH/jFW30JYPd/3nSPBes0Nu5/kYfd7/cBA4A0l+SYkRW62cAjZ DZCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011932; x=1726616732; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=AYJem9uK2cy9PiRW4XqF1AVoy187cj8pbxn6tsiytpU=; b=IwPhOBUuYmeKM6C2tkY7ikB2LiUzKmvQK4Juz1gxXYfQEJdjaB9p8J/tKuDt3KlPmB Ya4slYoX9ILMSq0wsyMPASyN8KOT5VULO0a+yxFwWpq9gHLZ3WNLRXEk2JfcJ+SVj8Am xIYkcZUosYX8YXirIufQ4zfC2X7t2nDnA88GFXVqVOgnNnS+sYy/9mCQKh3SudC3tF6c p7E7CjW5brfn752GUm/Ha3Uhr3XUkZv1tDjiT48iHET72pVBNXd9yN+QCv+bkjKAF/6A 5IcVEvQockgkExEWjjaz+/KkUfWvmPbGZCGvLAnuTWkXpQg8Wg8c0XXJ1WfLxMMBrMk8 8k3Q== X-Forwarded-Encrypted: i=1; AJvYcCXEvXF5juX1iPbpCPXchXTDo1j2WeLtvbtV/9vNTy3r+RrOsbc36mSftfEnmJnjCtkq59W11qyTow==@kvack.org X-Gm-Message-State: AOJu0YwsAyJCxGQ+FinECzkXdLoa0xTXmMmRbt6TVtNruhzc4Z2kiJsD YQQwufxtBZjBoAOnhwmJyFX+UfFZIsSjz/++E9wMqE/r5oeZOT+kagg0ZTKKtVtgInzRNBiFIQY OzfyVDtQOHPrEpAoY91Vi2Q== X-Google-Smtp-Source: AGHT+IHaymwmA0a5qHgrrgMyaOagseBsBxvzMqXn3tRr1+SR9wD08VNMmkUJlj2d4F4z5ptyADsmNrcNJBVUlk0nKg== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:aa7:9390:0:b0:706:3153:978a with SMTP id d2e1a72fcca58-7191722c371mr2134b3a.6.1726011931559; Tue, 10 Sep 2024 16:45:31 -0700 (PDT) Date: Tue, 10 Sep 2024 23:44:05 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <0ea30ee1128f7e6d033783034b6bc48dfbabb5db.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 34/39] KVM: selftests: Add notes in private_mem_kvm_exits_test for mmap-able guest_memfd From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspamd-Queue-Id: 19CF940006 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: ht15eo1n3qgdr77gionprhh6ktqzb4sf X-HE-Tag: 1726011932-790713 X-HE-Meta: U2FsdGVkX1/ADMHRA/RP45+7uYr4ZFCK/4kFzlXpz18FGjSRPlpgFbcX8WabvL0efNPmEE29v/v9S8cFbx3ZitKsd0dCJLQXrExuWQJ4Cimjf2qHGQlplBq20VjUTIPqGXXlxDK8JdY/KUbFOEkFG8Nkfih9Miuzzn5O2ixAzok4mJ0ax8eoOsWrDBUJwDHNNqPttvEFTyBgEUP13qp9BQGSomVMWJEFUEKhpAzH6jIX27faugfdGkj417sWfdYUNQiwIiQjReaoaIRJB7FrhLqCbQYSGRTTZlT7vc1mK6MPmbsOoaj6pbi4/kH0SPT5jH4/Z6DMquORSoY/ySBfkI0fwCYy01lFKSO6lQlDENFEoZZRjJLwp/SXisgHDsp+rNeo+Mikj1Y8uEP0w7HkjvsEHiEelzfY+y3DofedL0g2sNAaGPGybPrxDe+VvvNbCJYGqNtuTtskqoTCMxlCBRfS27vYbKsAp5QrY6/iCGsELvmw9j7RNXS6UDvBe3FD/ouRtQIF5D0ggNTgXxE8aB+tG3yJdIq6KXgVXjnEGQeBSFf5lZQwP3TqB2zIjzl2Ru4eziqf7UPS4GbIXypj3HVbnhl3gD88gdb06XZD/AieERHC9oPysjtqKt88yoUpm9CZV0EWvoIl101NgUldBEGtqUL/ktMbzA30b9/ogtWZQE0rCDMPZy3WzULpQzv7zQ3YmlQ8ihJs5RkQepVxpizr/knaeZDISQnBp9iPduHiMbcDB9LpM1EDkLnFBcY+48ONPoR6WE8MwEbzmBUAzzanD8bb+H865edlU26WBCPWF8vfAaWqhF48VSMfeyhTUIjVBgfbm3tJGWDlbtPoqSlgFMRB1Tr4OagnDJWAqO32cdHJ5o24L0Ewq/a60bUMxaSZwJXBtty7atFeI+DIpzTW39vp/PGVgCN4EPvxycK/XZqHBjX2zIeeGBUL5tm1Yc+sIDmk7cD7jmYthIR 7c/+UJgG GchmuPugwDu6zc/kc6JQDRnIuWF0qHtoI1XeOuAS6923NgZ1jUoT4WIVOGLE3lAfn8IOQmkxruEj2c+Qmele6r7qTrs2d9e6rtt2WPbObajHVpqxA9N3Ia+WC5Bh7imlRS/lnrjpErUMihDbbAf/6OgJhU192woynNXBbfvO52+RcVpJ6AAKDAm9tY3ba2GYnFLmdVfiA34MbNo4PDLiLGKgdAuiz4LQVU9AGBxfNtilsw6rPppbgOWnnuvMp5/WNzIAcmjmRN4nYXi+xOPX556tTuFtdWuUKVOarEjoL1XIcbgxqmcsjitV0rMjetn5dTo2Fd8x2TF1PVulo/n/Z8Sva3R7NBu1s7x8/wlQaR8TLriT4/Qe942aBBbiwgOwKMBCUJB9UGC6hrTqOh7zgoJUMJLm+c+HD/MSlSi9Oj3tZyhRO2B+y1thpk4sUlK2Opb7L5KP8epiLrUWl1EsKINgKvIAjdmkAb4FxLAmYIouIASypWJJ4CqdKowKW5W/pM7I1aejJy+aAfnP1188oEc4FbYtI/M1VsyHNmurr1h0oufJQx7NpBg2rUUerzWw2oRMu+4Y35+fJ2jsMHxiaKMMMdF1bbq8VVuszJCvJDtV8raF/6zVeEq1D9++q8bdZNFWojIdb/9kfk3ygMKWx8hUz142wJiDHEQkdNNfZ/KXg1QMCGcFLtMzA3KHG4IhqABgPFpSEedmQOzA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.137788, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Note in comments why madvise() is not needed before setting memory to private. Signed-off-by: Ackerley Tng --- .../selftests/kvm/x86_64/private_mem_kvm_exits_test.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/kvm/x86_64/private_mem_kvm_exits_test.c b/tools/testing/selftests/kvm/x86_64/private_mem_kvm_exits_test.c index 13e72fcec8dd..f8bcfc897f6a 100644 --- a/tools/testing/selftests/kvm/x86_64/private_mem_kvm_exits_test.c +++ b/tools/testing/selftests/kvm/x86_64/private_mem_kvm_exits_test.c @@ -62,7 +62,11 @@ static void test_private_access_memslot_deleted(void) virt_map(vm, EXITS_TEST_GVA, EXITS_TEST_GPA, EXITS_TEST_NPAGES); - /* Request to access page privately */ + /* + * Request to access page privately. madvise(MADV_DONTNEED) not required + * since memory was never mmap()-ed from guest_memfd. Anonymous memory + * was used instead for this memslot's userspace_addr. + */ vm_mem_set_private(vm, EXITS_TEST_GPA, EXITS_TEST_SIZE); pthread_create(&vm_thread, NULL, @@ -98,7 +102,10 @@ static void test_private_access_memslot_not_private(void) virt_map(vm, EXITS_TEST_GVA, EXITS_TEST_GPA, EXITS_TEST_NPAGES); - /* Request to access page privately */ + /* + * Request to access page privately. madvise(MADV_DONTNEED) not required + * since the affected memslot doesn't use guest_memfd. + */ vm_mem_set_private(vm, EXITS_TEST_GPA, EXITS_TEST_SIZE); exit_reason = run_vcpu_get_exit_reason(vcpu); From patchwork Tue Sep 10 23:44:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799508 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 71BC7EE01F2 for ; Tue, 10 Sep 2024 23:46:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DE7A48D00EE; Tue, 10 Sep 2024 19:45:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D462D8D00E2; Tue, 10 Sep 2024 19:45:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BBE778D00EE; Tue, 10 Sep 2024 19:45:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 965D88D00E2 for ; Tue, 10 Sep 2024 19:45:37 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 4983A1404DA for ; Tue, 10 Sep 2024 23:45:37 +0000 (UTC) X-FDA: 82550463114.18.136EB8F Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf15.hostedemail.com (Postfix) with ESMTP id 67C13A000E for ; Tue, 10 Sep 2024 23:45:35 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=mLzbLvkV; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of 3HdrgZgsKCJIwy60D70KF922AA270.yA8749GJ-886Hwy6.AD2@flex--ackerleytng.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3HdrgZgsKCJIwy60D70KF922AA270.yA8749GJ-886Hwy6.AD2@flex--ackerleytng.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011859; a=rsa-sha256; cv=none; b=dm8ByyEAXPBdsgRN69q/0+Zq225nUIjWkZZijlCWx9GcjhLgSmZF1dQRJRoCLxJpg8eNVg l1uod84bCIUcc9eQiSrJGcv4Pmsv1nNZGLak4toyUvmjXVZX7t/UeO9YmnrLgZTPLcMGB6 o0ypEMRcVhVMUUFojdJ9u1GGGDmWzZY= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=mLzbLvkV; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of 3HdrgZgsKCJIwy60D70KF922AA270.yA8749GJ-886Hwy6.AD2@flex--ackerleytng.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3HdrgZgsKCJIwy60D70KF922AA270.yA8749GJ-886Hwy6.AD2@flex--ackerleytng.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011859; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=PU0eLeT/Khb6QlOSfPgwKdSujIEPbf2Mi98Qjlq+ljA=; b=jUi018+ZgWBUZ9VW1RmD3YvuTH/6veXdUTIfeypBwPfWTbQViVw/JLKFsrs9ZgoJeRlOLV 9bRp6zzkLZypXuPUdi/539yboEaZxh1hnEKzjb6r0SbRs6R514qT4s7t/eDFBE/czH+ASy fGCkIfL6wr/P7z8Em95cIXbNwz1m/zk= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2d86e9da90cso1157868a91.2 for ; Tue, 10 Sep 2024 16:45:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011934; x=1726616734; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=PU0eLeT/Khb6QlOSfPgwKdSujIEPbf2Mi98Qjlq+ljA=; b=mLzbLvkVk5WHzBEiGn/TbRw+ITFwWwdwkA/CfupF1nmFIlvBFHTS/ikVQ0O09dD5+9 41vJYoyC0SrSKRjT40UJArDcBqy/uCPaBdM+xPq/1OuRXnCbqQu9/FvjyPdnRKBKt4Bw S7eaWrm7zQ80/96pUawG1CVOj1OdiuFy98gMh2hJnDmS59EMC6DqiskJbmT9fb/BF8aw qZI8AVW3KNFUZmCCR8Y5EncLWzUtEfobJjzNxmMJa/VZ6NUB7sEkmzLanSpHHrNU5Xei mg6cVMs5/PMj7hKfkWqELlAIT6NYcjA/5+Rc7eIvdC/EkKORHbYaismCF/BvTVt2aWst Tj1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011934; x=1726616734; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=PU0eLeT/Khb6QlOSfPgwKdSujIEPbf2Mi98Qjlq+ljA=; b=PB1iUYnXxrAFvwm4zIcTFYxTRr3VOS19pGwXhdwwScwP2YyY887pKt3NTkvPN/jF8b Ht0+wPXbZIp9ha9rYcAcar4bD08MFtA2q92Eq2cBNZKmBi+uOTUq4LYFIULJAcCV9YUU 9VvUt/+OTruyph8OiwDbjs7Aqn9OsqO02v/Si+qXgSmlpfVPzoAC9XaYYnNuPbZd6uuf ZDVT3fks0D4X3Q0Clfa5THZNXxj6N4qAgabG/TzZJNQ8OIA5s5lAQJLwWMHUuK7/q1D+ NlHxW9ngAZoNpWVMx41VyBlqyf8UNpOiF+i0ViPx7ld9nqhSXw+LO35ogP7XW1tqnl63 Qfww== X-Forwarded-Encrypted: i=1; AJvYcCUDYU9dnuVSNwhav7qe4TfuuDvjXciCdn8rhLZcu6cdu/60pH5hXkn3sv9J7W/rNdevbSlS7oALmw==@kvack.org X-Gm-Message-State: AOJu0YwppTn7kaAoxcVK8DhmG/J2NH47MszetQ5OARClLm4PHRsz8da4 m2B/X1RUHEGL+ma5Dr/WFHT4U3RoBmoIIMvF8U8MfYZFN/U2pXhUQCI1HzLbLb/H64TCj4269ck 6fme1ZpWM47ZRPPzteciRjw== X-Google-Smtp-Source: AGHT+IFVTG93FUeTatq9dmJkXI56Xlmri3CwyYIl4IemmSnQkPMQ4ndLEBPGw8YCyak9fs0iNwaTmELdbdpO3NZKxw== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a17:902:f687:b0:206:c776:4f11 with SMTP id d9443c01a7336-207522167damr522595ad.8.1726011933308; Tue, 10 Sep 2024 16:45:33 -0700 (PDT) Date: Tue, 10 Sep 2024 23:44:06 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <09892ae14d06596aee8b766b5908c8a7fdda85b4.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 35/39] KVM: selftests: Test that pinned pages block KVM from setting memory attributes to PRIVATE From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspamd-Queue-Id: 67C13A000E X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: ror4pbm3oswcuqioic1cskd98a7i5zy6 X-HE-Tag: 1726011935-753232 X-HE-Meta: U2FsdGVkX19rxlJU5TF99/abMrCa8OcyFIceCQ62xPlaDYvsAy6XPQMqdKyBq3+u4UbaUcNedlmALT0qY+J3gDyH/wnpY3DIlMym8Km61lo6pjAr966Cj9OYquA3iFEMpDBDJrJbbZa0nNl0Aia0+WoF9lmk2kOGlyHUt7gPcXmqWesWuMDeSSQeCI8ZJRj5LH01RiP3Qlv5nYQ541WOuBwgXsT9j7xfH4O8+0CpcM8gEL4bzSju6dqrMJ3B7ji/v6Q6ookrPTlr5s73nVoZQah0bGtXgXmxhNW5xyZYGR0nCeiEAXDRI9PNRt2C9Jvp/04avT2DnCcAXuOT4NTLJHNKsfWJziGVCkJoL6fUBWk7j4us8M/qQ/zkckCQvEZsaDqPnRT37j1p5tkvyBgZ1yamInHpF8jVy71+iUo7D1eDBVuCYa1Vnyg5V/lk969+iLvoMicsAa32HQuxoRZr7OCH7HecTSBbqaoSIEaEsWTCA9nrOgKkfWFWUwYflvCgY3D3UbAoY2p0UIYsdk/xADXpoh2ifHHN59ojDfA+e6XCM58hedBr+Z+2KaYIT6JH84Qjc4pCVYHUGeWEUNQJWlCgPoyEW9tPuMljpw3rzSF8KJCoUhQXOin67afLHLohMI3BFILuJpMae91/aIKGfIfybvcicMUzUXsxb7FYe/OnIVZFJ8cUE2UHVsl8wjwdLP8+31jeAbss2r9ym3AcGeHUuhVZedUZDkvxmmGA2mjboU6N5EF+qwq52cc9KJNY1J7Rn0T1/SDKnQZaiV57cXwsPaafeCb0Bl9z7jNXXqXty03f6StHocTwHJcZX364su2pMWzF4MCxYvN+ifvop9NzjYQ5e92O2pC9FOo3pQNsqZhkK5IBeSlrR3Ci5XF0DbWSmqHw9gVFt/5dItfehWLkrLll3z7cN8+oxOtH4euZLwMgLNJfFNm4fAmWNVlcX76+Qa8q7mH+TnNAfsT 2xbC9vaU 5ShrjJ3l2jbdw3RGIp1KrkAu3IdAYUFLc254RImKjKm11uIYpplN0wZkBtmw4qDcn/3IaS8SnGMjfysLH/2QmCWcSOrHHgJcPW2+G+FJLPWinME4CkN0i1m/mqdCaaFeIvhFk6vRC1n1yR5YTxOp9ZNnVWrYa6Eb6heV6A9TmcRRB41j+1Zj26eCiKzAVqriDBDI4yhnv4Az0ENEznEsMSeNinXdSL4/LLDri3B49LGrllux2VJ4eUOMjHQade/POwVT8P7+/dZHiCUR37EtEZwBJPggjtvrcgBiP8bJ/JUztizvETtkDqsqr+SUixnEP16GCp3dGRhm3lnxuW/B1XIM1ZxuYMz/IIOklNTpOQiaXNf6+pyasIvc7vBH1JhIn52DBkle0tYFz+8HnThc8Ojy+qA5VvQ+pTrOr7nHsGpAiroliLfieDrgUExuIdPH1KO1RTcKqPjy8a8e3ZSip4G/CZhOqBlSmBPStB9jl78R3cburDwV2RxKTlQsK7JY/A/QcHzAh55R1G3PqbjM4SQtgfdk95DZ2aVGTSvv/knQn4ibBLzaBw3z3HlwQG+s/DI5SMoI4WI4rtQ7CSFZzAf9TnUye4vjHbIvSeM+FZOawEN8FtY1m9qIGNCBE3gT41VUjhleYqI+HpFaFfvSftc16UA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000048, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: CONFIG_GUP_TEST provides userspace with an ioctl to invoke pin_user_pages(), and this test uses the ioctl to pin pages, to check that memory attributes cannot be set to private if shared pages are pinned. Signed-off-by: Ackerley Tng --- tools/testing/selftests/kvm/Makefile | 1 + .../selftests/kvm/guest_memfd_pin_test.c | 104 ++++++++++++++++++ 2 files changed, 105 insertions(+) create mode 100644 tools/testing/selftests/kvm/guest_memfd_pin_test.c diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 3c1f35456bfc..c5a1c8c7125a 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -136,6 +136,7 @@ TEST_GEN_PROGS_x86_64 += dirty_log_perf_test TEST_GEN_PROGS_x86_64 += guest_memfd_test TEST_GEN_PROGS_x86_64 += guest_memfd_hugetlb_reporting_test TEST_GEN_PROGS_x86_64 += guest_memfd_sharing_test +TEST_GEN_PROGS_x86_64 += guest_memfd_pin_test TEST_GEN_PROGS_x86_64 += guest_print_test TEST_GEN_PROGS_x86_64 += hardware_disable_test TEST_GEN_PROGS_x86_64 += kvm_create_max_vcpus diff --git a/tools/testing/selftests/kvm/guest_memfd_pin_test.c b/tools/testing/selftests/kvm/guest_memfd_pin_test.c new file mode 100644 index 000000000000..b45fb8024970 --- /dev/null +++ b/tools/testing/selftests/kvm/guest_memfd_pin_test.c @@ -0,0 +1,104 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Test that pinned pages block KVM from setting memory attributes to PRIVATE. + * + * Copyright (c) 2024, Google LLC. + */ +#include +#include +#include + +#include "test_util.h" +#include "kvm_util.h" +#include "../../../../mm/gup_test.h" + +#define GUEST_MEMFD_PIN_TEST_SLOT 10 +#define GUEST_MEMFD_PIN_TEST_GPA 0x50000000ULL +#define GUEST_MEMFD_PIN_TEST_OFFSET 0 + +static int gup_test_fd; + +void pin_pages(void *vaddr, uint64_t size) +{ + const struct pin_longterm_test args = { + .addr = (uint64_t)vaddr, + .size = size, + .flags = PIN_LONGTERM_TEST_FLAG_USE_WRITE, + }; + + TEST_ASSERT_EQ(ioctl(gup_test_fd, PIN_LONGTERM_TEST_START, &args), 0); +} + +void unpin_pages(void) +{ + TEST_ASSERT_EQ(ioctl(gup_test_fd, PIN_LONGTERM_TEST_STOP), 0); +} + +void run_test(void) +{ + struct kvm_vm *vm; + size_t page_size; + void *mem; + int fd; + + vm = vm_create_barebones_type(KVM_X86_SW_PROTECTED_VM); + + page_size = getpagesize(); + fd = vm_create_guest_memfd(vm, page_size, 0); + + mem = mmap(NULL, page_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, + GUEST_MEMFD_PIN_TEST_OFFSET); + TEST_ASSERT(mem != MAP_FAILED, "mmap should return valid address"); + + /* + * Setting up this memslot with a KVM_X86_SW_PROTECTED_VM marks all + * offsets in the file as shared. + */ + vm_set_user_memory_region2(vm, GUEST_MEMFD_PIN_TEST_SLOT, + KVM_MEM_GUEST_MEMFD, + GUEST_MEMFD_PIN_TEST_GPA, page_size, mem, fd, + GUEST_MEMFD_PIN_TEST_OFFSET); + + /* Before pinning pages, toggling memory attributes should be fine. */ + vm_mem_set_private(vm, GUEST_MEMFD_PIN_TEST_GPA, page_size); + vm_mem_set_shared(vm, GUEST_MEMFD_PIN_TEST_GPA, page_size); + + pin_pages(mem, page_size); + + /* + * Pinning also faults pages in, so remove these pages from userspace + * page tables to properly test that pinning blocks setting memory + * attributes to private. + */ + TEST_ASSERT_EQ(madvise(mem, page_size, MADV_DONTNEED), 0); + + /* Should fail since the page is still faulted in. */ + TEST_ASSERT_EQ(__vm_set_memory_attributes(vm, GUEST_MEMFD_PIN_TEST_GPA, + page_size, + KVM_MEMORY_ATTRIBUTE_PRIVATE), + -1); + TEST_ASSERT_EQ(errno, EINVAL); + + unpin_pages(); + + /* With the pages unpinned, kvm can set this page to private. */ + vm_mem_set_private(vm, GUEST_MEMFD_PIN_TEST_GPA, page_size); + + kvm_vm_free(vm); + close(fd); +} + +int main(int argc, char *argv[]) +{ + gup_test_fd = open("/sys/kernel/debug/gup_test", O_RDWR); + /* + * This test depends on CONFIG_GUP_TEST to provide a kernel module that + * exposes pin_user_pages() to userspace. + */ + TEST_REQUIRE(gup_test_fd != -1); + TEST_REQUIRE(kvm_check_cap(KVM_CAP_VM_TYPES) & BIT(KVM_X86_SW_PROTECTED_VM)); + + run_test(); + + return 0; +} From patchwork Tue Sep 10 23:44:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799509 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A185EE01F1 for ; Tue, 10 Sep 2024 23:46:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A13728D00EF; Tue, 10 Sep 2024 19:45:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 99DCC8D00E2; Tue, 10 Sep 2024 19:45:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 754158D00EF; Tue, 10 Sep 2024 19:45:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 516DB8D00E2 for ; Tue, 10 Sep 2024 19:45:39 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 10E4F1C1F06 for ; Tue, 10 Sep 2024 23:45:39 +0000 (UTC) X-FDA: 82550463198.26.F395B04 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf20.hostedemail.com (Postfix) with ESMTP id 20C4B1C0007 for ; Tue, 10 Sep 2024 23:45:36 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ChtFLklt; spf=pass (imf20.hostedemail.com: domain of 3H9rgZgsKCJQy082F92MHB44CC492.0CA96BIL-AA8Jy08.CF4@flex--ackerleytng.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3H9rgZgsKCJQy082F92MHB44CC492.0CA96BIL-AA8Jy08.CF4@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011884; a=rsa-sha256; cv=none; b=8l0emdIqcnPbydMY1xPGHXRblZEep9o0/tp/CsrGAhYi08ngyDh4aSMKGLxlFeaADNsLvw mmPZuGA7rL96xI2G4urbVruF4I/+uoDT13U2E2tBR6z0bWkQZaTKnfMGjiCOkb/P9MDa4x FgKy+XXdMZHur2EAfkXVx77wtnys6Ug= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ChtFLklt; spf=pass (imf20.hostedemail.com: domain of 3H9rgZgsKCJQy082F92MHB44CC492.0CA96BIL-AA8Jy08.CF4@flex--ackerleytng.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3H9rgZgsKCJQy082F92MHB44CC492.0CA96BIL-AA8Jy08.CF4@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011884; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=f/AV4772bz9qe8Bi/YmA4hlRAWxNbxHJTt16v2vi3B8=; b=q9NV0S3ECQD/zGDBrgPsmXT9WXjJZ3WzQTF/bMXduYRFx5GuH4ET8K73Ooq3fmJGWEYKFY LjKndLEozOJ9qfl61E+ESAVslZU/HtPdPuXUjgU8+6lZ33xHn3Zly0gqjQMp9LF4JBIj1Q /p8pRISYPDQw7pdIwHt6uM85YYIwVi8= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-206da734c53so20338195ad.2 for ; Tue, 10 Sep 2024 16:45:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011936; x=1726616736; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=f/AV4772bz9qe8Bi/YmA4hlRAWxNbxHJTt16v2vi3B8=; b=ChtFLkltMHWBYJUO9g2P8G0VkMpufk2/RyLZPdJIZchrH1uB0T2Lof7uM9ZrSxIlzf 2CG9w7IjuUA0iwXmsCNglq6S+sOGR0gIUiE1hHh1v4ULMAcZGfHBuBBCHrB10gCH+Tbr kJGBp2wfHGq1bolQHI9nIPzw0SL2c6Tw8v49vG8YYBQay8u/gkP6MZ5EitlOO4ri/i53 BPSGnmet0ieZvhnFM6QYPWA770EPM9h4ayd0KfrEDrytgo7t9CnggCaE5Z3ezKsyeC8l A8uBTs9l9iwuI3Oevq074Hm5Q9gNVDjFVpRfOq4VjWni6IuozoD/YcToVikRZyAxpXy0 7Y1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011936; x=1726616736; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=f/AV4772bz9qe8Bi/YmA4hlRAWxNbxHJTt16v2vi3B8=; b=EMzEl0/ViqFWBVNE8MJ/oOHuCt/Z0VZF/7OyY3pIpmwt8Icq9sGyWKssBoM/+cOIfl wnQT5AzZ2U2gjLfnQTTHEm+LUnRL/AipfkAISVja8ugC0mJ1yIIAijIxdKVbZXKs5dID T1/gQYyPLMl9GyNwZtur0HZD3UkRcnm20DT9B264juv5bjOgQVgDHC7HdXQxF7D6WyhF LMJAj1YSn/zbKsFDVbLp+v8biVuQNNBvd1RmnDIJ2wHJfzCDMTezddYm5/MexSO24+7t eZ9LVzIg3ztcady6B/tcjTpp2mEDY1mDoGftzws5CPYkfaWWuI/jRgUAUdyW3AbsdJME 3xfQ== X-Forwarded-Encrypted: i=1; AJvYcCWYct6RUdWu2uSdUAee4Cp+C3r8VxB9Mw6Ey5V/Q5jn9/GyMRC1VWxZ2jUFPa6ziAiNRBW4f93w1w==@kvack.org X-Gm-Message-State: AOJu0Yz0HSnkdpeTFj7JegWz4ldmT+gGqZycwsxwcoeOQ1EsA5+Hk3yp 3C9e9xqJIK/U/QI8VcMBy+6O1kZEqeI1pYM/6Azgv/0gfkafauAnWuk7glNeJ12+VywHfmiyjvf JCeaOiTo0Z+LzpBgQ98dJ+Q== X-Google-Smtp-Source: AGHT+IHXrryjP2y/Skas2BttlctleDJukyVnmiThULgyHNQcrulpwoY9Zba1IZOLxgpQ53M3aNE5Nft3kH97eZ+fNA== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a17:902:d2ca:b0:205:799f:125e with SMTP id d9443c01a7336-207521f4414mr358015ad.10.1726011935717; Tue, 10 Sep 2024 16:45:35 -0700 (PDT) Date: Tue, 10 Sep 2024 23:44:07 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <2daf579fa5d2ba223fa3a907c1048d3ea4458a57.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 36/39] KVM: selftests: Refactor vm_mem_add to be more flexible From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Stat-Signature: bk6tpq3acmnwng3a6f6tkcjbp3e49die X-Rspamd-Queue-Id: 20C4B1C0007 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1726011936-957134 X-HE-Meta: U2FsdGVkX1/XePdWcH4XsYulsAS/PpzFRyC7CHnsEHQ7a5JJHcMzRwYi2neRjM4FaD8JgJBCY9UU7fFpl8R5hHIVfNhPezuRGSCJH/6vzpns3UVv1fE+2esCFSe8KyORimEOstcX9bK3A0PMtw6z/YMiLWkq1aftxaDVFdVusEsqiuTxzHAaY13yLkzjY09M0b7+EBH/V7e8KqfPjZYL5cmtqWCUhuaIars0ByzEtdImRoM2i/lbb1Z0F44Nuwgw0XH1fcJcw4y5cCgkgXfBt3Wy/zc5iIcXaKCl4BO/+O6Vq9UFBcJTR52u3Ui/hAvgQ5SJwpWIYmWH4Gh9eNqfLF0KFn2XzZ1XYY3HcCqv/TwG9scc0dhqmQxSusHpW0vkw6pep9MlvVxLVir6zk45YPoZ+3BvHwJsGK5j8Oj9rfJHGTMESPSl/QTEPg0hf71w2dR4vly1x5BVsU7NySYeY9qSOXcIQ3+UIjZ3nAfx9vzlOOL2qEaoFh5c0og+h1JCmQFdRERisW36h/FWIs1XhQAmoleJz+F/EgrIC2DpuOB0Hoy+hjv8JycOMVHM8svl+VblbbZSUMezvrbX1whcueYtcrVPh6xjf62uNkOjGD1jeSaxfSWMrqTDhqTj3yVJ2DGfp4yuDIn0J1erN/KAhebTKZqMW6dwV+O9MPDHBHrz5xP0SaBxwEl6lE3Wq4i88UktHRaAKrLURk6sC1vYPiNCk8vst1MOxK4ukJ8D/1SuY/vn5Slok3cAHHOGGtKgd8j2m1cmeJ17v4UTJOJ6jLSMZmYZ6YZarkq41S+YeTt0Nq072omKfM+u56lTg5xt8RHKAF65JPhWDSuysCJwzs9tEU8VYiS5xMimZo7IKgZoHgApIYy4oTGd08PIX3L8qE2Di0I35/XnE92WewiyTxHSNXI1l3JQw69GPtUJSYaHhkifCnSpUzGUG6EuP77YsBbt+wk2bXhR24Ozk4c 8+VAgBqO EOyM7th6hLEdSO9NtEn9bnBZ/TKVAUB49p7apv5x9deImiKtgY2TYfgsEFBqjbqxzWa9Lvh6/QDJqBHbm6FKynO7I49sq/VMW91fpzZAbu5+FXetZmujmjnOOmCLx9UOuLSkBw2MxqzmiK4/oPkY6TYRR4WA6oI+QO9u72+hJzu28vJzpuMKwV1AD8BbCjyyxmxapP3D/NQjQ73BtJYZSYIgoeIA0znHnIMixY5fGRfShuhf95keRsvUiKhOWECeFQuFdku+AbmjFZZaMi1JO+hXy03mSovUP2bPgMygC0lSDURNYXxjExlKZ/5A1ew8WuN2yad3/axrXKHZPAzt1nWtIGY3mS5AHKUAI4yA5UU+/vfavwouY5OmmXcZSMSujVhwU1IgWXukU/b9p45gjKehskuK2GLoHXiih9lvKX3H/96uYAtu56/+hN8eCpu5OzW/+SzosWqZXGexhJi0ytTJcelsao61vxkHDHY8pMI6ZUmCFrtS2JDGcOoQi6K37JlZjdqFEQaMAkQpTVEkPvrA++utWCmr+8Tivw1OhoqBTWeCic39vgl/tAPNHO4hyShkTIZjjmlUBrsXZi4AwdVhVszln7MVxpi5Of271FViahpYNXRpJzUGW+1hzzT6evsd0kP1ZQ/F8i/TBjMiTE6XTqVGfiuPQkbqeedWmo/jXFmBJw5jfwx9YzVeW7Xkmp0yl9p3v1Sdnmk0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000002, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: enum vm_mem_backing_src_type is encoding too many different possibilities on different axes of (1) whether to mmap from an fd, (2) granularity of mapping for THP, (3) size of hugetlb mapping, and has yet to be extended to support guest_memfd. When guest_memfd supports mmap() and we also want to support testing with mmap()ing from guest_memfd, the number of combinations make enumeration in vm_mem_backing_src_type difficult. This refactor separates out vm_mem_backing_src_type from userspace_mem_region. For now, vm_mem_backing_src_type remains a possible way for tests to specify, on the command line, the combination of backing memory to test. vm_mem_add() is now the last place where vm_mem_backing_src_type is interpreted, to 1. Check validity of requested guest_paddr 2. Align mmap_size appropriately based on the mapping's page_size and architecture 3. Install memory appropriately according to mapping's page size mmap()ing an alias seems to be specific to userfaultfd tests and could be refactored out of struct userspace_mem_region and localized in userfaultfd tests in future. This paves the way for replacing vm_mem_backing_src_type with multiple command line flags that would specify backing memory more flexibly. Future tests are expected to use vm_mem_region_alloc() to allocate a struct userspace_mem_region, then use more fundamental functions like vm_mem_region_mmap(), vm_mem_region_madvise_thp(), kvm_memfd_create(), vm_create_guest_memfd(), and other functions in vm_mem_add() to flexibly build up struct userspace_mem_region before finally adding the region to the vm with vm_mem_region_add(). Signed-off-by: Ackerley Tng --- .../testing/selftests/kvm/include/kvm_util.h | 29 +- .../testing/selftests/kvm/include/test_util.h | 2 + tools/testing/selftests/kvm/lib/kvm_util.c | 413 +++++++++++------- tools/testing/selftests/kvm/lib/test_util.c | 25 ++ 4 files changed, 319 insertions(+), 150 deletions(-) diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h index d336cd0c8f19..1576e7e4aefe 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -35,11 +35,26 @@ struct userspace_mem_region { struct sparsebit *protected_phy_pages; int fd; off_t offset; - enum vm_mem_backing_src_type backing_src_type; + /* + * host_mem is mmap_start aligned upwards to an address suitable for the + * architecture. In most cases, host_mem and mmap_start are the same, + * except for s390x, where the host address must be aligned to 1M (due + * to PGSTEs). + */ +#ifdef __s390x__ +#define S390X_HOST_ADDRESS_ALIGNMENT 0x100000 +#endif void *host_mem; + /* host_alias is to mmap_alias as host_mem is to mmap_start */ void *host_alias; void *mmap_start; void *mmap_alias; + /* + * mmap_size is possibly larger than region.memory_size because in some + * cases, host_mem has to be adjusted upwards (see comment for host_mem + * above). In those cases, mmap_size has to be adjusted upwards so that + * enough memory is available in this memslot. + */ size_t mmap_size; struct rb_node gpa_node; struct rb_node hva_node; @@ -559,6 +574,18 @@ int __vm_set_user_memory_region2(struct kvm_vm *vm, uint32_t slot, uint32_t flag uint64_t gpa, uint64_t size, void *hva, uint32_t guest_memfd, uint64_t guest_memfd_offset); +struct userspace_mem_region *vm_mem_region_alloc(struct kvm_vm *vm); +void *vm_mem_region_mmap(struct userspace_mem_region *region, size_t length, + int flags, int fd, off_t offset); +void vm_mem_region_install_memory(struct userspace_mem_region *region, + size_t memslot_size, size_t alignment); +void vm_mem_region_madvise_thp(struct userspace_mem_region *region, int advice); +int vm_mem_region_install_guest_memfd(struct userspace_mem_region *region, + int guest_memfd); +void *vm_mem_region_mmap_alias(struct userspace_mem_region *region, int flags, + size_t alignment); +void vm_mem_region_add(struct kvm_vm *vm, struct userspace_mem_region *region); + void vm_userspace_mem_region_add(struct kvm_vm *vm, enum vm_mem_backing_src_type src_type, uint64_t guest_paddr, uint32_t slot, uint64_t npages, diff --git a/tools/testing/selftests/kvm/include/test_util.h b/tools/testing/selftests/kvm/include/test_util.h index 011e757d4e2c..983adeb54c0e 100644 --- a/tools/testing/selftests/kvm/include/test_util.h +++ b/tools/testing/selftests/kvm/include/test_util.h @@ -159,6 +159,8 @@ size_t get_trans_hugepagesz(void); size_t get_def_hugetlb_pagesz(void); const struct vm_mem_backing_src_alias *vm_mem_backing_src_alias(uint32_t i); size_t get_backing_src_pagesz(uint32_t i); +int backing_src_should_madvise(uint32_t i); +int get_backing_src_madvise_advice(uint32_t i); bool is_backing_src_hugetlb(uint32_t i); void backing_src_help(const char *flag); enum vm_mem_backing_src_type parse_backing_src_type(const char *type_name); diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index 56b170b725b3..9bdd03a5da90 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -774,15 +774,12 @@ void kvm_vm_free(struct kvm_vm *vmp) free(vmp); } -int kvm_memfd_alloc(size_t size, bool hugepages) +int kvm_create_memfd(size_t size, unsigned int flags) { - int memfd_flags = MFD_CLOEXEC; - int fd, r; - - if (hugepages) - memfd_flags |= MFD_HUGETLB; + int fd; + int r; - fd = memfd_create("kvm_selftest", memfd_flags); + fd = memfd_create("kvm_selftest", flags); TEST_ASSERT(fd != -1, __KVM_SYSCALL_ERROR("memfd_create()", fd)); r = ftruncate(fd, size); @@ -794,6 +791,16 @@ int kvm_memfd_alloc(size_t size, bool hugepages) return fd; } +int kvm_memfd_alloc(size_t size, bool hugepages) +{ + int memfd_flags = MFD_CLOEXEC; + + if (hugepages) + memfd_flags |= MFD_HUGETLB; + + return kvm_create_memfd(size, memfd_flags); +} + /* * Memory Compare, host virtual to guest virtual * @@ -973,185 +980,293 @@ void vm_set_user_memory_region2(struct kvm_vm *vm, uint32_t slot, uint32_t flags errno, strerror(errno)); } +/** + * Allocates and returns a struct userspace_mem_region. + */ +struct userspace_mem_region *vm_mem_region_alloc(struct kvm_vm *vm) +{ + struct userspace_mem_region *region; -/* FIXME: This thing needs to be ripped apart and rewritten. */ -void vm_mem_add(struct kvm_vm *vm, enum vm_mem_backing_src_type src_type, - uint64_t guest_paddr, uint32_t slot, uint64_t npages, - uint32_t flags, int guest_memfd, uint64_t guest_memfd_offset) + /* Allocate and initialize new mem region structure. */ + region = calloc(1, sizeof(*region)); + TEST_ASSERT(region != NULL, "Insufficient Memory"); + + region->unused_phy_pages = sparsebit_alloc(); + if (vm_arch_has_protected_memory(vm)) + region->protected_phy_pages = sparsebit_alloc(); + + region->fd = -1; + region->region.guest_memfd = -1; + + return region; +} + +static size_t compute_page_size(int mmap_flags, int madvise_advice) +{ + if (mmap_flags & MAP_HUGETLB) { + int size_flags = (mmap_flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK; + if (!size_flags) + return get_def_hugetlb_pagesz(); + + return 1ULL << size_flags; + } + + return madvise_advice == MADV_HUGEPAGE ? get_trans_hugepagesz() : getpagesize(); +} + +/** + * Calls mmap() with @length, @flags, @fd, @offset for @region. + * + * Think of this as the struct userspace_mem_region wrapper for the mmap() + * syscall. + */ +void *vm_mem_region_mmap(struct userspace_mem_region *region, size_t length, + int flags, int fd, off_t offset) +{ + void *mem; + + if (flags & MAP_SHARED) { + TEST_ASSERT(fd != -1, + "Ensure that fd is provided for shared mappings."); + TEST_ASSERT( + region->fd == fd || region->region.guest_memfd == fd, + "Ensure that fd is opened before mmap, and is either " + "set up in region->fd or region->region.guest_memfd."); + } + + mem = mmap(NULL, length, PROT_READ | PROT_WRITE, flags, fd, offset); + TEST_ASSERT(mem != MAP_FAILED, "Couldn't mmap anonymous memory"); + + region->mmap_start = mem; + region->mmap_size = length; + region->offset = offset; + + return mem; +} + +/** + * Installs mmap()ed memory in @region->mmap_start as @region->host_mem, + * checking constraints. + */ +void vm_mem_region_install_memory(struct userspace_mem_region *region, + size_t memslot_size, size_t alignment) +{ + TEST_ASSERT(region->mmap_size >= memslot_size, + "mmap()ed memory insufficient for memslot"); + + region->host_mem = align_ptr_up(region->mmap_start, alignment); + region->region.userspace_addr = (uint64_t)region->host_mem; + region->region.memory_size = memslot_size; +} + + +/** + * Calls madvise with @advice for @region. + * + * Think of this as the struct userspace_mem_region wrapper for the madvise() + * syscall. + */ +void vm_mem_region_madvise_thp(struct userspace_mem_region *region, int advice) { int ret; - struct userspace_mem_region *region; - size_t backing_src_pagesz = get_backing_src_pagesz(src_type); - size_t mem_size = npages * vm->page_size; - size_t alignment; - TEST_REQUIRE_SET_USER_MEMORY_REGION2(); + TEST_ASSERT( + region->host_mem && region->mmap_size, + "vm_mem_region_madvise_thp() must be called after vm_mem_region_mmap()"); - TEST_ASSERT(vm_adjust_num_guest_pages(vm->mode, npages) == npages, - "Number of guest pages is not compatible with the host. " - "Try npages=%d", vm_adjust_num_guest_pages(vm->mode, npages)); - - TEST_ASSERT((guest_paddr % vm->page_size) == 0, "Guest physical " - "address not on a page boundary.\n" - " guest_paddr: 0x%lx vm->page_size: 0x%x", - guest_paddr, vm->page_size); - TEST_ASSERT((((guest_paddr >> vm->page_shift) + npages) - 1) - <= vm->max_gfn, "Physical range beyond maximum " - "supported physical address,\n" - " guest_paddr: 0x%lx npages: 0x%lx\n" - " vm->max_gfn: 0x%lx vm->page_size: 0x%x", - guest_paddr, npages, vm->max_gfn, vm->page_size); + ret = madvise(region->host_mem, region->mmap_size, advice); + TEST_ASSERT(ret == 0, "madvise failed, addr: %p length: 0x%lx", + region->host_mem, region->mmap_size); +} + +/** + * Installs guest_memfd by setting it up in @region. + * + * Returns the guest_memfd that was installed in the @region. + */ +int vm_mem_region_install_guest_memfd(struct userspace_mem_region *region, + int guest_memfd) +{ + /* + * Install a unique fd for each memslot so that the fd can be closed + * when the region is deleted without needing to track if the fd is + * owned by the framework or by the caller. + */ + guest_memfd = dup(guest_memfd); + TEST_ASSERT(guest_memfd >= 0, __KVM_SYSCALL_ERROR("dup()", guest_memfd)); + region->region.guest_memfd = guest_memfd; + + return guest_memfd; +} + +/** + * Calls mmap() to create an alias for mmap()ed memory at region->host_mem, + * exactly the same size the was mmap()ed. + * + * This is used mainly for userfaultfd tests. + */ +void *vm_mem_region_mmap_alias(struct userspace_mem_region *region, int flags, + size_t alignment) +{ + region->mmap_alias = mmap(NULL, region->mmap_size, + PROT_READ | PROT_WRITE, flags, region->fd, 0); + TEST_ASSERT(region->mmap_alias != MAP_FAILED, + __KVM_SYSCALL_ERROR("mmap()", (int)(unsigned long)MAP_FAILED)); + + region->host_alias = align_ptr_up(region->mmap_alias, alignment); + + return region->host_alias; +} + +static void vm_mem_region_assert_no_duplicate(struct kvm_vm *vm, uint32_t slot, + uint64_t gpa, size_t size) +{ + struct userspace_mem_region *region; /* * Confirm a mem region with an overlapping address doesn't * already exist. */ - region = (struct userspace_mem_region *) userspace_mem_region_find( - vm, guest_paddr, (guest_paddr + npages * vm->page_size) - 1); - if (region != NULL) - TEST_FAIL("overlapping userspace_mem_region already " - "exists\n" - " requested guest_paddr: 0x%lx npages: 0x%lx " - "page_size: 0x%x\n" - " existing guest_paddr: 0x%lx size: 0x%lx", - guest_paddr, npages, vm->page_size, - (uint64_t) region->region.guest_phys_addr, - (uint64_t) region->region.memory_size); + region = userspace_mem_region_find(vm, gpa, gpa + size - 1); + if (region != NULL) { + TEST_FAIL("overlapping userspace_mem_region already exists\n" + " requested gpa: 0x%lx size: 0x%lx" + " existing gpa: 0x%lx size: 0x%lx", + gpa, size, + (uint64_t) region->region.guest_phys_addr, + (uint64_t) region->region.memory_size); + } /* Confirm no region with the requested slot already exists. */ - hash_for_each_possible(vm->regions.slot_hash, region, slot_node, - slot) { + hash_for_each_possible(vm->regions.slot_hash, region, slot_node, slot) { if (region->region.slot != slot) continue; - TEST_FAIL("A mem region with the requested slot " - "already exists.\n" - " requested slot: %u paddr: 0x%lx npages: 0x%lx\n" - " existing slot: %u paddr: 0x%lx size: 0x%lx", - slot, guest_paddr, npages, - region->region.slot, - (uint64_t) region->region.guest_phys_addr, - (uint64_t) region->region.memory_size); + TEST_FAIL("A mem region with the requested slot already exists.\n" + " requested slot: %u paddr: 0x%lx size: 0x%lx\n" + " existing slot: %u paddr: 0x%lx size: 0x%lx", + slot, gpa, size, + region->region.slot, + (uint64_t) region->region.guest_phys_addr, + (uint64_t) region->region.memory_size); } +} - /* Allocate and initialize new mem region structure. */ - region = calloc(1, sizeof(*region)); - TEST_ASSERT(region != NULL, "Insufficient Memory"); - region->mmap_size = mem_size; +/** + * Add a @region to @vm. All necessary fields in region->region should already + * be populated. + * + * Think of this as the struct userspace_mem_region wrapper for the + * KVM_SET_USER_MEMORY_REGION2 ioctl. + */ +void vm_mem_region_add(struct kvm_vm *vm, struct userspace_mem_region *region) +{ + uint64_t npages; + uint64_t gpa; + int ret; -#ifdef __s390x__ - /* On s390x, the host address must be aligned to 1M (due to PGSTEs) */ - alignment = 0x100000; -#else - alignment = 1; -#endif + TEST_REQUIRE_SET_USER_MEMORY_REGION2(); - /* - * When using THP mmap is not guaranteed to returned a hugepage aligned - * address so we have to pad the mmap. Padding is not needed for HugeTLB - * because mmap will always return an address aligned to the HugeTLB - * page size. - */ - if (src_type == VM_MEM_SRC_ANONYMOUS_THP) - alignment = max(backing_src_pagesz, alignment); + npages = region->region.memory_size / vm->page_size; + TEST_ASSERT(vm_adjust_num_guest_pages(vm->mode, npages) == npages, + "Number of guest pages is not compatible with the host. " + "Try npages=%d", vm_adjust_num_guest_pages(vm->mode, npages)); + + gpa = region->region.guest_phys_addr; + TEST_ASSERT((gpa % vm->page_size) == 0, + "Guest physical address not on a page boundary.\n" + " gpa: 0x%lx vm->page_size: 0x%x", + gpa, vm->page_size); + TEST_ASSERT((((gpa >> vm->page_shift) + npages) - 1) <= vm->max_gfn, + "Physical range beyond maximum supported physical address,\n" + " gpa: 0x%lx npages: 0x%lx\n" + " vm->max_gfn: 0x%lx vm->page_size: 0x%x", + gpa, npages, vm->max_gfn, vm->page_size); + + vm_mem_region_assert_no_duplicate(vm, region->region.slot, gpa, + region->mmap_size); - TEST_ASSERT_EQ(guest_paddr, align_up(guest_paddr, backing_src_pagesz)); + ret = __vm_ioctl(vm, KVM_SET_USER_MEMORY_REGION2, ®ion->region); + TEST_ASSERT(ret == 0, "KVM_SET_USER_MEMORY_REGION2 IOCTL failed,\n" + " rc: %i errno: %i\n" + " slot: %u flags: 0x%x\n" + " guest_phys_addr: 0x%lx size: 0x%llx guest_memfd: %d", + ret, errno, region->region.slot, region->region.flags, + gpa, region->region.memory_size, + region->region.guest_memfd); - /* Add enough memory to align up if necessary */ - if (alignment > 1) - region->mmap_size += alignment; + sparsebit_set_num(region->unused_phy_pages, gpa >> vm->page_shift, npages); - region->fd = -1; - if (backing_src_is_shared(src_type)) - region->fd = kvm_memfd_alloc(region->mmap_size, - src_type == VM_MEM_SRC_SHARED_HUGETLB); - - region->mmap_start = mmap(NULL, region->mmap_size, - PROT_READ | PROT_WRITE, - vm_mem_backing_src_alias(src_type)->flag, - region->fd, 0); - TEST_ASSERT(region->mmap_start != MAP_FAILED, - __KVM_SYSCALL_ERROR("mmap()", (int)(unsigned long)MAP_FAILED)); + /* Add to quick lookup data structures */ + vm_userspace_mem_region_gpa_insert(&vm->regions.gpa_tree, region); + vm_userspace_mem_region_hva_insert(&vm->regions.hva_tree, region); + hash_add(vm->regions.slot_hash, ®ion->slot_node, region->region.slot); +} - TEST_ASSERT(!is_backing_src_hugetlb(src_type) || - region->mmap_start == align_ptr_up(region->mmap_start, backing_src_pagesz), - "mmap_start %p is not aligned to HugeTLB page size 0x%lx", - region->mmap_start, backing_src_pagesz); +void vm_mem_add(struct kvm_vm *vm, enum vm_mem_backing_src_type src_type, + uint64_t guest_paddr, uint32_t slot, uint64_t npages, + uint32_t flags, int guest_memfd, uint64_t guest_memfd_offset) +{ + struct userspace_mem_region *region; + size_t mapping_page_size; + size_t memslot_size; + int madvise_advice; + size_t mmap_size; + size_t alignment; + int mmap_flags; + int memfd; - /* Align host address */ - region->host_mem = align_ptr_up(region->mmap_start, alignment); + memslot_size = npages * vm->page_size; + + mmap_flags = vm_mem_backing_src_alias(src_type)->flag; + madvise_advice = get_backing_src_madvise_advice(src_type); + mapping_page_size = compute_page_size(mmap_flags, madvise_advice); + + TEST_ASSERT_EQ(guest_paddr, align_up(guest_paddr, mapping_page_size)); + + alignment = mapping_page_size; +#ifdef __s390x__ + alignment = max(alignment, S390X_HOST_ADDRESS_ALIGNMENT); +#endif - /* As needed perform madvise */ - if ((src_type == VM_MEM_SRC_ANONYMOUS || - src_type == VM_MEM_SRC_ANONYMOUS_THP) && thp_configured()) { - ret = madvise(region->host_mem, mem_size, - src_type == VM_MEM_SRC_ANONYMOUS ? MADV_NOHUGEPAGE : MADV_HUGEPAGE); - TEST_ASSERT(ret == 0, "madvise failed, addr: %p length: 0x%lx src_type: %s", - region->host_mem, mem_size, - vm_mem_backing_src_alias(src_type)->name); + region = vm_mem_region_alloc(vm); + + memfd = -1; + if (backing_src_is_shared(src_type)) { + unsigned int memfd_flags = MFD_CLOEXEC; + if (src_type == VM_MEM_SRC_SHARED_HUGETLB) + memfd_flags |= MFD_HUGETLB; + + memfd = kvm_create_memfd(memslot_size, memfd_flags); } + region->fd = memfd; + + mmap_size = align_up(memslot_size, alignment); + vm_mem_region_mmap(region, mmap_size, mmap_flags, memfd, 0); + vm_mem_region_install_memory(region, memslot_size, alignment); - region->backing_src_type = src_type; + if (backing_src_should_madvise(src_type)) + vm_mem_region_madvise_thp(region, madvise_advice); + + if (backing_src_is_shared(src_type)) + vm_mem_region_mmap_alias(region, mmap_flags, alignment); if (flags & KVM_MEM_GUEST_MEMFD) { if (guest_memfd < 0) { - uint32_t guest_memfd_flags = 0; - TEST_ASSERT(!guest_memfd_offset, - "Offset must be zero when creating new guest_memfd"); - guest_memfd = vm_create_guest_memfd(vm, mem_size, guest_memfd_flags); - } else { - /* - * Install a unique fd for each memslot so that the fd - * can be closed when the region is deleted without - * needing to track if the fd is owned by the framework - * or by the caller. - */ - guest_memfd = dup(guest_memfd); - TEST_ASSERT(guest_memfd >= 0, __KVM_SYSCALL_ERROR("dup()", guest_memfd)); + TEST_ASSERT( + guest_memfd_offset == 0, + "Offset must be zero when creating new guest_memfd"); + guest_memfd = vm_create_guest_memfd(vm, memslot_size, 0); } - region->region.guest_memfd = guest_memfd; - region->region.guest_memfd_offset = guest_memfd_offset; - } else { - region->region.guest_memfd = -1; + vm_mem_region_install_guest_memfd(region, guest_memfd); } - region->unused_phy_pages = sparsebit_alloc(); - if (vm_arch_has_protected_memory(vm)) - region->protected_phy_pages = sparsebit_alloc(); - sparsebit_set_num(region->unused_phy_pages, - guest_paddr >> vm->page_shift, npages); region->region.slot = slot; region->region.flags = flags; region->region.guest_phys_addr = guest_paddr; - region->region.memory_size = npages * vm->page_size; - region->region.userspace_addr = (uintptr_t) region->host_mem; - ret = __vm_ioctl(vm, KVM_SET_USER_MEMORY_REGION2, ®ion->region); - TEST_ASSERT(ret == 0, "KVM_SET_USER_MEMORY_REGION2 IOCTL failed,\n" - " rc: %i errno: %i\n" - " slot: %u flags: 0x%x\n" - " guest_phys_addr: 0x%lx size: 0x%lx guest_memfd: %d", - ret, errno, slot, flags, - guest_paddr, (uint64_t) region->region.memory_size, - region->region.guest_memfd); - - /* Add to quick lookup data structures */ - vm_userspace_mem_region_gpa_insert(&vm->regions.gpa_tree, region); - vm_userspace_mem_region_hva_insert(&vm->regions.hva_tree, region); - hash_add(vm->regions.slot_hash, ®ion->slot_node, slot); - - /* If shared memory, create an alias. */ - if (region->fd >= 0) { - region->mmap_alias = mmap(NULL, region->mmap_size, - PROT_READ | PROT_WRITE, - vm_mem_backing_src_alias(src_type)->flag, - region->fd, 0); - TEST_ASSERT(region->mmap_alias != MAP_FAILED, - __KVM_SYSCALL_ERROR("mmap()", (int)(unsigned long)MAP_FAILED)); - - /* Align host alias address */ - region->host_alias = align_ptr_up(region->mmap_alias, alignment); - } + region->region.guest_memfd_offset = guest_memfd_offset; + vm_mem_region_add(vm, region); } void vm_userspace_mem_region_add(struct kvm_vm *vm, diff --git a/tools/testing/selftests/kvm/lib/test_util.c b/tools/testing/selftests/kvm/lib/test_util.c index d0a9b5ee0c01..cbcc1e7ad578 100644 --- a/tools/testing/selftests/kvm/lib/test_util.c +++ b/tools/testing/selftests/kvm/lib/test_util.c @@ -351,6 +351,31 @@ size_t get_private_mem_backing_src_pagesz(uint32_t i) } } +int backing_src_should_madvise(uint32_t i) +{ + switch (i) { + case VM_MEM_SRC_ANONYMOUS: + case VM_MEM_SRC_SHMEM: + case VM_MEM_SRC_ANONYMOUS_THP: + return true; + default: + return false; + } +} + +int get_backing_src_madvise_advice(uint32_t i) +{ + switch (i) { + case VM_MEM_SRC_ANONYMOUS: + case VM_MEM_SRC_SHMEM: + return MADV_NOHUGEPAGE; + case VM_MEM_SRC_ANONYMOUS_THP: + return MADV_NOHUGEPAGE; + default: + return 0; + } +} + bool is_backing_src_hugetlb(uint32_t i) { return !!(vm_mem_backing_src_alias(i)->flag & MAP_HUGETLB); From patchwork Tue Sep 10 23:44:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799510 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA24EEE01F2 for ; Tue, 10 Sep 2024 23:46:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1FEAF8D00F0; Tue, 10 Sep 2024 19:45:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1878B8D00E2; Tue, 10 Sep 2024 19:45:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EF2C48D00F0; Tue, 10 Sep 2024 19:45:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id CAEEC8D00E2 for ; Tue, 10 Sep 2024 19:45:40 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 8FD491607E9 for ; Tue, 10 Sep 2024 23:45:40 +0000 (UTC) X-FDA: 82550463240.10.30C576D Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf22.hostedemail.com (Postfix) with ESMTP id C477DC0009 for ; Tue, 10 Sep 2024 23:45:38 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=KKb6Fajk; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf22.hostedemail.com: domain of 3IdrgZgsKCJY02A4HB4OJD66EE6B4.2ECB8DKN-CCAL02A.EH6@flex--ackerleytng.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3IdrgZgsKCJY02A4HB4OJD66EE6B4.2ECB8DKN-CCAL02A.EH6@flex--ackerleytng.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011854; a=rsa-sha256; cv=none; b=PS3BbuLwVIPpcU9kiaQBXQV9gpkcR7yMtqamNbvpXcUxlVkZQraIlLIDHXvPCLiXxf/vjN 7LiQwRy6MNZIigp3HFT+bpv5V5qWANMQr4G7WLZrtswUQvMUUf9t5poBSanDpMoBbzgbOk VkjdKKN2O6mgA0uA2J6wTJMWI/ihW4g= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=KKb6Fajk; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf22.hostedemail.com: domain of 3IdrgZgsKCJY02A4HB4OJD66EE6B4.2ECB8DKN-CCAL02A.EH6@flex--ackerleytng.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3IdrgZgsKCJY02A4HB4OJD66EE6B4.2ECB8DKN-CCAL02A.EH6@flex--ackerleytng.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011854; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mkb+NAnOFL/DwPaS2OVTOWWsNRjgIx4s2eg9q8AJrXM=; b=eh71leQeuUXC45dmooycz9tu9IR+9i5n5GDBhPWHohb23uw2p6Aw+nC4/41q58QSOfAQbb PRMo2421FBg3kv4o+1m2VcCroQFTHDlCFY3Gf3mbIR35T9FoQEQwwCQOu63+VP0daDQs4W v4n/EVfeNFQXHYMibT9NAlJbvGjNZqo= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2058456b864so23030565ad.3 for ; Tue, 10 Sep 2024 16:45:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011937; x=1726616737; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=mkb+NAnOFL/DwPaS2OVTOWWsNRjgIx4s2eg9q8AJrXM=; b=KKb6FajkFeOq6G4dk53GjwOcrp+y/VsOtemrKjibDdvUDuGbuZwwCLlLvuwSMwsehw 6XcTkAeHz2BnEyid0LeyBh9fYkSKZZa8Juwwgk6l1VUoWtqYts+WWAicdCWo5GWGSq9U jswyUO1M01JM42hBCkoccjiGIEkUqYsS7lJKNJroyGoCXW7jhj+5pToSTgQdFCfaJ+M+ 0DsC4SftrGcljDpgpxb7eLmu8oQnkU4+d0eTjCi+/bQMgbroVF2wnWIMoNZx7ScuAEpd cgMKZArU9twMH0FRPiNEoMabB70Puo6Q5pho7i4LB7RqqyAHQaS5WtK6GOv2tVPLj4pf BlDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011937; x=1726616737; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=mkb+NAnOFL/DwPaS2OVTOWWsNRjgIx4s2eg9q8AJrXM=; b=JF9JDOvTIzPLCSKWMVVZlDY37C5E7I5UKY6K12/mnI+xWVtH7NheDLjTCpcz818k45 qc0Ut8IZAx32IEZedjQ/EfyG5kI0gc7TRR/2RfvhHPuGCRwxC6F0PjQsC0BmMGqh+rOg ohoe9UyeCgKPIMVZzzIEDUNbOntkxKg6gYrFmXZ7Bejk9P3+w2cku/9vkr/6THVIbHTD w0WE/CcTtZ9ERZpJ03zhj5HPIhBPm2i+XsT/d/OtXQ8P/0rPgZyXzDbhN63e25JlOTaq DCdAYUL72/HzDFe5xEWbaE3zKFhn2EneNLF0l2Phl07ZCohI/uIlAP11kt3VS/T3r1FT y+5Q== X-Forwarded-Encrypted: i=1; AJvYcCVgILAaKLXTzzqdOs3T8pGWIkXVabUge7Lh0a0qx5sFwWQGkfmUk8/+6lVDR38/3T7CNq90MuQPtg==@kvack.org X-Gm-Message-State: AOJu0YwVmpGz9cfgXqRdZkPiUfAGAb0JDRRYonCoFmzpMrQcM2iT7OE5 bfHMNh/aPCTo/wMieQYFdwZqt1GqpVvWm1ePzg3Em0GiWU0hPasoTAzld/2VeJ+gobf0DbUOU3H yKN/Kz0I7xarvOZheJ0sZBA== X-Google-Smtp-Source: AGHT+IEIVAikHARJb/9AtMZHbBPnKppzatT8jO6P6FcNUKI8bwXbpm7jyhYnZiKRc4hR4u8fz8EVE4930Msvwij+9Q== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a17:903:1ca:b0:1fb:8620:e113 with SMTP id d9443c01a7336-20752183657mr936435ad.3.1726011937387; Tue, 10 Sep 2024 16:45:37 -0700 (PDT) Date: Tue, 10 Sep 2024 23:44:08 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: Subject: [RFC PATCH 37/39] KVM: selftests: Add helper to perform madvise by memslots From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: C477DC0009 X-Stat-Signature: 84so3myc7gcegntmb8oxdyu8pc76cenj X-Rspam-User: X-HE-Tag: 1726011938-699073 X-HE-Meta: U2FsdGVkX1+8+VTjCsj90E1zoSqaSND567nO6kp3qXfSZO3XgP2jSgwx/vorVCGLMBJeZK3DteEQNl/9MbM726NmT7fWIz6wVP2aU7KEbXMitWf+6hv0YI1BA7M5iK4Vosr6wkVJ7H1z7E2ZuANzZ9DR9ht97puWci3Ap3eqcMuklDjYv2yArF7JagXfO452ulOASgTmgWpOg+0PzJWCPoYeBOlxtwyxJAd/iEaJggu3r6VUxYcc0b9o96Z9TBcW1v93kKT2UJ7Ou+Gv0c+/Xkay8KqNqE5uDnc4APT1Ok9Z9ASNPyhAGgfQjjIXkbmJy+DXpClwg3VmNAhtZO6WWPxhjpW4WkPpnPvUumA+nl7h5A5Ay9DeKTMK/OV9QmKUVQi/RNW+w2g9FS0FioxTvC8d5f9+8OtnTTnFlORSh6FjKQ7/92ggkGybA8g8wEZVsgOjHToCWk/urV4ehQgaIqNI0cltkmn4j4LATc+VDm/c/PaQrLZi8WcTXBHVWu21qdbK9GZlhGj2X7pxA6ROGLZrIJMgIFG5kmWgwLmkekQYgZ/xhGY2mwRcyKGvqYTrKWuftZif1EerAHKfxTArBvUK4rB2xTkXkj8zY6C3YqDjclvgnqsSCv7RzeN7TgoJsckSE/1nrQLya6k+5Ep5jtGDvlfdyB1u8+M0QzfWL3OuvLQbGnD34qPbNJpXQPFYOV1oQKq8H+fTL0Gj8VhRuLTt3YKLdIz7YIIZLrA7zwwcYiGseXhdArEZ/zqwPTREbbFk5yEtlDdyux0UV7rgWPU6evocxWRzv5hZ+4n0TdzwjJr2Wetre0ndfmuGa9IJbD8aKx/konE1hSkt+TAqvUP20GZVjy5PkKMU43G6nox69UDY65xQHHGTxXA1sAvMgwDzlv379lZuHRNhULMW2IdsMQ6mOAs+xApzyoD5Yh3fxdH0m+Jv7NewtqKZbzT0MRbb6K0xN1mVThX7y1v fCnimKYE cgZD6lTpl7wxwRCD9eE90DS2QZbfpCe4m+3Ax+BlHJA3qtBzYb19c7sjveB/8znvI0otKiCuKrMngjd296jj48yaxTRH7HTZ3ba5VS7YqOU7mgoD/qjIvlQ4bYAFe37fJP2dr9ALHap20ocSAH1kz0mJp7zmr2b5W9ryqALWLy5O4pvlR8zJvNzwknuImFGshDYTy8Z7xoFqZvWS+pdowpLBZpnYVVu04Po0pMVji0jkmwKLULRtYqZDjRX2muqBhHfjZxIQGhhx8WYrLBeqOiXuO01WPfXGjsS+q6vOLOFNGGu+gtpbVRxqFHNW23Z5bOMEw3mHXYgLA39iVUxak4hdCo9zsOL3RdD0RXVOnU/pUqo7Y6VD714VY/1lLKi/C1uVhVSSV15a+sCT70VCHktaK0FK4AOBytRp+iqvk3xkwjb7/NggOJMF1/Xmn86H8rxGRnTZv+KubwRnGSUX6f9QH14FjT/V8URm5visMEe2uW2TJKpwA+h/IxWIbsPjJ+Z4j5KqL1cTmD9axrSfoBb8A9W3q8X+U/uwbn8C5oCCHxwHa4bGg1UJlyPg5okb+7vEfwytrCBXpNi7bvBzjwf9LpU1rzkThpptSadIDGQk8/cocwVItyU51/9I/LRDGbFFvGrINs1b+NEadB8wzmkZPRA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.013762, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: A contiguous GPA range may not be contiguous in HVA. This helper performs madvise, given a GPA range, by madvising in blocks according to memslot configuration. Signed-off-by: Ackerley Tng --- tools/include/linux/kernel.h | 4 +-- .../testing/selftests/kvm/include/kvm_util.h | 2 ++ tools/testing/selftests/kvm/lib/kvm_util.c | 30 +++++++++++++++++++ 3 files changed, 34 insertions(+), 2 deletions(-) diff --git a/tools/include/linux/kernel.h b/tools/include/linux/kernel.h index 07cfad817d53..5454cd3272ed 100644 --- a/tools/include/linux/kernel.h +++ b/tools/include/linux/kernel.h @@ -54,8 +54,8 @@ _min1 < _min2 ? _min1 : _min2; }) #endif -#define max_t(type, x, y) max((type)x, (type)y) -#define min_t(type, x, y) min((type)x, (type)y) +#define max_t(type, x, y) max((type)(x), (type)(y)) +#define min_t(type, x, y) min((type)(x), (type)(y)) #define clamp(val, lo, hi) min((typeof(val))max(val, lo), hi) #ifndef BUG_ON diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h index 1576e7e4aefe..58b516c23574 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -433,6 +433,8 @@ static inline void vm_mem_set_shared(struct kvm_vm *vm, uint64_t gpa, void vm_guest_mem_fallocate(struct kvm_vm *vm, uint64_t gpa, uint64_t size, bool punch_hole); +void vm_guest_mem_madvise(struct kvm_vm *vm, vm_paddr_t gpa_start, uint64_t size, + int advice); static inline void vm_guest_mem_punch_hole(struct kvm_vm *vm, uint64_t gpa, uint64_t size) { diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index 9bdd03a5da90..21ea6616124c 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -1416,6 +1416,36 @@ void vm_guest_mem_fallocate(struct kvm_vm *vm, uint64_t base, uint64_t size, } } +void vm_guest_mem_madvise(struct kvm_vm *vm, vm_paddr_t gpa_start, uint64_t size, + int advice) +{ + size_t madvise_len; + vm_paddr_t gpa_end; + vm_paddr_t gpa; + + gpa_end = gpa_start + size; + for (gpa = gpa_start; gpa < gpa_end; gpa += madvise_len) { + struct userspace_mem_region *region; + void *hva_start; + uint64_t memslot_end; + int ret; + + region = userspace_mem_region_find(vm, gpa, gpa); + TEST_ASSERT(region, "Memory region not found for GPA 0x%lx", gpa); + + hva_start = addr_gpa2hva(vm, gpa); + memslot_end = region->region.userspace_addr + + region->region.memory_size; + madvise_len = min_t(size_t, memslot_end - (uint64_t)hva_start, + gpa_end - gpa); + + ret = madvise(hva_start, madvise_len, advice); + TEST_ASSERT(!ret, "madvise(addr=%p, len=%lx, advice=%x) failed\n", + hva_start, madvise_len, advice); + } +} + + /* Returns the size of a vCPU's kvm_run structure. */ static int vcpu_mmap_sz(void) { From patchwork Tue Sep 10 23:44:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799511 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2DE02EE01F4 for ; Tue, 10 Sep 2024 23:46:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 643978D00F1; Tue, 10 Sep 2024 19:45:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5A7028D00E2; Tue, 10 Sep 2024 19:45:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3AAAA8D00F1; Tue, 10 Sep 2024 19:45:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 11EE88D00E2 for ; Tue, 10 Sep 2024 19:45:42 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id B6457A09AA for ; Tue, 10 Sep 2024 23:45:41 +0000 (UTC) X-FDA: 82550463282.21.B9340B1 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf05.hostedemail.com (Postfix) with ESMTP id E25B210000D for ; Tue, 10 Sep 2024 23:45:39 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=C3FGK0Dw; spf=pass (imf05.hostedemail.com: domain of 3I9rgZgsKCJg24C6JD6QLF88GG8D6.4GEDAFMP-EECN24C.GJ8@flex--ackerleytng.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3I9rgZgsKCJg24C6JD6QLF88GG8D6.4GEDAFMP-EECN24C.GJ8@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011836; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=TlIOiJjV6U0PI3iCsuHTa2oFdtPEU4Vo6B5xI+1eOyo=; b=pnzfOygupZrMl7RfPQdEEDk4Y0rIqy65soTCQDYlycl1A7g/V09b3x3YFKpDgUKgb+oB7i X5y0POjf9YPVxgfmE/aEROeB6pe7EwLTBr0ADAbABlFPLk/kijNwuJ6JClYQRk6Nbj8qNQ Ro2xuh/FWXuf5NB6CEA1kQEMVrMWLGI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011836; a=rsa-sha256; cv=none; b=aO0FALfLOjKhoiIDtfWD81THTgoK5DnU1/Pe2WMlu5dnupQyim9S0c43w4aDxybdqbYkDA O1t8JpU7iQJS0dxuMpMUZ78TgO8SXAuS/hXmjqENNG6p7RdCaq79jJ/WevWgw1TRnBx1y0 INQG7MQQ9Tr3Ts+PuNN++Vy7NcdYkB0= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=C3FGK0Dw; spf=pass (imf05.hostedemail.com: domain of 3I9rgZgsKCJg24C6JD6QLF88GG8D6.4GEDAFMP-EECN24C.GJ8@flex--ackerleytng.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3I9rgZgsKCJg24C6JD6QLF88GG8D6.4GEDAFMP-EECN24C.GJ8@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-6db7a8c6910so87037337b3.0 for ; Tue, 10 Sep 2024 16:45:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011939; x=1726616739; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=TlIOiJjV6U0PI3iCsuHTa2oFdtPEU4Vo6B5xI+1eOyo=; b=C3FGK0DwIe/R+zwABPGLI+gNQwC+ghKN+hQ5g+BWdSwI7ZGsMQVJ66RH1nj8owHu+8 TMBphGV6KKbCTC8fo6CRO+11QbJhcOFWCriRH+HZArDFo7kJQhY6q3rCrgqbTCH7Fia8 YIvBvo7uSdBUWX1c9N2xPmmXKYkNxTOu4HqdJQUjOi62nVRDxO0hNMooUUpRqlm3SmT+ 5ubTGsuovSdOyhYaPL1MkW8oE8s4nPh9C7VdGo4YKwUmtvvc1L9qRHOgUDG1sHxNBIL3 7mkOfunC/uDQT1xeEADHqLCyyEM/o5sl8p74SGz9KUqH6Uda9pB8o2AnkNFw9tWQPu84 1kWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011939; x=1726616739; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=TlIOiJjV6U0PI3iCsuHTa2oFdtPEU4Vo6B5xI+1eOyo=; b=IbpnRHiJQFA8oDATGippQIC3gE9c94ptaDH32jXHsDOTN7Ks1W8mB3aU3YBbGn4nGE IRkHh2tvBNTvN54tQnLjX2UytQJhcuKNnA2WkC6bCTbfIoD2EAgJSxgOGTAy/ploP/Qm Z4+lLlrgXoG3zLVuOpMZhTrBdYMy6kehqK5HLsXtUlsN4CkdUTeQJFXGR/8WwKsd3JWr jLq5ufVwH8D1lOuarR90y+gb/KrcdN1TYvltLL73Jn6wgALA7Bw1Itvox5B7X3tQLnr5 LGq3FFD3/IQ6/sydHKM/56rz5lYdf5wUDuyGOUxRV+ineUN7/eR7KxCRuwoIfDnnfzw3 01rg== X-Forwarded-Encrypted: i=1; AJvYcCXgevIinvpgBsBsW5AOcNjlHEgCr/sg4pJJ3ThIGOyL69Hg+xpKcdyfTZWAZXFSG0M9CzV3KJKYiA==@kvack.org X-Gm-Message-State: AOJu0Yxsm8cBsMnjvAPK/JwZlq8HpxDAfiy2MK8q8m8/P3eJbkRpLfo3 dnix+Pu6UWtOsCSHwau36CPiXOShfskik6UGzFNHU1TYTvpRs/SPzgF1drTbv9brSlDPyxwm/56 Og3xO6TRuVGsN16/7g2JPGw== X-Google-Smtp-Source: AGHT+IH3N5qL0TDHwb+ahMnzs9ETG7Svm+A6F+qfQL1tieuJZR8fE0gET637CAqDRn32FQw0mJ0L0E81ldHixRjnIw== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a05:690c:c93:b0:6b0:57ec:c5f9 with SMTP id 00721157ae682-6db44a59f17mr6527007b3.0.1726011939045; Tue, 10 Sep 2024 16:45:39 -0700 (PDT) Date: Tue, 10 Sep 2024 23:44:09 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <3ef4b32d32dca6e1b506e967c950dc2d4c3bc7ae.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 38/39] KVM: selftests: Update private_mem_conversions_test for mmap()able guest_memfd From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Stat-Signature: 61wkafpbeghaamkxic7izfmm7ryq3wsw X-Rspamd-Queue-Id: E25B210000D X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1726011939-164545 X-HE-Meta: U2FsdGVkX1+/S0CFvUNzIQegHlVipW0b3vRAFBnL7MtZR/c4Ux/TBFfckB0K6dSupsL8gAGjroTVyR/dHC4OJiYRzm5/F34BLK3j6U1cHksufV8Gu14eXk0XixCQXozsSmQgbJYpp5FRONsyXvbmMU7wjjw9kdT90/qKs+jTJakWJu85AoHQ2JW6ip9kSNfaErWD4I45tfYgw4QbWGlrIAuIBay/VauQ9fHhPEaQiv6mlLxl+CUpoW9uWyzKc9p8YmL3BjIVBjQTqKr2bXeSVE/GRpl2fV1pr62MKCb6f4l4rujdhbNAYPuIHaEw0AS2Cx2nPelXYUOeMBViDpungMERcg6ltEdO0rqvm+FnRuW4u9SnrraQzetD2iN3pqePoEltxyqHoTyO2qEgUw+2XmPMSwqzUJFsi1VWYG9nsieAIQ3r+m+zTnazQKhIJTxMmlM8gyfzfIpNq++LqX5hrPJ1fahdr2zI7kE8qBjFSSEQ2GMhl5xhucQX/4LG7hYODWYIXhTdMy/OhwXGXKkF4S0D76QheiyfaYDiZs+mbBWNd21w+PeLyFtFAAjyZUEI23h/N5Y7tU46sRkvJDeTOea5AJsQJNvQgrzUvrKPfSLjJxfQMTROHLKEeiUveL8gmns+0inpIR67JqsFL4hdIxECT7ci/wqaOXRyVeRX7REjnJI/Rb6ErIofstIxWdii5UucyftwpAeHkfuZybupnbisI/KyVMMBt5yAdBgqu6VsfpoFRJMSc+hOjLo++RxUpO+i6A8sGiuxwjfnJJleXQZGY5rB7D5ZZx575de1g+XqQS57WzH665ayluEvOpnV5ZS7txqNllFgOSanicWoAl08bqjyMKXOqhs5ZsLLyXXHuwLVflVAG3c+MEg4OJD7fqxt9gvvo2tAJOTLyyS/aLdZD0bVBvB/msXCjufwPRzMiEISNjFALvnJxJv45MI12L+YlUJ24i6/X94CH9g KsLlpSna Zq4EGM8pELfHJkTVhtCa8gAWKn0GEgqgXmYHNaZizyOjYFEcuz/LltL2mE0tnKKHHqQZpYndFyIKXMtFHBWPyBQqH8M/c61ARORDhSmlMu+nRKQgHs/oqnqTpefykfTRGpgvrQx9TerB0UqFYi2ZSzKV2Yo7tTtjN3Jl6s+/zMUixgQZUug0rmwzP2laVLlwnoNmcbWmfaAhbsHTeSp1xek99D7NFrE8ADosBXk4aPOanibLg27UpAJtNtvjYC5kE0NzHNGKhqy8aKOC7/YcuckMwKSuyuxJVQsrozfp3kCRQ3w9/LMghkzs0QzkgCTPh3bogc2/i8En9lQedJIIJ/NXxfjtwYw5RlsHOMSGTk3A9bm8qswmKYQntwJvtW50yAoKLZQalzjGZbInZrakiEPXLaTPC3sQ3bHMKUH344PDqQvZRb+ZydUJDz4Ex/9vx5OSqjcPUu50qhc9h5aVbuwgMxbCx8ah1cy9bfY/PU412ar/ZdVODKhdvjKm/NZ6iNYjFez7hrzsV45jWe7iwzncZ3aUpEAircnjCdq6Ez8Y6Pyz28YpvTvZB93eRETG26td/jguL5dP+KGT1CdqlEQlDECbksSWlEySGGOODP+vb6hXlto+OG3UDgWI7+dv6ig+mnj3QS4s1CJbv3+DTAQds82oMw9J0GgYZlncqoqg6Zz62Y0bkAWP/Rb/gSU6TtdZzh2MKzryOsxM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Signed-off-by: Ackerley Tng --- .../kvm/x86_64/private_mem_conversions_test.c | 146 +++++++++++++++--- .../x86_64/private_mem_conversions_test.sh | 3 + 2 files changed, 124 insertions(+), 25 deletions(-) diff --git a/tools/testing/selftests/kvm/x86_64/private_mem_conversions_test.c b/tools/testing/selftests/kvm/x86_64/private_mem_conversions_test.c index 71f480c19f92..6524ef398584 100644 --- a/tools/testing/selftests/kvm/x86_64/private_mem_conversions_test.c +++ b/tools/testing/selftests/kvm/x86_64/private_mem_conversions_test.c @@ -11,6 +11,8 @@ #include #include #include +#include +#include #include #include @@ -202,15 +204,19 @@ static void guest_test_explicit_conversion(uint64_t base_gpa, bool do_fallocate) guest_sync_shared(gpa, size, p3, p4); memcmp_g(gpa, p4, size); - /* Reset the shared memory back to the initial pattern. */ - memset((void *)gpa, init_p, size); - /* * Free (via PUNCH_HOLE) *all* private memory so that the next * iteration starts from a clean slate, e.g. with respect to * whether or not there are pages/folios in guest_mem. */ guest_map_shared(base_gpa, PER_CPU_DATA_SIZE, true); + + /* + * Reset the entire block back to the initial pattern. Do this + * after fallocate(PUNCH_HOLE) because hole-punching zeroes + * memory. + */ + memset((void *)base_gpa, init_p, PER_CPU_DATA_SIZE); } } @@ -286,7 +292,8 @@ static void guest_code(uint64_t base_gpa) GUEST_DONE(); } -static void handle_exit_hypercall(struct kvm_vcpu *vcpu) +static void handle_exit_hypercall(struct kvm_vcpu *vcpu, + bool back_shared_memory_with_guest_memfd) { struct kvm_run *run = vcpu->run; uint64_t gpa = run->hypercall.args[0]; @@ -303,17 +310,46 @@ static void handle_exit_hypercall(struct kvm_vcpu *vcpu) if (do_fallocate) vm_guest_mem_fallocate(vm, gpa, size, map_shared); - if (set_attributes) + if (set_attributes) { + if (back_shared_memory_with_guest_memfd && !map_shared) + vm_guest_mem_madvise(vm, gpa, size, MADV_DONTNEED); vm_set_memory_attributes(vm, gpa, size, map_shared ? 0 : KVM_MEMORY_ATTRIBUTE_PRIVATE); + } run->hypercall.ret = 0; } +static void assert_not_faultable(uint8_t *address) +{ + pid_t child_pid; + + child_pid = fork(); + TEST_ASSERT(child_pid != -1, "fork failed"); + + if (child_pid == 0) { + *address = 'A'; + } else { + int status; + waitpid(child_pid, &status, 0); + + TEST_ASSERT(WIFSIGNALED(status), + "Child should have exited with a signal"); + TEST_ASSERT_EQ(WTERMSIG(status), SIGBUS); + } +} + static bool run_vcpus; -static void *__test_mem_conversions(void *__vcpu) +struct test_thread_args { - struct kvm_vcpu *vcpu = __vcpu; + struct kvm_vcpu *vcpu; + bool back_shared_memory_with_guest_memfd; +}; + +static void *__test_mem_conversions(void *params) +{ + struct test_thread_args *args = params; + struct kvm_vcpu *vcpu = args->vcpu; struct kvm_run *run = vcpu->run; struct kvm_vm *vm = vcpu->vm; struct ucall uc; @@ -325,7 +361,8 @@ static void *__test_mem_conversions(void *__vcpu) vcpu_run(vcpu); if (run->exit_reason == KVM_EXIT_HYPERCALL) { - handle_exit_hypercall(vcpu); + handle_exit_hypercall(vcpu, + args->back_shared_memory_with_guest_memfd); continue; } @@ -349,8 +386,18 @@ static void *__test_mem_conversions(void *__vcpu) size_t nr_bytes = min_t(size_t, vm->page_size, size - i); uint8_t *hva = addr_gpa2hva(vm, gpa + i); - /* In all cases, the host should observe the shared data. */ - memcmp_h(hva, gpa + i, uc.args[3], nr_bytes); + /* Check contents of memory */ + if (args->back_shared_memory_with_guest_memfd && + uc.args[0] == SYNC_PRIVATE) { + assert_not_faultable(hva); + } else { + /* + * If shared and private memory use + * separate backing memory, the host + * should always observe shared data. + */ + memcmp_h(hva, gpa + i, uc.args[3], nr_bytes); + } /* For shared, write the new pattern to guest memory. */ if (uc.args[0] == SYNC_SHARED) @@ -366,11 +413,41 @@ static void *__test_mem_conversions(void *__vcpu) } } -static void -test_mem_conversions(enum vm_mem_backing_src_type src_type, - enum vm_private_mem_backing_src_type private_mem_src_type, - uint32_t nr_vcpus, - uint32_t nr_memslots) +static void add_memslot(struct kvm_vm *vm, uint64_t gpa, uint32_t slot, + uint64_t size, int guest_memfd, + uint64_t guest_memfd_offset, + enum vm_mem_backing_src_type src_type, + bool back_shared_memory_with_guest_memfd) +{ + struct userspace_mem_region *region; + + if (!back_shared_memory_with_guest_memfd) { + vm_mem_add(vm, src_type, gpa, slot, size / vm->page_size, + KVM_MEM_GUEST_MEMFD, guest_memfd, + guest_memfd_offset); + return; + } + + region = vm_mem_region_alloc(vm); + + guest_memfd = vm_mem_region_install_guest_memfd(region, guest_memfd); + + vm_mem_region_mmap(region, size, MAP_SHARED, guest_memfd, guest_memfd_offset); + vm_mem_region_install_memory(region, size, getpagesize()); + + region->region.slot = slot; + region->region.flags = KVM_MEM_GUEST_MEMFD; + region->region.guest_phys_addr = gpa; + region->region.guest_memfd_offset = guest_memfd_offset; + + vm_mem_region_add(vm, region); +} + +static void test_mem_conversions(enum vm_mem_backing_src_type src_type, + enum vm_private_mem_backing_src_type private_mem_src_type, + uint32_t nr_vcpus, + uint32_t nr_memslots, + bool back_shared_memory_with_guest_memfd) { /* * Allocate enough memory so that each vCPU's chunk of memory can be @@ -381,6 +458,7 @@ test_mem_conversions(enum vm_mem_backing_src_type src_type, get_private_mem_backing_src_pagesz(private_mem_src_type), get_backing_src_pagesz(src_type))); const size_t per_cpu_size = align_up(PER_CPU_DATA_SIZE, alignment); + struct test_thread_args *thread_args[KVM_MAX_VCPUS]; const size_t memfd_size = per_cpu_size * nr_vcpus; const size_t slot_size = memfd_size / nr_memslots; struct kvm_vcpu *vcpus[KVM_MAX_VCPUS]; @@ -404,13 +482,14 @@ test_mem_conversions(enum vm_mem_backing_src_type src_type, vm, memfd_size, vm_private_mem_backing_src_alias(private_mem_src_type)->flag); - for (i = 0; i < nr_memslots; i++) - vm_mem_add(vm, src_type, BASE_DATA_GPA + slot_size * i, - BASE_DATA_SLOT + i, slot_size / vm->page_size, - KVM_MEM_GUEST_MEMFD, memfd, slot_size * i); + for (i = 0; i < nr_memslots; i++) { + add_memslot(vm, BASE_DATA_GPA + slot_size * i, + BASE_DATA_SLOT + i, slot_size, memfd, slot_size * i, + src_type, back_shared_memory_with_guest_memfd); + } for (i = 0; i < nr_vcpus; i++) { - uint64_t gpa = BASE_DATA_GPA + i * per_cpu_size; + uint64_t gpa = BASE_DATA_GPA + i * per_cpu_size; vcpu_args_set(vcpus[i], 1, gpa); @@ -420,13 +499,23 @@ test_mem_conversions(enum vm_mem_backing_src_type src_type, */ virt_map(vm, gpa, gpa, PER_CPU_DATA_SIZE / vm->page_size); - pthread_create(&threads[i], NULL, __test_mem_conversions, vcpus[i]); + thread_args[i] = malloc(sizeof(struct test_thread_args)); + TEST_ASSERT(thread_args[i] != NULL, + "Could not allocate memory for thread parameters"); + thread_args[i]->vcpu = vcpus[i]; + thread_args[i]->back_shared_memory_with_guest_memfd = + back_shared_memory_with_guest_memfd; + + pthread_create(&threads[i], NULL, __test_mem_conversions, + (void *)thread_args[i]); } WRITE_ONCE(run_vcpus, true); - for (i = 0; i < nr_vcpus; i++) + for (i = 0; i < nr_vcpus; i++) { pthread_join(threads[i], NULL); + free(thread_args[i]); + } kvm_vm_free(vm); @@ -448,7 +537,7 @@ test_mem_conversions(enum vm_mem_backing_src_type src_type, static void usage(const char *cmd) { puts(""); - printf("usage: %s [-h] [-m nr_memslots] [-s mem_type] [-p private_mem_type] [-n nr_vcpus]\n", cmd); + printf("usage: %s [-h] [-m nr_memslots] [-s mem_type] [-p private_mem_type] [-n nr_vcpus] [-g]\n", cmd); puts(""); backing_src_help("-s"); puts(""); @@ -458,19 +547,22 @@ static void usage(const char *cmd) puts(""); puts(" -m: specify the number of memslots (default: 1)"); puts(""); + puts(" -g: back shared memory with guest_memfd (default: false)"); + puts(""); } int main(int argc, char *argv[]) { enum vm_mem_backing_src_type src_type = DEFAULT_VM_MEM_SRC; enum vm_private_mem_backing_src_type private_mem_src_type = DEFAULT_VM_PRIVATE_MEM_SRC; + bool back_shared_memory_with_guest_memfd = false; uint32_t nr_memslots = 1; uint32_t nr_vcpus = 1; int opt; TEST_REQUIRE(kvm_check_cap(KVM_CAP_VM_TYPES) & BIT(KVM_X86_SW_PROTECTED_VM)); - while ((opt = getopt(argc, argv, "hm:s:p:n:")) != -1) { + while ((opt = getopt(argc, argv, "hgm:s:p:n:")) != -1) { switch (opt) { case 's': src_type = parse_backing_src_type(optarg); @@ -484,6 +576,9 @@ int main(int argc, char *argv[]) case 'm': nr_memslots = atoi_positive("nr_memslots", optarg); break; + case 'g': + back_shared_memory_with_guest_memfd = true; + break; case 'h': default: usage(argv[0]); @@ -491,7 +586,8 @@ int main(int argc, char *argv[]) } } - test_mem_conversions(src_type, private_mem_src_type, nr_vcpus, nr_memslots); + test_mem_conversions(src_type, private_mem_src_type, nr_vcpus, nr_memslots, + back_shared_memory_with_guest_memfd); return 0; } diff --git a/tools/testing/selftests/kvm/x86_64/private_mem_conversions_test.sh b/tools/testing/selftests/kvm/x86_64/private_mem_conversions_test.sh index fb6705fef466..c7f3dfee0336 100755 --- a/tools/testing/selftests/kvm/x86_64/private_mem_conversions_test.sh +++ b/tools/testing/selftests/kvm/x86_64/private_mem_conversions_test.sh @@ -75,6 +75,9 @@ TEST_EXECUTABLE="$(dirname "$0")/private_mem_conversions_test" $TEST_EXECUTABLE -s "$src_type" -p "$private_mem_src_type" -n $num_vcpus_to_test $TEST_EXECUTABLE -s "$src_type" -p "$private_mem_src_type" -n $num_vcpus_to_test -m $num_memslots_to_test + $TEST_EXECUTABLE -s "$src_type" -p "$private_mem_src_type" -n $num_vcpus_to_test -g + $TEST_EXECUTABLE -s "$src_type" -p "$private_mem_src_type" -n $num_vcpus_to_test -m $num_memslots_to_test -g + { set +x; } 2>/dev/null echo From patchwork Tue Sep 10 23:44:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ackerley Tng X-Patchwork-Id: 13799512 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 109D7EE01F1 for ; Tue, 10 Sep 2024 23:46:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0F0998D00F2; Tue, 10 Sep 2024 19:45:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 04F6F8D00E2; Tue, 10 Sep 2024 19:45:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D212E8D00F2; Tue, 10 Sep 2024 19:45:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id AA2368D00E2 for ; Tue, 10 Sep 2024 19:45:43 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 6CC0EA91BC for ; Tue, 10 Sep 2024 23:45:43 +0000 (UTC) X-FDA: 82550463366.21.2F1C2DF Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf03.hostedemail.com (Postfix) with ESMTP id 93B582000D for ; Tue, 10 Sep 2024 23:45:41 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ubudzU3W; spf=pass (imf03.hostedemail.com: domain of 3JNrgZgsKCJk35D7KE7RMG99HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--ackerleytng.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3JNrgZgsKCJk35D7KE7RMG99HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726011839; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=plYJYQpQfTnkaDlyMrD3/y2PoxU/BsueDgrxr9NfLVw=; b=notezqeFb3Ks8mTY6w+XXetMb4OwNOOOz+HluQQ6UUQnXPNxfVA9j7jwe1YtU7JzFnjaF6 4S/+HkNXRvmxUPN954tKwa29GAVGydAEEa9VDugUWH4b/wXUTCotSNbFqOHsrroERdIyfH JlhTGQSEXctM2hr4ruOdwxMiEVz1dD4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726011839; a=rsa-sha256; cv=none; b=xXp5R/TAny185m91Gffmk5SVSRy7+BuLqHvIq1qykUkIIcn3WjJPO3DAhbrGk/3x4w4LXn QjSHMk5SNtQdDwoFLvxSwjQggaj7wEMe1jzoSRQ+/udxobm2qT+1ftTdUku8iB/Qdw1IjP d8fDov7GNjW001e9F1COr6MAXj1yit0= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ubudzU3W; spf=pass (imf03.hostedemail.com: domain of 3JNrgZgsKCJk35D7KE7RMG99HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--ackerleytng.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3JNrgZgsKCJk35D7KE7RMG99HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-6d7124939beso189301637b3.2 for ; Tue, 10 Sep 2024 16:45:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726011941; x=1726616741; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=plYJYQpQfTnkaDlyMrD3/y2PoxU/BsueDgrxr9NfLVw=; b=ubudzU3Wxwj6VUSzXbvG9fkSvXqsIt4aQJdJSQkkQX6rSYZlyfaX2VmvUXTd+RNDOv p9WjRUaWdFQeMKG9MfasA3RHzB3K87Cm9Dr3Ko8uY5YpxV5S2bu08Yhr6SblsTCs+TDp OWCajZxUYmtkTpW1lrbeUpM20RcxSVptfurLqSTQ2l0OXF4neLGBqYJoeutqN9i2eAVd QxE34eWMI0Rd/xKJf6xR+RXpZ/2q8QXK/ViPN1Lusa58pvIJ7KyNI3lS2m5PFHq0Mwko tiHdzV42y74KnGenec2zx3j573XE2VDl3SzT6FutQmc+Qv0ZXps5TzQEDUwFaUrRBSk/ ZFgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726011941; x=1726616741; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=plYJYQpQfTnkaDlyMrD3/y2PoxU/BsueDgrxr9NfLVw=; b=giHpWItLRB8jrFPQWNBkpowSJWmZ7HpkupLB4GI9d3qfrS9B7sMyGp70tMCbPXFBAQ uIlj9IxEEed8LGHBe2vrDXYGuzq+XtlO+Xb9JWM48Mcuy06ZPcKg3FhyjlsZcqCzXwKd YC16+DumBCLPM1cMNH5rsup2nU3mz33TZM9zYQ6muEWWL8w3h/X/V5qW+C1xEXUqRuW6 u/enbQvYK5aLKFs65gMOCbZQKKP16rNeab8IqZYRXc9bWsN4ov4pQeVaaMJk39hdHgCL fIOO9Yz2OtGP/HFnYZBnnFJhUxQITkN2h14qMlE9MKDkdetiYKPt85UTeUsPmp0L7Sn8 PA3A== X-Forwarded-Encrypted: i=1; AJvYcCUti6fE2r3Xv8sHXG7FSOObhwpYxtg0KsO7ztwc0VSQo/0Q/uIAcSGszPuXzD6oO06dfxhQ91/DJQ==@kvack.org X-Gm-Message-State: AOJu0Yx4tlpd7UYWYGQCZ4NKiRDrvtc5V1QEkIddR4rsKHzvsZaEetYL E4uyzhU9EEQNQl9lo67ard7gmb+mntNn9myFrXomaoNospwFQNgsPRYi7fO4pfuqmW2r3AL6Rop RsdL+MDA2v/fjkfo/CbHM0A== X-Google-Smtp-Source: AGHT+IF4JIC3xXL5PcaZmTn+2qPPsXwRjGrvNWXeluzH7gFrLjYhLnR02nFhF2UYLVydi+C+3P7U7SKVduv8ybLF7g== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:13f8]) (user=ackerleytng job=sendgmr) by 2002:a05:690c:6703:b0:6d5:df94:b7f2 with SMTP id 00721157ae682-6db45163299mr10464137b3.5.1726011940637; Tue, 10 Sep 2024 16:45:40 -0700 (PDT) Date: Tue, 10 Sep 2024 23:44:10 +0000 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.46.0.598.g6f2099f65c-goog Message-ID: <38723c5d5e9b530e52f28b9f9f4a6d862ed69bcd.1726009989.git.ackerleytng@google.com> Subject: [RFC PATCH 39/39] KVM: guest_memfd: Dynamically split/reconstruct HugeTLB page From: Ackerley Tng To: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com Cc: erdemaktas@google.com, vannapurve@google.com, ackerleytng@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org X-Rspamd-Queue-Id: 93B582000D X-Stat-Signature: 84i6ek4joyumh9o6nqkhr9tsyd9djqh3 X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1726011941-135427 X-HE-Meta: U2FsdGVkX1/Y0j3c+RiKzz8XCs2juowwq3H2L8HqTeZkNG9eRS+FUa1/0qFedBo2Io7mp8dCXhMRhNQ0J0ECtab50HY9IH1Ol58HzaVytkroFVaDqul8OYqZByG8friYBIlo0dMT1b5EKmfe7qoKuS7b5M1UkrfgBQ7Wggmk62uEfxxr32VfjSne8wmnGWh3F1oYV60mZDHPwepV8XdEY6f0J+o5oqXg9wqEAjqNyIh3LL4XUpP7lHHfLJXGi9QqrcQqOFCpCh/ZhxxqxkR/34rX1v/R8u74geqVzlvi4wivZJZRkxzAu9LmxaclkBpATHsBy9wU7ERv9lgnzZzbOpUgW67tJMuZBBOcu9co9WWwHaGyPYzFjUE/lfy+VYuxXHqVJuV83MjcVI5BsED9iHejo1xTbI2+PLYf5V+mjj8Nx5yqAFHGACf04WH7sMsXGGWsEfU3YZ5weirpNFklu9OETkiMTaJ8KoCVMXRFvhNNgcaCF4OEZYch35COMfx0fBPROSKHhtVYdw6kYASZajvP97E8UB/rhUYgf5z5izWEtjVCyl6tXwKc+rrmbJmAoa9sJ7DIQLxd/LRds85WFHsjxaFjSKYjucl2615TE21mexYWUJJ3mV7MTcYvpGNxccc2zoRl46JJXhPGS1VD1unr/firI8EEz21/4lr40SqkGnhTNwNTUWzg6CQPQfRLcSKVhZdmDaeDh6U2PRWDPNgAIS3acdVpyaJroC+qk7SbxfCSmk8JiV0CHxCg4PPtmRgUn38Rov7GZF6exF2ohYTYNpu7sGPL6UZ6qxztsDUuWry1a/TJODjMXtGRCbQUMf7w1XsKb6cNJeC7zm6du4d2YzPfFqY6g7cwI8pXoa8/FLNVdn/a3tk4gIIEEON9TJRSLNsYkaHA1x2ja2ASZSv+4+5X0XH6JbU4FV/XMejs5ICWOlIfaKmXsCVTlcGcs06EMS66/zL51gx1xNk IxjKVYGS 5BSurUB3dZBWn/dzGk/uZEseARt3JxLbMf/96asw+WyWeZ9deZF4d+BixIGq4cPonJmMgGsI0G923NM9BGW2UfOKYodGdoNL9RD3nSd2sCL2fVyGLmt1n3jrOuPhrcnxq2DHxi3qB4XCTj63hqoY9lnVhQJ/LbkyG5mXZWDOj1RdGAZmcDXo9BBY3G+Wvc6bF9N10cg6CNElbvJK7x3k4fN0CXX0ZKJi+5BS/ZGalBd1HkluBLLbHbF2nQLGik5rZyymjQevqvOENsZpBCYRMojOCuOs5R5FHhml8fxJmFspBvrp6tEv8juIxfxoxzB12MtWse+2CjpDGcevqP3WdFhyVdLdKzoXUtheNjONjsA4lVqjynp9GLQZ6GC+o6DyHYsSNsu+LsQNonrFrDG6gev7oAid8V5V2vQKQ2kMx0JH6IMfUEt1fHAYV3P8rAc3YLpOABZPNCw2Lab/+WK8UX1Rmn3bvrTo2C4lB+/m9up+b/xNzY3EhgKvtFfpVqyrpP1/8LPYgSVGJhDlin5SBX15FItb1GRElvAQ5qflU+g2w+F2bbaxN32qFi/Erhz6LGdj8mZB9tix8kvCYeN+/al/aen36MzmPxTx+tAjjQadrrP4adax50CyOy8Xubcw85rWhViZd8HZqlNhXtAbq8YH2XjZkFFaWBUIQ+xtqci1Pa2TGHoiBX0590xHDrWmOZXMMQfYlAZ1SEwQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Vishal Annapurve The faultability of a page is used to determine whether to split or reconstruct a page. If there is any page in a folio that is faultable, split the folio. If all pages in a folio are not faultable, reconstruct the folio. On truncation, always reconstruct and free regardless of faultability (as long as a HugeTLB page's worth of pages is truncated). Co-developed-by: Vishal Annapurve Signed-off-by: Vishal Annapurve Co-developed-by: Ackerley Tng Signed-off-by: Ackerley Tng --- virt/kvm/guest_memfd.c | 678 +++++++++++++++++++++++++++-------------- 1 file changed, 456 insertions(+), 222 deletions(-) diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index fb292e542381..0afc111099c0 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -99,6 +99,23 @@ static bool kvm_gmem_is_faultable(struct inode *inode, pgoff_t index) return xa_to_value(xa_load(faultability, index)) == KVM_GMEM_FAULTABILITY_VALUE; } +/** + * Return true if any of the @nr_pages beginning at @index is allowed to be + * faulted in. + */ +static bool kvm_gmem_is_any_faultable(struct inode *inode, pgoff_t index, + int nr_pages) +{ + pgoff_t i; + + for (i = index; i < index + nr_pages; ++i) { + if (kvm_gmem_is_faultable(inode, i)) + return true; + } + + return false; +} + /** * folio_file_pfn - like folio_file_page, but return a pfn. * @folio: The folio which contains this index. @@ -312,6 +329,40 @@ static int kvm_gmem_hugetlb_filemap_add_folio(struct address_space *mapping, return 0; } +static inline void kvm_gmem_hugetlb_filemap_remove_folio(struct folio *folio) +{ + folio_lock(folio); + + folio_clear_dirty(folio); + folio_clear_uptodate(folio); + filemap_remove_folio(folio); + + folio_unlock(folio); +} + +/* + * Locks a block of nr_pages (1 << huge_page_order(h)) pages within @mapping + * beginning at @index. Take either this or filemap_invalidate_lock() whenever + * the filemap is accessed. + */ +static u32 hugetlb_fault_mutex_lock(struct address_space *mapping, pgoff_t index) +{ + pgoff_t hindex; + u32 hash; + + hindex = index >> huge_page_order(kvm_gmem_hgmem(mapping->host)->h); + hash = hugetlb_fault_mutex_hash(mapping, hindex); + + mutex_lock(&hugetlb_fault_mutex_table[hash]); + + return hash; +} + +static void hugetlb_fault_mutex_unlock(u32 hash) +{ + mutex_unlock(&hugetlb_fault_mutex_table[hash]); +} + struct kvm_gmem_split_stash { struct { unsigned long _flags_2; @@ -394,15 +445,136 @@ static int kvm_gmem_hugetlb_reconstruct_folio(struct hstate *h, struct folio *fo } __folio_set_hugetlb(folio); - - folio_set_count(folio, 1); + hugetlb_folio_list_add(folio, &h->hugepage_activelist); hugetlb_vmemmap_optimize_folio(h, folio); + folio_set_count(folio, 1); + return 0; } -/* Basically folio_set_order(folio, 1) without the checks. */ +/** + * Reconstruct a HugeTLB folio out of folio_nr_pages(@first_folio) pages. Will + * clean up subfolios from filemap and add back the reconstructed folio. Folios + * to be reconstructed must not be locked, and reconstructed folio will not be + * locked. Return 0 on success or negative error otherwise. + * + * hugetlb_fault_mutex_lock() has to be held when calling this function. + * + * Expects that before this call, the filemap's refcounts are the only refcounts + * for the folios in the filemap. After this function returns, the filemap's + * refcount will be the only refcount on the reconstructed folio. + */ +static int kvm_gmem_reconstruct_folio_in_filemap(struct hstate *h, + struct folio *first_folio) +{ + struct address_space *mapping; + struct folio_batch fbatch; + unsigned long end; + pgoff_t index; + pgoff_t next; + int ret; + int i; + + if (folio_order(first_folio) == huge_page_order(h)) + return 0; + + index = first_folio->index; + mapping = first_folio->mapping; + + next = index; + end = index + (1UL << huge_page_order(h)); + folio_batch_init(&fbatch); + while (filemap_get_folios(mapping, &next, end - 1, &fbatch)) { + for (i = 0; i < folio_batch_count(&fbatch); ++i) { + struct folio *folio; + + folio = fbatch.folios[i]; + + /* + * Before removing from filemap, take a reference so + * sub-folios don't get freed when removing from + * filemap. + */ + folio_get(folio); + + kvm_gmem_hugetlb_filemap_remove_folio(folio); + } + folio_batch_release(&fbatch); + } + + ret = kvm_gmem_hugetlb_reconstruct_folio(h, first_folio); + if (ret) { + /* TODO: handle cleanup properly. */ + WARN_ON(ret); + return ret; + } + + kvm_gmem_hugetlb_filemap_add_folio(mapping, first_folio, index, + htlb_alloc_mask(h)); + + folio_unlock(first_folio); + folio_put(first_folio); + + return ret; +} + +/** + * Reconstruct any HugeTLB folios in range [@start, @end), if all the subfolios + * are not faultable. Return 0 on success or negative error otherwise. + * + * Will skip any folios that are already reconstructed. + */ +static int kvm_gmem_try_reconstruct_folios_range(struct inode *inode, + pgoff_t start, pgoff_t end) +{ + unsigned int nr_pages; + pgoff_t aligned_start; + pgoff_t aligned_end; + struct hstate *h; + pgoff_t index; + int ret; + + if (!is_kvm_gmem_hugetlb(inode)) + return 0; + + h = kvm_gmem_hgmem(inode)->h; + nr_pages = 1UL << huge_page_order(h); + + aligned_start = round_up(start, nr_pages); + aligned_end = round_down(end, nr_pages); + + ret = 0; + for (index = aligned_start; !ret && index < aligned_end; index += nr_pages) { + struct folio *folio; + u32 hash; + + hash = hugetlb_fault_mutex_lock(inode->i_mapping, index); + + folio = filemap_get_folio(inode->i_mapping, index); + if (!IS_ERR(folio)) { + /* + * Drop refcount because reconstruction expects an equal number + * of refcounts for all subfolios - just keep the refcount taken + * by the filemap. + */ + folio_put(folio); + + /* Merge only when the entire block of nr_pages is not faultable. */ + if (!kvm_gmem_is_any_faultable(inode, index, nr_pages)) { + ret = kvm_gmem_reconstruct_folio_in_filemap(h, folio); + WARN_ON(ret); + } + } + + hugetlb_fault_mutex_unlock(hash); + } + + return ret; +} + +/* Basically folio_set_order() without the checks. */ static inline void kvm_gmem_folio_set_order(struct folio *folio, unsigned int order) { folio->_flags_1 = (folio->_flags_1 & ~0xffUL) | order; @@ -414,8 +586,8 @@ static inline void kvm_gmem_folio_set_order(struct folio *folio, unsigned int or /** * Split a HugeTLB @folio of size huge_page_size(@h). * - * After splitting, each split folio has a refcount of 1. There are no checks on - * refcounts before splitting. + * Folio must have refcount of 1 when this function is called. After splitting, + * each split folio has a refcount of 1. * * Return 0 on success and negative error otherwise. */ @@ -423,14 +595,18 @@ static int kvm_gmem_hugetlb_split_folio(struct hstate *h, struct folio *folio) { int ret; + VM_WARN_ON_ONCE_FOLIO(folio_ref_count(folio) != 1, folio); + + folio_set_count(folio, 0); + ret = hugetlb_vmemmap_restore_folio(h, folio); if (ret) - return ret; + goto out; ret = kvm_gmem_hugetlb_stash_metadata(folio); if (ret) { hugetlb_vmemmap_optimize_folio(h, folio); - return ret; + goto out; } kvm_gmem_folio_set_order(folio, 0); @@ -439,109 +615,183 @@ static int kvm_gmem_hugetlb_split_folio(struct hstate *h, struct folio *folio) __folio_clear_hugetlb(folio); /* - * Remove the first folio from h->hugepage_activelist since it is no + * Remove the original folio from h->hugepage_activelist since it is no * longer a HugeTLB page. The other split pages should not be on any * lists. */ hugetlb_folio_list_del(folio); - return 0; + ret = 0; +out: + folio_set_count(folio, 1); + return ret; } -static struct folio *kvm_gmem_hugetlb_alloc_and_cache_folio(struct inode *inode, - pgoff_t index) +/** + * Split a HugeTLB folio into folio_nr_pages(@folio) pages. Will clean up folio + * from filemap and add back the split folios. @folio must not be locked, and + * all split folios will not be locked. Return 0 on success or negative error + * otherwise. + * + * hugetlb_fault_mutex_lock() has to be held when calling this function. + * + * Expects that before this call, the filemap's refcounts are the only refcounts + * for the folio. After this function returns, the filemap's refcounts will be + * the only refcounts on the split folios. + */ +static int kvm_gmem_split_folio_in_filemap(struct hstate *h, struct folio *folio) { - struct folio *allocated_hugetlb_folio; - pgoff_t hugetlb_first_subpage_index; - struct page *hugetlb_first_subpage; - struct kvm_gmem_hugetlb *hgmem; - struct page *requested_page; + struct address_space *mapping; + struct page *first_subpage; + pgoff_t index; int ret; int i; - hgmem = kvm_gmem_hgmem(inode); - allocated_hugetlb_folio = kvm_gmem_hugetlb_alloc_folio(hgmem->h, hgmem->spool); - if (IS_ERR(allocated_hugetlb_folio)) - return allocated_hugetlb_folio; + if (folio_order(folio) == 0) + return 0; - requested_page = folio_file_page(allocated_hugetlb_folio, index); - hugetlb_first_subpage = folio_file_page(allocated_hugetlb_folio, 0); - hugetlb_first_subpage_index = index & (huge_page_mask(hgmem->h) >> PAGE_SHIFT); + index = folio->index; + mapping = folio->mapping; - ret = kvm_gmem_hugetlb_split_folio(hgmem->h, allocated_hugetlb_folio); + first_subpage = folio_page(folio, 0); + + /* + * Take reference so that folio will not be released when removed from + * filemap. + */ + folio_get(folio); + + kvm_gmem_hugetlb_filemap_remove_folio(folio); + + ret = kvm_gmem_hugetlb_split_folio(h, folio); if (ret) { - folio_put(allocated_hugetlb_folio); - return ERR_PTR(ret); + WARN_ON(ret); + kvm_gmem_hugetlb_filemap_add_folio(mapping, folio, index, + htlb_alloc_mask(h)); + folio_put(folio); + return ret; } - for (i = 0; i < pages_per_huge_page(hgmem->h); ++i) { - struct folio *folio = page_folio(nth_page(hugetlb_first_subpage, i)); + for (i = 0; i < pages_per_huge_page(h); ++i) { + struct folio *folio = page_folio(nth_page(first_subpage, i)); - ret = kvm_gmem_hugetlb_filemap_add_folio(inode->i_mapping, - folio, - hugetlb_first_subpage_index + i, - htlb_alloc_mask(hgmem->h)); + ret = kvm_gmem_hugetlb_filemap_add_folio(mapping, folio, + index + i, + htlb_alloc_mask(h)); if (ret) { /* TODO: handle cleanup properly. */ - pr_err("Handle cleanup properly index=%lx, ret=%d\n", - hugetlb_first_subpage_index + i, ret); - dump_page(nth_page(hugetlb_first_subpage, i), "check"); - return ERR_PTR(ret); + WARN_ON(ret); + return ret; } + folio_unlock(folio); + /* - * Skip unlocking for the requested index since - * kvm_gmem_get_folio() returns a locked folio. - * - * Do folio_put() to drop the refcount that came with the folio, - * from splitting the folio. Splitting the folio has a refcount - * to be in line with hugetlb_alloc_folio(), which returns a - * folio with refcount 1. - * - * Skip folio_put() for requested index since - * kvm_gmem_get_folio() returns a folio with refcount 1. + * Drop reference so that the only remaining reference is the + * one held by the filemap. */ - if (hugetlb_first_subpage_index + i != index) { - folio_unlock(folio); - folio_put(folio); - } + folio_put(folio); } + return ret; +} + +/* + * Allocates and then caches a folio in the filemap. Returns a folio with + * refcount of 2: 1 after allocation, and 1 taken by the filemap. + */ +static struct folio *kvm_gmem_hugetlb_alloc_and_cache_folio(struct inode *inode, + pgoff_t index) +{ + struct kvm_gmem_hugetlb *hgmem; + pgoff_t aligned_index; + struct folio *folio; + int nr_pages; + int ret; + + hgmem = kvm_gmem_hgmem(inode); + folio = kvm_gmem_hugetlb_alloc_folio(hgmem->h, hgmem->spool); + if (IS_ERR(folio)) + return folio; + + nr_pages = 1UL << huge_page_order(hgmem->h); + aligned_index = round_down(index, nr_pages); + + ret = kvm_gmem_hugetlb_filemap_add_folio(inode->i_mapping, folio, + aligned_index, + htlb_alloc_mask(hgmem->h)); + WARN_ON(ret); + spin_lock(&inode->i_lock); inode->i_blocks += blocks_per_huge_page(hgmem->h); spin_unlock(&inode->i_lock); - return page_folio(requested_page); + return folio; +} + +/** + * Split @folio if any of the subfolios are faultable. Returns the split + * (locked, refcount=2) folio at @index. + * + * Expects a locked folio with 1 refcount in addition to filemap's refcounts. + * + * After splitting, the subfolios in the filemap will be unlocked and have + * refcount 1 (other than the returned folio, which will be locked and have + * refcount 2). + */ +static struct folio *kvm_gmem_maybe_split_folio(struct folio *folio, pgoff_t index) +{ + pgoff_t aligned_index; + struct inode *inode; + struct hstate *h; + int nr_pages; + int ret; + + inode = folio->mapping->host; + h = kvm_gmem_hgmem(inode)->h; + nr_pages = 1UL << huge_page_order(h); + aligned_index = round_down(index, nr_pages); + + if (!kvm_gmem_is_any_faultable(inode, aligned_index, nr_pages)) + return folio; + + /* Drop lock and refcount in preparation for splitting. */ + folio_unlock(folio); + folio_put(folio); + + ret = kvm_gmem_split_folio_in_filemap(h, folio); + if (ret) { + kvm_gmem_hugetlb_filemap_remove_folio(folio); + return ERR_PTR(ret); + } + + /* + * At this point, the filemap has the only reference on the folio. Take + * lock and refcount on folio to align with kvm_gmem_get_folio(). + */ + return filemap_lock_folio(inode->i_mapping, index); } static struct folio *kvm_gmem_get_hugetlb_folio(struct inode *inode, pgoff_t index) { - struct address_space *mapping; struct folio *folio; - struct hstate *h; - pgoff_t hindex; u32 hash; - h = kvm_gmem_hgmem(inode)->h; - hindex = index >> huge_page_order(h); - mapping = inode->i_mapping; - - /* To lock, we calculate the hash using the hindex and not index. */ - hash = hugetlb_fault_mutex_hash(mapping, hindex); - mutex_lock(&hugetlb_fault_mutex_table[hash]); + hash = hugetlb_fault_mutex_lock(inode->i_mapping, index); /* - * The filemap is indexed with index and not hindex. Taking lock on - * folio to align with kvm_gmem_get_regular_folio() + * The filemap is indexed with index and not hindex. Take lock on folio + * to align with kvm_gmem_get_regular_folio() */ - folio = filemap_lock_folio(mapping, index); + folio = filemap_lock_folio(inode->i_mapping, index); + if (IS_ERR(folio)) + folio = kvm_gmem_hugetlb_alloc_and_cache_folio(inode, index); + if (!IS_ERR(folio)) - goto out; + folio = kvm_gmem_maybe_split_folio(folio, index); - folio = kvm_gmem_hugetlb_alloc_and_cache_folio(inode, index); -out: - mutex_unlock(&hugetlb_fault_mutex_table[hash]); + hugetlb_fault_mutex_unlock(hash); return folio; } @@ -610,17 +860,6 @@ static void kvm_gmem_invalidate_end(struct kvm_gmem *gmem, pgoff_t start, } } -static inline void kvm_gmem_hugetlb_filemap_remove_folio(struct folio *folio) -{ - folio_lock(folio); - - folio_clear_dirty(folio); - folio_clear_uptodate(folio); - filemap_remove_folio(folio); - - folio_unlock(folio); -} - /** * Removes folios in range [@lstart, @lend) from page cache/filemap (@mapping), * returning the number of HugeTLB pages freed. @@ -631,61 +870,30 @@ static int kvm_gmem_hugetlb_filemap_remove_folios(struct address_space *mapping, struct hstate *h, loff_t lstart, loff_t lend) { - const pgoff_t end = lend >> PAGE_SHIFT; - pgoff_t next = lstart >> PAGE_SHIFT; - LIST_HEAD(folios_to_reconstruct); - struct folio_batch fbatch; - struct folio *folio, *tmp; - int num_freed = 0; - int i; - - /* - * TODO: Iterate over huge_page_size(h) blocks to avoid taking and - * releasing hugetlb_fault_mutex_table[hash] lock so often. When - * truncating, lstart and lend should be clipped to the size of this - * guest_memfd file, otherwise there would be too many iterations. - */ - folio_batch_init(&fbatch); - while (filemap_get_folios(mapping, &next, end - 1, &fbatch)) { - for (i = 0; i < folio_batch_count(&fbatch); ++i) { - struct folio *folio; - pgoff_t hindex; - u32 hash; - - folio = fbatch.folios[i]; + loff_t offset; + int num_freed; - hindex = folio->index >> huge_page_order(h); - hash = hugetlb_fault_mutex_hash(mapping, hindex); - mutex_lock(&hugetlb_fault_mutex_table[hash]); + num_freed = 0; + for (offset = lstart; offset < lend; offset += huge_page_size(h)) { + struct folio *folio; + pgoff_t index; + u32 hash; - /* - * Collect first pages of HugeTLB folios for - * reconstruction later. - */ - if ((folio->index & ~(huge_page_mask(h) >> PAGE_SHIFT)) == 0) - list_add(&folio->lru, &folios_to_reconstruct); + index = offset >> PAGE_SHIFT; + hash = hugetlb_fault_mutex_lock(mapping, index); - /* - * Before removing from filemap, take a reference so - * sub-folios don't get freed. Don't free the sub-folios - * until after reconstruction. - */ - folio_get(folio); + folio = filemap_get_folio(mapping, index); + if (!IS_ERR(folio)) { + /* Drop refcount so that filemap holds only reference. */ + folio_put(folio); + kvm_gmem_reconstruct_folio_in_filemap(h, folio); kvm_gmem_hugetlb_filemap_remove_folio(folio); - mutex_unlock(&hugetlb_fault_mutex_table[hash]); + num_freed++; } - folio_batch_release(&fbatch); - cond_resched(); - } - - list_for_each_entry_safe(folio, tmp, &folios_to_reconstruct, lru) { - kvm_gmem_hugetlb_reconstruct_folio(h, folio); - hugetlb_folio_list_move(folio, &h->hugepage_activelist); - folio_put(folio); - num_freed++; + hugetlb_fault_mutex_unlock(hash); } return num_freed; @@ -705,6 +913,10 @@ static void kvm_gmem_hugetlb_truncate_folios_range(struct inode *inode, int gbl_reserve; int num_freed; + /* No point truncating more than inode size. */ + lstart = min(lstart, inode->i_size); + lend = min(lend, inode->i_size); + hgmem = kvm_gmem_hgmem(inode); h = hgmem->h; @@ -1042,13 +1254,27 @@ static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf) bool is_prepared; inode = file_inode(vmf->vma->vm_file); - if (!kvm_gmem_is_faultable(inode, vmf->pgoff)) + + /* + * Use filemap_invalidate_lock_shared() to make sure + * kvm_gmem_get_folio() doesn't race with faultability updates. + */ + filemap_invalidate_lock_shared(inode->i_mapping); + + if (!kvm_gmem_is_faultable(inode, vmf->pgoff)) { + filemap_invalidate_unlock_shared(inode->i_mapping); return VM_FAULT_SIGBUS; + } folio = kvm_gmem_get_folio(inode, vmf->pgoff); + + filemap_invalidate_unlock_shared(inode->i_mapping); + if (!folio) return VM_FAULT_SIGBUS; + WARN(folio_test_hugetlb(folio), "should not be faulting in hugetlb folio=%p\n", folio); + is_prepared = folio_test_uptodate(folio); if (!is_prepared) { unsigned long nr_pages; @@ -1731,8 +1957,6 @@ static bool kvm_gmem_no_mappings_range(struct inode *inode, pgoff_t start, pgoff pgoff_t index; bool checked_indices_unmapped; - filemap_invalidate_lock_shared(inode->i_mapping); - /* TODO: replace iteration with filemap_get_folios() for efficiency. */ checked_indices_unmapped = true; for (index = start; checked_indices_unmapped && index < end;) { @@ -1754,98 +1978,130 @@ static bool kvm_gmem_no_mappings_range(struct inode *inode, pgoff_t start, pgoff folio_put(folio); } - filemap_invalidate_unlock_shared(inode->i_mapping); return checked_indices_unmapped; } /** - * Returns true if pages in range [@start, @end) in memslot @slot have no - * userspace mappings. + * Split any HugeTLB folios in range [@start, @end), if any of the offsets in + * the folio are faultable. Return 0 on success or negative error otherwise. + * + * Will skip any folios that are already split. */ -static bool kvm_gmem_no_mappings_slot(struct kvm_memory_slot *slot, - gfn_t start, gfn_t end) +static int kvm_gmem_try_split_folios_range(struct inode *inode, + pgoff_t start, pgoff_t end) { - pgoff_t offset_start; - pgoff_t offset_end; - struct file *file; - bool ret; - - offset_start = start - slot->base_gfn + slot->gmem.pgoff; - offset_end = end - slot->base_gfn + slot->gmem.pgoff; - - file = kvm_gmem_get_file(slot); - if (!file) - return false; - - ret = kvm_gmem_no_mappings_range(file_inode(file), offset_start, offset_end); + unsigned int nr_pages; + pgoff_t aligned_start; + pgoff_t aligned_end; + struct hstate *h; + pgoff_t index; + int ret; - fput(file); + if (!is_kvm_gmem_hugetlb(inode)) + return 0; - return ret; -} + h = kvm_gmem_hgmem(inode)->h; + nr_pages = 1UL << huge_page_order(h); -/** - * Returns true if pages in range [@start, @end) have no host userspace mappings. - */ -static bool kvm_gmem_no_mappings(struct kvm *kvm, gfn_t start, gfn_t end) -{ - int i; + aligned_start = round_down(start, nr_pages); + aligned_end = round_up(end, nr_pages); - lockdep_assert_held(&kvm->slots_lock); + ret = 0; + for (index = aligned_start; !ret && index < aligned_end; index += nr_pages) { + struct folio *folio; + u32 hash; - for (i = 0; i < kvm_arch_nr_memslot_as_ids(kvm); i++) { - struct kvm_memslot_iter iter; - struct kvm_memslots *slots; + hash = hugetlb_fault_mutex_lock(inode->i_mapping, index); - slots = __kvm_memslots(kvm, i); - kvm_for_each_memslot_in_gfn_range(&iter, slots, start, end) { - struct kvm_memory_slot *slot; - gfn_t gfn_start; - gfn_t gfn_end; - - slot = iter.slot; - gfn_start = max(start, slot->base_gfn); - gfn_end = min(end, slot->base_gfn + slot->npages); + folio = filemap_get_folio(inode->i_mapping, index); + if (!IS_ERR(folio)) { + /* + * Drop refcount so that the only references held are refcounts + * from the filemap. + */ + folio_put(folio); - if (iter.slot->flags & KVM_MEM_GUEST_MEMFD && - !kvm_gmem_no_mappings_slot(iter.slot, gfn_start, gfn_end)) - return false; + if (kvm_gmem_is_any_faultable(inode, index, nr_pages)) { + ret = kvm_gmem_split_folio_in_filemap(h, folio); + if (ret) { + /* TODO cleanup properly. */ + WARN_ON(ret); + } + } } + + hugetlb_fault_mutex_unlock(hash); } - return true; + return ret; } /** - * Set faultability of given range of gfns [@start, @end) in memslot @slot to - * @faultable. + * Returns 0 if guest_memfd permits setting range [@start, @end) with + * faultability @faultable within memslot @slot, or negative error otherwise. + * + * If a request was made to set the memory to PRIVATE (not faultable), the pages + * in the range must not be pinned or mapped for the request to be permitted. + * + * Because this may allow pages to be faulted in to userspace when requested to + * set attributes to shared, this must only be called after the pages have been + * invalidated from guest page tables. */ -static void kvm_gmem_set_faultable_slot(struct kvm_memory_slot *slot, gfn_t start, - gfn_t end, bool faultable) +static int kvm_gmem_try_set_faultable_slot(struct kvm_memory_slot *slot, + gfn_t start, gfn_t end, + bool faultable) { pgoff_t start_offset; + struct inode *inode; pgoff_t end_offset; struct file *file; + int ret; file = kvm_gmem_get_file(slot); if (!file) - return; + return 0; start_offset = start - slot->base_gfn + slot->gmem.pgoff; end_offset = end - slot->base_gfn + slot->gmem.pgoff; - WARN_ON(kvm_gmem_set_faultable(file_inode(file), start_offset, end_offset, - faultable)); + inode = file_inode(file); + + /* + * Use filemap_invalidate_lock_shared() to make sure + * splitting/reconstruction doesn't race with faultability updates. + */ + filemap_invalidate_lock(inode->i_mapping); + + kvm_gmem_set_faultable(inode, start_offset, end_offset, faultable); + + if (faultable) { + ret = kvm_gmem_try_split_folios_range(inode, start_offset, + end_offset); + } else { + if (kvm_gmem_no_mappings_range(inode, start_offset, end_offset)) { + ret = kvm_gmem_try_reconstruct_folios_range(inode, + start_offset, + end_offset); + } else { + ret = -EINVAL; + } + } + + filemap_invalidate_unlock(inode->i_mapping); fput(file); + + return ret; } /** - * Set faultability of given range of gfns [@start, @end) in memslot @slot to - * @faultable. + * Returns 0 if guest_memfd permits setting range [@start, @end) with + * faultability @faultable within VM @kvm, or negative error otherwise. + * + * See kvm_gmem_try_set_faultable_slot() for details. */ -static void kvm_gmem_set_faultable_vm(struct kvm *kvm, gfn_t start, gfn_t end, - bool faultable) +static int kvm_gmem_try_set_faultable_vm(struct kvm *kvm, gfn_t start, gfn_t end, + bool faultable) { int i; @@ -1866,43 +2122,15 @@ static void kvm_gmem_set_faultable_vm(struct kvm *kvm, gfn_t start, gfn_t end, gfn_end = min(end, slot->base_gfn + slot->npages); if (iter.slot->flags & KVM_MEM_GUEST_MEMFD) { - kvm_gmem_set_faultable_slot(slot, gfn_start, - gfn_end, faultable); + int ret; + + ret = kvm_gmem_try_set_faultable_slot(slot, gfn_start, + gfn_end, faultable); + if (ret) + return ret; } } } -} - -/** - * Returns true if guest_memfd permits setting range [@start, @end) to PRIVATE. - * - * If memory is faulted in to host userspace and a request was made to set the - * memory to PRIVATE, the faulted in pages must not be pinned for the request to - * be permitted. - */ -static int kvm_gmem_should_set_attributes_private(struct kvm *kvm, gfn_t start, - gfn_t end) -{ - kvm_gmem_set_faultable_vm(kvm, start, end, false); - - if (kvm_gmem_no_mappings(kvm, start, end)) - return 0; - - kvm_gmem_set_faultable_vm(kvm, start, end, true); - return -EINVAL; -} - -/** - * Returns true if guest_memfd permits setting range [@start, @end) to SHARED. - * - * Because this allows pages to be faulted in to userspace, this must only be - * called after the pages have been invalidated from guest page tables. - */ -static int kvm_gmem_should_set_attributes_shared(struct kvm *kvm, gfn_t start, - gfn_t end) -{ - /* Always okay to set shared, hence set range faultable here. */ - kvm_gmem_set_faultable_vm(kvm, start, end, true); return 0; } @@ -1922,10 +2150,16 @@ static int kvm_gmem_should_set_attributes_shared(struct kvm *kvm, gfn_t start, int kvm_gmem_should_set_attributes(struct kvm *kvm, gfn_t start, gfn_t end, unsigned long attrs) { - if (attrs & KVM_MEMORY_ATTRIBUTE_PRIVATE) - return kvm_gmem_should_set_attributes_private(kvm, start, end); - else - return kvm_gmem_should_set_attributes_shared(kvm, start, end); + bool faultable; + int ret; + + faultable = !(attrs & KVM_MEMORY_ATTRIBUTE_PRIVATE); + + ret = kvm_gmem_try_set_faultable_vm(kvm, start, end, faultable); + if (ret) + WARN_ON(kvm_gmem_try_set_faultable_vm(kvm, start, end, !faultable)); + + return ret; } #endif