From patchwork Tue Aug 13 12:02:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 13761886 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02A5FC531DC for ; Tue, 13 Aug 2024 12:03:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7AB016B009F; Tue, 13 Aug 2024 08:03:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 75D316B00A0; Tue, 13 Aug 2024 08:03:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4E8CB6B00A1; Tue, 13 Aug 2024 08:03:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 34FB06B009F for ; Tue, 13 Aug 2024 08:03:49 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id AC8AD1607BB for ; Tue, 13 Aug 2024 12:03:48 +0000 (UTC) X-FDA: 82447088136.23.D7A62B4 Received: from mail-qt1-f182.google.com (mail-qt1-f182.google.com [209.85.160.182]) by imf02.hostedemail.com (Postfix) with ESMTP id 9831F8000B for ; Tue, 13 Aug 2024 12:03:46 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=US3MwXRO; spf=pass (imf02.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.160.182 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723550556; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ArHqWK+pDk4THc6HGtgBtflIMEbd37YycfL2VsChgqA=; b=5d56RPUaTxrOb+FDwTVEM+eS2RncWN/cWDpc5grPxSEkcZr3JGsdH4dfeqF7m2qmYbvnsh SKBEiRzKXo4YseJ1fzX11fJmRAsY3pKtvO1E6cJ5zkCzFpEHBQg4ZVVqf5MLzolRQzqSX6 BzIHB3mcjvR0vOt4gtQ+UK2Q9eMjegI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723550556; a=rsa-sha256; cv=none; b=NqfRmxnq/0shEoqAUSrNZBqZPVcFwylVBkoEpFCaNTtQwAPG1QW96k11KTxEZ/iaa6AkrT qVbvstBaMHEY9Oo+wZug13oO3obnCdXxJ5IId2qfzKd3LFIytIjhV77GO5Qwf1c67fxDYn pm8v7Yxn8a2SyOs3WQnwM5lSHRGa8R4= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=US3MwXRO; spf=pass (imf02.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.160.182 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-qt1-f182.google.com with SMTP id d75a77b69052e-44fee8813c3so32332571cf.2 for ; Tue, 13 Aug 2024 05:03:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723550625; x=1724155425; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ArHqWK+pDk4THc6HGtgBtflIMEbd37YycfL2VsChgqA=; b=US3MwXRObbMtwTfzygu5x3IsJfsxntPlucpAGOf6A3rAjqFd/tn/UjmfHFqkFglvLE rNz7XaxFbb3wBjmY0ihQpApXAs1ZDNDZ0rxV0tUDEBEdodYCefwmU4Q89pjsp+88gtuC 12VhQvXhAUAxSdLs/F16l/Vh0ER3i4SDUHP4FEQpd1/l5A5rfVOEa5ldG3FYrQVcSThI 5U6xATQr0nrVmArwq0esAR61rEXgqCFmZtWAC6KVfFrGTHq58XBKYMMb072MGtdjG+Sk rpQr3vN0rFMpIbho9XYLQlk9jBw+rTf+z+1Hxl6/mLuqE7J61FXcoAbDFDoeMzdHnTA4 BsFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723550625; x=1724155425; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ArHqWK+pDk4THc6HGtgBtflIMEbd37YycfL2VsChgqA=; b=DItgWVr3PmNP1t9XirLyqDrFMOfBqwdG0lejZliY+MP0y6Zbvy7teaSDX6DQ9QRYhn krl1w7VaI16+TLj3iKy17Onc8ID0bs6fx2Hvm2NBJf7D1fDa6AXKtnBWBl7rF6siXeBW 0DLuCOHcXeKiL55cIqhPXdB1fmU57I/X5hBEcS3kjQIBhXOf3dGb+DpRvDu4dMMSS6L1 TEbFhkIfFMtku2j3GdKq6LCcb6KLjhAI3Hmnox6TpnuFNILfyVNqkuQPx9Dc4M1L76C2 bC8cU6apDZd9tlvHDkKDIIsCeWXqZwYjg54TO8QshMwoCPMufPBo1SWQdKsVAFRzUCsR gTIQ== X-Forwarded-Encrypted: i=1; AJvYcCW1OzPsesfSWoggkrC3YoFiopVf6qxghJQz99LT0vVRrvqZKOR0v95Vbvp7+1iUL8EuoN+fsfwJ5GKoyzL/CwyQIzM= X-Gm-Message-State: AOJu0YwUprKavpQZVn1qI4K/KPQ/Ofu7ok6J1wPKF27qIsQA620SsHHP yukx9a9CoqGKo20v7mKORD5bNK8TnqPjI5nwNGQrY4u27fNyg8eZ X-Google-Smtp-Source: AGHT+IFmFyxDVIM6hJt06Re344Hp5hGhJ4dfTOoArazVB1tqefIgGdnCSzB4+w1duntTekqvxAqRhA== X-Received: by 2002:a05:622a:316:b0:453:4aaa:d585 with SMTP id d75a77b69052e-4534aaad5f5mr34502981cf.4.1723550625396; Tue, 13 Aug 2024 05:03:45 -0700 (PDT) Received: from localhost (fwdproxy-ash-000.fbsv.net. [2a03:2880:20ff::face:b00c]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4531c1a7f12sm31519601cf.20.2024.08.13.05.03.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Aug 2024 05:03:45 -0700 (PDT) From: Usama Arif To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, baohua@kernel.org, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Shuang Zhai , Usama Arif Subject: [PATCH v3 2/6] mm: remap unused subpages to shared zeropage when splitting isolated thp Date: Tue, 13 Aug 2024 13:02:45 +0100 Message-ID: <20240813120328.1275952-3-usamaarif642@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20240813120328.1275952-1-usamaarif642@gmail.com> References: <20240813120328.1275952-1-usamaarif642@gmail.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 9831F8000B X-Stat-Signature: rwe357nm5wnsbd76cppt86mgr3am1yez X-HE-Tag: 1723550626-146868 X-HE-Meta: U2FsdGVkX1+mmNz20JmwLUcTnKKddMwZzTs4lENMq4kBBtXbBNd/ml6JLh1q5OqsqJZp46Aishk5JoU+tu/tKS70emfMNr+LyoFIXEosWcKVvQql+49PL8ZIrUDQUvjJsfGMjqbT2Ij4/sD71NG/tepvwGwRKRRTMjAwAY2W4/NYEFGi7KZUE4WQ5Obp6Ou7gEYuvgsmEMygpETh+VbVEsAbgc5HrRbjau32NuFs6Ii76L3tBuIbIPiLH2PFixx7nwT+RiPNDUjNGNy7I/WOa8p0220GCKOXaZ4f2KYC2cCUtrVUpz1deF5tjbOX88RSUb1bcCjV5Yz2ZWReor5hj/Tx5LWGZNpp875zQQAbWbentGl0o+ZSPlHvjPyf26ya9JupZsWZZdjLBIIWyiaZL1MYfn1OejJDsEsPZwjA2BOd0hBbFwbqyecRgvsfJFJpl6tn45nJUHGrpWmpsGp6PPlSF4yfAPeV61FpdSJXsxK5C3n4aRoDR3rw5ZLB4Laf6PJPCu0zlb1Bi1kZSUoVZ2mG6AUWuiH8E7uWisw7tTYU1Pu1MQ6nvzOx2jE4iiKPSGucurCiW7T1uOm2le8jJPkQxPh9ktDvRE9dOSR6S1rEshTkvtOmpL25DEtysxuoQ9S/gKtAxv9sFQ81BUUKCUMUyXgh69cKmWY8h8KODCCDwKo/VpIbfLc83c+fNYkbWEFVDL04DfyJCko0bWD0xzSXY9fCJS0hT58Js+5vZhlpQeqOkLIsO6MiDhIniYoTx1uiWtGqZBU3ydUc3XxxpOzZXaojWtC29MZm4jdc1+ShkA0HA5HFmBerlqu+igEZN3e0xb2WKaqW4exCS/XqvMHGdy/f1rG62V9Vn+AbxasQWLaBrJR3txBSWz8dQELsWLUS7M+ScfNQmESXc0uhteRbTACABT10yfs13MP5jGCF8f6Fp9KsXQmziwnBAhir3TxvH3+dAzI7axT7ak3 kT7zrk5v oxDmUy1e2IRyHZzfatXl6SPQLy8erwmWlyp3m1rV14y1ds0QahxtjP3UFXubreyyAvYkhPBAHrm8I1Gf+lJFOcSL6qeZYrod1XrCDrmQ+GzXbdiRWPnT8bO3s2YFQcBOMRg04C28iYhKvq9mvqI4nxYi6fxAiLh9+PVnqsYczNetuLrK/bMdb99Xn/VAhuMZBv6lVdMpn8OXhau3AcoHDYR552H1rXqFDaUyYmdNK03X0+k9P6/u7mp4cviPRNcFZ8dMhL7rbC7n6EcjI3mNTQhrm3YqmHSASecLMW9clxecERNg2CEbFNJioKOG8vwN5w6bfnFhuipAs+jlAv6Qvl9uqw69xrd/J5IAa15yu8QdqmHU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Yu Zhao Here being unused means containing only zeros and inaccessible to userspace. When splitting an isolated thp under reclaim or migration, the unused subpages can be mapped to the shared zeropage, hence saving memory. This is particularly helpful when the internal fragmentation of a thp is high, i.e. it has many untouched subpages. This is also a prerequisite for THP low utilization shrinker which will be introduced in later patches, where underutilized THPs are split, and the zero-filled pages are freed saving memory. Signed-off-by: Yu Zhao Tested-by: Shuang Zhai Signed-off-by: Usama Arif --- include/linux/rmap.h | 7 ++++- mm/huge_memory.c | 8 ++--- mm/migrate.c | 71 ++++++++++++++++++++++++++++++++++++++------ mm/migrate_device.c | 4 +-- 4 files changed, 74 insertions(+), 16 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 0978c64f49d8..07854d1f9ad6 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -745,7 +745,12 @@ int folio_mkclean(struct folio *); int pfn_mkclean_range(unsigned long pfn, unsigned long nr_pages, pgoff_t pgoff, struct vm_area_struct *vma); -void remove_migration_ptes(struct folio *src, struct folio *dst, bool locked); +enum rmp_flags { + RMP_LOCKED = 1 << 0, + RMP_USE_SHARED_ZEROPAGE = 1 << 1, +}; + +void remove_migration_ptes(struct folio *src, struct folio *dst, int flags); /* * rmap_walk_control: To control rmap traversing for specific needs diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 85a424e954be..6df0e9f4f56c 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2911,7 +2911,7 @@ bool unmap_huge_pmd_locked(struct vm_area_struct *vma, unsigned long addr, return false; } -static void remap_page(struct folio *folio, unsigned long nr) +static void remap_page(struct folio *folio, unsigned long nr, int flags) { int i = 0; @@ -2919,7 +2919,7 @@ static void remap_page(struct folio *folio, unsigned long nr) if (!folio_test_anon(folio)) return; for (;;) { - remove_migration_ptes(folio, folio, true); + remove_migration_ptes(folio, folio, RMP_LOCKED | flags); i += folio_nr_pages(folio); if (i >= nr) break; @@ -3129,7 +3129,7 @@ static void __split_huge_page(struct page *page, struct list_head *list, if (nr_dropped) shmem_uncharge(folio->mapping->host, nr_dropped); - remap_page(folio, nr); + remap_page(folio, nr, PageAnon(head) ? RMP_USE_SHARED_ZEROPAGE : 0); /* * set page to its compound_head when split to non order-0 pages, so @@ -3424,7 +3424,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, if (mapping) xas_unlock(&xas); local_irq_enable(); - remap_page(folio, folio_nr_pages(folio)); + remap_page(folio, folio_nr_pages(folio), 0); ret = -EAGAIN; } diff --git a/mm/migrate.c b/mm/migrate.c index 66a5f73ebfdf..3288ac041d03 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -178,13 +178,56 @@ void putback_movable_pages(struct list_head *l) } } +static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw, + struct folio *folio, + unsigned long idx) +{ + struct page *page = folio_page(folio, idx); + bool contains_data; + pte_t newpte; + void *addr; + + VM_BUG_ON_PAGE(PageCompound(page), page); + VM_BUG_ON_PAGE(!PageAnon(page), page); + VM_BUG_ON_PAGE(!PageLocked(page), page); + VM_BUG_ON_PAGE(pte_present(*pvmw->pte), page); + + if (PageMlocked(page) || (pvmw->vma->vm_flags & VM_LOCKED)) + return false; + + /* + * The pmd entry mapping the old thp was flushed and the pte mapping + * this subpage has been non present. If the subpage is only zero-filled + * then map it to the shared zeropage. + */ + addr = kmap_local_page(page); + contains_data = memchr_inv(addr, 0, PAGE_SIZE); + kunmap_local(addr); + + if (contains_data || mm_forbids_zeropage(pvmw->vma->vm_mm)) + return false; + + newpte = pte_mkspecial(pfn_pte(my_zero_pfn(pvmw->address), + pvmw->vma->vm_page_prot)); + set_pte_at(pvmw->vma->vm_mm, pvmw->address, pvmw->pte, newpte); + + dec_mm_counter(pvmw->vma->vm_mm, mm_counter(folio)); + return true; +} + +struct rmap_walk_arg { + struct folio *folio; + bool map_unused_to_zeropage; +}; + /* * Restore a potential migration pte to a working pte entry */ static bool remove_migration_pte(struct folio *folio, - struct vm_area_struct *vma, unsigned long addr, void *old) + struct vm_area_struct *vma, unsigned long addr, void *arg) { - DEFINE_FOLIO_VMA_WALK(pvmw, old, vma, addr, PVMW_SYNC | PVMW_MIGRATION); + struct rmap_walk_arg *rmap_walk_arg = arg; + DEFINE_FOLIO_VMA_WALK(pvmw, rmap_walk_arg->folio, vma, addr, PVMW_SYNC | PVMW_MIGRATION); while (page_vma_mapped_walk(&pvmw)) { rmap_t rmap_flags = RMAP_NONE; @@ -208,6 +251,9 @@ static bool remove_migration_pte(struct folio *folio, continue; } #endif + if (rmap_walk_arg->map_unused_to_zeropage && + try_to_map_unused_to_zeropage(&pvmw, folio, idx)) + continue; folio_get(folio); pte = mk_pte(new, READ_ONCE(vma->vm_page_prot)); @@ -286,14 +332,21 @@ static bool remove_migration_pte(struct folio *folio, * Get rid of all migration entries and replace them by * references to the indicated page. */ -void remove_migration_ptes(struct folio *src, struct folio *dst, bool locked) +void remove_migration_ptes(struct folio *src, struct folio *dst, int flags) { + struct rmap_walk_arg rmap_walk_arg = { + .folio = src, + .map_unused_to_zeropage = flags & RMP_USE_SHARED_ZEROPAGE, + }; + struct rmap_walk_control rwc = { .rmap_one = remove_migration_pte, - .arg = src, + .arg = &rmap_walk_arg, }; - if (locked) + VM_BUG_ON_FOLIO((flags & RMP_USE_SHARED_ZEROPAGE) && (src != dst), src); + + if (flags & RMP_LOCKED) rmap_walk_locked(dst, &rwc); else rmap_walk(dst, &rwc); @@ -903,7 +956,7 @@ static int writeout(struct address_space *mapping, struct folio *folio) * At this point we know that the migration attempt cannot * be successful. */ - remove_migration_ptes(folio, folio, false); + remove_migration_ptes(folio, folio, 0); rc = mapping->a_ops->writepage(&folio->page, &wbc); @@ -1067,7 +1120,7 @@ static void migrate_folio_undo_src(struct folio *src, struct list_head *ret) { if (page_was_mapped) - remove_migration_ptes(src, src, false); + remove_migration_ptes(src, src, 0); /* Drop an anon_vma reference if we took one */ if (anon_vma) put_anon_vma(anon_vma); @@ -1305,7 +1358,7 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private, lru_add_drain(); if (old_page_state & PAGE_WAS_MAPPED) - remove_migration_ptes(src, dst, false); + remove_migration_ptes(src, dst, 0); out_unlock_both: folio_unlock(dst); @@ -1443,7 +1496,7 @@ static int unmap_and_move_huge_page(new_folio_t get_new_folio, if (page_was_mapped) remove_migration_ptes(src, - rc == MIGRATEPAGE_SUCCESS ? dst : src, false); + rc == MIGRATEPAGE_SUCCESS ? dst : src, 0); unlock_put_anon: folio_unlock(dst); diff --git a/mm/migrate_device.c b/mm/migrate_device.c index 6d66dc1c6ffa..8f875636b35b 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -424,7 +424,7 @@ static unsigned long migrate_device_unmap(unsigned long *src_pfns, continue; folio = page_folio(page); - remove_migration_ptes(folio, folio, false); + remove_migration_ptes(folio, folio, 0); src_pfns[i] = 0; folio_unlock(folio); @@ -837,7 +837,7 @@ void migrate_device_finalize(unsigned long *src_pfns, src = page_folio(page); dst = page_folio(newpage); - remove_migration_ptes(src, dst, false); + remove_migration_ptes(src, dst, 0); folio_unlock(src); if (is_zone_device_page(page))