From patchwork Thu Aug 10 14:29:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13349526 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68A90C04A6A for ; Thu, 10 Aug 2023 14:30:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E31F26B0072; Thu, 10 Aug 2023 10:29:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DE1F76B0075; Thu, 10 Aug 2023 10:29:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CD11A6B0078; Thu, 10 Aug 2023 10:29:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id C10D76B0072 for ; Thu, 10 Aug 2023 10:29:59 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 7422EA107B for ; Thu, 10 Aug 2023 14:29:59 +0000 (UTC) X-FDA: 81108429318.21.00186E8 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf02.hostedemail.com (Postfix) with ESMTP id 6A91980024 for ; Thu, 10 Aug 2023 14:29:56 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf02.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1691677796; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jjkXyBj141BxTPIAchOKJ6WjdJR6TzI9YqXc7Ba2Scc=; b=WVeYBwuszMPF7/z0LGknlVZtRuafpL25WfEQVcnnvjhzlTd2UWRHA9J159ePfqWIb3zRD/ CDNstQal6w6HAUFaPLJdsrkv8XYTbH2Gb14pcygTsXdVyPVh/8PupFO9Q9zbJybXo7jJgu hnQk5fkLmbYZojjcPNWEjd9bRUXxJmM= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf02.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1691677796; a=rsa-sha256; cv=none; b=QvotrUYfVrZHzgHDLlLFUxJHfICx4tE6PM7xSc+N58oXSq7XNgc3Q68PD84DaIAT9QZIv7 v/l23EzpWEV15p4XskZK6AE8Wb9FPsJ7c+KaN+0VML9LV5whT02km/H5ipXAxF9ng1MMwe /XILHAJKss1VS40VI8DNdpGbvxvI7HE= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D9B95113E; Thu, 10 Aug 2023 07:30:37 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 288DB3F64C; Thu, 10 Aug 2023 07:29:53 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , Matthew Wilcox , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Anshuman Khandual , Yang Shi , "Huang, Ying" , Zi Yan , Luis Chamberlain , Itaru Kitayama , "Kirill A. Shutemov" Cc: Ryan Roberts , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v5 1/5] mm: Allow deferred splitting of arbitrary large anon folios Date: Thu, 10 Aug 2023 15:29:38 +0100 Message-Id: <20230810142942.3169679-2-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230810142942.3169679-1-ryan.roberts@arm.com> References: <20230810142942.3169679-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 6A91980024 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: 3pe8wf5oi6z5on3kps4t44zobfgxnk9n X-HE-Tag: 1691677796-278988 X-HE-Meta: U2FsdGVkX18yKh+odlrJ02EYuJc5StHSah8oiX2uIWJQI+tjymOlk147eORAdmEyS6k0FFoCGwbbcN2mnGpbemeosCvnmK0XT2CKfgLIF3wvr9eSDfnrh3qJBglITrx6LlqEsXVK43XGjr88QP/0BOLKyjhodnrs1Vd8/ZEhXR0bHVeW0uKMjB2rJOfAP4jglukM86YGniRA9yJE6ht7oq72mk7jknyjngpjrl4xtGJM3oUvAzfiuN5Ak9EzoR1dKyR4NZJhFp0NvieRM9bPcHFK3sIPrXyMr13hYNLq3DeVVQUfn7Qyec4jT+xK88TjXxCjFQXKS/w1BB52S/Jlpv2TxyJjLj3Zhny8ZGw2GZho8gQXMJK3herMhhyaoulP+rSFXBpLyVV2tdKyDobit4gw8/IZyTyg5Lk/at2MZfWyTJUfzUTjzfhfFqXOSdkB1Y7ImtcwJtrW7ZSxcbKODcrXxgkIGu0pw1X97CtYmQ6ktR7LIw6qbcaYkKdyS5R6TfVFc9NbdrCaNwmIH9X9Tkq3kq7I/eXoW94MUUAIi2DrC2ahG2eeYqw7TEm0mMjAwtzlnjqCdondq00BuTtPnqtyhegYDgf4+id8r92ivDJPIQgmdLl5pV3nKm3dWSIImrQ8DEJvWUzpr2VZgFjoTNV3UQ+7ANgYBsyosxoG4IwXeVXkggHnLvN0pfQYu7+1fu9TJPPyYkCHOZaq8jewK6Oykm4iLXvCjHUXQLIAUYLhhzPIWUKEwALpYhJmFghjfQzv1bZZpst6GdFaQG7j8HHaltMgA2s1Z505B0CPHuFjuxj33PESRcGT/b//1zyflVDGcaUHlibZqyGD9CX+jF/U/gMIOHNr48asoO0fnYSx+IPfmSxdBgK1uT4ByOGp6X5EXvar4Xy7Vu+3mJbAeD6AKi6lqnW7sXVg3r27IwRaIgoonAWSwFZObMlEJTNRDIf4EuAwvEZIpzVDaYL FUgQY8MF XuidDr/Remh+phZnEQboUM2uJJFlOE7nuQ2rsaNIgNgu0QsngAMaFvV6jUFvLZcnvKNQ4aCK6vCRQ9nFJJdPaMMbAmgLXrAyQ2jnsYBHJ3zJLNr2NVfhQBNga52hOBI6+lk+TiXs4tomOn2JaDe5JgXTa5tQfvNBQ/U/uW93HAGwKD8CkF3diiZHEas/7sAyzPRnN4ykHI2mS5rLzvsFovGgsChblgS+LA4Py8u/MYqqwYqKKKY+BExfYLFUS3m/ZY4EmgK7LLKxnBSYXpKWjK0SwCbgiwgUgrCyIsuTo7fD/cDgafBkCnimzyIgCrbzVWuIKjxKjLyNEAIL5pZPS3dO5vqla89r+XFpy X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In preparation for the introduction of large folios for anonymous memory, we would like to be able to split them when they have unmapped subpages, in order to free those unused pages under memory pressure. So remove the artificial requirement that the large folio needed to be at least PMD-sized. Reviewed-by: Yu Zhao Reviewed-by: Yin Fengwei Reviewed-by: Matthew Wilcox (Oracle) Reviewed-by: David Hildenbrand Signed-off-by: Ryan Roberts --- mm/rmap.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 1f04debdc87a..769fcabc6c56 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1446,11 +1446,11 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, __lruvec_stat_mod_folio(folio, idx, -nr); /* - * Queue anon THP for deferred split if at least one + * Queue anon large folio for deferred split if at least one * page of the folio is unmapped and at least one page * is still mapped. */ - if (folio_test_pmd_mappable(folio) && folio_test_anon(folio)) + if (folio_test_large(folio) && folio_test_anon(folio)) if (!compound || nr < nr_pmdmapped) deferred_split_folio(folio); } From patchwork Thu Aug 10 14:29:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13349527 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0B11C04E69 for ; Thu, 10 Aug 2023 14:30:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 30FB56B0075; Thu, 10 Aug 2023 10:30:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 249AF6B0078; Thu, 10 Aug 2023 10:30:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 137FC6B007B; Thu, 10 Aug 2023 10:30:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 035226B0075 for ; Thu, 10 Aug 2023 10:30:01 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id D831C1A0233 for ; Thu, 10 Aug 2023 14:30:00 +0000 (UTC) X-FDA: 81108429360.02.3E07832 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf22.hostedemail.com (Postfix) with ESMTP id 1328CC000D for ; Thu, 10 Aug 2023 14:29:58 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf22.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1691677799; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hZn/3SZrqk4Mk6iz6WOdEWWcv8iCLvWmnIBLw6ALzj4=; b=45DchmIHzRzMlnPQelSoM8CmOHxU+UXvSZYhYkc5g9lpROUO4DzMvG7+ELy1m5GPUut1eW IdbY55fiFA9pX7Xn5/SrFL45kvrUikQLC1be3flXm896Vx0lPKpwC1DwOKR11SK1WZ85cX IFNI4R9SZwq/jdjrpKF6Rl4Yq5w+A1Q= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf22.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1691677799; a=rsa-sha256; cv=none; b=SI5kMkdaXHYjul+jbWlnQD8+/tADeCNbsY4nNrylR32b1JJpJ1s7s1EheAXIuynnrreYLs UNueive01tFfmS3GW5v/cu8YIBTpkWhk1OuQiyZnMGTaIDtsgQeH4AkGj7a+SX8Mr6aW+N 3nYd5gigF3hJlFyBIsmp8eQ8UJ1s9Co= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 72BCB139F; Thu, 10 Aug 2023 07:30:40 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B5B913F64C; Thu, 10 Aug 2023 07:29:55 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , Matthew Wilcox , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Anshuman Khandual , Yang Shi , "Huang, Ying" , Zi Yan , Luis Chamberlain , Itaru Kitayama , "Kirill A. Shutemov" Cc: Ryan Roberts , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v5 2/5] mm: Non-pmd-mappable, large folios for folio_add_new_anon_rmap() Date: Thu, 10 Aug 2023 15:29:39 +0100 Message-Id: <20230810142942.3169679-3-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230810142942.3169679-1-ryan.roberts@arm.com> References: <20230810142942.3169679-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 1328CC000D X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: r31o8976hz4u8oicbjyrro48pwfjoico X-HE-Tag: 1691677798-238511 X-HE-Meta: U2FsdGVkX18f+Ak3tSAa/sO9BsD7T7QQaVgkerfIiHXBvOGtBVAi/hLf5rRiOh431oi82sYS5U4jqGZ9oKGZg+pvmjsO3yBYtJJ2gDYW2v4BHNOmKVUB27MdBCgI3WhG7L7XCx5D6Y5wzQzevXA/zifBP1wxHZ0yxbDBvn5/po96/lDQZQfi1qjVWOh+Is3PmP9cwE57ecYrjjymc6yn1sf1RKBiAOSI80ZW/H3w6FA1TLIxcZBuQxDYLp3ldoHAvbn+tkdk4j8hCGITWzmM0z8gECqL3NIOT0F7s9zLu21+5s4oR81btyTJVj65fq650b5Yky+PDBuKTvLUDjcQvRB2NR5b8Hr8HqkipJL1HEHmK6/UUk1gprN2nNKGxGF+uWDvyzAAck21amCXfcv7wkKk4K2u26FrtcueUGELz7uO2+QIzjmzHMIm8TJMyjacvd8PbNDdiCyBym9aLwJylTpuNfh/88+yCB1cNT6nvbyYL/QBVN8KXgUimipT93u5q3l2cCZyl+8Rp+cj9k5ANgTVxNsLd/est+9U3lkg88yDnfipfUyybOmQjgaDzgeVIqkdIBYIIlKoBfahlUPesu/xrzlxl5LqeAONpabzDCu9Vih6ET7w7SgVN5NdEAVsrOWCd0VW2GUKjVay75zQe2mGBFQh/Ldo1G0Jzd9VdTdnqaTdBZehLmUIWlZU4dCCS5i8l6FMb9OHRlq444FS6aOnGuP8QWOO83SsG65UjdnISS0Giwcd376ey0StV6HYq1oKMIKgx08ejZCLUl3kfp6txd/e3mvxD7kqBSr/a8YHGgh05YhVBfed847b4JUPM7r8pMSdgHFZSHWvUcg4wgYGbj7WKKO0hQLSfifAXIqfpJv1Iz7W8coekHqzY5QoLLhlaHTaqmxUaY6b1d1NmM6s9ORIJYEVvphLmV63AbEiMecaT1bGnaFYsFTZWMDH9QuZwCnXXpcuOrFLbfA DJibGIln K7fcdOVfeRFwnyXO3XpaVmBBuRjGHgWgVC0QIZnDG38lJVylmQe66bddvh7E5yZwtWKqAhhPukW2I4s2VLknFUAoDVubjRCRBXTWgO/b1VZ6TiNMIV0bjeINYRKrfNbEpe3bvwsCIDowql5+0s6SgP/nHu7lCnFoUUVK+ZyGRDk2cUgxKcLB9qsOXe93z4pggQt2JNf7V/WfYHlkB2mHTl4Rtr1i9daklfl4D3Vvp/jqXk+NvQnjcYk/9TZyKmHMw8m4KH1NYeYNUI8rgxU0NbrQgeIwpG0J6kaDayFki1HS3ywhZgAJ4qBPKJioqFpMWvqlm X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In preparation for LARGE_ANON_FOLIO support, improve folio_add_new_anon_rmap() to allow a non-pmd-mappable, large folio to be passed to it. In this case, all contained pages are accounted using the order-0 folio (or base page) scheme. Reviewed-by: Yu Zhao Reviewed-by: Yin Fengwei Signed-off-by: Ryan Roberts --- mm/rmap.c | 27 ++++++++++++++++++++------- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 769fcabc6c56..d1ff92b4bf6b 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1266,31 +1266,44 @@ void page_add_anon_rmap(struct page *page, struct vm_area_struct *vma, * This means the inc-and-test can be bypassed. * The folio does not have to be locked. * - * If the folio is large, it is accounted as a THP. As the folio + * If the folio is pmd-mappable, it is accounted as a THP. As the folio * is new, it's assumed to be mapped exclusively by a single process. */ void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma, unsigned long address) { - int nr; + int nr = folio_nr_pages(folio); - VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma); + VM_BUG_ON_VMA(address < vma->vm_start || + address + (nr << PAGE_SHIFT) > vma->vm_end, vma); __folio_set_swapbacked(folio); - if (likely(!folio_test_pmd_mappable(folio))) { + if (likely(!folio_test_large(folio))) { /* increment count (starts at -1) */ atomic_set(&folio->_mapcount, 0); - nr = 1; + __page_set_anon_rmap(folio, &folio->page, vma, address, 1); + } else if (!folio_test_pmd_mappable(folio)) { + int i; + + for (i = 0; i < nr; i++) { + struct page *page = folio_page(folio, i); + + /* increment count (starts at -1) */ + atomic_set(&page->_mapcount, 0); + __page_set_anon_rmap(folio, page, vma, + address + (i << PAGE_SHIFT), 1); + } + + atomic_set(&folio->_nr_pages_mapped, nr); } else { /* increment count (starts at -1) */ atomic_set(&folio->_entire_mapcount, 0); atomic_set(&folio->_nr_pages_mapped, COMPOUND_MAPPED); - nr = folio_nr_pages(folio); + __page_set_anon_rmap(folio, &folio->page, vma, address, 1); __lruvec_stat_mod_folio(folio, NR_ANON_THPS, nr); } __lruvec_stat_mod_folio(folio, NR_ANON_MAPPED, nr); - __page_set_anon_rmap(folio, &folio->page, vma, address, 1); } /** From patchwork Thu Aug 10 14:29:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13349528 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BCAC6C04A6A for ; Thu, 10 Aug 2023 14:30:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 50A876B0078; Thu, 10 Aug 2023 10:30:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 493C06B007B; Thu, 10 Aug 2023 10:30:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2E6C56B007D; Thu, 10 Aug 2023 10:30:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 20F776B0078 for ; Thu, 10 Aug 2023 10:30:04 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id A9753C0F4D for ; Thu, 10 Aug 2023 14:30:03 +0000 (UTC) X-FDA: 81108429486.25.5992B57 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf06.hostedemail.com (Postfix) with ESMTP id BB946180026 for ; Thu, 10 Aug 2023 14:30:01 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf06.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1691677801; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CkBS3801+DyIGuoTkxthrjCxqHAXo1ZB5C9GVz80YpI=; b=4OXsX9lBPJVG5a4+Fx30GlouPrYM7M/N5PMxbEtSQDdyC0mymFXhdsRA8xZG7ImCmxOwwh cGuxN3D8l+6hVSPXxKW++gbPb72EleySEBRrmKxucjAK1TLJKO6lZa924xvvVv+VIzKDep NuOVB9nqbqQf3uFIdLohuwcuyua1yjw= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf06.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1691677801; a=rsa-sha256; cv=none; b=f3m3x7/ja0ISU/AwywqitiaMs88g9qHo0TOPH+vSr0YmospgrMCGJGIjcrAagBl7qJC4CD cVIAFcIQfdixG+j9VQ1CZRI5IR70Uqle53s1LOGf8r4i/wkC4wUGPzn271EtflBFTXMIuE cklRPaESqrLdoMjOheXrQfxt24lvYxg= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 299A81480; Thu, 10 Aug 2023 07:30:43 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4EE6B3F64C; Thu, 10 Aug 2023 07:29:58 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , Matthew Wilcox , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Anshuman Khandual , Yang Shi , "Huang, Ying" , Zi Yan , Luis Chamberlain , Itaru Kitayama , "Kirill A. Shutemov" Cc: Ryan Roberts , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v5 3/5] mm: LARGE_ANON_FOLIO for improved performance Date: Thu, 10 Aug 2023 15:29:40 +0100 Message-Id: <20230810142942.3169679-4-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230810142942.3169679-1-ryan.roberts@arm.com> References: <20230810142942.3169679-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: 51zms81fyigas4syszojdrxk1qag6rdz X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: BB946180026 X-HE-Tag: 1691677801-984057 X-HE-Meta: U2FsdGVkX19AFuwJw/Ox4LtVQeHx3NVpTbbU5/J5jrk7CdMJCbMAM0qaLANuVykxFBtK8HahS4iLGIZyEuk27Vt9NIGnsnToWHIbf0kzTlnOBwqKqEEsOWfcSsXdL7KSu85BMqDVLurH+onSeWZBVJMsKvckgg+hN3Ii/gtXo/gQqM4HPSL+9iPxo1pR0K6th8mhE7hOc3Vqgz0SAsgut5lVjKVAbNxdCvmuHvwZ4Mq7HblSTiKGPmHHEIht1X3mbt/UJy+3z3lYXTNKIKq5Rzugq3GRRCLh/ZerlmUXSzOQauWb0U5XQjI7iF1WGTuUGcUg3xfl/HB7HWnH8L2/EenhPwmgKmmxG8BIB/7vgOXMGIUjK9y3tc0kuXtxBDQr59t6uZjb/nl4lB+wgYRrAC9MM2cTiim4yhyCWNPdJzdGz8GDZYInK4pTEwRc30xIia4bOLALl5k2OOGsSlBk2x0wyOr0old/mV5P7G/db/fhxpWWfRih21+7jJbB59ad2psvhXLhwcEsf6mZwqjqMPk2ez3Wozubxn4TFxbUYAXl28C6zFMx139MyxWs4lfZxO6ygC/kUgdgoJ5lVJ+Zr8FIg1H/QU+5mznYWdvJJojUhAQyqWvn0JuHI3wVQvQRgtXPoI7Zq3BLflKsE2+Sy+umylGcLNzwgIw30mpLV2Krr0AySK4XWC90Lr0gfRnQCxW9a9zq1TaS+Wv6BfXkoJ45fj5DSkVwD4MyPcfr+tswONz5yiuvH1SXbawbY3hNeNCxbnQu+GhuO+MvV5/d0hZ/WL1HeahgSs/mWeHDm8/jx36vOUI20bgcvT3sCDvu1oVgSV1TT0IVQ+ukATEmX6RkzXUend+6ixY9IrxIxNGQeNwODxFZTDAq0A5IiGPzVvt6Rh+FZ6Tga9HdoPWCpO+G7gn7/csofBWuqiqwPj+2JfRQFxsb0r1eTRtl7c6QrZTTUqzuZyipLtfwO+X 0DdJR7rc +v/2Iq/KmGpQbWz8CkvV2KHAyRYKfQAHGey8XGmrtrDmPuP6kEstTi7Q+yxmWfBmZCZlImYgLMMV7M4VRedzHBAsvB5krXIBJMNjH3gZxZIxs7hLw++vShBgR4W4QJCcScThbbHpiWrbag7xK6Ba6JK5IVjCuCXX7p70aPRZMLg5hfFrY75dF7TSq5Vd7Neiqqr7ZkywApSauOS0EZrgs7TYedJQbtE2E/idCQUl8bpFjVKBIqrDz8/Q1UsiUKo9crm8YrwLveHry2KYTpQdn3AByVw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Introduce LARGE_ANON_FOLIO feature, which allows anonymous memory to be allocated in large folios of a determined order. All pages of the large folio are pte-mapped during the same page fault, significantly reducing the number of page faults. The number of per-page operations (e.g. ref counting, rmap management lru list management) are also significantly reduced since those ops now become per-folio. The new behaviour is hidden behind the new LARGE_ANON_FOLIO Kconfig, which defaults to disabled for now; The long term aim is for this to defaut to enabled, but there are some risks around internal fragmentation that need to be better understood first. Large anonymous folio (LAF) allocation is integrated with the existing (PMD-order) THP and single (S) page allocation according to this policy, where fallback (>) is performed for various reasons, such as the proposed folio order not fitting within the bounds of the VMA, etc: | prctl=dis | prctl=ena | prctl=ena | prctl=ena | sysfs=X | sysfs=never | sysfs=madvise | sysfs=always ----------------|-----------|-------------|---------------|------------- no hint | S | LAF>S | LAF>S | THP>LAF>S MADV_HUGEPAGE | S | LAF>S | THP>LAF>S | THP>LAF>S MADV_NOHUGEPAGE | S | S | S | S This approach ensures that we don't violate existing hints to only allocate single pages - this is required for QEMU's VM live migration implementation to work correctly - while allowing us to use LAF independently of THP (when sysfs=never). This makes wide scale performance characterization simpler, while avoiding exposing any new ABI to user space. When using LAF for allocation, the folio order is determined as follows: The return value of arch_wants_pte_order() is used. For vmas that have not explicitly opted-in to use transparent hugepages (e.g. where sysfs=madvise and the vma does not have MADV_HUGEPAGE or sysfs=never), then arch_wants_pte_order() is limited to 64K (or PAGE_SIZE, whichever is bigger). This allows for a performance boost without requiring any explicit opt-in from the workload while limitting internal fragmentation. If the preferred order can't be used (e.g. because the folio would breach the bounds of the vma, or because ptes in the region are already mapped) then we fall back to a suitable lower order; first PAGE_ALLOC_COSTLY_ORDER, then order-0. arch_wants_pte_order() can be overridden by the architecture if desired. Some architectures (e.g. arm64) can coalsece TLB entries if a contiguous set of ptes map physically contigious, naturally aligned memory, so this mechanism allows the architecture to optimize as required. Here we add the default implementation of arch_wants_pte_order(), used when the architecture does not define it, which returns -1, implying that the HW has no preference. In this case, mm will choose it's own default order. Signed-off-by: Ryan Roberts --- include/linux/pgtable.h | 13 ++++ mm/Kconfig | 10 +++ mm/memory.c | 144 +++++++++++++++++++++++++++++++++++++--- 3 files changed, 158 insertions(+), 9 deletions(-) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 222a33b9600d..4b488cc66ddc 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -369,6 +369,19 @@ static inline bool arch_has_hw_pte_young(void) } #endif +#ifndef arch_wants_pte_order +/* + * Returns preferred folio order for pte-mapped memory. Must be in range [0, + * PMD_SHIFT-PAGE_SHIFT) and must not be order-1 since THP requires large folios + * to be at least order-2. Negative value implies that the HW has no preference + * and mm will choose it's own default order. + */ +static inline int arch_wants_pte_order(void) +{ + return -1; +} +#endif + #ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long address, diff --git a/mm/Kconfig b/mm/Kconfig index 721dc88423c7..a1e28b8ddc24 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1243,4 +1243,14 @@ config LOCK_MM_AND_FIND_VMA source "mm/damon/Kconfig" +config LARGE_ANON_FOLIO + bool "Allocate large folios for anonymous memory" + depends on TRANSPARENT_HUGEPAGE + default n + help + Use large (bigger than order-0) folios to back anonymous memory where + possible, even for pte-mapped memory. This reduces the number of page + faults, as well as other per-page overheads to improve performance for + many workloads. + endmenu diff --git a/mm/memory.c b/mm/memory.c index d003076b218d..bbc7d4ce84f7 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4073,6 +4073,123 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) return ret; } +static bool vmf_pte_range_changed(struct vm_fault *vmf, int nr_pages) +{ + int i; + + if (nr_pages == 1) + return vmf_pte_changed(vmf); + + for (i = 0; i < nr_pages; i++) { + if (!pte_none(ptep_get_lockless(vmf->pte + i))) + return true; + } + + return false; +} + +#ifdef CONFIG_LARGE_ANON_FOLIO +#define ANON_FOLIO_MAX_ORDER_UNHINTED \ + (ilog2(max_t(unsigned long, SZ_64K, PAGE_SIZE)) - PAGE_SHIFT) + +static int anon_folio_order(struct vm_area_struct *vma) +{ + int order; + + /* + * If the vma is eligible for thp, allocate a large folio of the size + * preferred by the arch. Or if the arch requested a very small size or + * didn't request a size, then use PAGE_ALLOC_COSTLY_ORDER, which still + * meets the arch's requirements but means we still take advantage of SW + * optimizations (e.g. fewer page faults). + * + * If the vma isn't eligible for thp, take the arch-preferred size and + * limit it to ANON_FOLIO_MAX_ORDER_UNHINTED. This ensures workloads + * that have not explicitly opted-in take benefit while capping the + * potential for internal fragmentation. + */ + + order = max(arch_wants_pte_order(), PAGE_ALLOC_COSTLY_ORDER); + + if (!hugepage_vma_check(vma, vma->vm_flags, false, true, true)) + order = min(order, ANON_FOLIO_MAX_ORDER_UNHINTED); + + return order; +} + +static struct folio *alloc_anon_folio(struct vm_fault *vmf) +{ + int i; + gfp_t gfp; + pte_t *pte; + unsigned long addr; + struct folio *folio; + struct vm_area_struct *vma = vmf->vma; + int prefer = anon_folio_order(vma); + int orders[] = { + prefer, + prefer > PAGE_ALLOC_COSTLY_ORDER ? PAGE_ALLOC_COSTLY_ORDER : 0, + 0, + }; + + /* + * If uffd is active for the vma we need per-page fault fidelity to + * maintain the uffd semantics. + */ + if (userfaultfd_armed(vma)) + goto fallback; + + /* + * If hugepages are explicitly disabled for the vma (either + * MADV_NOHUGEPAGE or prctl) fallback to order-0. Failure to do this + * breaks correctness for user space. We ignore the sysfs global knob. + */ + if (!hugepage_vma_check(vma, vma->vm_flags, false, true, false)) + goto fallback; + + for (i = 0; orders[i]; i++) { + addr = ALIGN_DOWN(vmf->address, PAGE_SIZE << orders[i]); + if (addr >= vma->vm_start && + addr + (PAGE_SIZE << orders[i]) <= vma->vm_end) + break; + } + + if (!orders[i]) + goto fallback; + + pte = pte_offset_map(vmf->pmd, vmf->address & PMD_MASK); + if (!pte) + return ERR_PTR(-EAGAIN); + + for (; orders[i]; i++) { + addr = ALIGN_DOWN(vmf->address, PAGE_SIZE << orders[i]); + vmf->pte = pte + pte_index(addr); + if (!vmf_pte_range_changed(vmf, 1 << orders[i])) + break; + } + + vmf->pte = NULL; + pte_unmap(pte); + + gfp = vma_thp_gfp_mask(vma); + + for (; orders[i]; i++) { + addr = ALIGN_DOWN(vmf->address, PAGE_SIZE << orders[i]); + folio = vma_alloc_folio(gfp, orders[i], vma, addr, true); + if (folio) { + clear_huge_page(&folio->page, addr, 1 << orders[i]); + return folio; + } + } + +fallback: + return vma_alloc_zeroed_movable_folio(vma, vmf->address); +} +#else +#define alloc_anon_folio(vmf) \ + vma_alloc_zeroed_movable_folio((vmf)->vma, (vmf)->address) +#endif + /* * We enter with non-exclusive mmap_lock (to exclude vma changes, * but allow concurrent faults), and pte mapped but not yet locked. @@ -4080,6 +4197,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) */ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) { + int i; + int nr_pages = 1; + unsigned long addr = vmf->address; bool uffd_wp = vmf_orig_pte_uffd_wp(vmf); struct vm_area_struct *vma = vmf->vma; struct folio *folio; @@ -4124,10 +4244,15 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) /* Allocate our own private page. */ if (unlikely(anon_vma_prepare(vma))) goto oom; - folio = vma_alloc_zeroed_movable_folio(vma, vmf->address); + folio = alloc_anon_folio(vmf); + if (IS_ERR(folio)) + return 0; if (!folio) goto oom; + nr_pages = folio_nr_pages(folio); + addr = ALIGN_DOWN(vmf->address, nr_pages * PAGE_SIZE); + if (mem_cgroup_charge(folio, vma->vm_mm, GFP_KERNEL)) goto oom_free_page; folio_throttle_swaprate(folio, GFP_KERNEL); @@ -4144,12 +4269,12 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) if (vma->vm_flags & VM_WRITE) entry = pte_mkwrite(pte_mkdirty(entry)); - vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, - &vmf->ptl); + vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, addr, &vmf->ptl); if (!vmf->pte) goto release; - if (vmf_pte_changed(vmf)) { - update_mmu_tlb(vma, vmf->address, vmf->pte); + if (vmf_pte_range_changed(vmf, nr_pages)) { + for (i = 0; i < nr_pages; i++) + update_mmu_tlb(vma, addr + PAGE_SIZE * i, vmf->pte + i); goto release; } @@ -4164,16 +4289,17 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) return handle_userfault(vmf, VM_UFFD_MISSING); } - inc_mm_counter(vma->vm_mm, MM_ANONPAGES); - folio_add_new_anon_rmap(folio, vma, vmf->address); + folio_ref_add(folio, nr_pages - 1); + add_mm_counter(vma->vm_mm, MM_ANONPAGES, nr_pages); + folio_add_new_anon_rmap(folio, vma, addr); folio_add_lru_vma(folio, vma); setpte: if (uffd_wp) entry = pte_mkuffd_wp(entry); - set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry); + set_ptes(vma->vm_mm, addr, vmf->pte, entry, nr_pages); /* No need to invalidate - it was non-present before */ - update_mmu_cache_range(vmf, vma, vmf->address, vmf->pte, 1); + update_mmu_cache_range(vmf, vma, addr, vmf->pte, nr_pages); unlock: if (vmf->pte) pte_unmap_unlock(vmf->pte, vmf->ptl); From patchwork Thu Aug 10 14:29:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13349529 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96DEAC001E0 for ; Thu, 10 Aug 2023 14:30:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 187FC6B007B; Thu, 10 Aug 2023 10:30:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 138186B007D; Thu, 10 Aug 2023 10:30:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 000356B007E; Thu, 10 Aug 2023 10:30:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id E37C86B007B for ; Thu, 10 Aug 2023 10:30:06 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 52410C106F for ; Thu, 10 Aug 2023 14:30:06 +0000 (UTC) X-FDA: 81108429612.16.9604F75 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf02.hostedemail.com (Postfix) with ESMTP id 2FD7B80012 for ; Thu, 10 Aug 2023 14:30:03 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf02.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1691677804; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mMjrxtyJTpWC3olVckqQ03r5Ge3JaClfF6iuN/DQcwg=; b=NIWizeusut6Oz4g7sRDvlA63qyg4eX8vX2OQpAHbUg6IiHl9oxzfq5eYtlS2KMd8djZA2F T1IiInn+ObonpKjGkP6CbQyBscj8IqPy5nj8P2GMafsBY32po1kwjeSS86pPDaFfzNc8zH SIeGxTwES3+C7e85bKjI5B9jPbkKGIc= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf02.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1691677804; a=rsa-sha256; cv=none; b=7vU1QtDVGV5Gg7r/Uq0VsciLW+yzGpkd6hU1ONUGDtDQZOJU+n4SCPiCD90HGqQ4nLahY1 G9lXJuuHeYnkIDD8kmArIsdBHDzozfEpldHei1JwXIN2BtTyvlZc62Lu61X/55LfGmjQwt 7pnxNYcijKIopeX7/uEJhglnnIuWe8Y= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B756214BF; Thu, 10 Aug 2023 07:30:45 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 05DF53F64C; Thu, 10 Aug 2023 07:30:00 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , Matthew Wilcox , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Anshuman Khandual , Yang Shi , "Huang, Ying" , Zi Yan , Luis Chamberlain , Itaru Kitayama , "Kirill A. Shutemov" Cc: Ryan Roberts , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v5 4/5] selftests/mm/cow: Generalize do_run_with_thp() helper Date: Thu, 10 Aug 2023 15:29:41 +0100 Message-Id: <20230810142942.3169679-5-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230810142942.3169679-1-ryan.roberts@arm.com> References: <20230810142942.3169679-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 2FD7B80012 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: siqgr7ysxzjt6hgoykomo8hyx9jmhums X-HE-Tag: 1691677803-748231 X-HE-Meta: U2FsdGVkX1/Vqp6OdBjBMlyX/txMJbCDDsI73ApvWcxhq0fVfflhczR6MIoo3YwtEj5FAjG1syFlZrsbNMqjdDVZ5AS9mQRGYbrZYEVav7zKKuTOm3CZaCF8Suvdf9pfkUX37zd6tNNpvpsvxCDdrYoeM8N0KNCJJC+Uemo79IhsJM052vLCSjHeVrdVgHx8wyCSiHO/V5jjJud4PZd7VIWwkFPHHgmufqHvD82zQjFdw1Zo9rTMbJdhH06KVlFIKCRym0BZqg1pml7AM+m2AT7wWzV38V6Ey8kZ1ejAaggpmyLY/PuE4/5y/UAfGwEHrrz3yEiIATzmb5b2eXj7r27YtHLpf4EwLMwm76CiyN5HAQpnfdtyg3O5daWj46+o4luoFP+dT/sziti3d+arjcitsReKRo3lr70OzSRm5gfxPSglSchWKLdajmZHXdYkbfzlhJXki+kVz/Q5M048tAFVbsZmNCKYIsTgpqKMToxDcVyRehS/jHaTuiX3Gg6bBKlgNfJTFpCWkxcJVAg5b/mvzG4evM2YtL2IBSSl/1HqaKvsf1zdR6/f0VOTlkrYpb09lCDYBAzQV2B/J+IX+rci1ewQp/FakmgS1n5HkEG7Wqs7SvPWKDjAM8nP/nXjfgV/DCsRiln34euYbAeMH1i5byLj5va5RGX/HluqEMngTyUvSqTKqgHd5IXEHEPoNwPSZjaLq/AKECnB0HORnRYr8FhUH44HMQ0PcnPuYteERYh99PnD2HNPOH5GrzIUHvR9Ce3ZFR9Ir2d0gwcbGWvyZ8j+NXkyWHfvK5wPakc65YYSjeD5w+DaZb2867SBLT3fD++boDnAXt8O2ZBwYIdw9oSpjMxBPytQ96Pa4wZ8q93XUcgb4sgQtjmevLBI+Mn+/VRSe5c7Df4KhGZa8d2EnFtMjCGZgmB4ZhaF5iUGFDTu5YdCsm/ylwxVTf/bfjdrmvbRqBipREdCQA9 oc+69xkm ggMQvTt+nHv9bmVpo43ucZw3oUqh2KVfY+djrxE9BgjRoSipeursgr5uQZKb+pvDctIJ7wqCcd7Q014fTJA0RW3xUclrR6CtTIXhe3SAGPquQ554r9GTZhAoTT/Bl+yL6Qra8gkTfOl14x+OJNhONyNpjJd6kId41/ibXZ4md68Ve+Dmz/puNEZ6v1gjCquKH5nGNKTH8rZprL2G0nJiJiQfWpCkeKaKIsfPHyaIVEW6VOUVPdzBA9CJNmKvT53fV95eAXIRw1ww1hPGKXuD1qIGyLg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: do_run_with_thp() prepares THP memory into different states before running tests. We would like to reuse this logic to also test large anon folios. So let's add a size parameter which tells the function what size of memory it should operate on. Remove references to THP and replace with LARGE, and fix up all existing call sites to pass thpsize as the required size. No functional change intended here, but a separate commit will add new large anon folio tests that use this new capability. Signed-off-by: Ryan Roberts --- tools/testing/selftests/mm/cow.c | 118 ++++++++++++++++--------------- 1 file changed, 61 insertions(+), 57 deletions(-) diff --git a/tools/testing/selftests/mm/cow.c b/tools/testing/selftests/mm/cow.c index 7324ce5363c0..304882bf2e5d 100644 --- a/tools/testing/selftests/mm/cow.c +++ b/tools/testing/selftests/mm/cow.c @@ -723,25 +723,25 @@ static void run_with_base_page_swap(test_fn fn, const char *desc) do_run_with_base_page(fn, true); } -enum thp_run { - THP_RUN_PMD, - THP_RUN_PMD_SWAPOUT, - THP_RUN_PTE, - THP_RUN_PTE_SWAPOUT, - THP_RUN_SINGLE_PTE, - THP_RUN_SINGLE_PTE_SWAPOUT, - THP_RUN_PARTIAL_MREMAP, - THP_RUN_PARTIAL_SHARED, +enum large_run { + LARGE_RUN_PMD, + LARGE_RUN_PMD_SWAPOUT, + LARGE_RUN_PTE, + LARGE_RUN_PTE_SWAPOUT, + LARGE_RUN_SINGLE_PTE, + LARGE_RUN_SINGLE_PTE_SWAPOUT, + LARGE_RUN_PARTIAL_MREMAP, + LARGE_RUN_PARTIAL_SHARED, }; -static void do_run_with_thp(test_fn fn, enum thp_run thp_run) +static void do_run_with_large(test_fn fn, enum large_run large_run, size_t size) { char *mem, *mmap_mem, *tmp, *mremap_mem = MAP_FAILED; - size_t size, mmap_size, mremap_size; + size_t mmap_size, mremap_size; int ret; - /* For alignment purposes, we need twice the thp size. */ - mmap_size = 2 * thpsize; + /* For alignment purposes, we need twice the requested size. */ + mmap_size = 2 * size; mmap_mem = mmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (mmap_mem == MAP_FAILED) { @@ -749,36 +749,40 @@ static void do_run_with_thp(test_fn fn, enum thp_run thp_run) return; } - /* We need a THP-aligned memory area. */ - mem = (char *)(((uintptr_t)mmap_mem + thpsize) & ~(thpsize - 1)); + /* We need to naturally align the memory area. */ + mem = (char *)(((uintptr_t)mmap_mem + size) & ~(size - 1)); - ret = madvise(mem, thpsize, MADV_HUGEPAGE); + ret = madvise(mem, size, MADV_HUGEPAGE); if (ret) { ksft_test_result_fail("MADV_HUGEPAGE failed\n"); goto munmap; } /* - * Try to populate a THP. Touch the first sub-page and test if we get - * another sub-page populated automatically. + * Try to populate a large folio. Touch the first sub-page and test if + * we get the last sub-page populated automatically. */ mem[0] = 0; - if (!pagemap_is_populated(pagemap_fd, mem + pagesize)) { - ksft_test_result_skip("Did not get a THP populated\n"); + if (!pagemap_is_populated(pagemap_fd, mem + size - pagesize)) { + ksft_test_result_skip("Did not get fully populated\n"); goto munmap; } - memset(mem, 0, thpsize); + memset(mem, 0, size); - size = thpsize; - switch (thp_run) { - case THP_RUN_PMD: - case THP_RUN_PMD_SWAPOUT: + switch (large_run) { + case LARGE_RUN_PMD: + case LARGE_RUN_PMD_SWAPOUT: + if (size != thpsize) { + ksft_test_result_fail("test bug: can't PMD-map size\n"); + goto munmap; + } break; - case THP_RUN_PTE: - case THP_RUN_PTE_SWAPOUT: + case LARGE_RUN_PTE: + case LARGE_RUN_PTE_SWAPOUT: /* - * Trigger PTE-mapping the THP by temporarily mapping a single - * subpage R/O. + * Trigger PTE-mapping the large folio by temporarily mapping a + * single subpage R/O. This is a noop if the large-folio is not + * thpsize (and therefore already PTE-mapped). */ ret = mprotect(mem + pagesize, pagesize, PROT_READ); if (ret) { @@ -791,25 +795,25 @@ static void do_run_with_thp(test_fn fn, enum thp_run thp_run) goto munmap; } break; - case THP_RUN_SINGLE_PTE: - case THP_RUN_SINGLE_PTE_SWAPOUT: + case LARGE_RUN_SINGLE_PTE: + case LARGE_RUN_SINGLE_PTE_SWAPOUT: /* - * Discard all but a single subpage of that PTE-mapped THP. What - * remains is a single PTE mapping a single subpage. + * Discard all but a single subpage of that PTE-mapped large + * folio. What remains is a single PTE mapping a single subpage. */ - ret = madvise(mem + pagesize, thpsize - pagesize, MADV_DONTNEED); + ret = madvise(mem + pagesize, size - pagesize, MADV_DONTNEED); if (ret) { ksft_test_result_fail("MADV_DONTNEED failed\n"); goto munmap; } size = pagesize; break; - case THP_RUN_PARTIAL_MREMAP: + case LARGE_RUN_PARTIAL_MREMAP: /* - * Remap half of the THP. We need some new memory location - * for that. + * Remap half of the lareg folio. We need some new memory + * location for that. */ - mremap_size = thpsize / 2; + mremap_size = size / 2; mremap_mem = mmap(NULL, mremap_size, PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (mem == MAP_FAILED) { @@ -824,13 +828,13 @@ static void do_run_with_thp(test_fn fn, enum thp_run thp_run) } size = mremap_size; break; - case THP_RUN_PARTIAL_SHARED: + case LARGE_RUN_PARTIAL_SHARED: /* - * Share the first page of the THP with a child and quit the - * child. This will result in some parts of the THP never - * have been shared. + * Share the first page of the large folio with a child and quit + * the child. This will result in some parts of the large folio + * never have been shared. */ - ret = madvise(mem + pagesize, thpsize - pagesize, MADV_DONTFORK); + ret = madvise(mem + pagesize, size - pagesize, MADV_DONTFORK); if (ret) { ksft_test_result_fail("MADV_DONTFORK failed\n"); goto munmap; @@ -844,7 +848,7 @@ static void do_run_with_thp(test_fn fn, enum thp_run thp_run) } wait(&ret); /* Allow for sharing all pages again. */ - ret = madvise(mem + pagesize, thpsize - pagesize, MADV_DOFORK); + ret = madvise(mem + pagesize, size - pagesize, MADV_DOFORK); if (ret) { ksft_test_result_fail("MADV_DOFORK failed\n"); goto munmap; @@ -854,10 +858,10 @@ static void do_run_with_thp(test_fn fn, enum thp_run thp_run) assert(false); } - switch (thp_run) { - case THP_RUN_PMD_SWAPOUT: - case THP_RUN_PTE_SWAPOUT: - case THP_RUN_SINGLE_PTE_SWAPOUT: + switch (large_run) { + case LARGE_RUN_PMD_SWAPOUT: + case LARGE_RUN_PTE_SWAPOUT: + case LARGE_RUN_SINGLE_PTE_SWAPOUT: madvise(mem, size, MADV_PAGEOUT); if (!range_is_swapped(mem, size)) { ksft_test_result_skip("MADV_PAGEOUT did not work, is swap enabled?\n"); @@ -878,49 +882,49 @@ static void do_run_with_thp(test_fn fn, enum thp_run thp_run) static void run_with_thp(test_fn fn, const char *desc) { ksft_print_msg("[RUN] %s ... with THP\n", desc); - do_run_with_thp(fn, THP_RUN_PMD); + do_run_with_large(fn, LARGE_RUN_PMD, thpsize); } static void run_with_thp_swap(test_fn fn, const char *desc) { ksft_print_msg("[RUN] %s ... with swapped-out THP\n", desc); - do_run_with_thp(fn, THP_RUN_PMD_SWAPOUT); + do_run_with_large(fn, LARGE_RUN_PMD_SWAPOUT, thpsize); } static void run_with_pte_mapped_thp(test_fn fn, const char *desc) { ksft_print_msg("[RUN] %s ... with PTE-mapped THP\n", desc); - do_run_with_thp(fn, THP_RUN_PTE); + do_run_with_large(fn, LARGE_RUN_PTE, thpsize); } static void run_with_pte_mapped_thp_swap(test_fn fn, const char *desc) { ksft_print_msg("[RUN] %s ... with swapped-out, PTE-mapped THP\n", desc); - do_run_with_thp(fn, THP_RUN_PTE_SWAPOUT); + do_run_with_large(fn, LARGE_RUN_PTE_SWAPOUT, thpsize); } static void run_with_single_pte_of_thp(test_fn fn, const char *desc) { ksft_print_msg("[RUN] %s ... with single PTE of THP\n", desc); - do_run_with_thp(fn, THP_RUN_SINGLE_PTE); + do_run_with_large(fn, LARGE_RUN_SINGLE_PTE, thpsize); } static void run_with_single_pte_of_thp_swap(test_fn fn, const char *desc) { ksft_print_msg("[RUN] %s ... with single PTE of swapped-out THP\n", desc); - do_run_with_thp(fn, THP_RUN_SINGLE_PTE_SWAPOUT); + do_run_with_large(fn, LARGE_RUN_SINGLE_PTE_SWAPOUT, thpsize); } static void run_with_partial_mremap_thp(test_fn fn, const char *desc) { ksft_print_msg("[RUN] %s ... with partially mremap()'ed THP\n", desc); - do_run_with_thp(fn, THP_RUN_PARTIAL_MREMAP); + do_run_with_large(fn, LARGE_RUN_PARTIAL_MREMAP, thpsize); } static void run_with_partial_shared_thp(test_fn fn, const char *desc) { ksft_print_msg("[RUN] %s ... with partially shared THP\n", desc); - do_run_with_thp(fn, THP_RUN_PARTIAL_SHARED); + do_run_with_large(fn, LARGE_RUN_PARTIAL_SHARED, thpsize); } static void run_with_hugetlb(test_fn fn, const char *desc, size_t hugetlbsize) @@ -1338,7 +1342,7 @@ static void run_anon_thp_test_cases(void) struct test_case const *test_case = &anon_thp_test_cases[i]; ksft_print_msg("[RUN] %s\n", test_case->desc); - do_run_with_thp(test_case->fn, THP_RUN_PMD); + do_run_with_large(test_case->fn, LARGE_RUN_PMD, thpsize); } } From patchwork Thu Aug 10 14:29:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13349530 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5710DC04A6A for ; Thu, 10 Aug 2023 14:30:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C36D56B007D; Thu, 10 Aug 2023 10:30:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BC20D6B007E; Thu, 10 Aug 2023 10:30:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9EA996B0080; Thu, 10 Aug 2023 10:30:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 90A026B007D for ; Thu, 10 Aug 2023 10:30:09 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id D1E4F121041 for ; Thu, 10 Aug 2023 14:30:08 +0000 (UTC) X-FDA: 81108429696.07.555B6FD Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf22.hostedemail.com (Postfix) with ESMTP id C805FC000D for ; Thu, 10 Aug 2023 14:30:06 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; spf=pass (imf22.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1691677806; a=rsa-sha256; cv=none; b=TwsZZUbMUY6MiCdLT2ZkN+L51Md+KYILzUCnyBYB+LGrBrE7ETmaDk1KPDmlDi3P8OTj2H siuRgU3SOhjxW5LGT/YOPOizE4OILVkaVVu35UUpx8lyEwoBdo0zwi8f6Fhn7kDVajxFo/ /dbViU+6K061jBa14dn/VXagV2sPzK4= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; spf=pass (imf22.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1691677806; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YAYuga94X0gYR3iPhGr2eIcuvk2ipwl7xIVV/NAaqzI=; b=CrO40ev5y8+AeHB5D48ge2OU6PU/+h+1UwMI+sRAw+z0a/Njjisylsq+w4RSGcwBSKHFqm o7VF6+Zj0CFXLqUJXgW5FEAAZe9FiOAxMA9RC+chDm6QsBjCwvGZqkSyl2YyuHC3jmaNao jKKQZTIDCVe0/quqneMjig/TEspsnz4= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 50B52150C; Thu, 10 Aug 2023 07:30:48 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 937D63F64C; Thu, 10 Aug 2023 07:30:03 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , Matthew Wilcox , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Anshuman Khandual , Yang Shi , "Huang, Ying" , Zi Yan , Luis Chamberlain , Itaru Kitayama , "Kirill A. Shutemov" Cc: Ryan Roberts , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v5 5/5] selftests/mm/cow: Add large anon folio tests Date: Thu, 10 Aug 2023 15:29:42 +0100 Message-Id: <20230810142942.3169679-6-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230810142942.3169679-1-ryan.roberts@arm.com> References: <20230810142942.3169679-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: C805FC000D X-Stat-Signature: f3ueos48xna481fd91g8hp65ktuc8gj8 X-HE-Tag: 1691677806-233387 X-HE-Meta: U2FsdGVkX18ppuCvKe0wHip8pg8C/cpkCzw7LaQ/ltqmjRIEfAVYLcE8bHkSuLrSyqDKocBk6q9z0XPyyqczIegLDMmGLP/WilzDSQiSUXjDneLGObtM34TSrmtuX60BYB1mrLNk2nTvBC9LiEhqG5S/3dK6En7EV3f8u2X9G5QNHeamymYatSJRiRvMI+WX0B8tNQ5nRbUI+QFVmwNwRwTnKuwSY/L8KDl54qG1jgVaEHQE6qfGEVeICTLGqSTIcaKg3LcvqNh2s+olbc6NFFV0VqD6WM6B5f8XnNNFE8KGOCD1mMv7fiC+Xjx1xHSKCyGG4HMtxSQM21RVYvxNpWutzWaqrUo9SqK2YmP6cqipWinZ/r2lDX01op/9Ic5fpZfE93H0K7r67H1n/KPxIGIXpY+naF1OyF++vJrYKIi0wDAl6NHzdyURTN2qhI0Pw9gnts5b7S4F9ePE90B059Y0f7ys/5QhNgBcJ7lNrXxXZRyF/7zM7+HtQJLZZRTNLY97LgnvI8vbs9s7O7qUWrWJ2qaUnsYBUFDqbKyWHZJOmgeeDxmQhNgFD2kklEl+YYkHysc+VBe5/6gXCzZLNnMCstUaF706x/G7wLZUdCeSQf+mx7OGHFrOg0ifCWQcqOmJgTAM0YFurEhbPFBvhpffE0eqAh6wvMMc+shFjabYv4UEv3kOjNhdPGoFt5JmjpyJciXnWo3IKAJ+eJSU/z2ShXfSpPaZ6G1D75S/Z+9Id6ac1vTlDyvm9gT+9Xii8MW9awTze1lbtGg4wGvXAQ5O5d1W5UtR9B1q/nj6X3HeAZ6sr3OsML8JTZiJVg4mc64I34dWYJEkyCydYGqtnWyzVBsqct2fIpgqIUILcS7+XtJjNfEHw1A38lqRlowGorEPRyNzYPjTuLS5rQdyWXYnALgdkyVPJHjr8rPoqPrge22XWZ5FEgF89WryUg41xXla2rWOFlBS7LPWbt/ 8noNNQLi u5WzSjHxsO4noV7KVAunnD7BFAEFpkk4cYlmVF+JoJMF+FPLY2kXoKpxuAaPQkftsw+QvY3s4T83tLzZRymq0RPDWomkrtbAiSifbzY4PEEeFXszNUiWp90HqATuJe2mWfJrkHpL5LL0KFEBMaxP7E4/6H9xLJOA9hoN9IP/g2VBxjOvFLQWSVBShPVbs0ZdED/ElG1fju56dXgdlNwJEVjNhyV4nYd/51YYD X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add tests similar to the existing THP tests, but which operate on memory backed by large anonymous folios, which are smaller than THP. This reuses all the existing infrastructure. If the test suite detects that large anonyomous folios are not supported by the kernel, the new tests are skipped. Signed-off-by: Ryan Roberts --- tools/testing/selftests/mm/cow.c | 111 +++++++++++++++++++++++++++++-- 1 file changed, 106 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/mm/cow.c b/tools/testing/selftests/mm/cow.c index 304882bf2e5d..932242c965a4 100644 --- a/tools/testing/selftests/mm/cow.c +++ b/tools/testing/selftests/mm/cow.c @@ -33,6 +33,7 @@ static size_t pagesize; static int pagemap_fd; static size_t thpsize; +static size_t lafsize; static int nr_hugetlbsizes; static size_t hugetlbsizes[10]; static int gup_fd; @@ -927,6 +928,42 @@ static void run_with_partial_shared_thp(test_fn fn, const char *desc) do_run_with_large(fn, LARGE_RUN_PARTIAL_SHARED, thpsize); } +static void run_with_laf(test_fn fn, const char *desc) +{ + ksft_print_msg("[RUN] %s ... with large anon folio\n", desc); + do_run_with_large(fn, LARGE_RUN_PTE, lafsize); +} + +static void run_with_laf_swap(test_fn fn, const char *desc) +{ + ksft_print_msg("[RUN] %s ... with swapped-out large anon folio\n", desc); + do_run_with_large(fn, LARGE_RUN_PTE_SWAPOUT, lafsize); +} + +static void run_with_single_pte_of_laf(test_fn fn, const char *desc) +{ + ksft_print_msg("[RUN] %s ... with single PTE of large anon folio\n", desc); + do_run_with_large(fn, LARGE_RUN_SINGLE_PTE, lafsize); +} + +static void run_with_single_pte_of_laf_swap(test_fn fn, const char *desc) +{ + ksft_print_msg("[RUN] %s ... with single PTE of swapped-out large anon folio\n", desc); + do_run_with_large(fn, LARGE_RUN_SINGLE_PTE_SWAPOUT, lafsize); +} + +static void run_with_partial_mremap_laf(test_fn fn, const char *desc) +{ + ksft_print_msg("[RUN] %s ... with partially mremap()'ed large anon folio\n", desc); + do_run_with_large(fn, LARGE_RUN_PARTIAL_MREMAP, lafsize); +} + +static void run_with_partial_shared_laf(test_fn fn, const char *desc) +{ + ksft_print_msg("[RUN] %s ... with partially shared large anon folio\n", desc); + do_run_with_large(fn, LARGE_RUN_PARTIAL_SHARED, lafsize); +} + static void run_with_hugetlb(test_fn fn, const char *desc, size_t hugetlbsize) { int flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB; @@ -1105,6 +1142,14 @@ static void run_anon_test_case(struct test_case const *test_case) run_with_partial_mremap_thp(test_case->fn, test_case->desc); run_with_partial_shared_thp(test_case->fn, test_case->desc); } + if (lafsize) { + run_with_laf(test_case->fn, test_case->desc); + run_with_laf_swap(test_case->fn, test_case->desc); + run_with_single_pte_of_laf(test_case->fn, test_case->desc); + run_with_single_pte_of_laf_swap(test_case->fn, test_case->desc); + run_with_partial_mremap_laf(test_case->fn, test_case->desc); + run_with_partial_shared_laf(test_case->fn, test_case->desc); + } for (i = 0; i < nr_hugetlbsizes; i++) run_with_hugetlb(test_case->fn, test_case->desc, hugetlbsizes[i]); @@ -1126,6 +1171,8 @@ static int tests_per_anon_test_case(void) if (thpsize) tests += 8; + if (lafsize) + tests += 6; return tests; } @@ -1680,15 +1727,74 @@ static int tests_per_non_anon_test_case(void) return tests; } +static size_t large_anon_folio_size(void) +{ + /* + * There is no interface to query this. But we know that it must be less + * than thpsize. So we map a thpsize area, aligned to thpsize offset by + * thpsize/2 (to avoid a hugepage being allocated), then touch the first + * page and see how many pages get faulted in. + */ + + int max_order = __builtin_ctz(thpsize); + size_t mmap_size = thpsize * 3; + char *mmap_mem = NULL; + int order = 0; + char *mem; + size_t offset; + int ret; + + /* For alignment purposes, we need 2.5x the requested size. */ + mmap_mem = mmap(NULL, mmap_size, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + if (mmap_mem == MAP_FAILED) + goto out; + + /* Align the memory area to thpsize then offset it by thpsize/2. */ + mem = (char *)(((uintptr_t)mmap_mem + thpsize) & ~(thpsize - 1)); + mem += thpsize / 2; + + /* We might get a bigger large anon folio when MADV_HUGEPAGE is set. */ + ret = madvise(mem, thpsize, MADV_HUGEPAGE); + if (ret) + goto out; + + /* Probe the memory to see how much is populated. */ + mem[0] = 0; + for (order = 0; order < max_order; order++) { + offset = (1 << order) * pagesize; + if (!pagemap_is_populated(pagemap_fd, mem + offset)) + break; + } + +out: + if (mmap_mem) + munmap(mmap_mem, mmap_size); + + if (order == 0) + return 0; + + return offset; +} + int main(int argc, char **argv) { int err; + gup_fd = open("/sys/kernel/debug/gup_test", O_RDWR); + pagemap_fd = open("/proc/self/pagemap", O_RDONLY); + if (pagemap_fd < 0) + ksft_exit_fail_msg("opening pagemap failed\n"); + pagesize = getpagesize(); thpsize = read_pmd_pagesize(); if (thpsize) ksft_print_msg("[INFO] detected THP size: %zu KiB\n", thpsize / 1024); + lafsize = large_anon_folio_size(); + if (lafsize) + ksft_print_msg("[INFO] detected large anon folio size: %zu KiB\n", + lafsize / 1024); nr_hugetlbsizes = detect_hugetlb_page_sizes(hugetlbsizes, ARRAY_SIZE(hugetlbsizes)); detect_huge_zeropage(); @@ -1698,11 +1804,6 @@ int main(int argc, char **argv) ARRAY_SIZE(anon_thp_test_cases) * tests_per_anon_thp_test_case() + ARRAY_SIZE(non_anon_test_cases) * tests_per_non_anon_test_case()); - gup_fd = open("/sys/kernel/debug/gup_test", O_RDWR); - pagemap_fd = open("/proc/self/pagemap", O_RDONLY); - if (pagemap_fd < 0) - ksft_exit_fail_msg("opening pagemap failed\n"); - run_anon_test_cases(); run_anon_thp_test_cases(); run_non_anon_test_cases();