From patchwork Wed Jul 26 09:51:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13327781 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD215C001E0 for ; Wed, 26 Jul 2023 09:52:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 224446B0074; Wed, 26 Jul 2023 05:52:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1AC906B0075; Wed, 26 Jul 2023 05:52:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 04E868D0001; Wed, 26 Jul 2023 05:52:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id E7F6D6B0074 for ; Wed, 26 Jul 2023 05:52:04 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id B6100B254E for ; Wed, 26 Jul 2023 09:52:04 +0000 (UTC) X-FDA: 81053296968.18.85E1A0A Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf20.hostedemail.com (Postfix) with ESMTP id 10A261C0002 for ; Wed, 26 Jul 2023 09:52:02 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf20.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690365123; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=P8ft9n3i6LFq7DQLCMNj+awnrZC8BqgiGOJGw+Gf8+8=; b=1q/Uhw4wFC7N/9JDmHSJXvP51oZWx98/JVkd+vvkUd5FNnVUws0dLjR9gdMM14vzqAw78S e8zrEUGMZNYOlEpOLF575RYsw0tOs+MQIMWvPFWYqfLKhl1dMapUd8Z0VMBoZDMatpATRC kv0cTK8vJrDg32CFoP1EdsQkf4INJas= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf20.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690365123; a=rsa-sha256; cv=none; b=yhuPwlxy5TSFEUjoKF3NaGbPbEKcA+OdYzf/yI+08BxF+xIO3bDpbo6s2tpO7iNv7X0DFY SVnp1Qn8RfOg34Bgex+dPtppF3DFA2WpB46ZUR/qZWf9YaGaUDtBnoSsnoqeKmVOpDVR02 Trrwm4iSu7TPRNhwu383Om+PJoWoKWw= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 31F81169C; Wed, 26 Jul 2023 02:52:45 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id EAB913F67D; Wed, 26 Jul 2023 02:51:59 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , Matthew Wilcox , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Anshuman Khandual , Yang Shi , "Huang, Ying" , Zi Yan , Luis Chamberlain , Itaru Kitayama Cc: Ryan Roberts , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 1/5] mm: Non-pmd-mappable, large folios for folio_add_new_anon_rmap() Date: Wed, 26 Jul 2023 10:51:42 +0100 Message-Id: <20230726095146.2826796-2-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230726095146.2826796-1-ryan.roberts@arm.com> References: <20230726095146.2826796-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 10A261C0002 X-Stat-Signature: xx4xa8rxcyacf9swg55jwbenesu6die6 X-Rspam-User: X-HE-Tag: 1690365122-763068 X-HE-Meta: U2FsdGVkX1/NWOY0s9Z5nAPcN2W3M6n0bNw7mFG0KiqF6y7r23Kz9LxLnfD5S+4mDx1G+ecmsrp8V4ccpg36tljalvPZXhwjf3hKL9AuGXzm25LWviPQVrZevrsCbMZdsXzNZ67BwJ7spTlZBlZD7gbGqypOSPIBZ/4bP3921iRoAJ+NO2GTinp8dxGqm/Og/RlT34feARCiR81Ngj+K+AJvnn3GSiArRuAC2JdlwvNwpszdbYVZWrWiNvCI/WhAZWJpODtgipok+TJ19RicV3U9jfN/GrvOPpCmQLXgtVW1TVgpqSnYkYTHy7CMocmjN0kM8dLuDygCuNuYZU83fRSCPSUgTo/XzQN4Jt0yuF9TNY0pXYGy752tWGIHCV5iEnIxVYl2s4d7ziXhkXdQLBLuQGZIj9hRMd5i6wYAo31vK1s+UFV7YTbslxy1L7JnWD0pXUqGl+3Vg0qjH7LefZ//u0+hXrdCszJSDwlxadofTE3O4wFVe4Js84gMrHGZWQNqEvY4zggsPQD5HPuG7gxjoaV2jcl6wRZv2lTsfNRSLvEmj4CbXKjvXjmHSKUNiL2G8OT8o2tFCk3g/V1gpsbDK4z4JwZkhouNCySwNM56SOiwFwUhu8uepfG/UyJBFyK5/jmIiFvBTXgbO4IEN1mXN5k7KZmcUI1F6JybY+h0V4v+UmFQga24oXUM99dFCnAMZ5NXxGBSZbo1Apt8Ve/UxEsaiI41iCbViNRSq8w1Qy2E6B6V+BL9AZkmJRSv/ZUpoRKyYB19yu2nhivbn4QwCFMAS42Bk976RtMeZac+gNwJWTMy1X8HFCa62ZY0u5fhlWLC3jIGbg0HqVZTEAwDlNj4SblPuRxudx2FczzCIFv9bAAF7kZEdgsNGBYgz7IEOopBVaNwZXYBxcZVIEIaESO9tG+LjgLX4KZzh2uO8SBQLwB32zueif9rbWYx8Sq6VvsbHcofcon4OxG eGerpzff 7WbC9zk+nZCP/fBX5o0giiACN+rTch+n6nhJ2TvpSX+tVWOTIMiLvu8AXwi+y8aO8us7eQzXR404JEUxrwfjIrObGqzGhZebWS9a0hmMxxSfoW5+5Fsmx6Unz5rapAkqMk8Uvor462ekCM87Z/CWIh9KrtjX5EIKlN+KU78ontjWAETEJmVROaE+0fD8N0wFJYLfEAirKH6qL5FbeVYjda+VdyU9pV2xcuOeR0mkb1FRA6lcw6xAl5NgM5Je9nkBHTN8HfEcYoI5LEgk5oWoKhRSHn53EWrlChyq9uO9Uta40tFFUahdSd1dw2au5JVJNMP/u X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In preparation for LARGE_ANON_FOLIO support, improve folio_add_new_anon_rmap() to allow a non-pmd-mappable, large folio to be passed to it. In this case, all contained pages are accounted using the order-0 folio (or base page) scheme. Reviewed-by: Yu Zhao Reviewed-by: Yin Fengwei Signed-off-by: Ryan Roberts --- mm/rmap.c | 27 ++++++++++++++++++++------- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 0c0d8857dfce..b3e3006738e4 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1278,31 +1278,44 @@ void page_add_anon_rmap(struct page *page, struct vm_area_struct *vma, * This means the inc-and-test can be bypassed. * The folio does not have to be locked. * - * If the folio is large, it is accounted as a THP. As the folio + * If the folio is pmd-mappable, it is accounted as a THP. As the folio * is new, it's assumed to be mapped exclusively by a single process. */ void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma, unsigned long address) { - int nr; + int nr = folio_nr_pages(folio); - VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma); + VM_BUG_ON_VMA(address < vma->vm_start || + address + (nr << PAGE_SHIFT) > vma->vm_end, vma); __folio_set_swapbacked(folio); - if (likely(!folio_test_pmd_mappable(folio))) { + if (likely(!folio_test_large(folio))) { /* increment count (starts at -1) */ atomic_set(&folio->_mapcount, 0); - nr = 1; + __page_set_anon_rmap(folio, &folio->page, vma, address, 1); + } else if (!folio_test_pmd_mappable(folio)) { + int i; + + for (i = 0; i < nr; i++) { + struct page *page = folio_page(folio, i); + + /* increment count (starts at -1) */ + atomic_set(&page->_mapcount, 0); + __page_set_anon_rmap(folio, page, vma, + address + (i << PAGE_SHIFT), 1); + } + + atomic_set(&folio->_nr_pages_mapped, nr); } else { /* increment count (starts at -1) */ atomic_set(&folio->_entire_mapcount, 0); atomic_set(&folio->_nr_pages_mapped, COMPOUND_MAPPED); - nr = folio_nr_pages(folio); + __page_set_anon_rmap(folio, &folio->page, vma, address, 1); __lruvec_stat_mod_folio(folio, NR_ANON_THPS, nr); } __lruvec_stat_mod_folio(folio, NR_ANON_MAPPED, nr); - __page_set_anon_rmap(folio, &folio->page, vma, address, 1); } /** From patchwork Wed Jul 26 09:51:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13327782 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C044C0015E for ; Wed, 26 Jul 2023 09:52:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 04E936B0075; Wed, 26 Jul 2023 05:52:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F3FBC6B0078; Wed, 26 Jul 2023 05:52:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E08668D0001; Wed, 26 Jul 2023 05:52:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id D1E166B0075 for ; Wed, 26 Jul 2023 05:52:07 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 9E31B1201ED for ; Wed, 26 Jul 2023 09:52:07 +0000 (UTC) X-FDA: 81053297094.25.5B0A127 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf08.hostedemail.com (Postfix) with ESMTP id D1BEF16000E for ; Wed, 26 Jul 2023 09:52:05 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=none; spf=pass (imf08.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690365126; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=x0BA6E9+i5Gnd4bO5XZ4PdPptm0H5ieTfyuaPLlEOJY=; b=SiCYWUFL12KdQZXYYGXYJaHBdLJXDhs7DrrGdLnweE/f/C62WDlUVLucKndca2J0/dwFg4 gpBrY5VYbELiA//oBa9qir2SdyNfIJM7pwKoqpVIENG3EmpKTIp2tq/QnIycs6WFKtHX0b YknnBuA7Zs2gf4nuQYaf2q1ZN9eXBrU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690365126; a=rsa-sha256; cv=none; b=c7FmISk6zMXnjbfcg+RW9Td51rpMEG1TP+/1851QlP6bWWNAkXnCp4wAwYCPgYQ6DQAwX8 dvEeJSQRpD+KaxN0ViJ6fKjBM3AOy4q2c0i1WqPDkOcB0vxnUTpVqACnP7glumb0lEGQcW hIKqL3UpRP3pdWl8mjofb/5Ul2c+14M= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=none; spf=pass (imf08.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D817F169E; Wed, 26 Jul 2023 02:52:47 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 7EFDF3F67D; Wed, 26 Jul 2023 02:52:02 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , Matthew Wilcox , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Anshuman Khandual , Yang Shi , "Huang, Ying" , Zi Yan , Luis Chamberlain , Itaru Kitayama Cc: Ryan Roberts , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 2/5] mm: LARGE_ANON_FOLIO for improved performance Date: Wed, 26 Jul 2023 10:51:43 +0100 Message-Id: <20230726095146.2826796-3-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230726095146.2826796-1-ryan.roberts@arm.com> References: <20230726095146.2826796-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: D1BEF16000E X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: 8dqnm64npmsmtjxberz84bwmomkzi1h3 X-HE-Tag: 1690365125-662855 X-HE-Meta: U2FsdGVkX19ArOEFrOSGhFKHR210Adq2WVmAD7KLFCXfFJNlI3+ZhnpvoLvqevgGNPoHYu8JCKfzD79b38o3FVIWz6FDlGNl4lOT0sjvQdZW+gSAlQlWjSWWOwXPyXDAnvjQwjbMPHGkCdTTNQjLSEjB8ByfyVI+DWh33hK8FhusSi073pVBUqEuklUBgqg+SEwkgz2j+VcwjUd7fhVLJvGWShro+tbEVZkxpJ+xg9uCgIRNQfv2+t7Ak+nEnztLJ/l76IIN34B2AMQtNfSLYEgFb0Y+nAaOcmVhlxj0owckcDa6StrS7gM1hXyasvA+dqKapHmohh+E26/lgbP01ElLCF9D0yyvVVWL7QUvBCDyGyJ/gXHEMgMYvlpHBXX3plfxIPlEy/YPUUTCXjjnBwrzUI9S9hyGkvbmRTBimRvIWbjywZ+lOumC4LDO1V+sNWxxLnAaLUr9YiHE5pS2qtQcU84J+RpSK3+D5q/+mNkVPkYwnbZTNzKFJgZtFoDbPxGpbUxC2AkM3pbXJSPATRWKTSA8mfnJUjJvCb9vlI/p234Qd9doXLoeAFEQQSuPT75NRTjkOvMnZQ3IDXy7UJBxvm7e6zFS7DX6TzfOQyg0lysMf54t1JT3ZtCuBNbIyeEFqWB8QJNObFgFD7w2xSdwlPDeutfH79MM8IV/0nRjSJtju5HKHIeEzxIqMq2Mr1vvI8S3BR3dwU5Zxxz472+GWnGlxgJ3OTjXWdWwWKMjXNm3qZRFp5JErthJa86V2Lwp8rsKMItzczovwDzWvoKMCxK8zhO/a0y2BOPuXXno2Ezbmn+PAzDw1jqfcjrCDnz25I4++yeHDRj0G4+01GEWI/69oPqumhlb/cWJTN4Ptz0VDWquuNRKiUOKZcmCn8AgfoQ7w/LoCYXBRYjSxAqMaVx+Jam2TCoBcVXrHLmJ/lBwau+57870olsKohoh+1OhpiL+BSPkLm2nyBf Wycz6CNl SJp2qNaURQC7U28lf8EKw4AvVvBAZwsCOwU3SbJdYdhhXL57cfLsnxIZo9YZ2w6EceVaVLJZW6eJg3K4J9AqgfIg2VLJpo7DiLo+spLVGu6Vwrzdv24oLfzQhprZ1DInJBEiCf79AsIJomrr44LUX6vnZ4Tf14DBs5ZRnOKRUNm3Xld1wpYqxwNc/SIz84+A1vDT31FoZhTTvMEZXKJ9o3mlhHRwz1IXjhpk6ExwZ/qh0aGExEusH0JIiCtLo1YkOxLhrvzk0RpoNgBeKgiD1upmNcQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Introduce LARGE_ANON_FOLIO feature, which allows anonymous memory to be allocated in large folios of a determined order. All pages of the large folio are pte-mapped during the same page fault, significantly reducing the number of page faults. The number of per-page operations (e.g. ref counting, rmap management lru list management) are also significantly reduced since those ops now become per-folio. The new behaviour is hidden behind the new LARGE_ANON_FOLIO Kconfig, which defaults to disabled for now; The long term aim is for this to defaut to enabled, but there are some risks around internal fragmentation that need to be better understood first. When enabled, the folio order is determined as such: For a vma, process or system that has explicitly disabled THP, we continue to allocate order-0. THP is most likely disabled to avoid any possible internal fragmentation so we honour that request. Otherwise, the return value of arch_wants_pte_order() is used. For vmas that have not explicitly opted-in to use transparent hugepages (e.g. where thp=madvise and the vma does not have MADV_HUGEPAGE), then arch_wants_pte_order() is limited to 64K (or PAGE_SIZE, whichever is bigger). This allows for a performance boost without requiring any explicit opt-in from the workload while limitting internal fragmentation. If the preferred order can't be used (e.g. because the folio would breach the bounds of the vma, or because ptes in the region are already mapped) then we fall back to a suitable lower order; first PAGE_ALLOC_COSTLY_ORDER, then order-0. arch_wants_pte_order() can be overridden by the architecture if desired. Some architectures (e.g. arm64) can coalsece TLB entries if a contiguous set of ptes map physically contigious, naturally aligned memory, so this mechanism allows the architecture to optimize as required. Here we add the default implementation of arch_wants_pte_order(), used when the architecture does not define it, which returns -1, implying that the HW has no preference. In this case, mm will choose it's own default order. Signed-off-by: Ryan Roberts --- include/linux/pgtable.h | 13 ++++ mm/Kconfig | 10 +++ mm/memory.c | 166 ++++++++++++++++++++++++++++++++++++---- 3 files changed, 172 insertions(+), 17 deletions(-) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 5063b482e34f..2a1d83775837 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -313,6 +313,19 @@ static inline bool arch_has_hw_pte_young(void) } #endif +#ifndef arch_wants_pte_order +/* + * Returns preferred folio order for pte-mapped memory. Must be in range [0, + * PMD_SHIFT-PAGE_SHIFT) and must not be order-1 since THP requires large folios + * to be at least order-2. Negative value implies that the HW has no preference + * and mm will choose it's own default order. + */ +static inline int arch_wants_pte_order(void) +{ + return -1; +} +#endif + #ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long address, diff --git a/mm/Kconfig b/mm/Kconfig index 09130434e30d..fa61ea160447 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1238,4 +1238,14 @@ config LOCK_MM_AND_FIND_VMA source "mm/damon/Kconfig" +config LARGE_ANON_FOLIO + bool "Allocate large folios for anonymous memory" + depends on TRANSPARENT_HUGEPAGE + default n + help + Use large (bigger than order-0) folios to back anonymous memory where + possible, even for pte-mapped memory. This reduces the number of page + faults, as well as other per-page overheads to improve performance for + many workloads. + endmenu diff --git a/mm/memory.c b/mm/memory.c index 01f39e8144ef..64c3f242c49a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4050,6 +4050,127 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) return ret; } +static bool vmf_pte_range_changed(struct vm_fault *vmf, int nr_pages) +{ + int i; + + if (nr_pages == 1) + return vmf_pte_changed(vmf); + + for (i = 0; i < nr_pages; i++) { + if (!pte_none(ptep_get_lockless(vmf->pte + i))) + return true; + } + + return false; +} + +#ifdef CONFIG_LARGE_ANON_FOLIO +#define ANON_FOLIO_MAX_ORDER_UNHINTED \ + (ilog2(max_t(unsigned long, SZ_64K, PAGE_SIZE)) - PAGE_SHIFT) + +static int anon_folio_order(struct vm_area_struct *vma) +{ + int order; + + /* + * If THP is explicitly disabled for either the vma, the process or the + * system, then this is very likely intended to limit internal + * fragmentation; in this case, don't attempt to allocate a large + * anonymous folio. + * + * Else, if the vma is eligible for thp, allocate a large folio of the + * size preferred by the arch. Or if the arch requested a very small + * size or didn't request a size, then use PAGE_ALLOC_COSTLY_ORDER, + * which still meets the arch's requirements but means we still take + * advantage of SW optimizations (e.g. fewer page faults). + * + * Finally if thp is enabled but the vma isn't eligible, take the + * arch-preferred size and limit it to ANON_FOLIO_MAX_ORDER_UNHINTED. + * This ensures workloads that have not explicitly opted-in take benefit + * while capping the potential for internal fragmentation. + */ + + if ((vma->vm_flags & VM_NOHUGEPAGE) || + test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags) || + !hugepage_flags_enabled()) + order = 0; + else { + order = max(arch_wants_pte_order(), PAGE_ALLOC_COSTLY_ORDER); + + if (!hugepage_vma_check(vma, vma->vm_flags, false, true, true)) + order = min(order, ANON_FOLIO_MAX_ORDER_UNHINTED); + } + + return order; +} + +static int alloc_anon_folio(struct vm_fault *vmf, struct folio **folio) +{ + int i; + gfp_t gfp; + pte_t *pte; + unsigned long addr; + struct vm_area_struct *vma = vmf->vma; + int prefer = anon_folio_order(vma); + int orders[] = { + prefer, + prefer > PAGE_ALLOC_COSTLY_ORDER ? PAGE_ALLOC_COSTLY_ORDER : 0, + 0, + }; + + *folio = NULL; + + if (vmf_orig_pte_uffd_wp(vmf)) + goto fallback; + + for (i = 0; orders[i]; i++) { + addr = ALIGN_DOWN(vmf->address, PAGE_SIZE << orders[i]); + if (addr >= vma->vm_start && + addr + (PAGE_SIZE << orders[i]) <= vma->vm_end) + break; + } + + if (!orders[i]) + goto fallback; + + pte = pte_offset_map(vmf->pmd, vmf->address & PMD_MASK); + if (!pte) + return -EAGAIN; + + for (; orders[i]; i++) { + addr = ALIGN_DOWN(vmf->address, PAGE_SIZE << orders[i]); + vmf->pte = pte + pte_index(addr); + if (!vmf_pte_range_changed(vmf, 1 << orders[i])) + break; + } + + vmf->pte = NULL; + pte_unmap(pte); + + gfp = vma_thp_gfp_mask(vma); + + for (; orders[i]; i++) { + addr = ALIGN_DOWN(vmf->address, PAGE_SIZE << orders[i]); + *folio = vma_alloc_folio(gfp, orders[i], vma, addr, true); + if (*folio) { + clear_huge_page(&(*folio)->page, addr, 1 << orders[i]); + return 0; + } + } + +fallback: + *folio = vma_alloc_zeroed_movable_folio(vma, vmf->address); + return *folio ? 0 : -ENOMEM; +} +#else +static inline int alloc_anon_folio(struct vm_fault *vmf, struct folio **folio) +{ + *folio = vma_alloc_zeroed_movable_folio(vmf->vma, vmf->address); + return *folio ? 0 : -ENOMEM; +} +#endif + /* * We enter with non-exclusive mmap_lock (to exclude vma changes, * but allow concurrent faults), and pte mapped but not yet locked. @@ -4057,6 +4178,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) */ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) { + int i = 0; + int nr_pages = 1; + unsigned long addr = vmf->address; bool uffd_wp = vmf_orig_pte_uffd_wp(vmf); struct vm_area_struct *vma = vmf->vma; struct folio *folio; @@ -4101,10 +4225,15 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) /* Allocate our own private page. */ if (unlikely(anon_vma_prepare(vma))) goto oom; - folio = vma_alloc_zeroed_movable_folio(vma, vmf->address); + ret = alloc_anon_folio(vmf, &folio); + if (unlikely(ret == -EAGAIN)) + return 0; if (!folio) goto oom; + nr_pages = folio_nr_pages(folio); + addr = ALIGN_DOWN(vmf->address, nr_pages * PAGE_SIZE); + if (mem_cgroup_charge(folio, vma->vm_mm, GFP_KERNEL)) goto oom_free_page; folio_throttle_swaprate(folio, GFP_KERNEL); @@ -4116,17 +4245,12 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) */ __folio_mark_uptodate(folio); - entry = mk_pte(&folio->page, vma->vm_page_prot); - entry = pte_sw_mkyoung(entry); - if (vma->vm_flags & VM_WRITE) - entry = pte_mkwrite(pte_mkdirty(entry)); - - vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, - &vmf->ptl); + vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, addr, &vmf->ptl); if (!vmf->pte) goto release; - if (vmf_pte_changed(vmf)) { - update_mmu_tlb(vma, vmf->address, vmf->pte); + if (vmf_pte_range_changed(vmf, nr_pages)) { + for (i = 0; i < nr_pages; i++) + update_mmu_tlb(vma, addr + PAGE_SIZE * i, vmf->pte + i); goto release; } @@ -4141,16 +4265,24 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) return handle_userfault(vmf, VM_UFFD_MISSING); } - inc_mm_counter(vma->vm_mm, MM_ANONPAGES); - folio_add_new_anon_rmap(folio, vma, vmf->address); + folio_ref_add(folio, nr_pages - 1); + add_mm_counter(vma->vm_mm, MM_ANONPAGES, nr_pages); + folio_add_new_anon_rmap(folio, vma, addr); folio_add_lru_vma(folio, vma); + + for (i = 0; i < nr_pages; i++) { + entry = mk_pte(folio_page(folio, i), vma->vm_page_prot); + entry = pte_sw_mkyoung(entry); + if (vma->vm_flags & VM_WRITE) + entry = pte_mkwrite(pte_mkdirty(entry)); setpte: - if (uffd_wp) - entry = pte_mkuffd_wp(entry); - set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry); + if (uffd_wp) + entry = pte_mkuffd_wp(entry); + set_pte_at(vma->vm_mm, addr + PAGE_SIZE * i, vmf->pte + i, entry); - /* No need to invalidate - it was non-present before */ - update_mmu_cache(vma, vmf->address, vmf->pte); + /* No need to invalidate - it was non-present before */ + update_mmu_cache(vma, addr + PAGE_SIZE * i, vmf->pte + i); + } unlock: if (vmf->pte) pte_unmap_unlock(vmf->pte, vmf->ptl); From patchwork Wed Jul 26 09:51:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13327783 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC0DFC001DE for ; Wed, 26 Jul 2023 09:52:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 72C1A6B0078; Wed, 26 Jul 2023 05:52:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6DC4B6B007B; Wed, 26 Jul 2023 05:52:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5F2ED8D0001; Wed, 26 Jul 2023 05:52:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 4FFCC6B0078 for ; Wed, 26 Jul 2023 05:52:10 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 1443540212 for ; Wed, 26 Jul 2023 09:52:10 +0000 (UTC) X-FDA: 81053297220.06.384F16A Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf13.hostedemail.com (Postfix) with ESMTP id 5934C20016 for ; Wed, 26 Jul 2023 09:52:08 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=none; spf=pass (imf13.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690365128; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iwm9DkoXLBxio5wlQLJ5lOhkwwAE6xJtKQSzoG6WzoY=; b=qwDwRYVaiJFY7ro/KajSWcJEjYtfMsdo75Rex17Go18R817+e6/DGN96FDLpaIaaCL7ESq +3FKK+tEOZwq7gvML6xbC8TXUw7I0w6S33KAlot/fwZs0HCGcJJOATuIljXtqxi+OEferV aAUxC/4U5GdCFb/Fg0wlB68o+lc7Fjo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690365128; a=rsa-sha256; cv=none; b=G+Ii9rEQ0oSSJzTPNiVfgSxU0xz1fBUTI54/qFokGTtnU6eybSTpnoXtbSLtIPce59cJRa j8MoOpLJqaSquQSqgB1j749mV6nDek/OkEB7YErz1G4p5/gv4qjCWjXGpl3fmSFTLfxr72 Xd8MEV014jrARaabmw1ywfdds0E2gm8= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=none; spf=pass (imf13.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6C4B9169C; Wed, 26 Jul 2023 02:52:50 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 30CED3F67D; Wed, 26 Jul 2023 02:52:05 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , Matthew Wilcox , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Anshuman Khandual , Yang Shi , "Huang, Ying" , Zi Yan , Luis Chamberlain , Itaru Kitayama Cc: Ryan Roberts , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 3/5] arm64: mm: Override arch_wants_pte_order() Date: Wed, 26 Jul 2023 10:51:44 +0100 Message-Id: <20230726095146.2826796-4-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230726095146.2826796-1-ryan.roberts@arm.com> References: <20230726095146.2826796-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 5934C20016 X-Rspam-User: X-Stat-Signature: uap7a16dgkqjctf9bz7u8uzxar7i7jno X-Rspamd-Server: rspam03 X-HE-Tag: 1690365128-547312 X-HE-Meta: U2FsdGVkX18+yL2FZU7C58CDbaWOWRU9dclHh+0XvCWmBTuKKojKC1/d/4Mw7Tz2Mc/1oD5uNWp+6vfsst0PFSXntDB1FQtbFyr2ja27VTur64ho7hHztEHSfxLD3yaWR8zR8R8VHBLZ9oAEc3nhYZYBii3Qp87s4rb/63qkU3/65a+HpYzxtNXDzUQxMiQYK/gYo50I2YlKtuwMhgDLV5afNZ2wgNbLkQOPdxO9TL1uo6nOR4858FtMQbeE9+7ePI7/M7uFbn18yMOz54JjuqMybZlTmZPz6bLpPrZK7Xq3H9kqOVXQZtfCKlPw+HGM1fz1Dv9rGaxijRGSLeLEHAU5aH1Aive4N2tocKvc2ROtdW1iSj52y0qu9kt1H1qqoxvexDH667Lwx3iAXsY7A4wYj1nZERrF2B8PXMMuv6dUz52eHr1vBDxAPlDsRQZpxe40X54hA4CzpK08Ydp1yrFfJoUA0PXdSGsPj4EZcEQBhJl9znIfsnou90+8Q5M//bpzpl29NOw+W/MRSNfqAMJDgfbeQzh5pX+uoVO+2uXvvoAnq2xussDDGHDvcuwLlvD7W05X829JJUYuSqMjsdK8tl+3t3XS5vEVIinb0MNfYSLC1gzc+Etb/wzG0SMPZTcytzu1HoRIqpDgWuElezFeTSrRUhc+3u0m5vlg5UoyK7ad/hcXxPMlJUaAMz67TrZ7qOhLzMpd1yXyJH/1YNkpkBnZb0L233qWEKRwTKkR5g0Rc0MAzprPY3c3eHQc5g/Yf91yswiiNEpDxIY8ajDdKDS5m3TbLyICBmQoPBxXnHUUiF+W0d0msqVLYSpfD4K7nEZkNXghRI86EV/nG2eyFT487imCNeSvhEWFHII4s2AE/x6D3kQ4k4xey8ydUK9AnXJONacWRuvw3rQNN2mHSWnFCAscy/6xf+CLwTVFFrgtHu77JrFW78BwJEdmwp848MTub5+1XFhAx3/ zKMfOfPx euD+KFisAkqL5ELgHKYR98w0txlhPzC7AeIOd/B4qcxa44IiQ314UVlVZUJjYY2R/Tj1GsNIeB7vGCjmQGFBIF4A4iP+TBnft0VT+iK6Nq/rhdhKKDSxOQIc8++z78KYUgbfF0oBkxrF7JMkBdMpAIeEWmS0xtHOyBD5yiZlpCZQi6IKd9n5ujd4o1FivuWBYmqwc7J/bBhwFxPOPWaTcJ96lgBayaF3ZhvARvRwL24zado8pAudI0ovWxCVjRl2SPbWxwNqcNeI+OJVxpGrArYWudacntkJGj6uTSeL7Hmj6yT4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Define an arch-specific override of arch_wants_pte_order() so that when LARGE_ANON_FOLIO is enabled, large folios will be allocated for anonymous memory with an order that is compatible with arm64's contpte mappings. Reviewed-by: Yu Zhao Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 0bd18de9fd97..d00bb26fe28f 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -1106,6 +1106,12 @@ extern pte_t ptep_modify_prot_start(struct vm_area_struct *vma, extern void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t old_pte, pte_t new_pte); + +#define arch_wants_pte_order arch_wants_pte_order +static inline int arch_wants_pte_order(void) +{ + return CONT_PTE_SHIFT - PAGE_SHIFT; +} #endif /* !__ASSEMBLY__ */ #endif /* __ASM_PGTABLE_H */ From patchwork Wed Jul 26 09:51:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13327784 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8610BC0015E for ; Wed, 26 Jul 2023 09:52:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1FBFC6B007B; Wed, 26 Jul 2023 05:52:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1AD6C8D0001; Wed, 26 Jul 2023 05:52:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 074346B007E; Wed, 26 Jul 2023 05:52:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id ED5D06B007B for ; Wed, 26 Jul 2023 05:52:12 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id C8013B26B6 for ; Wed, 26 Jul 2023 09:52:12 +0000 (UTC) X-FDA: 81053297304.16.9E00B3B Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf10.hostedemail.com (Postfix) with ESMTP id 1AA3EC0008 for ; Wed, 26 Jul 2023 09:52:10 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=none; spf=pass (imf10.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690365131; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mMjrxtyJTpWC3olVckqQ03r5Ge3JaClfF6iuN/DQcwg=; b=t7q2wy987zBrHoLV+ip0+uv/aNGTKmTNuGVLbF/9NQoHg/EL2IW0kLW5KtH/He2oAuk4WP pXWJixwX/PDft67GQVOBp892xg6i6NEyojlQ7ijImvKe5lRl1HcLCcvSOrYh+3Tx1qOBzw c1k39nln5RpbmuYIsYC32xSElrwSCUA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690365131; a=rsa-sha256; cv=none; b=han3XYGffJdhPUjH8x2YIYnQQWPUoQsWMUWThTAAn8eXZy32Sc4DxWaEvlt3h07l3pZqvR 2OaxTGl/mcJTM5nSko3QeB7McmmqiKif05BgGNC0DWWyYVTOltE0ZUCNGy176EEjIQ8KnX AJjaNClqFTB9RecV96AwrSMqXHz8cAA= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=none; spf=pass (imf10.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 004491692; Wed, 26 Jul 2023 02:52:53 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B913A3F67D; Wed, 26 Jul 2023 02:52:07 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , Matthew Wilcox , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Anshuman Khandual , Yang Shi , "Huang, Ying" , Zi Yan , Luis Chamberlain , Itaru Kitayama Cc: Ryan Roberts , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 4/5] selftests/mm/cow: Generalize do_run_with_thp() helper Date: Wed, 26 Jul 2023 10:51:45 +0100 Message-Id: <20230726095146.2826796-5-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230726095146.2826796-1-ryan.roberts@arm.com> References: <20230726095146.2826796-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 1AA3EC0008 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: dpnof8phyqqby5ie7u4t81uxskj97nho X-HE-Tag: 1690365130-743879 X-HE-Meta: U2FsdGVkX19v0CwhrHvSIbSBpJWHU1l3UsLbFBbRdmS3QCYrHmQ/vuEbwM4r9WIEAWbNGmweykr4TMOv6lZvFI1GZjSAoN9D89MgUQhOwnDWEYZTipVJueMx/LkQt5UwDo+zUjzxjQKqkBOQ3DqYdDX6vJyy8rT8QyQQnI0yW6kvB+CqMSWm1q36PJYbp3eGKUPKISdigKyU9vjxPRG4TgXoXA9caEeAZvLCCHcCr6N76LVHE781t1Xfw/XdDI0jyVTPZxIMbjfFagmglnw728PLlfwNIZvR/oIN2QtHF1JIXWdwOjTCvaH/GXwYSixGv5BkPT9+062eD7rXA43WTcw27vC+jn4NAo8E5SxBMgaSRHpcdemRUCSvq8XnWl0Bn0P9wSx+PhlW7PJqA29d/VLBOicRnOQCcFIW08EPJd+/ozwLeeXDDfU7zlr9FAx4GF0CYEQWIQ62GDKrpBndaYnmqeCXE/iYQTNHD90pvwkZOK9ZlnR3VqHMvpZ5BbudnhjngoKExU1NUpawvmRATOnaswMdjhPuUcHIHIxUazHAWTVLr9u4cWNidtylLVFU+oJTFRbmlA/zervxAIsomGBiSHFOOBAJx1Z8q7cS3U+4DiR/JniCQCy/eu4mcKDSdkFLJUgtgpF3fwCEgtnIa1kvHKtS3x5aiJX2yQdhoGaQ/QNVAIIBqTD6pLpIuWmlYx4KhDN2WC32mgApGYVvd0RF2qrIHY24i2GLaigxSEBeBluzPO7y95aWrlPiT0tPF3oyoOWf9UHQxowipgpKQyuDXMJ7BMCy7jIOGC198m4m9z8WSVYeMBG9Drc8YCOuUYF+slnmIvlAHhyUPKvCOswRDUt953kmD7GY9LHH397mlXpjNnVwiiizYRSPPPeAW1bI4b+sQ/rYg7TAXrfRRyM88+mSMIGGMmRPQCzyNqtANbCSa75P3V39Q9Id4MByyVU+5sDWJlQLpiWcSPv xQMWWQsp 8iNNB43k/nCgwrvjqaeSroc+oQ2PDilPH4v3qOikrVNTuSXP9kCjCdFa4yHeoAel/lwhnQyFIOCBpdXxxpbs3ed4mEioDLbGkGLeFSyfyqCi552ZywxwWoj3MfburuYN0WlCdRE9HBzSwyLZF6YkGflhe6ljmFYJ7vgddwwKwI8sauW7FMhlQHZtmZhKJtCZVVZTlpKZnkSwiSZ8TFzskY2ugu90LZoPSLtV0vcORzuvaJGpOWIqUz+dKIgCcDsVwfHhA9aUjF3PjGo6OvotKPG6ThA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: do_run_with_thp() prepares THP memory into different states before running tests. We would like to reuse this logic to also test large anon folios. So let's add a size parameter which tells the function what size of memory it should operate on. Remove references to THP and replace with LARGE, and fix up all existing call sites to pass thpsize as the required size. No functional change intended here, but a separate commit will add new large anon folio tests that use this new capability. Signed-off-by: Ryan Roberts --- tools/testing/selftests/mm/cow.c | 118 ++++++++++++++++--------------- 1 file changed, 61 insertions(+), 57 deletions(-) diff --git a/tools/testing/selftests/mm/cow.c b/tools/testing/selftests/mm/cow.c index 7324ce5363c0..304882bf2e5d 100644 --- a/tools/testing/selftests/mm/cow.c +++ b/tools/testing/selftests/mm/cow.c @@ -723,25 +723,25 @@ static void run_with_base_page_swap(test_fn fn, const char *desc) do_run_with_base_page(fn, true); } -enum thp_run { - THP_RUN_PMD, - THP_RUN_PMD_SWAPOUT, - THP_RUN_PTE, - THP_RUN_PTE_SWAPOUT, - THP_RUN_SINGLE_PTE, - THP_RUN_SINGLE_PTE_SWAPOUT, - THP_RUN_PARTIAL_MREMAP, - THP_RUN_PARTIAL_SHARED, +enum large_run { + LARGE_RUN_PMD, + LARGE_RUN_PMD_SWAPOUT, + LARGE_RUN_PTE, + LARGE_RUN_PTE_SWAPOUT, + LARGE_RUN_SINGLE_PTE, + LARGE_RUN_SINGLE_PTE_SWAPOUT, + LARGE_RUN_PARTIAL_MREMAP, + LARGE_RUN_PARTIAL_SHARED, }; -static void do_run_with_thp(test_fn fn, enum thp_run thp_run) +static void do_run_with_large(test_fn fn, enum large_run large_run, size_t size) { char *mem, *mmap_mem, *tmp, *mremap_mem = MAP_FAILED; - size_t size, mmap_size, mremap_size; + size_t mmap_size, mremap_size; int ret; - /* For alignment purposes, we need twice the thp size. */ - mmap_size = 2 * thpsize; + /* For alignment purposes, we need twice the requested size. */ + mmap_size = 2 * size; mmap_mem = mmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (mmap_mem == MAP_FAILED) { @@ -749,36 +749,40 @@ static void do_run_with_thp(test_fn fn, enum thp_run thp_run) return; } - /* We need a THP-aligned memory area. */ - mem = (char *)(((uintptr_t)mmap_mem + thpsize) & ~(thpsize - 1)); + /* We need to naturally align the memory area. */ + mem = (char *)(((uintptr_t)mmap_mem + size) & ~(size - 1)); - ret = madvise(mem, thpsize, MADV_HUGEPAGE); + ret = madvise(mem, size, MADV_HUGEPAGE); if (ret) { ksft_test_result_fail("MADV_HUGEPAGE failed\n"); goto munmap; } /* - * Try to populate a THP. Touch the first sub-page and test if we get - * another sub-page populated automatically. + * Try to populate a large folio. Touch the first sub-page and test if + * we get the last sub-page populated automatically. */ mem[0] = 0; - if (!pagemap_is_populated(pagemap_fd, mem + pagesize)) { - ksft_test_result_skip("Did not get a THP populated\n"); + if (!pagemap_is_populated(pagemap_fd, mem + size - pagesize)) { + ksft_test_result_skip("Did not get fully populated\n"); goto munmap; } - memset(mem, 0, thpsize); + memset(mem, 0, size); - size = thpsize; - switch (thp_run) { - case THP_RUN_PMD: - case THP_RUN_PMD_SWAPOUT: + switch (large_run) { + case LARGE_RUN_PMD: + case LARGE_RUN_PMD_SWAPOUT: + if (size != thpsize) { + ksft_test_result_fail("test bug: can't PMD-map size\n"); + goto munmap; + } break; - case THP_RUN_PTE: - case THP_RUN_PTE_SWAPOUT: + case LARGE_RUN_PTE: + case LARGE_RUN_PTE_SWAPOUT: /* - * Trigger PTE-mapping the THP by temporarily mapping a single - * subpage R/O. + * Trigger PTE-mapping the large folio by temporarily mapping a + * single subpage R/O. This is a noop if the large-folio is not + * thpsize (and therefore already PTE-mapped). */ ret = mprotect(mem + pagesize, pagesize, PROT_READ); if (ret) { @@ -791,25 +795,25 @@ static void do_run_with_thp(test_fn fn, enum thp_run thp_run) goto munmap; } break; - case THP_RUN_SINGLE_PTE: - case THP_RUN_SINGLE_PTE_SWAPOUT: + case LARGE_RUN_SINGLE_PTE: + case LARGE_RUN_SINGLE_PTE_SWAPOUT: /* - * Discard all but a single subpage of that PTE-mapped THP. What - * remains is a single PTE mapping a single subpage. + * Discard all but a single subpage of that PTE-mapped large + * folio. What remains is a single PTE mapping a single subpage. */ - ret = madvise(mem + pagesize, thpsize - pagesize, MADV_DONTNEED); + ret = madvise(mem + pagesize, size - pagesize, MADV_DONTNEED); if (ret) { ksft_test_result_fail("MADV_DONTNEED failed\n"); goto munmap; } size = pagesize; break; - case THP_RUN_PARTIAL_MREMAP: + case LARGE_RUN_PARTIAL_MREMAP: /* - * Remap half of the THP. We need some new memory location - * for that. + * Remap half of the lareg folio. We need some new memory + * location for that. */ - mremap_size = thpsize / 2; + mremap_size = size / 2; mremap_mem = mmap(NULL, mremap_size, PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (mem == MAP_FAILED) { @@ -824,13 +828,13 @@ static void do_run_with_thp(test_fn fn, enum thp_run thp_run) } size = mremap_size; break; - case THP_RUN_PARTIAL_SHARED: + case LARGE_RUN_PARTIAL_SHARED: /* - * Share the first page of the THP with a child and quit the - * child. This will result in some parts of the THP never - * have been shared. + * Share the first page of the large folio with a child and quit + * the child. This will result in some parts of the large folio + * never have been shared. */ - ret = madvise(mem + pagesize, thpsize - pagesize, MADV_DONTFORK); + ret = madvise(mem + pagesize, size - pagesize, MADV_DONTFORK); if (ret) { ksft_test_result_fail("MADV_DONTFORK failed\n"); goto munmap; @@ -844,7 +848,7 @@ static void do_run_with_thp(test_fn fn, enum thp_run thp_run) } wait(&ret); /* Allow for sharing all pages again. */ - ret = madvise(mem + pagesize, thpsize - pagesize, MADV_DOFORK); + ret = madvise(mem + pagesize, size - pagesize, MADV_DOFORK); if (ret) { ksft_test_result_fail("MADV_DOFORK failed\n"); goto munmap; @@ -854,10 +858,10 @@ static void do_run_with_thp(test_fn fn, enum thp_run thp_run) assert(false); } - switch (thp_run) { - case THP_RUN_PMD_SWAPOUT: - case THP_RUN_PTE_SWAPOUT: - case THP_RUN_SINGLE_PTE_SWAPOUT: + switch (large_run) { + case LARGE_RUN_PMD_SWAPOUT: + case LARGE_RUN_PTE_SWAPOUT: + case LARGE_RUN_SINGLE_PTE_SWAPOUT: madvise(mem, size, MADV_PAGEOUT); if (!range_is_swapped(mem, size)) { ksft_test_result_skip("MADV_PAGEOUT did not work, is swap enabled?\n"); @@ -878,49 +882,49 @@ static void do_run_with_thp(test_fn fn, enum thp_run thp_run) static void run_with_thp(test_fn fn, const char *desc) { ksft_print_msg("[RUN] %s ... with THP\n", desc); - do_run_with_thp(fn, THP_RUN_PMD); + do_run_with_large(fn, LARGE_RUN_PMD, thpsize); } static void run_with_thp_swap(test_fn fn, const char *desc) { ksft_print_msg("[RUN] %s ... with swapped-out THP\n", desc); - do_run_with_thp(fn, THP_RUN_PMD_SWAPOUT); + do_run_with_large(fn, LARGE_RUN_PMD_SWAPOUT, thpsize); } static void run_with_pte_mapped_thp(test_fn fn, const char *desc) { ksft_print_msg("[RUN] %s ... with PTE-mapped THP\n", desc); - do_run_with_thp(fn, THP_RUN_PTE); + do_run_with_large(fn, LARGE_RUN_PTE, thpsize); } static void run_with_pte_mapped_thp_swap(test_fn fn, const char *desc) { ksft_print_msg("[RUN] %s ... with swapped-out, PTE-mapped THP\n", desc); - do_run_with_thp(fn, THP_RUN_PTE_SWAPOUT); + do_run_with_large(fn, LARGE_RUN_PTE_SWAPOUT, thpsize); } static void run_with_single_pte_of_thp(test_fn fn, const char *desc) { ksft_print_msg("[RUN] %s ... with single PTE of THP\n", desc); - do_run_with_thp(fn, THP_RUN_SINGLE_PTE); + do_run_with_large(fn, LARGE_RUN_SINGLE_PTE, thpsize); } static void run_with_single_pte_of_thp_swap(test_fn fn, const char *desc) { ksft_print_msg("[RUN] %s ... with single PTE of swapped-out THP\n", desc); - do_run_with_thp(fn, THP_RUN_SINGLE_PTE_SWAPOUT); + do_run_with_large(fn, LARGE_RUN_SINGLE_PTE_SWAPOUT, thpsize); } static void run_with_partial_mremap_thp(test_fn fn, const char *desc) { ksft_print_msg("[RUN] %s ... with partially mremap()'ed THP\n", desc); - do_run_with_thp(fn, THP_RUN_PARTIAL_MREMAP); + do_run_with_large(fn, LARGE_RUN_PARTIAL_MREMAP, thpsize); } static void run_with_partial_shared_thp(test_fn fn, const char *desc) { ksft_print_msg("[RUN] %s ... with partially shared THP\n", desc); - do_run_with_thp(fn, THP_RUN_PARTIAL_SHARED); + do_run_with_large(fn, LARGE_RUN_PARTIAL_SHARED, thpsize); } static void run_with_hugetlb(test_fn fn, const char *desc, size_t hugetlbsize) @@ -1338,7 +1342,7 @@ static void run_anon_thp_test_cases(void) struct test_case const *test_case = &anon_thp_test_cases[i]; ksft_print_msg("[RUN] %s\n", test_case->desc); - do_run_with_thp(test_case->fn, THP_RUN_PMD); + do_run_with_large(test_case->fn, LARGE_RUN_PMD, thpsize); } } From patchwork Wed Jul 26 09:51:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13327785 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E351C001DE for ; Wed, 26 Jul 2023 09:52:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AD66F6B007D; Wed, 26 Jul 2023 05:52:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AAE236B007E; Wed, 26 Jul 2023 05:52:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 94E836B0080; Wed, 26 Jul 2023 05:52:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 8260C6B007D for ; Wed, 26 Jul 2023 05:52:15 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 6020F1601FA for ; Wed, 26 Jul 2023 09:52:15 +0000 (UTC) X-FDA: 81053297430.25.3DA549F Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf18.hostedemail.com (Postfix) with ESMTP id 99F351C000E for ; Wed, 26 Jul 2023 09:52:13 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690365133; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YAYuga94X0gYR3iPhGr2eIcuvk2ipwl7xIVV/NAaqzI=; b=at3u/iiLa9/054JtO1TJQRsjlozT8t0E2Ih0bfyehcFlzviopPIbUWpQOYmabP6LN7jrQ6 nCNMDXLevUMBCHcwMQysxZ0AAKFS3kIaYub+L2RXlOnafx/aXFtB2t8aLxD+lMTAP0amX2 CLmSJ+SS/TVeUa/2jiljbSsRgT7LhTc= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690365133; a=rsa-sha256; cv=none; b=e5i2sNGTzDOW4TcdkSU9agoqOoWK7GBL/Zfq5e+aGdKLHcEvXtcyGV4avx29+Ph3qtxbtO dXIBszqCSY8venpLh610dB5eDOipTxl5APaDm7WzGR4G58hXGX6VRUzUBEVCEwRgUAUDU9 8SiGdcyFCwBWpeXr8DkKuMaPVRuCmHQ= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8860416A3; Wed, 26 Jul 2023 02:52:55 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4CFB33F67D; Wed, 26 Jul 2023 02:52:10 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , Matthew Wilcox , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Anshuman Khandual , Yang Shi , "Huang, Ying" , Zi Yan , Luis Chamberlain , Itaru Kitayama Cc: Ryan Roberts , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 5/5] selftests/mm/cow: Add large anon folio tests Date: Wed, 26 Jul 2023 10:51:46 +0100 Message-Id: <20230726095146.2826796-6-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230726095146.2826796-1-ryan.roberts@arm.com> References: <20230726095146.2826796-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 99F351C000E X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: 6zkcd7kkm3hxn9irq48i8t89w5gjxi8e X-HE-Tag: 1690365133-200382 X-HE-Meta: U2FsdGVkX194NxDYLI3ELpTRgarPq8kFGj1BG23ob267VmdcYLEeg3TC5GBJO6ied9nTV+sNDLnEHkvm+aainKYRD5c2sehfvjU3twph2If5/xU/b10VpOXctFRbiT29Mh01uoBwSB7WlCtd4SQqSWZFRymrWGCrUW9d12bo+rBXLwJ/Be1xsPErz8bDnhna/nbDpzGSPovO1XuR/Yc0OIwe9xW2oGd3EJi60+DG85+NAVu3AxAK7vIfYYhvd0YaQ30VyE7617gfmhrmaNO4MBfXz5y3mPc4i0+ZQcpFQl+FV5nKLyTbz2GrSfM6WuwudXz4XwRY5zFd9BVA8aG5nynFBkvl3GqLPIFan3w0OtiDeGm/35qERQaBHdpxsZWkN6jloFErJ/h9hi2ecajHBYxmF0THsUzjA/nGCDd+R4S1fl43WchmrgQi6bB6rwqbBWvXq4Iu8gJnqQfhSHiokL1D/oA3wTiZKOMY1rJK9A+dYCpor+WIXcUyDKlUgVtFnuXCp1J71kDsox1e/zou8VhOU8QTvWdr8NCpL/5eGwXp3RAeX+55GUAx+gt+qTwukHTL9YVE/VxXUYMdt9y6ckgYLPtgSuWCsX+X2kVK8bgm+iG71HB7uunL+M7ixj2hR1qZP5wwN/BR58jSSQPMWoKDKdUxVVTKFqOPC/xZivNcjjMT/gzdt3QU/7CWabXp0ZPmxAPJ38cdU4L+t2Ybo/oqPGYDs0bBY7QWi8s1EDWMP7nHF354a0ow37wwIK/LKGaq+giBh3JbGNZ6DsAj5aOL9iB9p2Zq9f2PeWnxf1n6SaXoUSJayVnboGPvsfr5SNfuqxjBTR3R8F1HxikOhXV7bjyfFeSbaqHUTvOdv89WaS9+Cey1kWqWYumCXe2uWW6HNVRQCRR5hY7obGZFxS9QFApXgjKokG8RJYb5rtYEOr62ViCAB7jAbv2mLcKtXx7ybyYjPtubGC3+nOv mU9FqSqX OLAbUNdTldCv/P6ukqEREO4IeFTQOB1a+CDMNCCqpy49iCW+kpqBpJsDxu12hT0U7Sw+3M46MkEVNlo3OdbEwIQq33G/A3FvWszMssi4jqaFLJ02Q6tniUkKO1LOFZvULcb9z2orBRvttoo3/Q9fI8nMp7t9V8pVyPCJYxfK1hhteHcouzt3KpyOjMnXwmp6B1PTWEeHV5cProOUwVnaLMe7vN4sa3IcG7GAv X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add tests similar to the existing THP tests, but which operate on memory backed by large anonymous folios, which are smaller than THP. This reuses all the existing infrastructure. If the test suite detects that large anonyomous folios are not supported by the kernel, the new tests are skipped. Signed-off-by: Ryan Roberts --- tools/testing/selftests/mm/cow.c | 111 +++++++++++++++++++++++++++++-- 1 file changed, 106 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/mm/cow.c b/tools/testing/selftests/mm/cow.c index 304882bf2e5d..932242c965a4 100644 --- a/tools/testing/selftests/mm/cow.c +++ b/tools/testing/selftests/mm/cow.c @@ -33,6 +33,7 @@ static size_t pagesize; static int pagemap_fd; static size_t thpsize; +static size_t lafsize; static int nr_hugetlbsizes; static size_t hugetlbsizes[10]; static int gup_fd; @@ -927,6 +928,42 @@ static void run_with_partial_shared_thp(test_fn fn, const char *desc) do_run_with_large(fn, LARGE_RUN_PARTIAL_SHARED, thpsize); } +static void run_with_laf(test_fn fn, const char *desc) +{ + ksft_print_msg("[RUN] %s ... with large anon folio\n", desc); + do_run_with_large(fn, LARGE_RUN_PTE, lafsize); +} + +static void run_with_laf_swap(test_fn fn, const char *desc) +{ + ksft_print_msg("[RUN] %s ... with swapped-out large anon folio\n", desc); + do_run_with_large(fn, LARGE_RUN_PTE_SWAPOUT, lafsize); +} + +static void run_with_single_pte_of_laf(test_fn fn, const char *desc) +{ + ksft_print_msg("[RUN] %s ... with single PTE of large anon folio\n", desc); + do_run_with_large(fn, LARGE_RUN_SINGLE_PTE, lafsize); +} + +static void run_with_single_pte_of_laf_swap(test_fn fn, const char *desc) +{ + ksft_print_msg("[RUN] %s ... with single PTE of swapped-out large anon folio\n", desc); + do_run_with_large(fn, LARGE_RUN_SINGLE_PTE_SWAPOUT, lafsize); +} + +static void run_with_partial_mremap_laf(test_fn fn, const char *desc) +{ + ksft_print_msg("[RUN] %s ... with partially mremap()'ed large anon folio\n", desc); + do_run_with_large(fn, LARGE_RUN_PARTIAL_MREMAP, lafsize); +} + +static void run_with_partial_shared_laf(test_fn fn, const char *desc) +{ + ksft_print_msg("[RUN] %s ... with partially shared large anon folio\n", desc); + do_run_with_large(fn, LARGE_RUN_PARTIAL_SHARED, lafsize); +} + static void run_with_hugetlb(test_fn fn, const char *desc, size_t hugetlbsize) { int flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB; @@ -1105,6 +1142,14 @@ static void run_anon_test_case(struct test_case const *test_case) run_with_partial_mremap_thp(test_case->fn, test_case->desc); run_with_partial_shared_thp(test_case->fn, test_case->desc); } + if (lafsize) { + run_with_laf(test_case->fn, test_case->desc); + run_with_laf_swap(test_case->fn, test_case->desc); + run_with_single_pte_of_laf(test_case->fn, test_case->desc); + run_with_single_pte_of_laf_swap(test_case->fn, test_case->desc); + run_with_partial_mremap_laf(test_case->fn, test_case->desc); + run_with_partial_shared_laf(test_case->fn, test_case->desc); + } for (i = 0; i < nr_hugetlbsizes; i++) run_with_hugetlb(test_case->fn, test_case->desc, hugetlbsizes[i]); @@ -1126,6 +1171,8 @@ static int tests_per_anon_test_case(void) if (thpsize) tests += 8; + if (lafsize) + tests += 6; return tests; } @@ -1680,15 +1727,74 @@ static int tests_per_non_anon_test_case(void) return tests; } +static size_t large_anon_folio_size(void) +{ + /* + * There is no interface to query this. But we know that it must be less + * than thpsize. So we map a thpsize area, aligned to thpsize offset by + * thpsize/2 (to avoid a hugepage being allocated), then touch the first + * page and see how many pages get faulted in. + */ + + int max_order = __builtin_ctz(thpsize); + size_t mmap_size = thpsize * 3; + char *mmap_mem = NULL; + int order = 0; + char *mem; + size_t offset; + int ret; + + /* For alignment purposes, we need 2.5x the requested size. */ + mmap_mem = mmap(NULL, mmap_size, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + if (mmap_mem == MAP_FAILED) + goto out; + + /* Align the memory area to thpsize then offset it by thpsize/2. */ + mem = (char *)(((uintptr_t)mmap_mem + thpsize) & ~(thpsize - 1)); + mem += thpsize / 2; + + /* We might get a bigger large anon folio when MADV_HUGEPAGE is set. */ + ret = madvise(mem, thpsize, MADV_HUGEPAGE); + if (ret) + goto out; + + /* Probe the memory to see how much is populated. */ + mem[0] = 0; + for (order = 0; order < max_order; order++) { + offset = (1 << order) * pagesize; + if (!pagemap_is_populated(pagemap_fd, mem + offset)) + break; + } + +out: + if (mmap_mem) + munmap(mmap_mem, mmap_size); + + if (order == 0) + return 0; + + return offset; +} + int main(int argc, char **argv) { int err; + gup_fd = open("/sys/kernel/debug/gup_test", O_RDWR); + pagemap_fd = open("/proc/self/pagemap", O_RDONLY); + if (pagemap_fd < 0) + ksft_exit_fail_msg("opening pagemap failed\n"); + pagesize = getpagesize(); thpsize = read_pmd_pagesize(); if (thpsize) ksft_print_msg("[INFO] detected THP size: %zu KiB\n", thpsize / 1024); + lafsize = large_anon_folio_size(); + if (lafsize) + ksft_print_msg("[INFO] detected large anon folio size: %zu KiB\n", + lafsize / 1024); nr_hugetlbsizes = detect_hugetlb_page_sizes(hugetlbsizes, ARRAY_SIZE(hugetlbsizes)); detect_huge_zeropage(); @@ -1698,11 +1804,6 @@ int main(int argc, char **argv) ARRAY_SIZE(anon_thp_test_cases) * tests_per_anon_thp_test_case() + ARRAY_SIZE(non_anon_test_cases) * tests_per_non_anon_test_case()); - gup_fd = open("/sys/kernel/debug/gup_test", O_RDWR); - pagemap_fd = open("/proc/self/pagemap", O_RDONLY); - if (pagemap_fd < 0) - ksft_exit_fail_msg("opening pagemap failed\n"); - run_anon_test_cases(); run_anon_thp_test_cases(); run_non_anon_test_cases();