From patchwork Thu Jan 16 21:10:38 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 13942307 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B8C4C02187 for ; Thu, 16 Jan 2025 21:11:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E072D280003; Thu, 16 Jan 2025 16:11:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DB669280001; Thu, 16 Jan 2025 16:11:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C302E280003; Thu, 16 Jan 2025 16:11:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 9FADD280001 for ; Thu, 16 Jan 2025 16:11:20 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 524F71C84F6 for ; Thu, 16 Jan 2025 21:11:20 +0000 (UTC) X-FDA: 83014560720.08.16B7243 Received: from NAM12-BN8-obe.outbound.protection.outlook.com (mail-bn8nam12on2056.outbound.protection.outlook.com [40.107.237.56]) by imf13.hostedemail.com (Postfix) with ESMTP id 899D220006 for ; Thu, 16 Jan 2025 21:11:17 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b="R/1mNyhL"; spf=pass (imf13.hostedemail.com: domain of ziy@nvidia.com designates 40.107.237.56 as permitted sender) smtp.mailfrom=ziy@nvidia.com; arc=pass ("microsoft.com:s=arcselector10001:i=1"); dmarc=pass (policy=reject) header.from=nvidia.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737061877; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nL+wY28kI76n9e+G1XMx3CKgrNgqEsGZq21/wECx+Fo=; b=P8HzWPO8jouHKGQi4HfpN3flQu+aYss8XluI/Q5QXh6SyYAV52EdjeKhwmd5QTwuxwgBvl iDvh+7vz1qP34PfQTf8CY8Kp7KHopN9YMOC6IRBfZ0Ty6/nWruWfybRJ+v1YN39KC0JKWP nr03jLhl9bOD8oU/KitEJVumN3g8Uu4= ARC-Authentication-Results: i=2; imf13.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b="R/1mNyhL"; spf=pass (imf13.hostedemail.com: domain of ziy@nvidia.com designates 40.107.237.56 as permitted sender) smtp.mailfrom=ziy@nvidia.com; arc=pass ("microsoft.com:s=arcselector10001:i=1"); dmarc=pass (policy=reject) header.from=nvidia.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1737061877; a=rsa-sha256; cv=pass; b=0+A2DtReKkIFL18wvK6uAsm/Si0FKIF+o1yzo8bIhWaHOmhSVHaP5Uurpzeq5VyqnCdcOu P8VDSQlZSEPjyzkDNq+rCgW6FFmF4Vo446REondUV10V1FY2rJghN5kUrOcYrWKmSUMC93 m9Y74iFtcWVbsdnvXRZLW4I84BEq4DQ= ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Ay7lUGLeRWvTh1PkT045t0eatNce96z/x2VWfYKRPtWJMY6HedwxCOK/i9DOLBxlzXnlVkKiVo0R7nk9Kp6HJXHa6IkmXYgBmQzE4bDPC+uPzEDPyNYZkwzx2ek+7h+FnwVc7EiNDnBC+XIkF2DnZjVLRuNdMQsmmRH0XQ2I1pOWC6vnJx+RvE+HB2hM+LerAvbBfnQL/TT34UAjua+ISkye2OVAdNPjS47hCutdE+994HR0v9A5sDjZ3UY0H7WtywZRUSBkbvpJMGjF9h0Wj3FSOpSZmg6dOKdJ7YfV2aGMxaI2R9nDoJd4C3NoJjF7+bLPcsYgW0GJu4pVVZOTsA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=nL+wY28kI76n9e+G1XMx3CKgrNgqEsGZq21/wECx+Fo=; b=skRYi/DTQBlWY5h5JGrdoMbxJOvKciNxXoHy+/tGqHFzPEITMWYPcO1UfsuqWw0QATh9s2n5DYKPavRYZ0kKKE/56iIh42mzDLikmllcXP7JULKd/hNrVza2NhOe0CB2blxYoIzMR2Ip+mQQy6qzN0Bm82CQMwyFVqjvK6aS2P9c6e8cv4WRSJgblKWWy9XmujYYehphqqyEy/xBzPVkLVRyfLQi4JAAnDERq5GwXGb0N5whqYFNQx1y/Ie5xQvTrxDIOFdtftm2wwtLlISJ3C3E/AtGcr2wih5l6z2Hu+kwkTV4v6VPDDo7rbFApk5T+2OoDE+DRBCnYx2cXd129w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=nL+wY28kI76n9e+G1XMx3CKgrNgqEsGZq21/wECx+Fo=; b=R/1mNyhLHlAginPplFUBV61BKTlwGDMNLNgh0wubYcwV3t3Kkbba8a8YBYTkBKQ4Uk354tlP6NTTFKWrgu53Nq/OXFkcj2tEfNQlWfGAD2x7UOh9A9VpEjP3jP4umK24IRlHQrK1iYdfaFG5uGA28QfNYq6FWE2JHlsjAVHNSZQMqhyU4yG+pRdc6gsGa+/yCU/RPOmWQwfWPL2ZoZIo0DGi42cZWyVU1glFECPLjeDT934upF7W1ynNq2HDk9/AHzIoe2OnAic8loEr9uGqRhPGZOMZCOeNmIogUJ92SGt1V2G921GYQp+J+RUbx7ie/wbX8MauzPfeHWswur23lw== Received: from DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) by LV8PR12MB9232.namprd12.prod.outlook.com (2603:10b6:408:182::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8356.15; Thu, 16 Jan 2025 21:11:00 +0000 Received: from DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a]) by DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a%3]) with mapi id 15.20.8356.014; Thu, 16 Jan 2025 21:11:00 +0000 From: Zi Yan To: linux-mm@kvack.org, Andrew Morton , "Kirill A . Shutemov" , "Matthew Wilcox (Oracle)" Cc: Ryan Roberts , Hugh Dickins , David Hildenbrand , Yang Shi , Miaohe Lin , Kefeng Wang , Yu Zhao , John Hubbard , Baolin Wang , linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, Zi Yan Subject: [PATCH v5 06/10] mm/huge_memory: add buddy allocator like folio_split() Date: Thu, 16 Jan 2025 16:10:38 -0500 Message-ID: <20250116211042.741543-7-ziy@nvidia.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20250116211042.741543-1-ziy@nvidia.com> References: <20250116211042.741543-1-ziy@nvidia.com> X-ClientProxiedBy: BN0PR04CA0006.namprd04.prod.outlook.com (2603:10b6:408:ee::11) To DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS7PR12MB9473:EE_|LV8PR12MB9232:EE_ X-MS-Office365-Filtering-Correlation-Id: 31e6b3a8-889c-463b-0a4b-08dd367246e3 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|7416014|366016|1800799024; X-Microsoft-Antispam-Message-Info: ZIyvVs2kQg+k0BtVNPjo9mJ42tklWxeKXQ/cMAImukdBXLplvFCKIXb2ubCQI5gDUldu55Zb8z8npn3NRFMqsuO2vPr4bD08IrPzfXatMorKvp2gx9osU1PQw6GqByl23F9lD1c8ccBl/Hfw6Utp7lvJlyXb/jiC3mE7QzOKwTkCOWZ13b0KvSwqm7Gz+Buyl7Zpah0op9Elq/4CSSBDvtzNUENVLYFtFrZMQCfPdU2tkLp/8ktUt4bJnT7755jdLo+dCQAO/T5ikN7OgtI+z7kml3wrlbAXjFpeisu3ggrlhx77Rwou39yV78Zr6G6xh93RhwflqsrKxaGFZbUknvW+X0J0GRqY1/zJ4BYmWkBnbV+gN8HCjDD6+O3ZHFNhmP7m44UFBlo1gtwV1QAJkvDbWzrwpJCssmAUfiC+0IKglEqo0AA+S4u3fz9OX6sHo731vQvnodQz6EHb/ItdmRa55TDJB96b8VVwM5N1/Oq0dCnbofE/ckNSZwM0m1jdVrpKn3/jNzda+lKjvFItPGnCBnYNE/MhzC76eSYL+/iDcSGFK77d8DC8vTi4E1riPZV46rHeJHdTige6tvpYqNMYELu8gEcXkB43B+JogaGJBIAEqLzilSiE/9pXXCiQT5yXCTE5+CSg515HgGBu6e+2C3worY2MHX3d9ondmwgBEXnwhl8OnI3omnsrF7RXP8UG5zbrZmxBe3V3DjJPHmuckJ3SvmTDN9WafiUV46wLDRgwwIinCL1sdBQEs/Bd7f8STYJSgoSDmNOdWpGiLS1awev8D0i+yC7MsymO4e3a2jhOJSfnWZ+4ViGJNImnPKkJjJHMKtaToyoULH47LlOGD3WyiqtD5kJETrOOSNr6G+ZUQMthcTKou49r5KRgTlKf1Z+SkVaPQ4Gt4HraHxm56L1WTvbpnjrPDE4gyrswlZbsFXziegBwO0YrgWV5xNhH0TBlaxpXhYTkBVqrQ7hY8zK1rRGI+pnRnEN3j+DVx8ccqF21RdFivV2Xy4BEu5uqfDcdQMTBKfFhKMrazFLrh+22nv99y1PkSKqKVDUhCDAhUvUDV3Ng0VKHMBRg0iCokXmLDmVBINmCeqFPa7EpMG7ufZXqC0Z2eLTfb9xghPE1JHmW4cH+8DF7WJ7OzmFq7mRGX20/10vCrLujmgVCXsanAl9Yfsh80fNLqhN9moUWiY8SJzbYsS9GHCiMeJUm74q1gIkEc+tJwwlHlh0BXFu6SXWPeKqN0Kg5g8/sgy43oRqxWnArFu6dJVjj3jke/HvbAGA8PVlbNscau7tdwq6jT17JzDTV1QUlr5QRXqELPMvNH1OHv9sZRIKQ1yKH8wb6ftQsntdVXaTlvvE+4R7aBRoP8KYJpp1MoQZryyMTNsn0ASMqL1oluTz1 X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS7PR12MB9473.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(376014)(7416014)(366016)(1800799024);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: H6zJzgpUHKBHy94nHFVv1WvHU98VOJrvjF3CAP/v6rs6E+OTbc3dPukIFJraiG+l/GvghpCOJP7ywyArHDkAwO2/Dn2SpwzOmNTLH8ywjtg3k0iBey+xXlz8GRFviqS4ZypakirakwO5Zj2t8j3faF7qpnzrbvRIpJkS5MkW2jGTa/GLWJ9vs9fSShBbNBGGKGocxmF8cBmkCsJhmnDUuJ1WUeL9dR2DLTVdRQCBRBKtKM1eJ2m8kTCBi5OfGQTLa1MlGH6TyPc0su/WyDBan4kpa8jJNWr/ctLfCG9clzuxdNNsjngFQwU1QzCjSUKr/viYAqLTjJ6CcUGFO+yhyGeyIt6Y2eBPouCX4OW4ww+H1x+Ni2lHafgRJLXjVeBqo+mFbD4M/vPA78UDF1YbBMVVXn1ADUByRSHnsp6w+rhj+QkSvAUEkPj9OGNZ3thuwW5VWE3+1NMfMEQI2wfRrOsk+ZP/jLTzW3bnqADWBzvNAuVGSU8SgGqoTekUmkiCP4E5RJwimnr3uuDvYsIrQm8AGHrZl0Ebyi/MMWQK8tWgpzXASwLCZ5QZ+P+oh35i0bLLech1OJf8aZv1H4yCtDFfrxR5imVxpihpSDGZAdD0CRGn037ml2oCzMqO/qc0g/pu2TK2B+Bh4xIk2MsQ9N3cXzzwX2StRzjCr0u5RbJCoyUw6U4VSHiaCXIfzMlDEKZv6MPXBuOpxrC1ly65XcD1pGzgyW3Fr0dHd0FoILoplKZ0aNYMhsm6reSetsHYjRG4+JUvPUGfYj1RysVRTJ0zsTCQNykzCzRLCcl4WlO9kQcPk9nWg2Y2Mxc9UK+xGnfe4hMyd9sh4uhLE7hkzDyJqWUd6ooGRxgWJ5CyVRtDXH31zHF3T07ZWZAlcm4qXzcl+FmUVffa38Q+1B/SldyeURpLQVx0rHkTm8Bemnq7JPHZxK8dIeSjTyjfz2ywDraR5+QGp76rVwvaAcPBAXDhgsfDySjkoTW+cyBBThYrviEAv6aI+zSemIfo23vRWqRxBrKW1IJnB4+rrkNRwVE4L0omi0VFPn3iuDUxa9aUIY5XwCGkiWpTMiu2Yor3gZJWcQ8rg9cOETmlkCov7o9bKaVAH1COERA6B/TJG+ZdA2pArqzY8dSUqwQzR1JlkQIRNGB/RP3IgGrAmrWZwvy7rWFuFUCpvmwDdu7KQ4sHuzpnod4kxXlybFblSeFDGtGv/18VRm6VTKMeH6ph+9Kj9vRFcKNueCgbGs2fNGySpmbmZZ4aZvHsFLjPu0iH5oo8Z9AHnxtU/5kWpedUbcJl4JjB6iURGQyaawyevfveF8F5OCeQkI//vfJYAhJbLvgNPVX+/FlYZJ0kfLLLTewdE4zwII9pSETfhweDkpeAgLzp7ePg6R9Mi8AuEQmjgzryegFaGtH201/2Os0H2LtFiiBx5lPD6u9F+EDuaFmgY3S0vaI5FsnL5FmownyWawJmevFhr6y3rOoeo9SdFzuvRO+YhLBf3PwAWttiQWfDkSVPruqX7dMBCNjDmk1j53K+8qMSzQjxa3pIxiykhdfSSatIwbJqHcZcq5nPnEt/ln8d4UVgSqxcoPBc8tmD X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 31e6b3a8-889c-463b-0a4b-08dd367246e3 X-MS-Exchange-CrossTenant-AuthSource: DS7PR12MB9473.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Jan 2025 21:11:00.1005 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: em1+x8C4SFFz/yyuD8jhgd0irSpTwR5G0K665SkkIHG6MQ3Yholj7IB1QPn1gAKE X-MS-Exchange-Transport-CrossTenantHeadersStamped: LV8PR12MB9232 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 899D220006 X-Stat-Signature: 5w1rfkkdcrzx5nn1e8hrfj9hr3fu3rcr X-Rspam-User: X-HE-Tag: 1737061877-863566 X-HE-Meta: U2FsdGVkX1+LzjgjpYKvY70YIG4adP+rEBb3VSU/gwzk6okrJaGYS92xzoc6hZujebxTzSr6pPGHwJgsMPVQaKkKE9Ihf2B8w2+v5i8uvunHmTs3BrkB+6MRaV9xL7i6t2QovmNnI+bLzM2Fc8c/fIKmZMm2siAaXH/Ea8buxSRPHKJtynkgg6Pf6xlcHMrIW/XuiViLlML+6+PdVBwLeKxEXBeqo91tJh3RhOUea0/zqc7LETCSdIGnmCiRdAxXuYN+ydJMZJDxzTu6nJAUgJylgkSlV4hOtc41lCAC9iSUoVW7/n9jBIYqjBxPMwA5jKKynwFMg4+9BoEcpMW0TSkXbhHrFlYOwqvUho9MsS6eC4fLrGJvEJ1x+V3vtOQBIFNkK/Iogr8fS8975EoJ/PfHcxHXIZhCJ6eSadG8NxAnTk0+sGZjUF1DI9EUoLBXHjhQi2vBAh/QWLd0jRirUHiosljH8BaNa901QSrAxj2168zNsbQeM5FVAia2GIMtLXiLxtc83Iq+UqpqUXnhLFOq6NttbfRi+eG9bewd/5d1YsHlxNwQdgkP57caDt+ioYBtLuXahTToEMLb66hukeIZ8lNPP7TECphid7bm8eUIOmlw5fxcZdSX4QdfD/noIzBc2b3Q7v+9V80J0iv9nbSSTQdkABXAdbbwEb6oJblPfQm0M8QMExI5Z1S7TSJcEvbZi2Ihx6tbWM4Iq+Ws3uSNLED5S7385x2mnLMvBWwjuu5O9Jp/zFtUOR2ZjrMLWsXBR2cltRBjMl+Af1zIYAR6a4/8dGsGyXVSonjfMuGvMGPYkfu7LUS99UKxdmABM3c1JCVO3ffze2JfDq2g694YFa5f7ZtnmPhsKUrXRJbsdDytPvtm0OZSSAKO5HTGTn6z6k6Y2oTuy4f3P5fqsJvGyvedNpvDg/PyOkaNhzE4dRssxGilkqRNR8jvvb/iBhQ3lD0xo4kxrApyLlo u3EoqjRu 7w24V2oSiBss1XCKXkopi3Fqub1vsOFnUkuBeHgi4+wiRZZzNurbDCGR9VngY/Awy45plrTuqSWxbHLt33CnmCQXWShdEGm3A2B/+rM5GPB8XvBUCPy2NVmbshqlIYaheatHj8qYqaYC3MqQ9Dv14LnXS5w4U9C4W7eLVw3VES77KRclw4hOoywVXAtxHlbW0I5JQwI9N8av6uvuV00/1I6YsIKZmjOAkeGD9TWk6qB7mqwHPwFfwSvrVJyxcf+MDdNj+FFAEigG7puODlenegRKeObMoK4ZdnN8YnFjHfglOKkNs61dFtN3d5aybGShQUPxerl1oDiA80CKO58qXndigDr7XwDeEUxEOIQswRZQ5+sPVaWNKrx5/2dI+LKsljt8QVBacom5RavU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: folio_split() splits a large folio in the same way as buddy allocator splits a large free page for allocation. The purpose is to minimize the number of folios after the split. For example, if user wants to free the 3rd subpage in a order-9 folio, folio_split() will split the order-9 folio as: O-0, O-0, O-0, O-0, O-2, O-3, O-4, O-5, O-6, O-7, O-8 if it is anon O-1, O-0, O-0, O-2, O-3, O-4, O-5, O-6, O-7, O-9 if it is pagecache Since anon folio does not support order-1 yet. It generates fewer folios than existing page split approach, which splits the order-9 to 512 order-0 folios. folio_split() and existing split_huge_page_to_list_to_order() share the folio unmapping and remapping code in __folio_split() and the common backend split code in __split_unmapped_folio() using uniform_split variable to distinguish their operations. uniform_split_supported() and non_uniform_split_supported() are added to factor out check code and will be used outside __folio_split() in the following commit. Signed-off-by: Zi Yan --- mm/huge_memory.c | 134 ++++++++++++++++++++++++++++++++++------------- 1 file changed, 97 insertions(+), 37 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 6c0089a0bcdb..d9f5ca61d78c 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3738,12 +3738,68 @@ static int __split_unmapped_folio(struct folio *folio, int new_order, return ret; } +static bool non_uniform_split_supported(struct folio *folio, unsigned int new_order, + bool warns) +{ + /* order-1 is not supported for anonymous THP. */ + if (folio_test_anon(folio) && new_order == 1) { + VM_WARN_ONCE(warns, "Cannot split to order-1 folio"); + return false; + } + + /* + * No split if the file system does not support large folio. + * Note that we might still have THPs in such mappings due to + * CONFIG_READ_ONLY_THP_FOR_FS. But in that case, the mapping + * does not actually support large folios properly. + */ + if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && + !mapping_large_folio_support(folio->mapping)) { + VM_WARN_ONCE(warns, + "Cannot split file folio to non-0 order"); + return false; + } + + /* Only swapping a whole PMD-mapped folio is supported */ + if (folio_test_swapcache(folio)) { + VM_WARN_ONCE(warns, + "Cannot split swapcache folio to non-0 order"); + return false; + } + + return true; +} + +/* See comments in non_uniform_split_supported() */ +static bool uniform_split_supported(struct folio *folio, unsigned int new_order, + bool warns) +{ + if (folio_test_anon(folio) && new_order == 1) { + VM_WARN_ONCE(warns, "Cannot split to order-1 folio"); + return false; + } + + if (new_order) { + if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && + !mapping_large_folio_support(folio->mapping)) { + VM_WARN_ONCE(warns, + "Cannot split file folio to non-0 order"); + return false; + } + if (folio_test_swapcache(folio)) { + VM_WARN_ONCE(warns, + "Cannot split swapcache folio to non-0 order"); + return false; + } + } + return true; +} + static int __folio_split(struct folio *folio, unsigned int new_order, - struct page *page, struct list_head *list) + struct page *page, struct list_head *list, bool uniform_split) { struct deferred_split *ds_queue = get_deferred_split_queue(folio); - /* reset xarray order to new order after split */ - XA_STATE_ORDER(xas, &folio->mapping->i_pages, folio->index, new_order); + XA_STATE(xas, &folio->mapping->i_pages, folio->index); bool is_anon = folio_test_anon(folio); struct address_space *mapping = NULL; struct anon_vma *anon_vma = NULL; @@ -3758,29 +3814,11 @@ static int __folio_split(struct folio *folio, unsigned int new_order, if (new_order >= folio_order(folio)) return -EINVAL; - if (is_anon) { - /* order-1 is not supported for anonymous THP. */ - if (new_order == 1) { - VM_WARN_ONCE(1, "Cannot split to order-1 folio"); - return -EINVAL; - } - } else if (new_order) { - /* - * No split if the file system does not support large folio. - * Note that we might still have THPs in such mappings due to - * CONFIG_READ_ONLY_THP_FOR_FS. But in that case, the mapping - * does not actually support large folios properly. - */ - if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && - !mapping_large_folio_support(folio->mapping)) { - VM_WARN_ONCE(1, - "Cannot split file folio to non-0 order"); - return -EINVAL; - } - } + if (uniform_split && !uniform_split_supported(folio, new_order, true)) + return -EINVAL; - /* Only swapping a whole PMD-mapped folio is supported */ - if (folio_test_swapcache(folio) && new_order) + if (!uniform_split && + !non_uniform_split_supported(folio, new_order, true)) return -EINVAL; is_hzp = is_huge_zero_folio(folio); @@ -3837,10 +3875,13 @@ static int __folio_split(struct folio *folio, unsigned int new_order, goto out; } - xas_split_alloc(&xas, folio, folio_order(folio), gfp); - if (xas_error(&xas)) { - ret = xas_error(&xas); - goto out; + if (uniform_split) { + xas_set_order(&xas, folio->index, new_order); + xas_split_alloc(&xas, folio, folio_order(folio), gfp); + if (xas_error(&xas)) { + ret = xas_error(&xas); + goto out; + } } anon_vma = NULL; @@ -3905,7 +3946,6 @@ static int __folio_split(struct folio *folio, unsigned int new_order, if (mapping) { int nr = folio_nr_pages(folio); - xas_split(&xas, folio, folio_order(folio)); if (folio_test_pmd_mappable(folio) && new_order < HPAGE_PMD_ORDER) { if (folio_test_swapbacked(folio)) { @@ -3919,12 +3959,8 @@ static int __folio_split(struct folio *folio, unsigned int new_order, } } - if (is_anon) { - mod_mthp_stat(order, MTHP_STAT_NR_ANON, -1); - mod_mthp_stat(new_order, MTHP_STAT_NR_ANON, 1 << (order - new_order)); - } - __split_huge_page(page, list, end, new_order); - ret = 0; + ret = __split_unmapped_folio(page_folio(page), new_order, + page, list, end, &xas, mapping, uniform_split); } else { spin_unlock(&ds_queue->split_queue_lock); fail: @@ -4002,7 +4038,31 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, { struct folio *folio = page_folio(page); - return __folio_split(folio, new_order, page, list); + return __folio_split(folio, new_order, page, list, true); +} + +/* + * folio_split: split a folio at offset_in_new_order to a new_order folio + * @folio: folio to split + * @new_order: the order of the new folio + * @page: a page within the new folio + * + * return: 0: successful, <0 failed (if -ENOMEM is returned, @folio might be + * split but not to @new_order, the caller needs to check) + * + * Split a folio at offset_in_new_order to a new_order folio, leave the + * remaining subpages of the original folio as large as possible. For example, + * split an order-9 folio at its third order-3 subpages to an order-3 folio. + * There are 2^6=64 order-3 subpages in an order-9 folio and the result will be + * a set of folios with different order and the new folio is in bracket: + * [order-4, {order-3}, order-3, order-5, order-6, order-7, order-8]. + * + * After split, folio is left locked for caller. + */ +int folio_split(struct folio *folio, unsigned int new_order, + struct page *page, struct list_head *list) +{ + return __folio_split(folio, new_order, page, list, false); } int min_order_for_split(struct folio *folio)