From patchwork Mon Mar 25 04:44:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Hubbard X-Patchwork-Id: 13601175 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67333C54E58 for ; Mon, 25 Mar 2024 04:46:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D630C6B0088; Mon, 25 Mar 2024 00:46:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D13296B0089; Mon, 25 Mar 2024 00:46:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BB5556B008A; Mon, 25 Mar 2024 00:46:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id AAAD76B0088 for ; Mon, 25 Mar 2024 00:46:27 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 4C59F1A03D7 for ; Mon, 25 Mar 2024 04:46:26 +0000 (UTC) X-FDA: 81934325172.16.A162E23 Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2082.outbound.protection.outlook.com [40.107.220.82]) by imf25.hostedemail.com (Postfix) with ESMTP id 44B30A000D for ; Mon, 25 Mar 2024 04:46:22 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=GyoJx6SM; dmarc=pass (policy=reject) header.from=nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf25.hostedemail.com: domain of jhubbard@nvidia.com designates 40.107.220.82 as permitted sender) smtp.mailfrom=jhubbard@nvidia.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711341983; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=vRM2c46Ff75/AqLwVQefBUyqOXAt4HFlfpfmBETDQeE=; b=KwAwJS1C4Fr0BAYWa2MG0TzOYQx/g/xT4heIVSGRcJqwEUHHRfkmLPcoJMQjw66zNok924 9mAATvM3rCDnZrOmYOaxAT6em4GaCGQUiFig3VsH+bayw/WeOVB+sToT63HrMQAhx+E0Fz VEwRMV6SrtpqHOe0sLZZqjSAHrvDN+c= ARC-Authentication-Results: i=2; imf25.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=GyoJx6SM; dmarc=pass (policy=reject) header.from=nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf25.hostedemail.com: domain of jhubbard@nvidia.com designates 40.107.220.82 as permitted sender) smtp.mailfrom=jhubbard@nvidia.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1711341983; a=rsa-sha256; cv=pass; b=fvIGTAt/zQypXkZtH1ezJVhOZf9393G7OOLG8Jl4RVCd0UqaAnndERHU5qygUatmWAZe1n z3qN4GFwmen73SjGe9nBUqNDaTLrNsdkLgZpZkQRstdnw3yhCQ8Ca4lq6ok1Uf41O1emOe /f+/0eF7N8pUpBUWtQ4odIwEDrfvG4w= ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=A7gAiaCg7RcNLOfLU9bhRJXdfLS6RlQDNxz8zJNlg+GWMhEUmjVRpPBnz1574PRlECL9MiUsKtprpVhmw+QEbRrkEOCIurgFhfVVorUtRbdSSeMGsTaLcjvajUte9/FGKYljVr2xZOPdZJ5KS40tt2vHjEiYOy+VqujSl0k6P7Kqf0YvSOM7PKExbdtMBlalrY0SvKCoBEtvPAMITHz6IkuKH8lQ521UqNyXbS7AccwIuo+LmiRsbOb6AXeQJpRaObCQ+gXj/SFiG2bTY3rrMGnyErbt22ZO1f9bRL5TNlYvfWLuiwL2stVc8cGszYvpW5JY5vMkgWeW1U+BwCA0PA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=vRM2c46Ff75/AqLwVQefBUyqOXAt4HFlfpfmBETDQeE=; b=GP2rjJHF1+ywUZpWhSnTLeUUQa39ARMqmbLcwHkBF8e0SKQlaGuIFHKm7xPRcv4qTKVeQ0/uDLwvyVuiCP53xnYM9bEApme+io4OpXNiTSySI91J9e2BeuXfgYslJStDiXzrPCbxZexmADllFBwR9aYgHxYzfJgXBTT59+YmIixj4jRCQnc1nYPB2Xrr/tX/5IdSpqSwp/gFOM2GCw5a8C2aCACboS4x5KJhxN1/pcoLkrwnBeYafBYhe3yU8WcWrkHy77nAUIOmvP2iD3GXCXPtz+b+lS/p/KSpqVED+DDEvrXBdZp5ttpjURLX3al6PrxubqeNqpvloYcdtizu+g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.233) smtp.rcpttodomain=linux-foundation.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=vRM2c46Ff75/AqLwVQefBUyqOXAt4HFlfpfmBETDQeE=; b=GyoJx6SMM8wNt7ReudpUnny0ottmC5W3AYRjJ2rKxmhq5qLF45ls8PaTcTo6BcZ851l54XRsA4gvW3l1q7/5//ZmKM0y+9mq+0ShjVrHs2tZyKn0C1PvSWYDFDrAQAU1KpxQgUCtm1y4uEyoLv3l6uSQAbWHLjmkwejlU+/0B8QvlRsAO2MILxpZdVm0vmFmexVYn0EpQL1SuwK/lQH+0QRiEKOQxb3Jpz452RPLcwCdkOskuH8YigPhUtADhNPXDnU54ovRtpWahAtSx1waAmPLYpN5JLgPubmuRes3MbW2EUTtgozP3Kd65zVdO2YyIUyp4AkRX53uD6DEvCTUaA== Received: from BL0PR0102CA0002.prod.exchangelabs.com (2603:10b6:207:18::15) by PH8PR12MB7133.namprd12.prod.outlook.com (2603:10b6:510:22e::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.31; Mon, 25 Mar 2024 04:46:15 +0000 Received: from MN1PEPF0000ECD8.namprd02.prod.outlook.com (2603:10b6:207:18:cafe::57) by BL0PR0102CA0002.outlook.office365.com (2603:10b6:207:18::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.13 via Frontend Transport; Mon, 25 Mar 2024 04:46:15 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.233) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.233 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.233; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.233) by MN1PEPF0000ECD8.mail.protection.outlook.com (10.167.242.137) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.10 via Frontend Transport; Mon, 25 Mar 2024 04:46:14 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by mail.nvidia.com (10.127.129.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.41; Sun, 24 Mar 2024 21:46:06 -0700 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail203.nvidia.com (10.126.190.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.12; Sun, 24 Mar 2024 21:46:06 -0700 Received: from sandstorm.attlocal.net (10.127.8.12) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.1258.12 via Frontend Transport; Sun, 24 Mar 2024 21:46:05 -0700 From: John Hubbard To: Andrew Morton CC: LKML , , John Hubbard , David Hildenbrand , Matthew Wilcox , Zi Yan Subject: [PATCH] huge_memory.c: document huge page splitting rules more thoroughly Date: Sun, 24 Mar 2024 21:44:52 -0700 Message-ID: <20240325044452.217463-1-jhubbard@nvidia.com> X-Mailer: git-send-email 2.44.0 MIME-Version: 1.0 X-NVConfidentiality: public X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MN1PEPF0000ECD8:EE_|PH8PR12MB7133:EE_ X-MS-Office365-Filtering-Correlation-Id: e98b56d4-88f6-4a37-572e-08dc4c8680c0 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: KQ+EPSIA+rZpqJLLKITlx6fhGDctAb9Cdmy8NpBS6luIMNMfykwTlGe/hpUB9e3iwEXvJtv9xchlVYsHf05Qx99fFMwECfX3pVwhS1aiFDreCa/z9NCGVRwQaJDXdgOO3dapnbK8Kgh8BM3w5rwofsHREDek4neGKO3wO100UmKzxytjYDYTsKjEWnx+gaj1/kUTadZoBYASjj8tZwDXWSb2TglAQ+tG1mkic9b6RDIx19oPv4atl9a5/NmbpE0Xv4AaSbAEUkIw/2w64N4RZyqnZ7EqEsTLY3E3wWaCVdVfZFhvBNNKfk/ffxvsIqw67ggPtBgEhbzQI0bsegv+XnZk86h+uFNIbq6VggEu/STN97hujAA3mzc5V07JZZvD2tY120Fn+w43jQ94gbBYLV5CO7VO1na8sFgMdXnXZkI1o1UGWcoEAxLPRxNlw2KgzYXF/ZWdgrLgOMo/otUIFHg1ZgOmBl2yxh7NcWOfr7GYaf/LDrq6lRWiDzgsJS5qswkErsQ1tmU4WYMdZAwqAPuWjp2J2BLSJ7b33ODnO3vidi90LmfE/cWhKAcsB5gNghO47H2w5GN2KHpMdHWq5Y1Y1NO/Vzo/+WpE/1qu6kWNyxipOCJoX5KoCIkPm9ipInZTR9xgwhor+4AANDB/Im0zh65AXXRgKXQ5CyPtd0Se9PUTt3UDFf4n41hhy13HJeiMvljwgMURj7xmw/bfnQ3hxEI0Fu8o8yBshTwG+btzOflaswjfiadKpNIXkWGo X-Forefront-Antispam-Report: CIP:216.228.118.233;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge2.nvidia.com;CAT:NONE;SFS:(13230031)(1800799015)(376005)(82310400014)(36860700004);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Mar 2024 04:46:14.7310 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: e98b56d4-88f6-4a37-572e-08dc4c8680c0 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.233];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: MN1PEPF0000ECD8.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH8PR12MB7133 X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 44B30A000D X-Stat-Signature: xo9goh9bpuico19s8yrcohy33s8nhf7g X-Rspam-User: X-HE-Tag: 1711341982-882445 X-HE-Meta: U2FsdGVkX1+wlIZvVIgPPgEf5TXG+1XlGspYSTnK53+lv40hHa+Wh358kZ8uEaUILCJS3VuKIjgeZ8MlRnKThEobg76nc6druj8a4RFVxZmzSgSjpfKfV0/j6+CPXHI0Ke8x+OrJnqol6qyvVEDPYonOWKZDDpcO5O+ghAcsJeRr6kTfSfsCBR0K28bBUYOkefwqH9pkREhoDUOADF/LrKGJdYBpsLzMROVGgOkEO6PSUK7EN/aXKA0aIMpVXGcbIYNhtaY943FltuIRNyCVma3+hQ74PyVAIaqnp/Omfy12LV8u4EnTNtsreLIR6kprozHRoq9t3XmmjdzDTD87DrT4Xged6ddnlWWfwMDj6YXA9LFb5zeyHcrjeXGhf/cynBQtdowOlwYFmKNe4nbiLJ6St39NowXKx0i5tVvmS7H5b87x47YJKar6kHZvEXGjMIl9dpHlcuG5KfRa4w8NqIe3FjXRBYJWgSwjTJBz2BgGzQZTiTTtjkVnZDJQbkulK7M2RbzTd+L0I6FVlzyy7N3WYwGFs3myaD8HW33Od5/PMM6sZzvX8XjK3s4IlXLoN67PvowT3PwCQdTrQdD7Ba7pBjE+YrExziwyt2ad7c6jhRXJERj/BcVWR1RfP0XBHeolxDuENDfT20+yukN8oNNOb6jqjVIPcw93e9+WBulG6ObYStwl1JoG9ZUi/bWX4ounQOwQj0b+zTjD4Im4wXX1spTLz/UkTogeN8LTxfKj0MXzjEQv8K/wPjnV7IdqYklNBdKWf8eeq5Zx3IK8iBYqsqc83CsKnd76Z19ZsfWyx8LjyRz6/hjOvWgvngrullhwNGJz78VeRI1XRNtfcH18JmVDtdUITCTlXdBouKihH8MnuTXtHNc/vVy9UkSKWojBEOv3Nw2RkNrWXyR/ySz1cvjVJaci3VAhzaoB4kXmc7NKiaGP6cNI3uJoQE9FUsr0yUxs0bTyEbI3Yuc AuJ17qi8 6n2TsWp32TWzxS8Lwx//KSC8BAOJNtPqPCmRcF46CbEiLzmiYLH4OjtPjZMoIdpTtI7I+ekJn4JBIC6FrWeWChRFx6cPtZEUTc4aCKl94O4eERiHBQV+naSNDmLZ0xjqSsGGZAcR73UZyXfGS58UvHefsefpofB0YV8tshF4O99V+jvK5ajQkw6GsDHUdyFzC3JnVaYIVcg7aE9hDFoGIEeWVldEwu5dbKwLvIgbW88YfHLLHIdVp9/QFW6CLd+Kq8MzRe36FrFGNgD2Onefts94eMtZtv51RxGw9h3GHYrjaRifgZ34IynAtOtpZ21GBPP761cYB0yuMorSQpkqWFemH2fL/jVWb7pmK2nqlNYzfA1EgU7Yeyp9Tsq+Z5g5FZL6kx4HgKmMSu/DfegXqipG57Idz0UTER7Njz1rlAu61uRyXCzn0yKLxVM+tJXQZByGkue0inPYMmXw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: 1. Add information about the behavior of huge page splitting, with respect to page/folio refcounts, and gup/pup pins. 2. Update and clarify the existing documentation, to compensate for the ravages of time and code change. Cc: David Hildenbrand Cc: Matthew Wilcox Cc: Zi Yan Signed-off-by: John Hubbard Reviewed-by: David Hildenbrand --- Hi David, Matthew, Zi, This is a follow up from our short email thread of a week ago [1]. Zi, I've inflicted some minor violence upon your original wording, and moved it into a Prerequisites section (item 4). [1] https://lore.kernel.org/all/d9c06bec-805f-4d53-9f91-6b8ad29fcb6b@redhat.com/ thanks, John Hubbard NVIDIA mm/huge_memory.c | 42 +++++++++++++++++++++++++++--------------- 1 file changed, 27 insertions(+), 15 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 9859aa4f7553..9f2354068359 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3013,28 +3013,40 @@ bool can_split_folio(struct folio *folio, int *pextra_pins) } /* - * This function splits huge page into pages in @new_order. @page can point to - * any subpage of huge page to split. Split doesn't change the position of - * @page. + * This function splits a large folio into smaller folios of order @new_order. + * @page can point to any page of the large folio to split. The split operation + * does not change the position of @page. * - * NOTE: order-1 anonymous folio is not supported because _deferred_list, - * which is used by partially mapped folios, is stored in subpage 2 and an - * order-1 folio only has subpage 0 and 1. File-backed order-1 folios are OK, - * since they do not use _deferred_list. + * Prerequisites: * - * Only caller must hold pin on the @page, otherwise split fails with -EBUSY. - * The huge page must be locked. + * 1) The caller must hold a reference on the @page's owning folio, also known + * as the large folio. + * + * 2) The large folio must be locked. + * + * 3) The folio must not be pinned. Any unexpected folio references, including + * GUP pins, will result in the folio not getting split; instead, the caller + * will receive an -EBUSY. + * + * 4) @new_order > 1, usually. Splitting to order-1 anonymous folios is not + * supported for non-file-backed folios, because folio->_deferred_list, which + * is used by partially mapped folios, is stored in subpage 2, but an order-1 + * folio only has subpages 0 and 1. File-backed order-1 folios are supported, + * since they do not use _deferred_list. + * + * After splitting, the caller's folio reference will be transferred to @page, + * resulting in a raised refcount of @page after this call. The other pages may + * be freed if they are not mapped. * * If @list is null, tail pages will be added to LRU list, otherwise, to @list. * - * Pages in new_order will inherit mapping, flags, and so on from the hugepage. + * Pages in @new_order will inherit the mapping, flags, and so on from the + * huge page. * - * GUP pin and PG_locked transferred to @page or the compound page @page belongs - * to. Rest subpages can be freed if they are not mapped. + * Returns 0 if the huge page was split successfully. * - * Returns 0 if the hugepage is split successfully. - * Returns -EBUSY if the page is pinned or if anon_vma disappeared from under - * us. + * Returns -EBUSY if @page's folio is pinned, or if the anon_vma disappeared + * from under us. */ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, unsigned int new_order)