From patchwork Thu Jun 27 00:54:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13713667 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A59E0C30653 for ; Thu, 27 Jun 2024 00:57:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 29DDD6B00A7; Wed, 26 Jun 2024 20:57:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2754D6B00A8; Wed, 26 Jun 2024 20:57:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 07A636B00A9; Wed, 26 Jun 2024 20:56:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id D3B096B00A7 for ; Wed, 26 Jun 2024 20:56:59 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 55045818E2 for ; Thu, 27 Jun 2024 00:56:59 +0000 (UTC) X-FDA: 82274854158.11.30F4935 Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2067.outbound.protection.outlook.com [40.107.220.67]) by imf22.hostedemail.com (Postfix) with ESMTP id 79DAAC0006 for ; Thu, 27 Jun 2024 00:56:53 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=eiXixoTO; dmarc=pass (policy=reject) header.from=nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf22.hostedemail.com: domain of apopple@nvidia.com designates 40.107.220.67 as permitted sender) smtp.mailfrom=apopple@nvidia.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1719449804; a=rsa-sha256; cv=pass; b=JlsFM8phcU/cn9ifwkVqVygn/OGm+hRxVOWMzdXKYRKeOyV/IvQRj+BLsemD7mUgIWeflM CLjAlEeHYHX9fONOfOREXfoppY20VBCxBPzU5JAS8tDyuSRjSHYFH8V9fvjeM9Oh7gX7Wd vhkA9RkhBp4JWW+hDj3xJy2siuEr8jY= ARC-Authentication-Results: i=2; imf22.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=eiXixoTO; dmarc=pass (policy=reject) header.from=nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf22.hostedemail.com: domain of apopple@nvidia.com designates 40.107.220.67 as permitted sender) smtp.mailfrom=apopple@nvidia.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719449804; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=y4HSHp37qCtA/VWcs3e6LcRoHar3qq9hdLZVRN2dDn0=; b=YIwyrY9VNNTNuQDwbcAgQEppNUkzlu+6Ow3yeExefCzxuJbsmIRlFxzzTupDWp0YZ2HUrB 398rm9Kf+Ts4XN5/tpcGAtrydJB9+SP5aj6/dyQDNHf3s+9/6SjXUY7WD7GhBoGrcHivox wlzfJdqyodriAOrpLCYMlynuWk5WOKE= ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=SNcUpF88Y4rEAdYcsUr5Ib9KGWajskSOzBPzKlQidlJAyE425/ZuA7ivVpjxBU5rssvgjc4KnFlTiWWilc+7v4uA0AslTiTHid9mDVokDEByj/e3ICrHmLOwFWxy0QvxJgjQoH3yMgc1+v6IWlhcsHNsn6JZIyz9HhP8qpgbnSMrYmRcoVcmJEDf5RH1g+02nrLaeSLal29v8tVaw2HTLZdQC0F/Z+oaUiLqepA/GwmDKVlh+Md0uAEfu8VjQdNiMieTQY1vkbDpNDIc88NhWlGSPkLS7lRduJD2aVYc0S74xdxcGQ6g5l1Bj4A8r+ukJo5EN4vfZ5HSMwFThrFf1w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=y4HSHp37qCtA/VWcs3e6LcRoHar3qq9hdLZVRN2dDn0=; b=PebZMnc78+hTnDwnP7KzkSYmQRD6Vv0ag6sRha1e26PXwbdvS7SsLMClhZOkiSGMCHz9Y5xpvH1LTINiQ7PfFuMbHDnuBcFdOMz+/YA8C1OrWD92SPTWKlm0BTL1TY1mH6r0xGb3EV67jrH6uXp6NOXwDB4RKn8Hhbtmt4QysVRRRWRI4QvC44GVS0fhDyLthmomJBsx3r5ZWIrthOKE+Tdv8FHBY3J7xUOcasWDz7K0zbJVcYzwJcZGPDH6e3iMTDfCMYi5wl30vZx9/VaKvetTPYIpVnhrLHnKA+oy5oJ0x9l0VayymTl1QW79khnOyJdd7mplf0+KgGHPpz1TNQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=y4HSHp37qCtA/VWcs3e6LcRoHar3qq9hdLZVRN2dDn0=; b=eiXixoTOxfrJyf+EpoEiyPUuwDFOoY9CxumkHnCC/k661PJW8cbLDdghgXgOEJxQZ6pW6Dg0ZcOGwvaZkw6xJKrjxz5qtKuNKI5IgZOty0c8WqySLl/ww9dQvLW9fhzVdIHSdAq3jXqxvm38NocuNalIvvrQa9B9mWfuyt/kCHk7+x0cd1o0ASAobdL1gRWyQTxLGf1xEs1oTZCoSM6Yt5dBb364inHi7byzESFPlCZ0lWtNT5rRjSqjYSZCsS2HGNrPH7AaMbinAs1EFlR1BYByNCilCL6i91Ssz1D0W3m9hBI1fmp4oNAQkv3Sm2oaZgDN+kLPDyw3juGm4i5mAQ== Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by MW6PR12MB7071.namprd12.prod.outlook.com (2603:10b6:303:238::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7698.32; Thu, 27 Jun 2024 00:55:10 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%6]) with mapi id 15.20.7698.025; Thu, 27 Jun 2024 00:55:10 +0000 From: Alistair Popple To: dan.j.williams@intel.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca Cc: catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com, Alistair Popple Subject: [PATCH 07/13] huge_memory: Allow mappings of PUD sized pages Date: Thu, 27 Jun 2024 10:54:22 +1000 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: X-ClientProxiedBy: SY8P300CA0020.AUSP300.PROD.OUTLOOK.COM (2603:10c6:10:29d::23) To DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|MW6PR12MB7071:EE_ X-MS-Office365-Filtering-Correlation-Id: 98d895e4-4e19-4e64-caa8-08dc9643cb5d X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|376014|7416014|1800799024; X-Microsoft-Antispam-Message-Info: /XEw5kcci8RbCOPgi6/1Mvc5Y+zu3BWglvtZhovBvLxdtQRFmQR4vmhXSe+lE9VoqkOZIj1uG+qHLye/a1nDN6YKloB1AeXshqejjDQ8Q4LoLO/NqzwQl1JdW9VlIM6gEnTwSim19NeyMI0yJPPqzUStM56xdc2OEghfdrMl4a2GvEKSGkRHHYmezUqa78O9hJiiQXv9YGz6IKNJJwb7VPyxOz9/SgZCY14rt3KgVtGVpIUq9rFA4C6oiwn8LD06xouB/4DohStism9K/hqy/R4wFwU2Ob5uBKVIdt4O01weNRzu22seyv+xqGymJwv9PHOHPaDX56B6Tf2N6MRg9Tj5bcmHSM9mgqKouX/m+qV1qMKdO4I0z78xOEHFzjEKIUPyiMKlf6FnqH+EbnIOCzXX7kIpRkUdzviaPkFZpWYMloOhFI6rSKDq1qR9GwtSPX83BFAsDKBsotqpY1sL90UGmMmrRf8Rhf7PsaXT2lf902kF3BHHUl9fANbpbI9oSdsOf0ZWdohZY0WqQ/c3Q/gl0T76p65EDL/YU62i6t2eCCYaH4S5OO4RnHdz8wjAyXsyYuliSePw0aSl3txVX3YHAq7oX0DjJr0Ng+nS2WxbQnreMdSRp1AlRPuliXcF721zHbsGug4QLLstL55AKStyl9CdtZaT7IECiB1PnjhDlovN5wdEqr7oxAF3onOFRYSfhDUOAyzd8Ub4LMc+13W1N+si0qHtXBwsNTeHt/P08AzLvcd975n3SxuWP+Qv2t8zLv1wEOwWrGGOnDPqDWTsZOdEYMnruE9clJGok3p2Hh68Ek2W0DnrK/fZRbydUgx08zQNFnxcRb1x6h33FqqKgDjJCdmszvlXc9ITuUARu2gkwpgMvQ2oduXlf0nl1T771Hv7L2YNFYKSh56IZ9bKGyY5FL6bCNpRcTg5k1Afj5LD3FyjmqHv/m9ycTl4e6UOXhAA5GHFZ6YShytKHEA1x68dAFKR6POKyVXJf97xMVvMjcwjRXEsR5/I4WzxVk4+vvmzyG7WHgt6/Ha3uztDhBh8ZxxFGp7EwxFB7xbjIynvAXpKCacOW+MV83AOc4U9FeB8TK9NrCdSRLziAQl4u1n2zxbilyHdFLYrgMwcjL9JmGlToGDIRYk78gYBSxGmXWrIOIHGjaUH98/Nd35EUG9t2ZtVZU+1WRdkkC3KE1Qj1mlBM7M7y2rcKlJJl8HSy5wOg4wk8kLVHgQotxVPmDkKtBTDm872PW6EXYMkUoykPlpozw53jtHE79xxqgCKPI0Vr2f4cWV3cIuUmSukyQdeLWTecbtzz1uEBkh5vNJFIo7DZLfwL8l2As4th0M1MvnJnklMn54owHdzsQ== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(376014)(7416014)(1800799024);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: +/eZJhitVNCiazNaxxAnhP/Z7GJqM6EbedzvKe83MOHIuZsliHfWejaAQ406sSkmJNne0zRA6MW94stsDybWirIF2m2nMBR9hSXTCzHZEA2eb9lqTFB/0xayODYRRD7f7QkFNgbRCf+TZ61834uHfcaDGkYMBNjUX20jOEL4YEqbPjltnjt1DY1uuewIKXoRquh+Uz2lPxOjT+0UxgABY9HbSQY+oUQwAP4szecHdEwBe53plr09Gh3BiCWtk5fXTEvUTeQQLj+lt9Z5uZG6x3V+33HEa8HOPGahCgoskcAKZ8SIzA77lwCUu93SDPgXBS0C5QWjIx8XAFr65LjUXU8148XD9aHgGMzp7/6liUA+Zz4/Lz7U9xLR76AYGx0W5XZBmY42LzxS5dqnLJhulo5Q/sDGRWNdC0loAjMIomqjvfz7uA7YAe4S5FQ0IAj7mb8lgdCoAGol3Jdsr/63ChIVez/7GqeS5z8jO788myEHYpxFD6DpmGaIw+kpVUceUYPCvnN9vPAwgysXr8WIkxiNb3eGQCWK0rLanBUjumHl2VM/KJyh1yeh6KcLq8IKEedBNWUecoeeJ/tpLgaj2LgpGgt+yCCTMHWuTtOHVRuNzRQcD1Q3uII8WCu6C4/Iltwp3skHjj6K9mpLVyeeVXeMpsDMgRJ5TfSdZnPvxnRQp3JjD4syLthrDVz9cvWpWxlhCZLAjFmiJc8+ZY9fLSGKlkmAgQG/ycVNe1vuEJ8Esd+F3rXm062Ktm1JYREOjRAG0aZKgyiEWXCnZk1+XLsy8C+WUhgHwNsJv7jzLMUVO/jJfKIcL3T9reqKCMqxzxNjE7DcIQIEV1NpJ6iYAEjWPgBGoZkh3zNXvaDqH3LDb4UxpeUJfhQqc727tTyAaTfOfHSHkEKES882G46T1MfbrOWThs43k9yMpgyxCNQg4wLryLu9DkDyMSoIG5qmxLTK6GyIoL/DmaEqEsMAlqfSIYv1V+Jw0KSn1JLeKRyWshzpLUTiGKpoWnyR80acwrOtZB9xoh44kmjo9Z452hp/j/w9Kd0I98GJ0U8udzklAWMKbSG9H6Bmvwi9nRDJ4fhIUw9m47ohBC3+pyUYfi+w+oNMdP8lhUaa4dFf9WErcxfjLeMREaj07vMtwvyGcUivIiPtiDe/CwV5ArIm7jG9mq7dDoEicu6p6wI89mz09sXdmYAuc087L0VUf5d5Nk7Ud+3jAZOwpviB6c7rundMoZKNsinxayQiwbV24mZWvMlKT/+rBhrtLpurRxycQMYRCqxDJPGDtjAwEaChBMRKREFjeYpRTQt3xdsPEiwBQDVh5gGDwoIBI9Pjxb+BA4sWJEQwJZnW/WOR2OD4MvDzfS4R6qW6TNvc2OtjlBeY8Xinf2DcsUNTx9UkAMSRaNdE0UhnbdRGEc4f6jl0Zsgilj/U848Pms8jSTDTbsh7IGoqoEDHP+7hdOdGvQe6B04c0TxBkFEsSYy0c9UVJlXHjHlP3EjHkNy1Sw4JTwM0ivZkR+dpm0xMaNOtIp3T/DBRfqdfBdgsVxG61gbTqB3T3tTziqwR/ZdT2TdAPOg2Mnod7PRVb5hwTN5i9jn2 X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 98d895e4-4e19-4e64-caa8-08dc9643cb5d X-MS-Exchange-CrossTenant-AuthSource: DS0PR12MB7726.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 27 Jun 2024 00:55:10.0152 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: BYfcMwGOWBQA08pLBQl0N8W+2bB167QZ2Zrzqkta3KCKs36XsnVDKHI6RNdt8zy5VzjYil5I34xcOHdHl4Dltw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW6PR12MB7071 X-Rspamd-Queue-Id: 79DAAC0006 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: d3czm366n8ec7e976zgf7zowcmib3t1m X-HE-Tag: 1719449813-716389 X-HE-Meta: U2FsdGVkX1+dtDgvF1GhuaU5iKzs8axAmoYMV1BtHFP19Rl0Trubo2vzpNsTV23Ho+cFGDhZOI/5tdxdAfVUxGNZSz6UiuRhRWXQJ6qyR+QHXocJXgMi5/4duK16/mN95AmroRYXS+kmDNzRcZvrishAF+4xbydqqty/DPeHyarrFdc9WmVvTs2QeniibA6Mckein/H6cYuzZCFO/kEVB3NsEhff1ryow/MhKBh6xJhgL2I5dgPg3/J2sZ1Ln3skNqMfhVBcnLuR8RRJlX50f+Mha8GDI8wZOEADrDvxNXnrDzbwRK81yLG876V7hTr7A5DckiZaN7Pl0tjtTTg8NGoReymjraQkd3QD/6ZDLiERc7/mci39bvl67ul/UCMdXva637Uee8BVwwBUd9u5ceezJ9crR9ybJ4zDE2xOsV9MnKB4Le3QvzQdFdhzUGVlvB/37RBqwKpF35Ydea7jGEDsG6awjANR7FAvs53iwBbmePdK45mJb3pvzgIRVCNdp0uJpnmCtTOpTVkiXUA+KHUcJFWItRPCL2/+IJtErUc5Y0dMwDNVGmKiqs8RGp83+wfbU99sZZvoM3m2tg4R0zyVuAe4FZgYsiq8BUAFMRcXH8GyryTmizfEyOe/NUfN4LeMfpgjHdjrfIYCizGlwlhOQkz93qVcYUymDcN2o11QpnqAyChBh/uYteIRdebJooVhi9fAahuq3k8hwCXcPKbSeWfLVKloYoaPkc07gHOO0eHhrdGKaEhee3Iq4F8shj94wrjcOM30O8zH1nbtNCaZcsgzoBjLGZXK1k13TV64JVRzdhcN5pzojlpbW+vyBGmPau4NlyRGAOcg2EIZ/GSQ9xLItyTLAWLpwzA4nG86QtANYTWyRGZ0xXg/8x5J6il2Aev4bHgHJNKN2QtNxXzkYgrZqH/IqG+JU1e9+o5SOrI9z0rdWfWBCZ2APg5mY+yojyjNmij2K7azvKU PU+LoUuT yMfiYsdmcb3cPE3kWO8o6JU58ep2p9oJ2ZpwAFF9zw2s5cxEcMJs1Yv/zMcSCsvrQaTZSSuzNhx421nnrplBE+rXvC0XDcQl0ZLRH2aCR0ha+GA/zxBCCwjfVlg2EmFfTkmTQ1s8clHOfHbgJqknw85s4vhBf5i6k1qVRDez/upxHlIefUQBlOauGmsyIzml9GXOkhzjBReRLi6DNtxd3K+Nq0wxeu2+zihBaSehdstPewOJrCkuRPxZysIQAEQqpjVa8JNgb/5iyLVE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently DAX folio/page reference counts are managed differently to normal pages. To allow these to be managed the same as normal pages introduce dax_insert_pfn_pud. This will map the entire PUD-sized folio and take references as it would for a normally mapped page. This is distinct from the current mechanism, vmf_insert_pfn_pud, which simply inserts a special devmap PUD entry into the page table without holding a reference to the page for the mapping. Signed-off-by: Alistair Popple --- include/linux/huge_mm.h | 4 ++- include/linux/rmap.h | 14 +++++- mm/huge_memory.c | 108 ++++++++++++++++++++++++++++++++++++++--- mm/rmap.c | 48 ++++++++++++++++++- 4 files changed, 168 insertions(+), 6 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 2aa986a..b98a3cc 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -39,6 +39,7 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, vm_fault_t vmf_insert_pfn_pmd(struct vm_fault *vmf, pfn_t pfn, bool write); vm_fault_t vmf_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write); +vm_fault_t dax_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write); enum transparent_hugepage_flag { TRANSPARENT_HUGEPAGE_UNSUPPORTED, @@ -106,6 +107,9 @@ extern struct kobj_attribute shmem_enabled_attr; #define HPAGE_PUD_MASK (~(HPAGE_PUD_SIZE - 1)) #define HPAGE_PUD_SIZE ((1UL) << HPAGE_PUD_SHIFT) +#define HPAGE_PUD_ORDER (HPAGE_PUD_SHIFT-PAGE_SHIFT) +#define HPAGE_PUD_NR (1<_large_mapcount); break; case RMAP_LEVEL_PMD: + case RMAP_LEVEL_PUD: atomic_inc(&folio->_entire_mapcount); atomic_inc(&folio->_large_mapcount); break; @@ -434,6 +447,7 @@ static __always_inline int __folio_try_dup_anon_rmap(struct folio *folio, atomic_add(orig_nr_pages, &folio->_large_mapcount); break; case RMAP_LEVEL_PMD: + case RMAP_LEVEL_PUD: if (PageAnonExclusive(page)) { if (unlikely(maybe_pinned)) return -EBUSY; diff --git a/mm/huge_memory.c b/mm/huge_memory.c index db7946a..e1f053e 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1283,6 +1283,70 @@ vm_fault_t vmf_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write) return VM_FAULT_NOPAGE; } EXPORT_SYMBOL_GPL(vmf_insert_pfn_pud); + +/** + * dax_insert_pfn_pud - insert a pud size pfn backed by a normal page + * @vmf: Structure describing the fault + * @pfn: pfn of the page to insert + * @write: whether it's a write fault + * + * Return: vm_fault_t value. + */ +vm_fault_t dax_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write) +{ + struct vm_area_struct *vma = vmf->vma; + unsigned long addr = vmf->address & PUD_MASK; + pud_t *pud = vmf->pud; + pgprot_t prot = vma->vm_page_prot; + struct mm_struct *mm = vma->vm_mm; + pud_t entry; + spinlock_t *ptl; + struct folio *folio; + struct page *page; + + if (addr < vma->vm_start || addr >= vma->vm_end) + return VM_FAULT_SIGBUS; + + track_pfn_insert(vma, &prot, pfn); + + ptl = pud_lock(mm, pud); + if (!pud_none(*pud)) { + if (write) { + if (pud_pfn(*pud) != pfn_t_to_pfn(pfn)) { + WARN_ON_ONCE(!is_huge_zero_pud(*pud)); + goto out_unlock; + } + entry = pud_mkyoung(*pud); + entry = maybe_pud_mkwrite(pud_mkdirty(entry), vma); + if (pudp_set_access_flags(vma, addr, pud, entry, 1)) + update_mmu_cache_pud(vma, addr, pud); + } + goto out_unlock; + } + + entry = pud_mkhuge(pfn_t_pud(pfn, prot)); + if (pfn_t_devmap(pfn)) + entry = pud_mkdevmap(entry); + if (write) { + entry = pud_mkyoung(pud_mkdirty(entry)); + entry = maybe_pud_mkwrite(entry, vma); + } + + page = pfn_t_to_page(pfn); + folio = page_folio(page); + folio_get(folio); + folio_add_file_rmap_pud(folio, page, vma); + add_mm_counter(mm, mm_counter_file(folio), HPAGE_PUD_NR); + + set_pud_at(mm, addr, pud, entry); + update_mmu_cache_pud(vma, addr, pud); + +out_unlock: + spin_unlock(ptl); + + return VM_FAULT_NOPAGE; +} +EXPORT_SYMBOL_GPL(dax_insert_pfn_pud); #endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */ void touch_pmd(struct vm_area_struct *vma, unsigned long addr, @@ -1836,7 +1900,8 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, zap_deposited_table(tlb->mm, pmd); spin_unlock(ptl); } else if (is_huge_zero_pmd(orig_pmd)) { - zap_deposited_table(tlb->mm, pmd); + if (!vma_is_dax(vma) || arch_needs_pgtable_deposit()) + zap_deposited_table(tlb->mm, pmd); spin_unlock(ptl); } else { struct folio *folio = NULL; @@ -2268,20 +2333,34 @@ spinlock_t *__pud_trans_huge_lock(pud_t *pud, struct vm_area_struct *vma) int zap_huge_pud(struct mmu_gather *tlb, struct vm_area_struct *vma, pud_t *pud, unsigned long addr) { + pud_t orig_pud; spinlock_t *ptl; ptl = __pud_trans_huge_lock(pud, vma); if (!ptl) return 0; - pudp_huge_get_and_clear_full(vma, addr, pud, tlb->fullmm); + orig_pud = pudp_huge_get_and_clear_full(vma, addr, pud, tlb->fullmm); tlb_remove_pud_tlb_entry(tlb, pud, addr); - if (vma_is_special_huge(vma)) { + if (!vma_is_dax(vma) && vma_is_special_huge(vma)) { spin_unlock(ptl); /* No zero page support yet */ } else { - /* No support for anonymous PUD pages yet */ - BUG(); + struct page *page = NULL; + struct folio *folio; + + /* No support for anonymous PUD pages or migration yet */ + BUG_ON(vma_is_anonymous(vma) || !pud_present(orig_pud)); + + page = pud_page(orig_pud); + folio = page_folio(page); + folio_remove_rmap_pud(folio, page, vma); + VM_BUG_ON_PAGE(page_mapcount(page) < 0, page); + VM_BUG_ON_PAGE(!PageHead(page), page); + add_mm_counter(tlb->mm, mm_counter_file(folio), -HPAGE_PUD_NR); + + spin_unlock(ptl); + tlb_remove_page_size(tlb, page, HPAGE_PUD_SIZE); } return 1; } @@ -2289,6 +2368,8 @@ int zap_huge_pud(struct mmu_gather *tlb, struct vm_area_struct *vma, static void __split_huge_pud_locked(struct vm_area_struct *vma, pud_t *pud, unsigned long haddr) { + pud_t old_pud; + VM_BUG_ON(haddr & ~HPAGE_PUD_MASK); VM_BUG_ON_VMA(vma->vm_start > haddr, vma); VM_BUG_ON_VMA(vma->vm_end < haddr + HPAGE_PUD_SIZE, vma); @@ -2296,7 +2377,22 @@ static void __split_huge_pud_locked(struct vm_area_struct *vma, pud_t *pud, count_vm_event(THP_SPLIT_PUD); - pudp_huge_clear_flush(vma, haddr, pud); + old_pud = pudp_huge_clear_flush(vma, haddr, pud); + if (is_huge_zero_pud(old_pud)) + return; + + if (vma_is_dax(vma)) { + struct page *page = pud_page(old_pud); + struct folio *folio = page_folio(page); + + if (!folio_test_dirty(folio) && pud_dirty(old_pud)) + folio_mark_dirty(folio); + if (!folio_test_referenced(folio) && pud_young(old_pud)) + folio_set_referenced(folio); + folio_remove_rmap_pud(folio, page, vma); + folio_put(folio); + add_mm_counter(vma->vm_mm, mm_counter_file(folio), -HPAGE_PUD_NR); + } } void __split_huge_pud(struct vm_area_struct *vma, pud_t *pud, diff --git a/mm/rmap.c b/mm/rmap.c index e8fc5ec..e949e4f 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1165,6 +1165,7 @@ static __always_inline unsigned int __folio_add_rmap(struct folio *folio, atomic_add(orig_nr_pages, &folio->_large_mapcount); break; case RMAP_LEVEL_PMD: + case RMAP_LEVEL_PUD: first = atomic_inc_and_test(&folio->_entire_mapcount); if (first) { nr = atomic_add_return_relaxed(ENTIRELY_MAPPED, mapped); @@ -1306,6 +1307,12 @@ static __always_inline void __folio_add_anon_rmap(struct folio *folio, case RMAP_LEVEL_PMD: SetPageAnonExclusive(page); break; + case RMAP_LEVEL_PUD: + /* + * Keep the compiler happy, we don't support anonymous PUD mappings. + */ + WARN_ON_ONCE(1); + break; } } for (i = 0; i < nr_pages; i++) { @@ -1489,6 +1496,26 @@ void folio_add_file_rmap_pmd(struct folio *folio, struct page *page, #endif } +/** + * folio_add_file_rmap_pud - add a PUD mapping to a page range of a folio + * @folio: The folio to add the mapping to + * @page: The first page to add + * @vma: The vm area in which the mapping is added + * + * The page range of the folio is defined by [page, page + HPAGE_PUD_NR) + * + * The caller needs to hold the page table lock. + */ +void folio_add_file_rmap_pud(struct folio *folio, struct page *page, + struct vm_area_struct *vma) +{ +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD + __folio_add_file_rmap(folio, page, HPAGE_PUD_NR, vma, RMAP_LEVEL_PUD); +#else + WARN_ON_ONCE(true); +#endif +} + static __always_inline void __folio_remove_rmap(struct folio *folio, struct page *page, int nr_pages, struct vm_area_struct *vma, enum rmap_level level) @@ -1521,6 +1548,7 @@ static __always_inline void __folio_remove_rmap(struct folio *folio, partially_mapped = nr && atomic_read(mapped); break; case RMAP_LEVEL_PMD: + case RMAP_LEVEL_PUD: atomic_dec(&folio->_large_mapcount); last = atomic_add_negative(-1, &folio->_entire_mapcount); if (last) { @@ -1615,6 +1643,26 @@ void folio_remove_rmap_pmd(struct folio *folio, struct page *page, #endif } +/** + * folio_remove_rmap_pud - remove a PUD mapping from a page range of a folio + * @folio: The folio to remove the mapping from + * @page: The first page to remove + * @vma: The vm area from which the mapping is removed + * + * The page range of the folio is defined by [page, page + HPAGE_PUD_NR) + * + * The caller needs to hold the page table lock. + */ +void folio_remove_rmap_pud(struct folio *folio, struct page *page, + struct vm_area_struct *vma) +{ +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD + __folio_remove_rmap(folio, page, HPAGE_PUD_NR, vma, RMAP_LEVEL_PUD); +#else + WARN_ON_ONCE(true); +#endif +} + /* * @arg: enum ttu_flags will be passed to this argument */