From patchwork Fri Nov 22 01:40:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882582 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2068.outbound.protection.outlook.com [40.107.243.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 61F9015533B; Fri, 22 Nov 2024 01:41:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.243.68 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239677; cv=fail; b=e4xAg7Ngn/6bHUmkRs3EQKIki8unxxHlzGF5Hg3B1HiUXtZX5SusNgJN2qYlcPOr1ZbI0u1DbdB8IV2QLhPGZDA4zn+46w96XYFQZ16KCthQ4/DpJ4r80VkndDrjcIAyXWmen+L+FtcQWgZ7/igPWoSUcMzAPn3yTo84MHiPv98= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239677; c=relaxed/simple; bh=+cXy75NjXZLbraO5UZj7ueoUGm1nUBGbf57dhTWW4Qg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=s8gO21EQi5H715sorv0Dvi4vQV6aiD3FpKDoDOzALoZKHxsqw6LbnDFmhjf3Ky3Q0GQkfyT43+3jNDAlqaaaOH34VxTAh1I0KWxKiTU++8SsZIslHAkfz+zD3HEUCi2zVUpsZoOqAdI67bKUX2JT7L+IDjcKndHTdZ3CJlKxpSA= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=gWozrgwZ; arc=fail smtp.client-ip=40.107.243.68 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="gWozrgwZ" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=cNyIoyA3cQHW6lnahlQfhnRa9dZdDrQtVyyXV3H5UMizrw3Ka0YJSZP4q/npwGh1ddluWaYOyvqQGr+8hB8lVtDKjUIUmRVNV2aGTVf3Ut9d5TAFCJp7aEAxI+P+Hs+ZbfmAsoIcc0QALuAXuLw/dPqpkgP0Qi1jFQhvyk/0L14HsX7iLoNEFtuDiF0HIyruD/LklgzPQU1zIJ8AEqY2kOK5CwT2Zh8eEJ2v6C6dkJsyy1o2rF3fqrs0TJc1g0dAUxqwcVv6NU10saOqJM/YugYbbNqh0H2w7z043/ajWqAFlcW3CFEa3X+9V8pAiOXIioOQoSsFiYthFLQIqMK4vg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=MiSKhC/DzexmAjwbx3sF4Vku37jAyZrulJ1Vd2F/Mys=; b=vSliybnA+ioxNLO2q7o9dBkeFQp3v0i2wg7UfzuGWE5pSp0DDEXXduDDEECi2t4WkFCOl7PZB5vwpwZ3tk//KW51VvU+K7aOodlAEuyborC/SfOuybH60cUQLDVfTBaPvWKrNo9xyjBH+ivLY9DnPk+NroFAkumv648HQ+gVNnsQw8Ugsdn7CgpC6aQugOI9jW7XSQi6xNb1dNujSghJDWPGGgwqDdpYZ6//Zg2UNoXdV0YY4yE2VX20iEuwlCV7dUjA46T4RiXxGDXbKr5ZbypYpmGPUKj1gT1znOnf2Xd3j0GjSOUfYf7ouT7xj4vgrSMD2O3v+R3JdVkLPA3l9w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=MiSKhC/DzexmAjwbx3sF4Vku37jAyZrulJ1Vd2F/Mys=; b=gWozrgwZV65KyVKHeQUa/pwSg12GdWLPerOSDlWTM+1P3R8Ghp1eyWxAd5PBHcr9+32969XEuwBTUU2Oap5Em/zYLNOdTze3xEttWkNOfH/JnlJgvq9LDpijNERR6yFNZXmz7T+AgLKdJZg3Wthvd84IOWPLRLZtmYoiQj/rg9a+7hJJNfl8v92FXcbe8Lql7tfdQnj+KrA+XGJyVwPwsbRXbu4E8LZbLNnMDwtuUQR+58UJZes/f6sp/tNXNbUPbeVYwnrtDHik9w+nz/QDDkfpmaHeGsAG8nghAoUxOa3Chd3O3MTsoft5kHLOEPWyUMLZ+351CfsvZeSYTW4wbg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by IA1PR12MB6305.namprd12.prod.outlook.com (2603:10b6:208:3e7::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8182.17; Fri, 22 Nov 2024 01:41:12 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:41:12 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com, Vivek Goyal Subject: [PATCH v3 01/25] fuse: Fix dax truncate/punch_hole fault path Date: Fri, 22 Nov 2024 12:40:22 +1100 Message-ID: X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SYBPR01CA0190.ausprd01.prod.outlook.com (2603:10c6:10:52::34) To DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|IA1PR12MB6305:EE_ X-MS-Office365-Filtering-Correlation-Id: 61f8c056-7dfc-451d-5a9a-08dd0a96bedf X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|366016|7416014|376014; X-Microsoft-Antispam-Message-Info: NPTJ8tG6oj3aZQ0anSPm2zBQaJ1sUykEpXCOAzAZUisWMTqckCBZIfoRsUT1Fg8dPSVEtr0zKsrhMMRb8PU7rQ0M36sKTMnXdqG21nU6Hk2R92IbB0NXqWKyFhAlw1fy+qM1s0y1QTkpD2bFV1wi/Vbnqgmc3vtTajS+RUK0z6j4eCabkFy7Ureq+NDPqJFKBnsC2qjVWZ0NE2wDHRR789TGoCsg1TeZv7S8P1aKe/1uprpB67UbYAdCXtkaFGSx9HIbc+tk7UEcHwuq0sNr/NGyVJ0Q+pv4KZr1KsAZ/k6mFncX8pxoW7C1pKE3RZ4d1n0NR24PO+osDjPWn9pGACOumCaiZeBcrmxHwj8RIvRrYJhwliybTkRwl8lIFMOAtRXWbRyU61DT9XxtEw4FbFNf6HSEKmGtMW0pVQkFNyobdOJ8Q09GaMPEwiSAxNzJKZ5Aa1nAxYsw0A2+LlF0O2R8FBuj4zWuGTYTVRGEvC/Lrs1f/El0B16nh9xfV7/fr5blJX4fNOZFsAPpbIRwqwfg12vTXInZEpgzl9qYVEUn1gUi4Vle9EStreazAct8SwV7vJ1Nvx9KBj66V2miJu1aGXX2WHJq13Cdb71ZunMrQuqaJoXNUTJwmxQek9A18h4YqMKjsfeZgUzdynkfJ10hb4URFL/vtHeb6+7NrR/Yop2Yyf83Lg73acf/0LMhDJx55F6s0v+/xdA9tZ2pRA7WEGOENXBkEYRZn4Kmn1k4OCDPiANX4RrR/S410GymkpkKl9/H7sbk0VHfXBwR6yZJL8MejBfFqG5Ueem76z1qwGHeuQMOJ/h2s0WQK7kyLDQ52Jl3c7iohjvn3w20Cm3/bLI7KeO2bz5qzn/Oj/WgKiOtPwg10JTj6haOE7MWl9Zwx8G2YVtDtQYY8BhK+7AWG55rtkUPjathOarMlcUuBBtmKvHWhGhecaSUpApm2/g2v0ukF5sM+RSY/D7ptcUSG6rR5C7TxYlcgwaNt7iUFfkN5fyFGIm3kMnmccPjBjYObTcu9K9Q7B9bZ/3Lm6uvYXtIM64IkASIttqIGkMJ98z45pfuAzhsGsyVVBdfDj0p77PcxrSxq5K2RzMqWkWKqzXPmnvMvpzCaiEtUU+VLG7aWx75hDaiPMJvHpIuVm+T/wr3vUu0cBogV8OQZCEhKSSu6x5mgIoxIJ5SDmlZzV7C33xEkROX9ZJiB0wSrnMUWrNguOAr3YB581DMBAq3lzev7HvQIF1VIblyHo0l2iMC2NkExPMURGc44D3ABvRTCkrOvkJaXlLoZOzekGJr9hZuUmTwkdEDUXfnf1E= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(7416014)(376014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 9PeBowriKVISPg+XZjXB/CfOYHOa49E6iyTMKF9qZZyR3oP320ne6CjzN5zT8ZFoDUx4LxT3oHb1kvXEq8oVO6aWxcnF+DtpWCHPyBfWafUsg0re7c3Fj2jgBA4NslvCpLFMUj4Q0bhC7ulfLKg6/QYNVmo6F5zm6Th3MyLlZ+UtrU6pxA79mmK0ZhmqMuk8zcwZFXNwk6UgeLpIS0FCFxkXW4sG0QWEC6QUcfu6+UYI2DqSoUneZ5+jx3x5JHzBWRgrB9VJUxk1fI7r3dD26wmTIcumnf7jZP8KI67lk9M8se8tYUKKO5PDP8slS0z1A+pzLnEsAcMceYm6DS+8C7HvomlZDiCW0MvUWA/p4u4EetijgXsSjnHjK0zNWQPEboqGhA3ee0j34ujgVUK+gpsHgh13RmrMfR8I/mnTstfDKRl09VOO554NUdldZlFhWvmT4xSZHLBkmousi5IHaPIeyL85bhUVdvYM4hMUlbvwHIv4Ztr7GJdAAMoxyJRTnahC0TXaoT9/KuijBPeOgQ6pJVSFKsR4MUMRSNa/1cqR1DJz3KVLcAb7xnacUqkx1ixwLLKN4RmHRVpePDFc2oSclej5F4rX8KcTcBSNPW+ogNT2KXEwrxImMR0vZWFXuClUmF72QLPeE8c90tBoZH4qOlRUuh8dd7UCGBKvoCNY/fKVPZQC3WP67HCMOjFGW2rkFVr2AWm3VF+5dw1rsruL0lOw4nIJDoE8B60XWA8NK1T5uwxLk1czIRA4yqWRRXXHgQkf5ZF/f3+16kESx36GzwICzty+JYTu0CYyEtAuA03l7UaQiyM6OlkifFLc7bxw1ANPoZZmnX0bg8qSTVsUL5ukKKEieJ8C2AgefCaXsiSQgt8JDroLK2rTpo47fYtcejjKGmfKrWdId6Incv1MH8RFBeLu2XvwNNY5Xwc10mwaZVe4rOtxsM6/WU5VUi0ssUcc/tWhgkmNl6RhjvsO0JkYgNGDvVAIUwTkJWPZ9Kty7brYjnZpaPB5FtyfKeNy/HDC44w7tYMZFEtzFcuhkFRB8Yn5tdMbQwdoc9bTBQ2b6gxrytsBtR8TGTIAqDP8NNzGtb7SujHKwsef3+UlGeZ42sH+mCcVCHeUFejxY0cK5IipkcFXfV6XX29CNlKkYyK96/K5GsEPgN2DNYNBnyRdROYKwbSf2V0ijNQn/qnzHAbwvA+AUui0l370cqhxKscdy2+wGlxDL2OU1ZwW4Ge1WsIo/hu2ETXmaGZcco3YGWOHZ7GsnkM3C00xNNoVHqHp1pNHCeO9if1UqmKVL6TeMpzYtU1QLBcxXd8Tzy4fAEKbGYdCJIrEiTdBIwrgavA0WkaMbU2gTyJJC4c1UsiSStVqIlGdewOrB/jCvi2/L91uJwZT1hfi3Kvk0XRpTFnKm4FvHTuDKA8QC6qloseBYbOmTQnvHO2RdYQozY2Ib2Z0JSmzaLNt29wkS+rTl5EO/donll1UiN7TDpBuo1KXF9UludWUTBv+MKvsBzUHLiWhVNMhBcfEye80WtfBu2yirr6RbxSFT3XZg8AOgvyAPH7YmsvSqmVBmvM5r3IeADmOwhwgGdql7Wkg X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 61f8c056-7dfc-451d-5a9a-08dd0a96bedf X-MS-Exchange-CrossTenant-AuthSource: DS0PR12MB7726.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:41:12.1704 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: BH8j941Oeq5Rh9qiZ4VbXwOu3n6fYBS7biU00CJXYfWMSMmaBMeCVoX2O2dNs697wmXEMHPT4rUMSe7DXfO9Ow== X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6305 FS DAX requires file systems to call into the DAX layout prior to unlinking inodes to ensure there is no ongoing DMA or other remote access to the direct mapped page. The fuse file system implements fuse_dax_break_layouts() to do this which includes a comment indicating that passing dmap_end == 0 leads to unmapping of the whole file. However this is not true - passing dmap_end == 0 will not unmap anything before dmap_start, and further more dax_layout_busy_page_range() will not scan any of the range to see if there maybe ongoing DMA access to the range. Fix this by checking for dmap_end == 0 in fuse_dax_break_layouts() and pass the entire file range to dax_layout_busy_page_range(). Signed-off-by: Alistair Popple Fixes: 6ae330cad6ef ("virtiofs: serialize truncate/punch_hole and dax fault path") Cc: Vivek Goyal --- I am not at all familiar with the fuse file system driver so I have no idea if the comment is relevant or not and whether the documented behaviour for dmap_end == 0 is ever relied upon. However this seemed like the safest fix unless someone more familiar with fuse can confirm that dmap_end == 0 is never used. --- fs/fuse/dax.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c index 12ef91d..501a097 100644 --- a/fs/fuse/dax.c +++ b/fs/fuse/dax.c @@ -693,6 +693,10 @@ int fuse_dax_break_layouts(struct inode *inode, u64 dmap_start, ret = __fuse_dax_break_layouts(inode, &retry, dmap_start, dmap_end); } while (ret == 0 && retry); + if (!dmap_end) { + dmap_start = 0; + dmap_end = LLONG_MAX; + } return ret; } From patchwork Fri Nov 22 01:40:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882583 Received: from NAM12-BN8-obe.outbound.protection.outlook.com (mail-bn8nam12on2040.outbound.protection.outlook.com [40.107.237.40]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7894F15CD58; Fri, 22 Nov 2024 01:41:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.237.40 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239683; cv=fail; b=dNwzRoc0mTvKSZba7ucgigyQ3Ea87pVaiwN/dpz7ouFs2mHbv2MJ+PSofJPw9IGTQ1QC1SNZha8IgTOys4Cr2sKen5cSGelua11xskuETlotM6e+sdmMCEjnTwcLPPvlWEJn0hjkiHXke4KnjG9ti2gazjQTKqC6/fu1X9l6T6E= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239683; c=relaxed/simple; bh=OMskb6tiJXN/oVWhmqvQ06ZN7nbmfyJDrXPM0IWvd4w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=KT7TcFDrC3jFTEOxfqychkGMnk74/I12bhk4+MjrwEIf22X4wA3GOQbLn9cwFL10RGf+JutzAtaBfK6b6FvsITLsnqeSuSjbtaVonZRHN6aHB4mB8kCwQlEp06WrkotvIjblKonkTPj+HrSfwquMgDQD3/0P4zogb6cK6P/B2d8= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=ru/dcUVL; arc=fail smtp.client-ip=40.107.237.40 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="ru/dcUVL" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=BHYG9WNQ2IqBx6KlQPGT05KkTMahrGbQmdjjUlGR+2iX+xM18ap9D71jEgQRgGm0R5udJftd6jlXZZfQT/Pk8PRTLCu41i89fJC8J069UjRpyzYGJIvda3GBSfYouU0dcrEkwipAFa+hR/W1QsEYrs211gximydRC7AefGCid9PYKerSc9VQW0k95se9+hYgFst/zbxaNIsQ/zAj5gLf276sE2hqN42el5tp7sRcN+rk/tVHkljRnPvDSoLl6hQKGR47YiRiL5HkE3tEOXjYzZOynqiwuHXyBVy1AhJxYnMcxGCDjG9cJj8HgwgGYiYFdVd/QpRxvWCrDzNPFb4fhw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=JRQQVpyhJF5f1obr6Ie6XHXs8vyPo99NnnyCrsA9+Pc=; b=KbJYiWSsoYmRC+y6JNvfPgyoQ5u5Tv+JqlAHrcjE3QvVkBINnnUXNBwDz+jyYdykkyMyYGo5AS7inwGNWHVbbgtF6EKXYEz9JXpJU84w0tiMtbwFSiGNKZMnFRxujfX6wty2QaNClKw86Iit5u3vtpt6HQD+rWqTYlzBmO9ScWuUyTzZceXoAX3RrFD7pjJaC15TcqtDU8aP0RKmPO+7YOO2Y1+bcCkIHF6m2q1UJI7EnQWtVHVmB/ySGwmlIgOqquJXnJafY9TnNiXzY8U/cH7kT2fQ5wpqTR6Oul00EOa+KkPX1hWb3IiHQ2/eaoGkC3YekgcS9ZZj27wZnGVNIA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=JRQQVpyhJF5f1obr6Ie6XHXs8vyPo99NnnyCrsA9+Pc=; b=ru/dcUVLPZH/Ey5zbEfpLtfyBcKkXi0D3gHBAHbW9lqCvD6qaYoTzHspETvR1XZUuivurIfhPtYvz+DQF29/RbW7gOQy2FywKt+Bd0h4WbCOtBIJBgplbxndzpyNQ4bkEahmmX7Uw1fUZ2Mn2BXoFSwl1AiaU3KCaNN8SdmkLhwwaXmWFLblLWkEP+HPrLiVGUX0uuabqiLanleiI6QtTPvndZqOB2f+pB+BeG9h/If0hXV7XLJra6zL0SWreqb05tsC421w4nMWbwZraIqnPKFiqVZMXHWLyN8K596MK+r4VCzVJffCfAH2fkPdBpbW24X5rIWzFyFGU1cp+0pwXA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by IA1PR12MB6305.namprd12.prod.outlook.com (2603:10b6:208:3e7::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8182.17; Fri, 22 Nov 2024 01:41:18 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:41:18 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com Subject: [PATCH v3 02/25] fs/dax: Return unmapped busy pages from dax_layout_busy_page_range() Date: Fri, 22 Nov 2024 12:40:23 +1100 Message-ID: <0884b6a6e0f9ee4bb982a043da94fd910ee800f1.1732239628.git-series.apopple@nvidia.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SYCPR01CA0041.ausprd01.prod.outlook.com (2603:10c6:10:e::29) To DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|IA1PR12MB6305:EE_ X-MS-Office365-Filtering-Correlation-Id: 458cb362-5519-42b3-6989-08dd0a96c264 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|366016|7416014|376014|7053199007; X-Microsoft-Antispam-Message-Info: lotqsT2+U5sQkuyJsAawpUl1P/RhNn3d//wZFNzkD3wbRKhpJuhIsilKfgbpPtx+VJ3pDIhSqwo0qd0IL3Xpr9JCskM8W98iBiqUJzmI6fuTQQ70A3mhoZC/uZ9drGq18rdzvEulORwJjz4IjtIOMHRcGl9gou2a5Ow8wIK8wILf/jbQVpaYMPCqvDHN4AmVD13KLa/9BQD4TnZy6PYFmnDZb0biSvNMM39tDrzBBtwUZY3TdPtnW4tyR6SPYi2e30qiAdXym2V4673aBpi7/LOyalc3Qvzv0wadeFeU6wsFDEVqO7MuUSXIQdCd1bc8fYfAyosp6eIATzKFriYHJFrhATs+K874jBF5HnnFTo2VFUOxHPgpSiJWyBVz7qZj9mMC7Q2zPy/zvMZvzt+MAlK3EvFYTb4QZfRnr/cfsIw5R+btV7vQOG6n3gf0kR3vPcc1U7OzzC+7mMO83x3STEaw3rO/K0cbcfchqqzvxLXlAZPOPZw7eYWZjI8PsKLvxQydBIpfGmMOkTE8vzlYMQ3G/BlM2s2XwKSwYg4gPws3dT3EVsWjhN+fTnRgbyj5K+s2D7BPELYNReyShXbtpH6UZ11CQyREOaXuyCQJM5P0iDrq7pqlZCfBQkcODlfj+iOT5HFDFSTKqlopCiLiqpgMhEHpp6eQogY1GoC5JkH3fIu1Mt3+WsIfuOkziky/llY9o4o3ydKBxdHOhVyz3tRt7bPMTjLPt0oAtuKBerqa1N1lu79CtW76IHMlV+pAGej1uAwN9fBo6oWxV9kHCfSTPLVfyz44GEPaTLn6663oKJfjqbvn0c3A0z/6fvZTukGzgOlSpi/+88CuUIeUY6Y92WQJf0ZOuWxNCPU9Yvb1Kng2iJeiwoGDHrigfgIVYxn9QwH66PAMmXqk+AWXgubK34Gluq0oRwUZ8saAFe18QbvTHzfYVTNDpihKcq6Mh8S/tU1pjvQ2I9wewdSy0aKb4OH26ZPENTouIroo8f6eEuw6Crqg+mkcW7lYj/SgQESWJyp9RQ5Qsode6c7gPcK2qlMky7hXoonXd5WUgbvgQbGGKJNOlXALMFMlPH5BOlN2IjMSGf/Ob5tvafXSdB6n7CAajEHQksFe7f2N2+O0PKbhays0oqCwa4/db1whZQEQETkwZRg/i16USpCm/IXIf06PFXO1jleE6ikpyi0uF9OnOZNBGBzZeXgQ7Jg27BX18K6UAIBr3CfHy3DNG1gIhl1FgAwPhhz23/4l8lAw+/0zaJf75BC8PHw9a9Lycccg09V2w0OLvcIUCEqusgYJfQta1OqUZQxISpH7/t6iqFl2Hd1xckqv05kK9vOq X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(7416014)(376014)(7053199007);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: Qo5EiwIy0duwWJXhdBvjXurhaIUgop9Ug1ppvroOJfAjPND9hA4byrulT1vFDwy6/bXjJNicwwZGBivvM7xAB9omOIhJuVBDHL5wXSQjAqct43V6YLa5O83csa7p7WMl2Q6C2DdE/Y/alhufe9FDhZMY4aEdFQ2OxaAY/NnTcjFrXafdIk2g+pH1THOr6t0/jk2MaJ6X4w58ETaVvL82HSpxcVSnjc3ITOTXg9Xx/FrDRBCpChU8Jp9yofqG4obt6nrO+/GBvmCVF6Jtxa7iHqAEVPUjmz+s5iHd0RU+XWnI72P7syL+B9NHZR7Qhndv0R86M8eFHNR1WniIN3T/lY66bkljOvZoq6rjSTzj+cGsel6aNgbYo59TC9DIhuUuDQzNfGZCnjEx1LJ7LcXS8oLGssVNu6YWJY7MP/YkPSREX5IwgWLy0ZKGn6DIkUTVp8DREUTqfN+sATJg//oJjqzP3M7KB2nFYL01XHUAPgUxK2EHq5JEBKk/CqLRPHsVX7UuNG4gWQPJGCj1qiYR7eP7LCdyDu3fma2zPvNFoJaiCNkp2gWKtgG6AU7DgDU3HtoicLaGZTgsENBgtKB3LHJPz7L3yPv+uQYtdnQIP9/LfgBSeCbcJXiSIdlmKDSqvOBLJ52/mghxMU1EX2fqY542+bt8670DKAaBmyQtKTmruoX3PNTu31/rn4PG1GTxIUev/F7cEXQxV1t7CBqekKxRQJAP/r5U0Q/g7zcqv80ZtbItxOl3Gk6NJ804ox1kgBrq7ZAmLuUAcVCQreozKYNfPhVHTPAeCGxd+QqqZseqZk/oxg+uFzCnWam4k+uT++3ahjMQKygCd9DGcqRVI2jozUSxnFKuVbXNElXd+lR9Ff5Cv/x8hBY1eClS6sZmIFLU6YJ3dozJrPHusaqKhaSGTQJ7J8QMdgdmimASB42FOiDAjocydG0Ur/iSfjmePXxW0eNUPBpikQjDMjlOD3o7Ma5JhEJBtXY6PwzhAohM5I2EaNh5SZ0+cA/IKGZzAQEgq9RPVx5VRPthT1exL2otM1Jxi5GH6TgVrWBtB3Tmwl5B8/79fm3J2+Wc2rHttVV7LFT51ZzgzXArCXCj6stbwBEtnV071OELdEPglpJMNELBsvRsTA43h0CILCsFG1MfuHYt21Cy3imluQOkuwidaaJwN8UsfZW97I3ixi6j+CuRezM/5vv+wuIaXjJblnp/j3oWVgANVxXt4n9zSVYkGnwe5WSo26hsosWVg3/OQsI0hIrAPZa2Ceo4NhHEdfQJQeHm/Ybeshzlyy/hPlXw292wTtNZVXYnvzfMNIm7ydUfLHoRPtQqM7Q0F2KHjHA23HzbhCqTpZ9YdqasrP0OMiXa35pHLSTWirF/Q6afw9pTQHLT5x4f//6TGEc3QF/IKDRTSwVx0Zz6Hq84MWbW0CpxvEW6CwOkh6d8if3jNt+ySqiGd0V8kR62l5NWiWs57rMU8G0nXznJn8c7hzXbD4w8in1z8yokD32Z6zR0iTVsc5zgVleLXYCtgLHXRH4Mi+OMs3W4HUeV1p2kVRRHMgsTaJo8H7EQcj8q+XnEkDRKTc4R2D07gw8MTVYh X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 458cb362-5519-42b3-6989-08dd0a96c264 X-MS-Exchange-CrossTenant-AuthSource: DS0PR12MB7726.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:41:18.0397 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: fKerGv082UjkAIE8S2+bAoi9U3PeEL1YGlo+2SSuWtVgBJ24Ok1liWEjhe40UXyOcoxk/vlhCYlxiypfY5wHAA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6305 dax_layout_busy_page_range() is used by file systems to scan the DAX page-cache to unmap mapping pages from user-space and to determine if any pages in the given range are busy, either due to ongoing DMA or other get_user_pages() usage. Currently it checks to see the file mapping is mapped into user-space with mapping_mapped() and returns early if not, skipping the check for DMA busy pages. This is wrong as pages may still be undergoing DMA access even if they have subsequently been unmapped from user-space. Fix this by dropping the check for mapping_mapped(). Signed-off-by: Alistair Popple Suggested-by: Dan Williams --- fs/dax.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/dax.c b/fs/dax.c index c62acd2..a675eb2 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -690,7 +690,7 @@ struct page *dax_layout_busy_page_range(struct address_space *mapping, if (IS_ENABLED(CONFIG_FS_DAX_LIMITED)) return NULL; - if (!dax_mapping(mapping) || !mapping_mapped(mapping)) + if (!dax_mapping(mapping)) return NULL; /* If end == LLONG_MAX, all pages from start to till end of file */ From patchwork Fri Nov 22 01:40:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882584 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2059.outbound.protection.outlook.com [40.107.243.59]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 483781632DC; Fri, 22 Nov 2024 01:41:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.243.59 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239689; cv=fail; b=TmfO7Rs/jIdrXt3SWN0q8SxWmT5XfPDhkhLwwvRcM/ekRoM8nPmDaWS7nj/rlofZuCmosCeHvTO5kRiXaVzG6JSpUl+vKBkJCkTf+Drlm0P7uoyHXcT3QRfX/+5nmsB34HAkekXvhMjzOmS8+87l0JW5/yOZ2KSsZJFENRrq/UA= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239689; c=relaxed/simple; bh=0UKxkxXmrUqb1sMV89E8kvLv/FXWfXj9voy4Pb7F4Ac=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=GWVSWGmu4HAWNCed30vS3/PmYFo2JLm+sdHdZsmtzVN1a59IsrAyrBQP9bvqtXqVADLbRRocs+yPCu+D7e5ZO17yMoWQedcIpUFTDHAkq+eKwSjkrVUsQ8z3AR+ZRQ9VGMtL0+Nb7pdmkKRcM71UmQPvgK7QNWfJblnyeMmN8cI= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=G6oOuMXb; arc=fail smtp.client-ip=40.107.243.59 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="G6oOuMXb" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=J47TTuVUbjSQOGkYjJ370BL4576UkAzQ9KTm0eu0pvk/qX0+Xtr8tZ5Jv179WnAyI8mJ7VkBvjp9yW5NtCpFo8wxC6GKXDSSKph6ULoPFQZvBaJrNxy4PrNLqEUoZjwtmVNGE0m1Lxh6oHVgoZc40AtfByIeeOLGhm5FFHQkOfqWVz3fdkAQ4HGZ+DprVPHlwNZe0JtjurXUG1f4SuXzMVOI7pFMyBj9/QwLGeRiHQ7o7/Exdwp2q8ghos22vv/+CSlkXaMcDuFPE2DVlVBodu/6zZyYOMKxL4o53rM9n+Z2EFP9R/SCYE5iJE9ISMstlwSia1nrdmCtzqqVZKyUKA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=0u1tQB5hAD9qXX/WaQRL1Fudxl6ATJhd8TdUy1t6AqQ=; b=vOqHPNYOmiIN4bl6NQbvYH+O+Jz7dWd0pmpvPy8+u9Au56mGkmG+acw4uRNwCrTZZdiA8fs4p2tNQaBfuJVfue6vw93Q7oEdI2sisYeRyix0ZUikhub1WFZO8AeCks0au061Db1zdYHjEXdqzqkpE+/eTjd0mIAMbruvOBF2hf93xhx/7uj6ZfchzQAAss23luw2PGDsUPVG1aPbCZ2xbVs10fcMERA2KKirtEzmk1NFtJs/LnMNajWzmLNRtlWVLA3TsC28AXayQNxbNtNVXQDUVv7CCMeP/IhtZvPEguTr9fMcZ2sbXz1Yk11je2sAIXPK3fSElUQxBP0tvwNUFQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=0u1tQB5hAD9qXX/WaQRL1Fudxl6ATJhd8TdUy1t6AqQ=; b=G6oOuMXb3zadelXmS5B4me2ArYXMW4AAT6imLQtQy5dULOBtgxQ8Umc4rf0JQTHpib7c4yNMUGvRLtPuKleHVIXlxXCx2hWsetMtNZzdoCSINosdgJe8GdYABCfWIRW9duIjN6i9GQUhkf4uMSTARmgpC5UnopA74I4mtjNsq3ExMVDKM1YEPvGX7cY5TH1beZALB5oyvarFQpPVh+w2TqrHn5jCQeON+LTdFk/HkbQ+7qpCYikz9BkvpUSkfjib4V8sUdQ9D6l79KKZhQdN8fhREBqW0QWy0sv1EP27NJaNOUfP5IDdv27H5gSw6aI8eYgP6uFRnc/mu4zMWPrBZA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by IA1PR12MB6305.namprd12.prod.outlook.com (2603:10b6:208:3e7::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8182.17; Fri, 22 Nov 2024 01:41:23 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:41:23 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com Subject: [PATCH v3 03/25] fs/dax: Don't skip locked entries when scanning entries Date: Fri, 22 Nov 2024 12:40:24 +1100 Message-ID: <33af740975156cb93b883b4d796300657a0cf637.1732239628.git-series.apopple@nvidia.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SY0PR01CA0005.ausprd01.prod.outlook.com (2603:10c6:10:1bb::16) To CY8PR12MB7705.namprd12.prod.outlook.com (2603:10b6:930:84::9) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|IA1PR12MB6305:EE_ X-MS-Office365-Filtering-Correlation-Id: f3930ed2-5eb2-4d6c-cdfd-08dd0a96c519 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|366016|7416014|376014; X-Microsoft-Antispam-Message-Info: rJgGo8BNvB5K9PFN4GtdzN3uBLY24aRcttrCk/S1ykSAWw78cKdibxtA7C6QaSAnPN98+qeUFewAqUNPmwSh4MyJpmbjbgR3ODgpFI3zn0sy2v0nM8WZ8Z6N2B9pSMlWpkV6hPDwdycVQOtgF2cDthujodz/mXirjBXQH1EVlm4GUW+/kALuSxrq3E0Ha6pn+42/Ku+Ip4J/QrOp/704HUdOU23m14dVVNbQa/Y612OUtlKMz4hGJJZDoGGj/POYIaFsFW+Vjs5yFjFXHeyIO+GOtCmOIieSQD53edF7k+o/tmICFOdJZaFvr2Fyf/Jvh5i0IirxNOn2dJWuyaqRAtbPodedJK7E/c9/xU9r6EeYbuXfzIqybCLeDQoS7FsVW35gVfOE1XKL3AFgf9CRWDwshyQMbeWhtLSOAWra8FPvaTY3ttDsTcvSV136NeYiDSkNt1E79CmWJpMHAJdtUtoxo8pR8RU5g844wjFK+SAIFf95lhmYZDX1xNPCxMMKk9Mwt6U022Ku0s3F16FQwjChj83MCt16cjs5TooUrG4woBE013596PPIoiLzr54EUwE00taYaj0KE6cKFZGaWCe5ZNDsutz1G0R3TYlcuEv0kpcYdOD6/HEaDwCI14ncR7r2tDSHHVPtUY2ASonzAtEtZtS95Mb5ZBCUC++z1ZyvzdOR4WZgs9Q0hnQ+5xZI4lVB/wTYaKOjNm7ZAV5c6Zah4wfF4B7qr3YbzvOzyubRXqlleqiN8Z3wEIPSrOWTDbvZWKxJME+2r1NGbew0Uq1FeYgMmRjYuFYrv2JXOMC/taM1BSxC4HP5kB3QixD6ASAvGp8kInsHzn921iKjtF8qoZcmGdi3tsGXU0bnZXlDl4dRscMNjW9YIRuvNbdIR+5lsUL3Kf1hYH5EIZ2Bh+i/xoY9824iLD9s3M7kZ10z5XK11Nh8M1lrXtZNvuoEUg2LxPECgjGlnln7iZzLAWzx5QmIkX/nD/K9SA8uCjry0K244CUMayQWNdHEfsL3ywDFdxjPovyRYtb/NHbuSyq7AkvWNRzSLK40uFgyyQfOOK5X64S24Q0ev2SlOzSLnZZp6d9r/8JHvb1rLYEkhzcRZOsLWrFYjP81C6rBzNSmXPoYomNCO1iH+WdHdsVEqN03yx3B6AftFCr6VSNyrykkTLh1spZQiNTvz9AKCReavn8q1sns7FmqSjkH6T0CjaHP/kSbhyCKGgTdy0CHiXeVX9bo/MZL+2MZEff5eMWcCUXbHZ3kRSkSfCosVXpRYbrkj5S/AfsSYIKlZmJ1irZlGMZjjtUadTe6fvbWvXiaCwBhKlHWjvokdn5W2Be3 X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(7416014)(376014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: dYdogiIFP887dt6XI7P8m1sgAvcwmLBGYcQldY1C4ZIFdhiXnoUTnpr6avpQ44heqWnRfOuOde1y/P2OMkw0W7xYcoEKDBzVoCBp8omlKog4ocoQtvL5E+yCuUuf3YykCQ/Xi6CFG5WXKoTPMnDWHi4qMPS6h9XvHKK3MKoskzUDOtrcB5q3D0nm4URjnVg8e3FqLUj5aFsb8krN2VI9JqQuQ5d7GHYVu27mNt+UuBHOPFI7j8fSsaPXY8Zgkvr2uWTHPYXU3wJn9xt9EDB4MEJErdKIvaKweZWgV6Pj2Qw42XFd1lGdkmuplP5rjkVkhlDwtjQ0NQqW5GpktRo/SGQqlQ2cLNok4hLntoTrkfqIPbrWUICYHrSIuDzXdMdctazACF+jNL1xCvkKsO6p7DByXvAR8B8bva5dhqLr67AabZE5H0yE8G9uB8wiPIkhyHBUBN1CknvwD8vHpJ9nTiCMZ7alMFqMG2eWVjHfdYsQfW7POqqDu5a073g1xsfFCBbWSZ16WjJgg410Q1mIZPqb1ntduGQro+k+PeKwKKkluxMwcmExUJ/bXhTy+gbUKZ2vMXImqRkIYzMwh2pzFYT6RItZUNhesrQI6RLXY8nhU5rECImKfURiXtvYm/iseIuUvIf0I2qqWnM4HQ2Q7uC3nPbfLXdnF8I4+tZp6Y+u8QB1pfvEIMZRSL+Ols77XOZ+fI/vUOgDhq8Z2UrRBWxoy9SxaKScl89SlUJ+Yf3EcCie6wHQZa4non4fN+YXY2/O3cjqLnAeJ4tziM7JyZAH6uQXK1AA0ggCroy1QK3vixZN+FiMVfKtIWto61FXin10c9E/5eD1KCDnn5QoxyBL4Mh+AeBWy4vQmCAG3t7Xr0Og1PnRTfjoRO1nEs4CYRuz/95Mts2E5eqdOb1T1wbYc8/M7Y1fRQ+ghhh3gVYTesJt0rj8IdzdRxCDmp/di0Hv+oRu17mjG6mNMBM+WTfVHO0YL1BQVr/xKTwWMhysix9bqPayaTlBws7go8x0vN4oQ+RZd8htaJh5TliQU1hMJy5wVjWeHUxUG3Jd0yze2K3TFBV3XEvlftazrGdXsWGFnI29tp2sGlpcMqWZ54QcY23BF1F+S/p+HG1+7II1LZuDbA/QtEvbjc6ZJtQZRDZqOSpHIo+aO1tBnwwwfFs53xzL6/eRTgV0N2gEH4fnchygkk6IV/phxWu+y+JNbw+ucOfj8kROCTxkLd03I5TrJEHh/cpQrA1flVhoGTwu0KV6W0saBxR8WJ8rA/X7HLWzUffl3w3GWyuW9cOYFUqf3K4PBg5Odl1v9zDuQHCggQmBiGeDERnn65VNd2BpZfxyj4fSRz2y8oUVJ9yXM+Jr5RYYKSrmeA0OKFbDUde2CA8v5o4FOB96AyBzDCeKkfITjUN8UIqG2cK6uXAFnoMCIhaemnYr3PjUSu4u2CltOMYDPn8HqHIKgw4Al1xhnuQJDN1WB5SJHW6nGaVoHoqKXnB7zshdvB2CzYydl0NUbDJ+EkTsVaXcbAMGJPxZgHus/OKZ2uXJRQKRR00XzuHQLk621MKhThfV/vYI9h+e865K4+ZO6MzEnqIYrUVQ X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: f3930ed2-5eb2-4d6c-cdfd-08dd0a96c519 X-MS-Exchange-CrossTenant-AuthSource: CY8PR12MB7705.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:41:22.9309 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: /OPFRNi5XuOJOS/wnkS/Ff5ecHwzMhiyIpd7DNrjO43u/+ivGmmM8WHkERaM7iu4jlHIpxDxEy580hkN4Qodow== X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6305 Several functions internal to FS DAX use the following pattern when trying to obtain an unlocked entry: xas_for_each(&xas, entry, end_idx) { if (dax_is_locked(entry)) entry = get_unlocked_entry(&xas, 0); This is problematic because get_unlocked_entry() will get the next present entry in the range, and the next entry may not be locked. Therefore any processing of the original locked entry will be skipped. This can cause dax_layout_busy_page_range() to miss DMA-busy pages in the range, leading file systems to free blocks whilst DMA operations are ongoing which can lead to file system corruption. Instead callers from within a xas_for_each() loop should be waiting for the current entry to be unlocked without advancing the XArray state so a new function is introduced to wait. Also while we are here rename get_unlocked_entry() to get_next_unlocked_entry() to make it clear that it may advance the iterator state. Signed-off-by: Alistair Popple --- fs/dax.c | 50 +++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 41 insertions(+), 9 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index a675eb2..efc1d56 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -206,7 +206,7 @@ static void dax_wake_entry(struct xa_state *xas, void *entry, * * Must be called with the i_pages lock held. */ -static void *get_unlocked_entry(struct xa_state *xas, unsigned int order) +static void *get_next_unlocked_entry(struct xa_state *xas, unsigned int order) { void *entry; struct wait_exceptional_entry_queue ewait; @@ -236,6 +236,37 @@ static void *get_unlocked_entry(struct xa_state *xas, unsigned int order) } /* + * Wait for the given entry to become unlocked. Caller must hold the i_pages + * lock and call either put_unlocked_entry() if it did not lock the entry or + * dax_unlock_entry() if it did. Returns an unlocked entry if still present. + */ +static void *wait_entry_unlocked_exclusive(struct xa_state *xas, void *entry) +{ + struct wait_exceptional_entry_queue ewait; + wait_queue_head_t *wq; + + init_wait(&ewait.wait); + ewait.wait.func = wake_exceptional_entry_func; + + while (unlikely(dax_is_locked(entry))) { + wq = dax_entry_waitqueue(xas, entry, &ewait.key); + prepare_to_wait_exclusive(wq, &ewait.wait, + TASK_UNINTERRUPTIBLE); + xas_pause(xas); + xas_unlock_irq(xas); + schedule(); + finish_wait(wq, &ewait.wait); + xas_lock_irq(xas); + entry = xas_load(xas); + } + + if (xa_is_internal(entry)) + return NULL; + + return entry; +} + +/* * The only thing keeping the address space around is the i_pages lock * (it's cycled in clear_inode() after removing the entries from i_pages) * After we call xas_unlock_irq(), we cannot touch xas->xa. @@ -250,7 +281,7 @@ static void wait_entry_unlocked(struct xa_state *xas, void *entry) wq = dax_entry_waitqueue(xas, entry, &ewait.key); /* - * Unlike get_unlocked_entry() there is no guarantee that this + * Unlike get_next_unlocked_entry() there is no guarantee that this * path ever successfully retrieves an unlocked entry before an * inode dies. Perform a non-exclusive wait in case this path * never successfully performs its own wake up. @@ -580,7 +611,7 @@ static void *grab_mapping_entry(struct xa_state *xas, retry: pmd_downgrade = false; xas_lock_irq(xas); - entry = get_unlocked_entry(xas, order); + entry = get_next_unlocked_entry(xas, order); if (entry) { if (dax_is_conflict(entry)) @@ -716,8 +747,7 @@ struct page *dax_layout_busy_page_range(struct address_space *mapping, xas_for_each(&xas, entry, end_idx) { if (WARN_ON_ONCE(!xa_is_value(entry))) continue; - if (unlikely(dax_is_locked(entry))) - entry = get_unlocked_entry(&xas, 0); + entry = wait_entry_unlocked_exclusive(&xas, entry); if (entry) page = dax_busy_page(entry); put_unlocked_entry(&xas, entry, WAKE_NEXT); @@ -750,7 +780,7 @@ static int __dax_invalidate_entry(struct address_space *mapping, void *entry; xas_lock_irq(&xas); - entry = get_unlocked_entry(&xas, 0); + entry = get_next_unlocked_entry(&xas, 0); if (!entry || WARN_ON_ONCE(!xa_is_value(entry))) goto out; if (!trunc && @@ -776,7 +806,9 @@ static int __dax_clear_dirty_range(struct address_space *mapping, xas_lock_irq(&xas); xas_for_each(&xas, entry, end) { - entry = get_unlocked_entry(&xas, 0); + entry = wait_entry_unlocked_exclusive(&xas, entry); + if (!entry) + continue; xas_clear_mark(&xas, PAGECACHE_TAG_DIRTY); xas_clear_mark(&xas, PAGECACHE_TAG_TOWRITE); put_unlocked_entry(&xas, entry, WAKE_NEXT); @@ -940,7 +972,7 @@ static int dax_writeback_one(struct xa_state *xas, struct dax_device *dax_dev, if (unlikely(dax_is_locked(entry))) { void *old_entry = entry; - entry = get_unlocked_entry(xas, 0); + entry = get_next_unlocked_entry(xas, 0); /* Entry got punched out / reallocated? */ if (!entry || WARN_ON_ONCE(!xa_is_value(entry))) @@ -1938,7 +1970,7 @@ dax_insert_pfn_mkwrite(struct vm_fault *vmf, pfn_t pfn, unsigned int order) vm_fault_t ret; xas_lock_irq(&xas); - entry = get_unlocked_entry(&xas, order); + entry = get_next_unlocked_entry(&xas, order); /* Did we race with someone splitting entry or so? */ if (!entry || dax_is_conflict(entry) || (order == 0 && !dax_is_pte_entry(entry))) { From patchwork Fri Nov 22 01:40:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882585 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2068.outbound.protection.outlook.com [40.107.243.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A205D1667DA; Fri, 22 Nov 2024 01:41:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.243.68 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239692; cv=fail; b=M1k8BpD8b8WnvgURS69+eyiUZF9MH7ebE95xF1XIoTm4yKo2OHCY4JFq0VN96wXZyhWhryBRLkzyyr6NlHazAjxT+NPB54IDrfs0Ap9SkGLwmV4810NBlvmFc8EIDNl7R6ZdKzYOMiXpzCpHoorqygOuut+gk7f1W080pKDTpz0= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239692; c=relaxed/simple; bh=cq48zLtdILuZsOSnvH3Hi+btvp1Cy3JooyYQkUZ5q1w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=RhFK+GJbliG3U62B+QrzwWwwYqAqZuphRczuxfCXoDtw3JWadRJtCaxIrcdcTj7zVlAzGu/Pw+H3lJr3BaJZ8tOcJ9kvBgdIcK12+wQnhyrQ9LU+65ItfR2N0lHbDWQcE3uBcxlRDZ2Tbq0Qe0h6zItpLElX6bzPifpab2CayAo= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=FPWWbZop; arc=fail smtp.client-ip=40.107.243.68 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="FPWWbZop" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=EjPzjwPVG+RkdHotwXs+rk+UvSi6cCWH7Py+ovslYDesIKGnjDGsdGiRfFV3lei3u6q5mmU7++7mtV4dhz9+qZsXoRqs77e4+DKXLEqGHwwoO2AejTlJJswDakMpyuUuq5AAHg+/7R5PMCwyLGx3+W6CsNzOHmqXbqATWx3ADXO1SN1TccUCA7mAhkFo2In6PGN11mrat4jz8qtuGgTgogWImkLcWXPS6e6zZPeZ4toGwShp3D9xhreqi/7XarPx2iYmjxkAkn9j59oH8/FAwnsFenb5x/sgXsYUAUrafO4Py1nh/ps3UpZPjFwIjrlswWLwZ/kUDXLQKv5NhtdAOA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=41LuzWbsulYp/PRvvGucQX4mO89Nz3ZyduYN8QkpUTM=; b=FYBM7dYVa5ntezYAa2T4U9/vCpzhefD5lRr8AYFDiEybbScdj94AebJ0Bd2gjO3qrxqbm0lpxr6MlDLj/cIPLodI285anPfkL1mAps62QmaYelIecxIXAseWkFwXedU529xDlTqQOpfNdDE38v8Tw1dr6CbD3A8UlTYQ/bLUr4jFZELlksvfTTzHY2vQVmEsS/mmQr6UCyN6p4p5uHnuujdGVfGdogLjbjLnhw/eXHtRDz3WuVeE3VpNzLfwlkwdv0y4Uwf5avpwUt0K8pIGOhLdLAIieWDYNjVH73bkA+2mGxXR+D2l6j3ITHZ1gA6OBclVslpC5sszQqRIIawwZg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=41LuzWbsulYp/PRvvGucQX4mO89Nz3ZyduYN8QkpUTM=; b=FPWWbZopysFhz6vWMmcnRjBsqRUKZP1kRo4DEu4Bq+bTI2/lrB0fvi2har2/b2tDgzBlXCPToCioKWG9JUYAfYiv+zEKOPiABqCXvx5lrl/KBIyvbnrhrDfbF0gJ/w2BdQcSDT3Sxb2QKaoPHlRBfXxVvHv+px0X8akqyG8EN/mkWyyNKrpP0XdZORj+JwbvWSL80Jr3eVAhOKanFOln+GdUv2DNi1G2JgrNEzd9Dg7r1onrEudPV5vcJa7Nd8LMeqefFQBvLgMQkcy0RMomzUGXPBUuL785P2CP6IzwBj5nrN8DqhOA7G1VOgovum29I7c4uJVGiEAvRuLLSjDl4g== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by IA1PR12MB6305.namprd12.prod.outlook.com (2603:10b6:208:3e7::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8182.17; Fri, 22 Nov 2024 01:41:27 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:41:27 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com Subject: [PATCH v3 04/25] fs/dax: Refactor wait for dax idle page Date: Fri, 22 Nov 2024 12:40:25 +1100 Message-ID: X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SY5PR01CA0119.ausprd01.prod.outlook.com (2603:10c6:10:246::15) To CY8PR12MB7705.namprd12.prod.outlook.com (2603:10b6:930:84::9) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|IA1PR12MB6305:EE_ X-MS-Office365-Filtering-Correlation-Id: b89cb45e-9681-4324-9a99-08dd0a96c7d6 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|366016|7416014|376014|7053199007; X-Microsoft-Antispam-Message-Info: vBDsC22s/9jYkS3kHl2ZsPXUrK/pbOQ/z09tipXFcvpu+e/bpZcmUMHW3QSoafvbBwMYjEvS9CzFRRQiBnHgRKzGcvznQgUXZEHoKuM6NjHUiWoyJt2J3E/gS6C5zH0L1dUmz08ifDDkfWBfucjUiDjGVDGodWNPUoNtOsmZKFIE8YlMlDwse4QoJKW2ANnrVRy1espwxTyA73XHWPP7EcbvHYAPGtcN74degHg33lllaM/MNjSBBalxEEvHvdA2LFyKXH4wVqA+vANy2nM+zT2wSYt0o1DQaFUiwmy9vChvCwtFCxfD2BTcl6p5JksQfSVPBb1c4DSnY76GqXiI6ca9iyZar+05n7IBUeDR9wZnX6IAvjneIhKiOpirlAEycJHFmnec6rCjYwUK0BFiy+2N2rIhJJqN6a9E6bKqutcySPa+xD9MuJvJkBnp+VZHYxxJNplrzZRcojreOFbOSpYQOxEhlRgXlmibVGLRXjtAom5568XFBvhTQMahp3DKHkg79wK3RwLvh6xb8GcNp7D2DCU1E5YljIILGRBvgUbRZKgJkxR/LrhaS2F73mEiUMV4efSVz6eHTWP4RNR8hQFJV8KCtCnDagO4EXX+/305AKk4moXdgoy9nPUohKHHYZjlEabHOdFkPZMDZC9Bi4gPc1vlxxU5xfRfRaHtVqvp45o6yXfANGmT1wHZr3+w4YbjdWi8OdOr53t3/Rba6Evx4a0kmE9ypzJjH4/EHwipGiVNYZ+dE0i/8r8ccRYFya+l2q5WxQNMlYFRCivdi2ubmrDieas1w/22uDxsIaMYLpXb1Kve0rjDrFLa36RofdtMG8PtUVQQSF/kyaNfPBm6e5Tss2EdYzMJlzKy1vCFpeWYW1OEa03bjoRY1GCOyBhiN5vurPGcaA1JmfC46B1S6DKYPdeWNEndWO9S7+rDfSSf7W/YQEcVWXKJ3SZAFPYHuIiYZ3X8BFrOoqFjdujqLlsqGUwXFgy6nFX9CpB1HYJ9EOkv7nFV/7dgQ+69fSWGx4T05XfLccpSgIts5j5qv1jGrmuS/a/DA019cAHvF/cpqXcEew2tH/P/sv8bSg6alwTitF+/g8a2ZZeOKwOCXKGPQ/YFPR23rGFuhn76pBhvvMqgYi3+bOcL0WuGi8Glj/YkfrWpZ1fDuQCGUpOOfx+jdS8prOaw4BszGPmtsShEntCJHYbBZ2pxS4pU95oxMY+GPCyDGM3nTi/dvqS9CLuVg+3TtnhOokLT6yH5wgNNfheo1xIghWR/K0yKKxc8oiBVEYj0R6k5w+8J4PF9m3I49BbYg7M5uEm6wgmW/KjGgm1LNfCFlRcADzzl X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(7416014)(376014)(7053199007);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: LrcS7mIOCApa64fh6J/d+9KUHo2OE/MqUVwi4Afgpuz5ZyH0XJGnQxC2u+ATcOBplEqH4Qp4vkX7RqDxQ/qEH9EsOrnlCTYaY4725TAHNFd94ROkc2G88oQ3Q2dYl8XG/upiW40Wt4SfJKeedWXvd+pfUUybJ3FIL2RUXC0kkFb01+Bq5ILlzdSdKn4Czbl4vmEUhPYWC/62L/W5F4Lv+h6Y+uk1LmJNles7c8HUDnJtbYryWTBq+NS7YhtTtPZHtBtG4KuOfhdbx5NWORm4x5O+O10EVifzaD7C4MYTUWCkodetZyF84gdJ6GMFUqyVeRQN5S9+Ty1wO2N+zZ3WWWwr28J+Y5CBPvUH4e2YYNNNRPPiCgfQnA2mswg06JUhLYULwNs8+7FoJSAER9k5fE6K+5EUghMINSCuzjrUsACtOc5PpkxfZaPdCKhgSN2gAnOzF++iL2wCmvLnyLuZflnzS8iIMaYdvzo2q5BligJYZJ383Jrzroqwqw4UBoVfQFSD1o6Vz95kPbiPcccgRRcdo0K/nKapVfAgnolFgcrwqrjO0RYWx8V9EneqrvUpkDw+NvCdIg9DfARtDnGwQV7StwOTQWH2hEMpdZW1/XWlGbXFajbaEWgFuv6elF2ssUgSbAaWODcHN/eYGH4IZ81ThZRxLfW2hC1y0eDxcA9/g66NPsfY1KGkvDCF7KOxja3k1Q2HH++497D+YJLZ6U2aEE2CF5EJUw2h7lwV3bfOP4A2aplLSm2Mj/IDQTclhG3jbD8IZDFHyaoZkFQRhQWov1ofUGHROOitd/8Fszgb97dWFctcof3SjxIcVvoR93BR/EbYPQKQe8xWLTogt3Vonblrup7W9yYcBXkYSNIKrgIqkR2NCVlV8F1oENamhllR1J1JEaBD9dYnMhIKny3VxGgcxDFwjFm2QFiy1UVf3uOiwxIOIzHOSZFNwnwqdlEKF/fPQD/7NsHYEkdxapUygtVJiLam9w/ArSW/WqlCggmfM/SU4+Q+YKly1KkwAuymnXS2h4r3ecSp4rdRXz07aLmJdTIaVbB32Xsc0aRbZQMIZLjnhp5nYcIO772cVDduxVlCjM4GYru9ACRbGaDGE3EtwbBLnvZmxL67fHdgEGJdLpKTO2yQjUFeKNF2YMRcTo1G6aKl6GNFAIDj1SMGDZ87lNfC/Q4bxXTf4gpFeUt3iotxv/gs6Gx3MYsSVK1bNpvjbYvtoZj+wmOoiCbq/KlrYX1VpMSej7H0ql3OHlU5QJRyBJs0oUz1KX9gImoUsyvS4mRBqsu4EmeIer8D9/tEklbnWpjid7kvivTx6NddrE7iZBmprBx9/t2wAB75Epgd69L95psVzPPm2BCN4wskuy3GKLk8GOGlmGEnLcYgjg8psuMeFuTxNPLDJ/wP/1NHkRCpAZlCeZ50ro6bJZ2Lu5P1kdKylsoYd9XE34jKuGydxjmsl4zArSeE8V/cqO1M3E06WNSbiji8TDjwuxxVLY7cDC1QLTtwsP5WmwnVZLJUVrOas9dyBOg1r8jpm7XyGEangCOA4iJQLT5t1zShkCitvlin/reSy62UaYyrwrxeNRguTSXkERbT X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: b89cb45e-9681-4324-9a99-08dd0a96c7d6 X-MS-Exchange-CrossTenant-AuthSource: CY8PR12MB7705.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:41:27.5129 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: CI5cCDec1pLcjWyBDZawH7QH6iwOF1Iqcgj7aBlDi0whYUccbeQeit737RuRreXXEWIJybBhLxPts8yEyACfzg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6305 A FS DAX page is considered idle when its refcount drops to one. This is currently open-coded in all file systems supporting FS DAX. Move the idle detection to a common function to make future changes easier. Signed-off-by: Alistair Popple Reviewed-by: Jan Kara Reviewed-by: Christoph Hellwig Reviewed-by: Dan Williams Acked-by: Theodore Ts'o --- fs/ext4/inode.c | 5 +---- fs/fuse/dax.c | 4 +--- fs/xfs/xfs_inode.c | 4 +--- include/linux/dax.h | 8 ++++++++ 4 files changed, 11 insertions(+), 10 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 54bdd48..cf87c5b 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3890,10 +3890,7 @@ int ext4_break_layouts(struct inode *inode) if (!page) return 0; - error = ___wait_var_event(&page->_refcount, - atomic_read(&page->_refcount) == 1, - TASK_INTERRUPTIBLE, 0, 0, - ext4_wait_dax_page(inode)); + error = dax_wait_page_idle(page, ext4_wait_dax_page, inode); } while (error == 0); return error; diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c index 501a097..af436b5 100644 --- a/fs/fuse/dax.c +++ b/fs/fuse/dax.c @@ -676,9 +676,7 @@ static int __fuse_dax_break_layouts(struct inode *inode, bool *retry, return 0; *retry = true; - return ___wait_var_event(&page->_refcount, - atomic_read(&page->_refcount) == 1, TASK_INTERRUPTIBLE, - 0, 0, fuse_wait_dax_page(inode)); + return dax_wait_page_idle(page, fuse_wait_dax_page, inode); } /* dmap_end == 0 leads to unmapping of whole file */ diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index bcc277f..eb12123 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -2989,9 +2989,7 @@ xfs_break_dax_layouts( return 0; *retry = true; - return ___wait_var_event(&page->_refcount, - atomic_read(&page->_refcount) == 1, TASK_INTERRUPTIBLE, - 0, 0, xfs_wait_dax_page(inode)); + return dax_wait_page_idle(page, xfs_wait_dax_page, inode); } int diff --git a/include/linux/dax.h b/include/linux/dax.h index 9d3e332..773dfc4 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -213,6 +213,14 @@ int dax_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero, int dax_truncate_page(struct inode *inode, loff_t pos, bool *did_zero, const struct iomap_ops *ops); +static inline int dax_wait_page_idle(struct page *page, + void (cb)(struct inode *), + struct inode *inode) +{ + return ___wait_var_event(page, page_ref_count(page) == 1, + TASK_INTERRUPTIBLE, 0, 0, cb(inode)); +} + #if IS_ENABLED(CONFIG_DAX) int dax_read_lock(void); void dax_read_unlock(int id); From patchwork Fri Nov 22 01:40:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882586 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2060.outbound.protection.outlook.com [40.107.243.60]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 80B49154454; Fri, 22 Nov 2024 01:41:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.243.60 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239697; cv=fail; b=PnBy1abzyLkWWvHrXRqghsyLDVVHev2TtYeGu/7H5Q1bQuNdECF9hRSZpk2ER+/BrmtDofsVvRU4AF/pv9K01tcilEswQaw8Y62kA571gLT9rpCIUeG3LzWCClOV1OlTs790iDKvy6p75zD9ezokdvSYDjSjor+zr4kv+0eL4fw= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239697; c=relaxed/simple; bh=Hf5/6AmFN0iOHNDqOKhGh9Em4PNSCVXfqUDfDg3fCSc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=bOQtTeRCjK2DF4jhX2LnC+xPVPtyoHXvoIJZUBt61+aTDfqbyRIIDuiuO+MvoTsCgngD2pE3/Xazjzj7WqKMrQvjdj8KGVh69eejKGLkOyxIBLT93yldnjJIQ5qkjjsggKKxi8SRFWtgHeCFE+41fBYCptW7soZiWAk3ng0gNYQ= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=fXfJZsOt; arc=fail smtp.client-ip=40.107.243.60 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="fXfJZsOt" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=iSrP2T/B6wO4pIczzVM+G8nBu1enZOu5AxFNup28+y9HwzW8Ak1zcLBFHJ6J5so0KRVKfW+m61X5HlSYV6oJQH3zjSheJ2gp360LnrbqefnzBjrUtDe48yd9PlsYWa2ptS/Uie7K1tAdqxZ3bvxfD0Xk6thqQiHsyGG1Z8l/95OGkrB4V/rTBWjC78OiTEMlgYkSCBLks0PMrUe/N3qTWXjqwpJn3KzxMF7sWazX6cyesspc5+/6dp32WNt4+0hTTieVzmVQxWdyHyufaXz9bcYwvMYfmD7CIo+nZnysF/uUEQnhfk7P3c37szVsvTYAT4qZTI2QFEtcBbCkGPQVNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=aaxjcauunadh3WMcr+ZT+bNHtY/w01B57wxPB0uQu2U=; b=C7/aHwiAH2I0tg6CbwYFW2Rda8T469R512PgTTZ5b437eUPdsdaBALB+78O2Sog6KoMSJHSsBzkaBsaDLGL2AdTxMG1+EHYCxWMVJSypA+vuHhOsqGSQEHIzJqPskyMTmtRW2MPO3sO/1jyRH89ZDMLqOBKlHwXb5r16lQ0B3ps2OIbJ4P4gi3DqHIXs4Sb3c2haDlAY5rpGi6r4eKBNxJD6fLDxsJtzIhUquxpnT2msSP678pXDwrDmg2sFJpTAgYsmhro9maZ2fxSevR3RafYEWltzL87SGcbeoXfFyCmENziJWg5lMT1ASJLpaEl5aPrHbTesDGa2WqZvARxB9A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=aaxjcauunadh3WMcr+ZT+bNHtY/w01B57wxPB0uQu2U=; b=fXfJZsOtnkUhmJx4rzDUjpuCg+e90sSXpcYYl+e4SFtrCbTHeC3Zph0H3zqiS9XZdx6b1i3w6hduwsOZ2a2AvC0n3838OSsLs3XhGhW/mILbPXpeSYlepOcHh3HsGvLzJkwDf/099BFvjFVPkWXdnhhw/SlpaizeQQ1DG0ak3YtzW2zSYnyRAcLl1RWQCpWuJ7r3MCtfn3H8h6Z4G5NAm/nZhOcawOxGNH6SJU2qn0jzf58jXrEUR6zFlAC9Yo1Dn5Jkl3sya8D9D1H3STgbYgsrLc+qdEKM7Q/1xFJnJMjR/c8eaHdsw3f2fDX8Ee/1IyUD3bNEAiHJYiuYV58lkg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by IA1PR12MB6305.namprd12.prod.outlook.com (2603:10b6:208:3e7::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8182.17; Fri, 22 Nov 2024 01:41:32 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:41:32 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com Subject: [PATCH v3 05/25] fs/dax: Create a common implementation to break DAX layouts Date: Fri, 22 Nov 2024 12:40:26 +1100 Message-ID: <31b18e4f813bf9e661d5d98d928c79556c8c2c57.1732239628.git-series.apopple@nvidia.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SYAPR01CA0016.ausprd01.prod.outlook.com (2603:10c6:1::28) To DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|IA1PR12MB6305:EE_ X-MS-Office365-Filtering-Correlation-Id: d59761eb-5554-413a-9dc6-08dd0a96cac4 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|366016|7416014|376014; X-Microsoft-Antispam-Message-Info: U9oStiO4tFbQPF4xT7/v9D0qjrruE8SFXv5DzWPTMfSu8Ta7+QNz0m3CnMLr0SE62m/m7AWSC6V/Dwv6umqy1VB7mnlpF5UMFgXnOdZFMFw5g5yotKhJzGrrQcIfxkSrp1nmDA2aqdYaLa0mlROh2e2taHWKH4ksH/zFhJpTrk6Eugslwe1r2Ge83xRNSHCOXeHlJOhX1Xd/JZYq9e8PSFQXzfEYIe+0GrH8lUXCuO1oSFHUiVPshCGSyvLkdB8ur503YvMJgfX7yWESQauzqeOT9tgE52E39KmMZL7wbGRgWTJyo7zBp+xZiC0aokAAnQdrga9H3OQmptXMTE+lviiDCK/XI0j3OjTjs8oK2jsY89GNqhKyUbVOgPFYBEGEXgMWK8Zl822Gnv96qJD8YQ+MHNCIyW66eMYDspve1LOFklzF4bClVqbeO1bpTfUCpTwxjJKDAB1D+SZlBOyTY6EmA+JOYb2qCcKj05w2VJMrIqe1MG5ZYAc5vX9G+TEeBh0ENhiYZKJ6z3vBb/Cn4lmDtP8QAew6ekqRnzcTkRU6ofnpCg3fJlgonSb8IzjuT6xF7h/CHazL3AUZWsyDzP8q0Ql7U83ryKdxYoXvBN5gkn8NimnW0Q6BiSkoNs4JWIiLPQsyaHYUH4QvVHyTvGCjiQiFqGzF7ZPuXVhVKn65rturzaPoXxtq0LezILpWxJbihbn9pS9V3aPHgMnPB0qvvXZCP3ZpKRNy/R2jhIZQQfyOqjXtAyDhhUwd+ewigF19+pBTTFoQr/PxV667zGIpb3WqPqukhtKKJhweFT47mVfaFg/xuheBgaEo7sSAozt1wCOqMXC/ZOIkY+yW7/l5v7N2NL9l2bX17yK8KorCqjuzXnaK4exG2h8EoFOLKqyn4/DkhaTrO6Ds8FWuHpKVfXO/rHkjKPAZ3Q77EV8lwkybkOG2f5NU8W1tV0HX/o/Sjo60cb5ccwqDGwXjUxvbDzQMbHW7X6PkWIE6uvVwd2PWztpI6O+rE0UJRKySPLZ524ARtVlptGB2Svt9kHKixEtXk0HLY2NFd0YtToCipE1GWh7AAJnpxIvNAxViwurN6hEn5d+VsaGjjemFwo6wkaF2a7YEenRiS8v3E5u3bv0tQAfk7m0Um6q5K5MVahKEmY/Ll8YkpsdFpLcc6p9lNJr+BP6r5pZJFXK7Yn5AfBulVh1MKygZyiOM4PoHE4R4VTu/A1gE2ExMlUBFLH+lwG5+6tQyimDLaWllOduGuRIP//0ii7g4126CTOZftkfsdCeMp8jFjtiGVi7Y/o2Xj2Pc6l/1LgMzGBHBgUSm6AGBg4pmVNv9BMWjAziW X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(7416014)(376014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: ncQnbCYswH438j+Gl33h1L3OT93krl8ZMps+F3vYjjTdl8obBGQloC8rBAELHz7A1JCXVG+m9wFzeZE7xy+hRpXDpOEPU1MJUL+UAppCHehrlJVyQA6TBsbAAC6FToArV2vsA+yCAvVKWkc/KQEUiTX5MgPW8Z4LkQUUcD7emCjL7dm0mNEV8SYLlK9HpzbZp+LmCmiTjAiJFMceR6Sb2Hl823chWOCqownAvb7YDyLQbZ1bZVWtIcHJIpTGU3tPEloLTN89EVDjmtFcGS1WX8hd6cBLHGoqMUlfpzytfvjTRq7QKCHs1ufjdLdWzmdlTd6PryX+p9xN4xXjKMsC9spraLsgnOlG+Dl8bm+Q0LNZClFvByTWOZdAkarjWy1Adjm8L7gOJVEF7EC+2UG6XLA7o4j4ePu311AH/LImshCGSUk+1jRzW21uxoTZAvZF9GQYQQKlcvyrSVxXkJVuAT+0EeGW723c4RS9S+G6Tos09JpgwqyqGg7HPMlPF1ZVewEfNCRcLv2X/IeOynMII3Q87YD7EXwUYxEsWR5ueQqXW6kwGCNAkp57nRduBjQN6RcL/5m8PqFwjByr1mGPjIDOuUs9n/1LC9Sbs2QO/Q+bypflhGrJh0gRbouFf/DDidJAb5upAwSIzTz9tobDtx6ERbUXd4mT0ZA1GLHe7rAXKEZsChmfmGLSjOVNBZAU+/8W+OM6lhT3KqoAGSJ10IUJ/1bNIfJkQntnM7BqE4u37YmrIDyFlHWMzRNia0R2uLNDCijusDixgo/XxVvp/J61IWQF15FX8xl2LrI6zcNCk3Bi966MCMpJ9YErCJ0ehfP4+Sh6Rv597q8AZkhrVEgby5QC9SrDVFHSBqzKC0e0l1V7Ls3Po7SGS+OHr3/E+gLJAM7Fo5FLvc6afaJBE/Sg7S+G1Wuvs2+evn7ICjHrj4s7+mBMww4N9MzKnfNthqSnpN/FOP07i1iICusy+f+KYZS2yKDNr0sqrmrwd7jNxCEN/lmdnC3G5nkAxDu0wvawV8SAiE8s90UZhQ+pU95BFTxEC5EipKEcTvejcrqoSSCiw6Y9z3aaGBqh6OvJWx2A9LUFlk9c5/g6hWQrbgWER/DN14w4eP1WW3O/YRZX3WStkptiRl4fFAMRsgbx7Y+/JvKjG4rmGQj8QlWmzFMYSdNLteWT7x0R12ViwqaH4K8JE2ypInX39TLKCF9hyDRC6BIPYGVd9XFHqbhZFNIqpwsi+aWtFVSrBYf0cYbj+9Sx8kjEYO7J9TBAfhyAQnh3Z7Pyj54HTPxnMrt4iftliOR/igEZsSCR+25kkWd2OgZTvcHbEXDt4RHq3Prpsgubd5L4b53fp37Gq5/O25wFqoaBmib+rt3mbN6TwFpDe6yO1ZFFXg00XvZ3pv8OcTjuLyLJfnCJDoSG3aXXyz0AuWIZVHLu3w1hY2VUxb2X3fFDPJCoVjsJPgzsBy95g0PsVxLZoDgFjiDRHFHcvGgTbajx7jitBFyOUjz81rK9wkwfekUrBxBwOO8AHHPc1alt3iaGwc0mHJzfej7kvUg0MfkN3Mg55cDjdnpqnXNrwcyW+OldyXRscRfa7NBA X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: d59761eb-5554-413a-9dc6-08dd0a96cac4 X-MS-Exchange-CrossTenant-AuthSource: DS0PR12MB7726.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:41:32.2541 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: mM7BBtDtbubqz7V/PPzMsfZRWxBIFWD2NdLn1JOEaXhKh83UKsIQJQqdBK5n8QajQgUWpXBPEPSYEdv8UKC1vQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6305 Prior to freeing a block file systems supporting FS DAX must check that the associated pages are both unmapped from user-space and not undergoing DMA or other access from eg. get_user_pages(). This is achieved by unmapping the file range and scanning the FS DAX page-cache to see if any pages within the mapping have an elevated refcount. This is done using two functions - dax_layout_busy_page_range() which returns a page to wait for the refcount to become idle on. Rather than open-code this introduce a common implementation to both unmap and wait for the page to become idle. Signed-off-by: Alistair Popple --- fs/dax.c | 29 +++++++++++++++++++++++++++++ fs/ext4/inode.c | 10 +--------- fs/fuse/dax.c | 29 +++++------------------------ fs/xfs/xfs_inode.c | 23 +++++------------------ fs/xfs/xfs_inode.h | 2 +- include/linux/dax.h | 7 +++++++ 6 files changed, 48 insertions(+), 52 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index efc1d56..b1ad813 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -845,6 +845,35 @@ int dax_delete_mapping_entry(struct address_space *mapping, pgoff_t index) return ret; } +static int wait_page_idle(struct page *page, + void (cb)(struct inode *), + struct inode *inode) +{ + return ___wait_var_event(page, page_ref_count(page) == 1, + TASK_INTERRUPTIBLE, 0, 0, cb(inode)); +} + +/* + * Unmaps the inode and waits for any DMA to complete prior to deleting the + * DAX mapping entries for the range. + */ +int dax_break_mapping(struct inode *inode, loff_t start, loff_t end, + void (cb)(struct inode *)) +{ + struct page *page; + int error; + + do { + page = dax_layout_busy_page_range(inode->i_mapping, start, end); + if (!page) + break; + + error = wait_page_idle(page, cb, inode); + } while (error == 0); + + return error; +} + /* * Invalidate DAX entry if it is clean. */ diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index cf87c5b..d42c011 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3885,15 +3885,7 @@ int ext4_break_layouts(struct inode *inode) if (WARN_ON_ONCE(!rwsem_is_locked(&inode->i_mapping->invalidate_lock))) return -EINVAL; - do { - page = dax_layout_busy_page(inode->i_mapping); - if (!page) - return 0; - - error = dax_wait_page_idle(page, ext4_wait_dax_page, inode); - } while (error == 0); - - return error; + return dax_break_mapping_inode(inode, ext4_wait_dax_page); } /* diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c index af436b5..2493f9c 100644 --- a/fs/fuse/dax.c +++ b/fs/fuse/dax.c @@ -665,38 +665,19 @@ static void fuse_wait_dax_page(struct inode *inode) filemap_invalidate_lock(inode->i_mapping); } -/* Should be called with mapping->invalidate_lock held exclusively */ -static int __fuse_dax_break_layouts(struct inode *inode, bool *retry, - loff_t start, loff_t end) -{ - struct page *page; - - page = dax_layout_busy_page_range(inode->i_mapping, start, end); - if (!page) - return 0; - - *retry = true; - return dax_wait_page_idle(page, fuse_wait_dax_page, inode); -} - -/* dmap_end == 0 leads to unmapping of whole file */ +/* Should be called with mapping->invalidate_lock held exclusively. + * dmap_end == 0 leads to unmapping of whole file. + */ int fuse_dax_break_layouts(struct inode *inode, u64 dmap_start, u64 dmap_end) { - bool retry; - int ret; - - do { - retry = false; - ret = __fuse_dax_break_layouts(inode, &retry, dmap_start, - dmap_end); - } while (ret == 0 && retry); if (!dmap_end) { dmap_start = 0; dmap_end = LLONG_MAX; } - return ret; + return dax_break_mapping(inode, dmap_start, dmap_end, + fuse_wait_dax_page); } ssize_t fuse_dax_read_iter(struct kiocb *iocb, struct iov_iter *to) diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index eb12123..120597a 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -2704,21 +2704,17 @@ xfs_mmaplock_two_inodes_and_break_dax_layout( struct xfs_inode *ip2) { int error; - bool retry; struct page *page; if (ip1->i_ino > ip2->i_ino) swap(ip1, ip2); again: - retry = false; /* Lock the first inode */ xfs_ilock(ip1, XFS_MMAPLOCK_EXCL); - error = xfs_break_dax_layouts(VFS_I(ip1), &retry); - if (error || retry) { + error = xfs_break_dax_layouts(VFS_I(ip1)); + if (error) { xfs_iunlock(ip1, XFS_MMAPLOCK_EXCL); - if (error == 0 && retry) - goto again; return error; } @@ -2977,19 +2973,11 @@ xfs_wait_dax_page( int xfs_break_dax_layouts( - struct inode *inode, - bool *retry) + struct inode *inode) { - struct page *page; - xfs_assert_ilocked(XFS_I(inode), XFS_MMAPLOCK_EXCL); - page = dax_layout_busy_page(inode->i_mapping); - if (!page) - return 0; - - *retry = true; - return dax_wait_page_idle(page, xfs_wait_dax_page, inode); + return dax_break_mapping_inode(inode, xfs_wait_dax_page); } int @@ -3007,8 +2995,7 @@ xfs_break_layouts( retry = false; switch (reason) { case BREAK_UNMAP: - error = xfs_break_dax_layouts(inode, &retry); - if (error || retry) + if (xfs_break_dax_layouts(inode)) break; fallthrough; case BREAK_WRITE: diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index 97ed912..0db27ba 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -564,7 +564,7 @@ xfs_itruncate_extents( return xfs_itruncate_extents_flags(tpp, ip, whichfork, new_size, 0); } -int xfs_break_dax_layouts(struct inode *inode, bool *retry); +int xfs_break_dax_layouts(struct inode *inode); int xfs_break_layouts(struct inode *inode, uint *iolock, enum layout_break_reason reason); diff --git a/include/linux/dax.h b/include/linux/dax.h index 773dfc4..7419c88 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -257,6 +257,13 @@ vm_fault_t dax_finish_sync_fault(struct vm_fault *vmf, int dax_delete_mapping_entry(struct address_space *mapping, pgoff_t index); int dax_invalidate_mapping_entry_sync(struct address_space *mapping, pgoff_t index); +int __must_check dax_break_mapping(struct inode *inode, loff_t start, + loff_t end, void (cb)(struct inode *)); +static inline int __must_check dax_break_mapping_inode(struct inode *inode, + void (cb)(struct inode *)) +{ + return dax_break_mapping(inode, 0, LLONG_MAX, cb); +} int dax_dedupe_file_range_compare(struct inode *src, loff_t srcoff, struct inode *dest, loff_t destoff, loff_t len, bool *is_same, From patchwork Fri Nov 22 01:40:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882587 Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2071.outbound.protection.outlook.com [40.107.223.71]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F22FF171671; Fri, 22 Nov 2024 01:41:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.223.71 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239701; cv=fail; b=B4GdAKEDp+FVF0UU0SkV3RHTpC2/vt5xyc1iqrIRJ+whT+PJIU3T3Iy2p1aP6RAmwLt9uaRqYCWVb/cIMQu8a9KtrMo/95G+SJ7wZ/dXQEhYPJc9S3STyz4isvWsn1KdC12ddW3cIu6o90lLSS4vBzk6Bc3w/aubPF1ZlrKcuh8= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239701; c=relaxed/simple; bh=v1vWy54ggy2EI1mTvBEOjq8Oc8D+/HW1Cg8DFzugicE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=YBbYE8B78jcrxPs4KrkWIuElzkCKxpQg9PDg8DSSJwKVWqa8lVbsu3k9pihWfJlP2MxoGMgc4gfG/0pKXE+sCmgBbqP4JXgcZRs1xs5PuwxpojCoqt3UpxnwfG9sRdpGNllGJtY63fia6MCsFjp8ZH1T9hUbuH2lBTS9rsIzosU= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=elmvitDI; arc=fail smtp.client-ip=40.107.223.71 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="elmvitDI" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=q2pEOuUQgd9g7nAveAhiFEfugvzq2/PI1yLOGRRnHG/fepZne0UIbkVdN7JS773LJrJvzrW0BNaSesCPo+82MT9VcFkaVrH0361/gCanjgCoKoHDvZ3Bvm4KEpQe7y2CTqJIjH7WKlwEXBqerFHVIQEn63d+5xKM1XGydwTiUKfiQILomM+HltPsoPD7cK9wQsI7gnrlMWhw1qoKFfyfINk/xUGJYSkaJhw9DKThuiGSIOEFUzXx2tE8MNV3B3uT5iXaBLbjxTxQUimQZTstRZxMM5xoYbCk8Xb1yqx7DR02az2rNSpX1aYO88XsJ+VbSs3xHeI6z2VhhpAJOeDz9Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=f593KhRS1mNvsuFcUZphZN3rMXQcZUUvYCTPQhqFF+o=; b=kMBhEyfQA8+UhQBBZvRkus6HJcOkBIyfMB3tf30BXZXnNO66j65nuSZCSWLHuyEI2twjr07sM1N59WzaSuQeU1sHMlqVFMrHrwN7C9YLdSjXX3BKtKAj54A9ux4yXu1odEA5ZWvilCBZQx5WCc+eZzYlAULABpmLWdN/DuwoBl64DQ2ZQ/9NDIZJF0aJa6eMrgtPjXsKyNtca4ix8r4rp4hxhdcfEou5e6knACzyYgqq98z91e0GMQYJpQszQHH5vZZ4+kM1zzYCg7rxC3MtDS8M6XrnKuAmImqyNtcVCKGJYwLFBkrs4PKbDvy3GARusFjW0wdlWU8rNsPvzAXMng== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=f593KhRS1mNvsuFcUZphZN3rMXQcZUUvYCTPQhqFF+o=; b=elmvitDIuGrq68txIRIc803dcxHM/7NULS1c6UzCczxzMMj+A27iWWwl8lZyBtH1k4D4NrSIMoix8uO/0FSs3OGBNFfgSOB8oscFwmCMwNWNz2U1YOrWzqQJy8qVUqfC9nUfeTz9NbnW5XmwOnQnfwIbM0njWWJ1onsp8lkvs/V1iMz95t/SEOyeOaID2qaBG/FQhpOoQ4cbTP4anr3dm6AKZC20xgKWW6/9P/bYL13CGttssve12oVBSsrmTLkCw4pNcqKLrgIRKU7FDWlqTfxCCg5q4H0otlFELEA4UdFIf2Gq0KMv84EbhPKmXRJUzDxVzMPzEE91Aq8B32yfdQ== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by IA1PR12MB6305.namprd12.prod.outlook.com (2603:10b6:208:3e7::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8182.17; Fri, 22 Nov 2024 01:41:37 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:41:37 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com Subject: [PATCH v3 06/25] fs/dax: Always remove DAX page-cache entries when breaking layouts Date: Fri, 22 Nov 2024 12:40:27 +1100 Message-ID: <06c5c055f211642fe46444b7784437d08381632c.1732239628.git-series.apopple@nvidia.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SYAPR01CA0001.ausprd01.prod.outlook.com (2603:10c6:1::13) To DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|IA1PR12MB6305:EE_ X-MS-Office365-Filtering-Correlation-Id: 805a7d9d-3b26-4c00-8f7d-08dd0a96cd94 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|366016|7416014|376014; X-Microsoft-Antispam-Message-Info: GMx/Nbr2NlZxpDMnbGOEmbrnfr+OUua8VSittoSGQ7H9zTYjS6Z45x1ikGCDJE89HMu8tej3EDf73440S2zhtEUUFHz87LcEGAnXuQDj8T2OOJEjFQPb/+LX2ebLfoxzFxPEYlWYt+TDooTG352fCUHSz+MzoOAUxyQxbTCeM8QcuUfZYvgA7+EqqLjLHsBY7/wr3wsOPRVAyQUKTefycd2ljlbqrMmpCNqgyFLOJRp4auo+42skNrKnRFbcnUnarzuq1g+ZR2X4Ggv4oFlCqh6oHP2J6nADYQ63pC/lOPlf9hSCgoaocqlMf4TzEyC8Eq2Tu2aZ8R+P8RuE7V1OPaUqsrEFe6A9BMarPXgpOnCHEEwnpqj9xA8LRuw/BlYHsY1X+HHjc9+Y8vlhwpKSsxAx0yLmTDDOFVth7b1DtNKhNSVFO+7IzxYmxfbef3HdEfoS6VxF8FZ75QnCNA9AVAgB55F7gIpsryT73/JVcnt4Sao35HCZZCyibYzoMxnjyTbnkRNzqGikgjapef64x2NooM2pcyUqO2WB0bmtmfsk0AJ6fdchTakblZXOm1VjB2rFd702RQUX0dFz9slyNlEle3LP6hBJalc+SlLVPy3E60HJT/pNqkty/eZFzeZQd3gSWH4po0PkCGFfLPy1ePZ8ZRs2uZVGaZX9FoPIA6jwdd6qHvdMuDwNvv5kTTs7Do4XVCt6bXugeuxUJgJwL+bo25Q+xxP7+FYAAko21dKn1tX08uAdo4EDxvfPIcGauYlTtfmk26H0WGqC5zjun7M6BzoreBF8cRbveV8BfydPePNv/Yd94xiMvYNyrgrfsT/FH0bFmXcOwFTZPaUmYQx65VZa0tFPmpP1eibVeWjL7T5k85vUk6p7vzUuG9j1XyCQI5tZ/bEhFTqf2PQTHVJ2OVYB6396wQaGD4ysx0UNUrzWQCGinmbuV6TVO42/mmruRUaePnwG0hJgXIHHinPSN8KUuwZlzjP5G9IyEj/GEOh22y8BF1YgCpj/WyeNABmV8UvaFTd62cg4l2Njz9EJoX7KSONutD2a01NqnDhr7D21XSNmSljrN2mXGCTb7DJW142f8ig0mftXRvgw0IKbir8DxfOizRPDzE8NpRPfH55HuW0A44VlTDz8P6GHDMNnW325Or07FqWiPO4qcX8JZV7pX404TM6b76n4Kji+MYeqf50DBDwocOPs52xnbAkFUwybfN4kUiuCyzeiFF9hBB/d+sqbwuGfFu86pCQFyKuyGqepG/qz9JCmPbQ2rM3ZoZ42b8+x5ILbwMzGNwXsfHuh9O9NHYiiDCiZed+OrsqQ1E+67y/Iq90oMSCl X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(7416014)(376014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: /m466kVqvHjPoaqhdtAVnZW3fPzNhSDHnH0YonvvwNahgGl4B++61fimYeJ0hsZN+RuoDm7lTSVtUzOk/n1WvJkykSvFpmcDXvXOzS3s0yq1S+SVq7mT6ig/5sO4CmJFlDLaZGs4lzkW9PWjwrj9iPT23BgCTrBmtOvU0zSP88xaAz7U1+LjBRZAwBGUh54oetqEx/gjzrDxHq2HnLQ8aiY5wFw1/yxXLs7KUzGJVfk6xoGT5gbfKC0+pJ9cjwcvc826JJLxLof/2TLFO3uKmmy3ecOOVxY13kgbpcX8AG8gkxKLyhJxBAZ+O1/d5a8kPt1Ma+5sw0iTHw+FDtLkCLlr1QF5gWxn7ZEiibsBma8rEXyRFIdKF1+mk02tr6/IOGIB5xJqZhM1LYMZ1YAGE/Om79qy9mqCBpLqZTcr/yzUBHd5Qx72N7AE2O4+64XCZf3stzBwpZacI4N4fGaoHcm56aSuX+ekcqZFIRnkHvcPVXfaql4fmn/W8wYwmATJItGJBMBlZKOXLx0e90iKB+o8rq7Fr41+zY7Ompp2lWs4Q0IpRQBVifU9Ds5mOHiB07tZoaWgtBZUNnYLR3DVlUq1WuZnNEDBjfYQvbIjMKrB+re6o8slt6taniWJzWKO6b4FoBp6qaNV2iiZpdgBxjPSFCBVfPoEt6IWu+lBwRK6N/8krRPX6uzhr4nwn7RxYmh6AQHnkgRBmw5mMh5rdopuBvhsZWzRypRmHOo/w5Dd93Ig9uw/hJ7Yq6ldtvJTAZH6Txr2s6dB5iJ8HTGOCJdIWV2d3Ja6SkQ0fC5wJfA7fVs4IYt++qZ0OEVpiNmJ5rEoUmuwA9mvAwLN1r84DOHdnibFkC1dMzVc72q1IOrtdv4mOcHkb58JQ+4XD+MB+GhHSQWuzxWb6sWi3EnYamB9dvHSNjThRUwTBDvny+bIu8vR5YBhTWPTgmXP7nziRd+dw83Sq2yjJYtpWT0cVJnR8fS9O5ZiNmKuKvbJ6t3AUpbjAAwKV37FQabBArIw+QWd4cNT7Y2fXtnflIpokbUVHn4iuUM7I85nNZ/QZLYNpE2fSidhy5UOM/lAqVxsgkHDFXxN6s/rf7luKtjbeGh99pURksbGuwfr6FLBlBl4bpvaBkrpCLre46tlxFTXo0e5Opz3lvGPD15MS+8ecm9n9hQl0LRJov9TAc73nx/KXA/LhNQPRrzX7kdqCmxmMsCaiBo1rKpPCU3WPivnA93v4iqEjghxLw4lj03HLOSRimeRSG95L4obWxF+5WPf1ezy+m5heERNfnU97zAhivE9pgt8EurPIa6vpd+ZaiWGni4CBm8WKScSHnilFQungTcIZS6NAp7q5pNEKWpClhDiM4GVUhV2bxYwAehvH4VvdPveIZ7Id8Hrzmfedfi+nVZemzDem+GJGjwP2LJS8T8zlhgsZpbedN8h2TXWa/I+FUJKDDTDlTCv2lg5n6+tHGwh6Q8pFQ9KG7g4WNvGeqhBQ4dHsc+iHKKecxRI/LXY+Vuavigt0GGU9OtTepSEw5bsyeiBIQJ+pUNIf9T9BCrvvRNtEY158icdY3e4PNdkmRnZF7KEX1mMbLrmVBIv X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 805a7d9d-3b26-4c00-8f7d-08dd0a96cd94 X-MS-Exchange-CrossTenant-AuthSource: DS0PR12MB7726.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:41:36.9717 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: fYUpJOGu2p2HQXW178gltjaARvDV5JBelaEJUOIboMa04QPuq9iBL664JWsBDxX7Y5o2UeHN8LLAOPS42dKc/Q== X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6305 Prior to any truncation operations file systems call dax_break_mapping() to ensure pages in the range are not under going DMA. Later DAX page-cache entries will be removed by truncate_folio_batch_exceptionals() in the generic page-cache code. However this makes it possible for folios to be removed from the page-cache even though they are still DMA busy if the file-system hasn't called dax_break_mapping(). It also means they can never be waited on in future because FS DAX will lose track of them once the page-cache entry has been deleted. Instead it is better to delete the FS DAX entry when the file-system calls dax_break_mapping() as part of it's truncate operation. This ensures only idle pages can be removed from the FS DAX page-cache and makes it easy to detect if a file-system hasn't called dax_break_mapping() prior to a truncate operation. Signed-off-by: Alistair Popple --- Ideally I think we would move the whole wait-for-idle logic directly into the truncate paths. However this is difficult for a few reasons. Each filesystem needs it's own wait callback, although a new address space operation could address that. More problematic is that the wait-for-idle can fail as the wait is TASK_INTERRUPTIBLE, but none of the generic truncate paths allow for failure. So it ends up being easier to continue to let file systems call this and check that they behave as expected. --- fs/dax.c | 32 ++++++++++++++++++++++++++++++++ fs/xfs/xfs_inode.c | 6 ++++++ include/linux/dax.h | 2 ++ mm/truncate.c | 12 ++++++++++++ 4 files changed, 52 insertions(+) diff --git a/fs/dax.c b/fs/dax.c index b1ad813..78c7040 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -845,6 +845,35 @@ int dax_delete_mapping_entry(struct address_space *mapping, pgoff_t index) return ret; } +void dax_delete_mapping_range(struct address_space *mapping, + loff_t start, loff_t end) +{ + void *entry; + pgoff_t start_idx = start >> PAGE_SHIFT; + pgoff_t end_idx; + XA_STATE(xas, &mapping->i_pages, start_idx); + + /* If end == LLONG_MAX, all pages from start to till end of file */ + if (end == LLONG_MAX) + end_idx = ULONG_MAX; + else + end_idx = end >> PAGE_SHIFT; + + xas_lock_irq(&xas); + xas_for_each(&xas, entry, end_idx) { + if (!xa_is_value(entry)) + continue; + entry = wait_entry_unlocked_exclusive(&xas, entry); + if (!entry) + continue; + dax_disassociate_entry(entry, mapping, true); + xas_store(&xas, NULL); + mapping->nrpages -= 1UL << dax_entry_order(entry); + put_unlocked_entry(&xas, entry, WAKE_ALL); + } + xas_unlock_irq(&xas); +} + static int wait_page_idle(struct page *page, void (cb)(struct inode *), struct inode *inode) @@ -871,6 +900,9 @@ int dax_break_mapping(struct inode *inode, loff_t start, loff_t end, error = wait_page_idle(page, cb, inode); } while (error == 0); + if (!page) + dax_delete_mapping_range(inode->i_mapping, start, end); + return error; } diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 120597a..25f82ab 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -2735,6 +2735,12 @@ xfs_mmaplock_two_inodes_and_break_dax_layout( goto again; } + /* + * Normally xfs_break_dax_layouts() would delete the mapping entries as well so + * do that here. + */ + dax_delete_mapping_range(VFS_I(ip2)->i_mapping, 0, LLONG_MAX); + return 0; } diff --git a/include/linux/dax.h b/include/linux/dax.h index 7419c88..e8d584c 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -255,6 +255,8 @@ vm_fault_t dax_iomap_fault(struct vm_fault *vmf, unsigned int order, vm_fault_t dax_finish_sync_fault(struct vm_fault *vmf, unsigned int order, pfn_t pfn); int dax_delete_mapping_entry(struct address_space *mapping, pgoff_t index); +void dax_delete_mapping_range(struct address_space *mapping, + loff_t start, loff_t end); int dax_invalidate_mapping_entry_sync(struct address_space *mapping, pgoff_t index); int __must_check dax_break_mapping(struct inode *inode, loff_t start, diff --git a/mm/truncate.c b/mm/truncate.c index 0668cd3..ee2f890 100644 --- a/mm/truncate.c +++ b/mm/truncate.c @@ -102,6 +102,18 @@ static void truncate_folio_batch_exceptionals(struct address_space *mapping, } if (unlikely(dax)) { + /* + * File systems should already have called + * dax_break_mapping_entry() to remove all DAX entries + * while holding a lock to prevent establishing new + * entries. Therefore we shouldn't find any here. + */ + WARN_ON_ONCE(1); + + /* + * Delete the mapping so truncate_pagecache() doesn't + * loop forever. + */ dax_delete_mapping_entry(mapping, index); continue; } From patchwork Fri Nov 22 01:40:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882588 Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2057.outbound.protection.outlook.com [40.107.244.57]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7F5E9176231; Fri, 22 Nov 2024 01:41:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.244.57 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239706; cv=fail; b=P1WahW3nCIQaZ4LttYUKyfTVcPLJ4x04YARSDVpV7nZFf3k0kYyAr1vpauWyVImTsTFzSSvUZmXBfROdzI8+5haI5HNYUqAlfujEvACiGWtIEivWdgs4aQDlVu4FUa22JvbtIVBQxYHypu8Ga9VAA1IO5tMv/LkWx3Qs0h3h4I4= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239706; c=relaxed/simple; bh=lN6BUUiCigqpWLj4Lr/n33B5+9jfrL4ukep5ombzTF0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=usltY+b4i122rporNePAMcLAyyBC/4i37k5HL2d9JoOcPKuK79GpxW5dOLPrKVytaxLy5biJaSvrqLqzsqQsekenjNSXcyLeWslbikdE4hh3qKfv+dJm8xqnvOLOqVtdSN9IfUzTlyguIjNRBn8s+sfM8h40psJv+4W7RrSQ1Pw= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=DEG3dCUc; arc=fail smtp.client-ip=40.107.244.57 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="DEG3dCUc" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Nlh+hyDPb8784anoufSXlmZAsth0OECIlz/r4o165KyGLPnfaO9VY5zFCxuf+7ABeOG959nzudbVZum9cwBq29ZDZ4O09BgGzz8f57yAVw+2qNp36VAQ2rGZayxmjnE+MzEsHtyScF3HDKf6jBkV3nHNJFtkXHBe1pPKKa/SADV4vo3X+MFzUqXe+5SHVdIGv2OW2Ab+yLuGXFEOJHwoBQej+dJeVoMZwgjw760I5OPFSxBRCgrmhOlTwaZl1oeY5Nh1Qj8svqNL2W+hzjD2FcbJAQlUbYv1is/Mmi/z3OtW6T+zfXUQ1nVHXR1b6ycO/bYN8zNxhI/zLf69t9LShA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=wy91m0zgEkIdTV0vxkx864/RgDyaRUi24jYEQ6+6du0=; b=LGc0oIoAfJn85RdgUpMyN0Mn+G7NZUEQX989t9lvzHAUBGWkyiMYmBSBmoSAW0Mv3efU07MtNMDgAUvGL+MGkU976v64rP7G/26vdwv1AE/0kaer9rgYQ+GnGCwburB36NuUiMVDhNFZpMwjS/oOJ67v0dWArDrvv7yfAO+PMnzCdm+XL5Bc/DQuNqzBI61mFlV6d9EG6NesLO/5C8gc0z/vqllEuoxelyX0rmUQy8PLFr1RQr8lYm8dp+EvEMy31IdOi1kRE3AFidm3A2X3Wzgfre6R9RCwPuZeuxjj/6lG2wpxL35JkoPy439wGKq8cWf5OqO0p5BIkvhnUEpRDA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=wy91m0zgEkIdTV0vxkx864/RgDyaRUi24jYEQ6+6du0=; b=DEG3dCUcHKPSMmVhGG2v6vup6t70gcszvVUGuaozMNYdRGDC0wbjFwXQGxaJ3WvBSfwOtL/8zImRWcPYlwAG7BEOOETo7Q35xRZZ6K8wyJ0Y5Qbnc5WebgK853pRb+0ZptLRgtdnS4CTD3i/SsnV+pkdOWgvmW2q+O2JMpT48UeC9b1xCjUMaoolqJRaPJXW9VyFBg8mzc1ewzLkcLA5QzHG1DM2ChibuSMFuEgTBUjS1hEcuCYgT5Y3hqItnQ8PO7DQvABzb2eZeDATFB53/YN9Y37Mbt9vIJCNujD3Ik68mQbIavOeKfCYlp26R9NysKAJbsGDxK1LUBaJVnM2RQ== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by IA1PR12MB6305.namprd12.prod.outlook.com (2603:10b6:208:3e7::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8182.17; Fri, 22 Nov 2024 01:41:41 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:41:41 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com Subject: [PATCH v3 07/25] fs/dax: Ensure all pages are idle prior to filesystem unmount Date: Fri, 22 Nov 2024 12:40:28 +1100 Message-ID: X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SYYP282CA0012.AUSP282.PROD.OUTLOOK.COM (2603:10c6:10:b4::22) To CY8PR12MB7705.namprd12.prod.outlook.com (2603:10b6:930:84::9) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|IA1PR12MB6305:EE_ X-MS-Office365-Filtering-Correlation-Id: ad24861f-40f0-4684-6098-08dd0a96d013 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|366016|7416014|376014; X-Microsoft-Antispam-Message-Info: mCSPVX8VqOStD88Qo+Kue6nKdGxIh3U4X6ZgLmEeJ6ATjnQt9CKe3fWQNIzkxLvzyUYQlRKsKbokQCA/LM+VkGka8lGIdt3mZ5F6bH0n/rVX7DQKTqCMMqlkp04Ua6EmfMSfdy1ocNk0K/XwfawBcW219AyFPNRebKG3I0b/dugDiRtqvajsaetAnJaUgxQVePde8tc2tTQ0fFm0wIqRLWexxXNGehcYe6tbTTXSgPbcJpUzyw+PKPwFpJpt+R4Nlt87tSRAv1T060ii4m1Z43aYl3zdGoG/UaWgTVt7XusdvCWUhgxuIBoOhZVMSJFa+z3yi6HrwIBvJ3BUjzQuhIGvP3Rpxs7B2Li/QLejNoxCNK1kDiyCWbG41kO45obe7S7+cCvf9bsGvCtYg5i4E3uQIQY36iTJEn8xJ2M/zn1KzIYTsVq7HmED5bnyGlGEjgU9YxYhdhI6fzIlT7+ENriXvWGKsOuLaDR19/KQdLKi6iVRZNaqOMFHFfbtcBqsdL8dGcggIYj8Oalhom6oSrzlzzp1s3r4zacc1HUg2i2K0mxXkbv81ZyxqgN9lGMszgEi5Xi9ufCJEiaAZtNeDN1g5TxWXh8PT8XK+GnuR6LxkVBwc1QCiJQnBXmymR3zxKKlQfocQGSoAajhgDMfWwXpCpUJ8X1vww+t6iun45jTKZj5cIakKreltVh38T5R4Zwq64Ea3eRS+vffkBEj3fw7IxzwKgfKjOgchvx7IJLnBiKeWF7Rpx25kcWJEFUBS5cLMGitKHpzPzo2rBSjDe9iQ1qkQ/iUyYqweS6mLbBvYij0ItfmnESSfA4qYDxngOz0Ir94wGtM2C1uZRsV4ror+tGp7fXlruiYcHZGvaY6y4NWFgeW3nv3i6BgYSksJQgjqZzc9ziqQSFCgz7X+kvnA26Osb38JN5cxG9EeBtWME8F8DFqD4GA/qlAaDZ+//VwbQvnMNO5Rlx+C00k+RTfTvqQztsZC2VN67uHgcf6ZozIew0jR95JnqZ0x9tkmJEUIJgoCalrMlSfyg+3Y7w++EQPAhiLd3mz+hZMjkFj6x2KKspHcIwP4YDK2TpcmUq7szNXglBnnJvxawEtHRZrKQ11nbi5DybWpUzCKCWjv07EBjtLnbUbNPJ/fFQD1yUdrOQL17Sw3+b32y9JsigjcPKk9qe09peaegu6VTw0B/u3hr9DL8iQ8p2nF8TM7JB2Y/z9d5M63nEipW2HPoYnr93XH44cjtwhm00kUitV8wgXC+bHLbaiHJKsd80kih2rKx7mn/+gRHykfdXIxPGUJBIqt697DXjnsppQBtyTZG7SjZ9smgnvl0X3WlBg X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(7416014)(376014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: RiwU5ecWg3oCiyyI/RHAWrjLErqMpTxFq8ywqlE/I/FwAqqa9njHmy2b+W6pEVPAwpcAyYQY8H2ZE5s2raWzAs9T2tqttC7YyGiCPoYUW67n0iEE5hnVjo50fCFcUHImYPjqQUXJeIizpfuWBpQ7SWkQZtsELePzDuAWv4gqCISbptsVsi/JGUMn9a9ApWoH9fCAokIf81bs1Mic9OUUaNfFl5L2rSnnjjyB/xaHBlHDmXxafBIoS8w7F6d50278CLikAKehzo6GttRTWj0HJWDzgH9bIJ8ITQo9tSmQAa0U3XDfVEF6l8Smh0ZZMTj1lhAO/x1cAhGd+C0UZX7XVxrNo7n5rCNr8z4tSjWeM5374i320wMaFdSZlHoXz7AGIEpl/6mJz1ycK+MASo7jGz3i8ahNvxdSML7EPgg666X4nWjsbeb/lrsc4veowaY+KWTBk+hAVN6uROExku3jnTMnIle+7GLa7FKypy+6ugwQylS3ps68Q3ljLYeM4HmUKalswPoPTRYsmmnrMxI6AUF/lVB0BgN8x2AWCAXAMeGXwDow21raadOWM76K4IznXI+JqbNEtkmY/1ruZHTA+oAyGaOPdFoTsOBMHOiN9y9BS8SpDRYHVF20IvrY9CKAVUrj4RuRTLk7BYOZqIs//NOYAif3l7U2icis4czx6DqN1x01fljea37T6tqJiotaZtwYeAr8VyCpvbGTtBHnyUQ+7lbeIhWNIFZsVBsy/OXUULorJ8CBmP9rx6JEOas8q/fpjR3F5giYTzhZbnxrxV4hUA972aH1K5Yp64FJSGob986tdrVLSfJfWpZ5arzJYE3+3KXzqK9DO9YGVgc6ppqWjJLzfeCKfzbGgcLiouEhkExspCcISbxlTFYdmy6UzW1jnvfcdmeCOlxWjNeq0gxJh/IYeS2Ll5zTbPpTm0oN44Ota2WMTTtZEA6HgT4iZV8+4FXITmVak/Y9kJGCrYd4wqD9JjEBxtIhoM4oX8D+GcVnHO5vnIGZD0/Gu1GKvkT5vkmqQ7BJLw4wXcnqnc1p/VGSI+T57IH8cNojsJkbYifsHyto35rZuOWxt+DRstDC31yIk5AUFmjmC8lGN9ValB7AdIhEZgGvaGWDlRwQbRiGMGtPxa7f7zGYNUL9Z3KBOzBCpiRrJNG6KN3RtXvK097uH6YCnidBzFu2/1JdaubzNLLcG6OdEqNiz+AeQVx18sRiJcNEEqWKGOTo5Pe4fKKicvZnp46NyuNVWn7QDGUaYD216QvSZqrNCBzJ+mDQKkbduVar50xLBdO3lWCTO62Xpo09smWSOxCUqKZBQe/ak70sZfZ/30EZYm+KXNwz31vmo8gz4KutT1ep2mcQA2SyE2xDBrXS7lz8CkDuilYTRmWNPmZ//OGi1KBRkDMbLFt+yz8VF3YJ6S8K/lm9cM/payVSYq8CLkZO97UW3KvSkDQi5aOREd8v/KsNlouyX5eHhd1odg4fyHPn7V4WX59+qzPrJejU2u7bwPNjbSZsSwnAaMrQZZmBjRwxhsXzxz4OjqnT3suUkD0uku722pM5saCJaq90kxBBjp4SY9C8OEeYmnYjSODVn2sU X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: ad24861f-40f0-4684-6098-08dd0a96d013 X-MS-Exchange-CrossTenant-AuthSource: CY8PR12MB7705.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:41:41.3458 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 6jcpsiwTd+9kgsQzzoYxDiBs4kyq8tmzsfHVeqhrFjlxYl/oHbGakWwMGV+DPS9FRW6y9lNTbEeVR8Xwewy8QA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6305 File systems call dax_break_mapping() prior to reallocating file system blocks to ensure the page is not undergoing any DMA or other accesses. Generally this is needed when a file is truncated to ensure that if a block is reallocated nothing is writing to it. However filesystems currently don't call this when an FS DAX inode is evicted. This can cause problems when the file system is unmounted as a page can continue to be under going DMA or other remote access after unmount. This means if the file system is remounted any truncate or other operation which requires the underlying file system block to be freed will not wait for the remote access to complete. Therefore a busy block may be reallocated to a new file leading to corruption. Signed-off-by: Alistair Popple --- fs/dax.c | 25 +++++++++++++++++++++++++ fs/ext4/inode.c | 32 ++++++++++++++------------------ fs/xfs/xfs_inode.c | 9 +++++++++ fs/xfs/xfs_inode.h | 1 + fs/xfs/xfs_super.c | 18 ++++++++++++++++++ include/linux/dax.h | 2 ++ 6 files changed, 69 insertions(+), 18 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 78c7040..0267feb 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -882,6 +882,14 @@ static int wait_page_idle(struct page *page, TASK_INTERRUPTIBLE, 0, 0, cb(inode)); } +static void wait_page_idle_uninterruptible(struct page *page, + void (cb)(struct inode *), + struct inode *inode) +{ + ___wait_var_event(page, page_ref_count(page) == 1, + TASK_UNINTERRUPTIBLE, 0, 0, cb(inode)); +} + /* * Unmaps the inode and waits for any DMA to complete prior to deleting the * DAX mapping entries for the range. @@ -906,6 +914,23 @@ int dax_break_mapping(struct inode *inode, loff_t start, loff_t end, return error; } +void dax_break_mapping_uninterruptible(struct inode *inode, + void (cb)(struct inode *)) +{ + struct page *page; + + do { + page = dax_layout_busy_page_range(inode->i_mapping, 0, + LLONG_MAX); + if (!page) + break; + + wait_page_idle_uninterruptible(page, cb, inode); + } while (true); + + dax_delete_mapping_range(inode->i_mapping, 0, LLONG_MAX); +} + /* * Invalidate DAX entry if it is clean. */ diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index d42c011..9c8bcd8 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -163,6 +163,18 @@ int ext4_inode_is_fast_symlink(struct inode *inode) (inode->i_size < EXT4_N_BLOCKS * 4); } +static void ext4_wait_dax_page(struct inode *inode) +{ + filemap_invalidate_unlock(inode->i_mapping); + schedule(); + filemap_invalidate_lock(inode->i_mapping); +} + +int ext4_break_layouts(struct inode *inode) +{ + return dax_break_mapping_inode(inode, ext4_wait_dax_page); +} + /* * Called at the last iput() if i_nlink is zero. */ @@ -181,6 +193,8 @@ void ext4_evict_inode(struct inode *inode) trace_ext4_evict_inode(inode); + dax_break_mapping_uninterruptible(inode, ext4_wait_dax_page); + if (EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL) ext4_evict_ea_inode(inode); if (inode->i_nlink) { @@ -3870,24 +3884,6 @@ int ext4_update_disksize_before_punch(struct inode *inode, loff_t offset, return ret; } -static void ext4_wait_dax_page(struct inode *inode) -{ - filemap_invalidate_unlock(inode->i_mapping); - schedule(); - filemap_invalidate_lock(inode->i_mapping); -} - -int ext4_break_layouts(struct inode *inode) -{ - struct page *page; - int error; - - if (WARN_ON_ONCE(!rwsem_is_locked(&inode->i_mapping->invalidate_lock))) - return -EINVAL; - - return dax_break_mapping_inode(inode, ext4_wait_dax_page); -} - /* * ext4_punch_hole: punches a hole in a file by releasing the blocks * associated with the given offset and length diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 25f82ab..bdb335c 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -2986,6 +2986,15 @@ xfs_break_dax_layouts( return dax_break_mapping_inode(inode, xfs_wait_dax_page); } +void +xfs_break_dax_layouts_uninterruptible( + struct inode *inode) +{ + xfs_assert_ilocked(XFS_I(inode), XFS_MMAPLOCK_EXCL); + + dax_break_mapping_uninterruptible(inode, xfs_wait_dax_page); +} + int xfs_break_layouts( struct inode *inode, diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index 0db27ba..aea3d3a 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -565,6 +565,7 @@ xfs_itruncate_extents( } int xfs_break_dax_layouts(struct inode *inode); +void xfs_break_dax_layouts_uninterruptible(struct inode *inode); int xfs_break_layouts(struct inode *inode, uint *iolock, enum layout_break_reason reason); diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index fbb3a15..e25a880 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -747,6 +747,23 @@ xfs_fs_drop_inode( return generic_drop_inode(inode); } +STATIC void +xfs_fs_evict_inode( + struct inode *inode) +{ + struct xfs_inode *ip = XFS_I(inode); + uint iolock = XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL; + + if (IS_DAX(inode)) { + xfs_ilock(ip, iolock); + xfs_break_dax_layouts_uninterruptible(inode); + xfs_iunlock(ip, iolock); + } + + truncate_inode_pages_final(&inode->i_data); + clear_inode(inode); +} + static void xfs_mount_free( struct xfs_mount *mp) @@ -1184,6 +1201,7 @@ static const struct super_operations xfs_super_operations = { .destroy_inode = xfs_fs_destroy_inode, .dirty_inode = xfs_fs_dirty_inode, .drop_inode = xfs_fs_drop_inode, + .evict_inode = xfs_fs_evict_inode, .put_super = xfs_fs_put_super, .sync_fs = xfs_fs_sync_fs, .freeze_fs = xfs_fs_freeze, diff --git a/include/linux/dax.h b/include/linux/dax.h index e8d584c..53a7482 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -266,6 +266,8 @@ static inline int __must_check dax_break_mapping_inode(struct inode *inode, { return dax_break_mapping(inode, 0, LLONG_MAX, cb); } +void dax_break_mapping_uninterruptible(struct inode *inode, + void (cb)(struct inode *)); int dax_dedupe_file_range_compare(struct inode *src, loff_t srcoff, struct inode *dest, loff_t destoff, loff_t len, bool *is_same, From patchwork Fri Nov 22 01:40:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882589 Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2041.outbound.protection.outlook.com [40.107.223.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 97A1B17C992; Fri, 22 Nov 2024 01:41:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.223.41 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239711; cv=fail; b=Qhv2Eku5erS7W41oPUs3MfRvVrfCIBDZfIQP3XI1ApoOzqkHD9ft2+bhtguirsuo8jsNRYAB6zbAY5ZRAbDvYXUJFcAJWQXSa7RRFngduUaFmCjQ7wI1VtLuRLCJO0VrInUOl6DGTQb4wIan5bf75qexNnQmN56X2PVZKKQBwjw= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239711; c=relaxed/simple; bh=oLJn0ydv3Skk7dAjaq2CEFBCSOacGYu0euAiI/gK9kc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=upZenKTOD0A7gXQJkE1IdnGg0U8IoyKDy4Idjusae6l8KDD2KHP7pTXhQXgP5SIPJLHvrJ6pfM2sG1CNb3zcuc4w8Z37llVcoGcsvKzayq0xQx4hJUyPia97ScqijH3knIKoQxym1DHHVBT62Qe3ZTY2bCLV2iEAqjMAGdPLFG0= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=kyr1ePZh; arc=fail smtp.client-ip=40.107.223.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="kyr1ePZh" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=S2FMWEv3hox6U/JyUE+rlzz6iFINTjUGWYFcLUXTDfIgfBT3h9V3HzuLrnt9F+YL0eUSiZiVMOxMX7855cD4GBtjSJrxQpySlTM2o65fVWah4sLPRvlYmWCDteDfQovXnZHWwYTbPb5J6prZshAa9vk5DfYW1hpxQ2/InISNYT/7JcbYY167448E0c+L8ahI9zoLonrcZPnuXsZXYSj9jxLh/yufemhlopdbTywE2w9wp6NyePpk+OPzvADwKfAODjbWcntxLTizBDAxVCKobMAhMnVS9ds0n2q8saTnbhggyZGN7gMkGNHH67hCaLK4wST2V3Wgd6TNOxKwEra9rA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Nm3xJqkUNMV4Ed55ikpwm4J38qbiY5kdy1mnIz5BbWw=; b=iHxby60TV+4AvzVMAUO4VL5LGKOHtuJW/E2rud03DvlXa7ubDzxdbrPtyuxwi8aWPWrqHctw5nYIpwxdMQDzWVVLFUxIdhPBJdKxATGckxh3ljUKqcz+/RhRKUhLaJ5DFd2OaDwDwpC4aCSj6Nlnbgi6+63ARIQPPRG8mRNoLGMDKOQAJbdHx3IhlvRUoK0PpWIp/eBxanRKfYtX1W5102v4CV348iwDFokj0/u72XFViv+RHVTCdPcUaxkhNK4IEOytC6BSdo1m2Yfb+kqNgIw4FDsAJus19rFnJWUB4c3aD8GPzGeOcR+jTDirUHjBSU9s35Qe5Ry3m8yjF60ISA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Nm3xJqkUNMV4Ed55ikpwm4J38qbiY5kdy1mnIz5BbWw=; b=kyr1ePZh3/VATxqd/QmUpzIa+3W4WQriZgirLY6BgXlUbfudmR6Tgddi81Qw/0t3j9SRPvqW0qqDIy1VolC5l8P/V+Stv91P5g5j2BWf0jOeVGuEs40P+bpd3qfdkERc2VEUB7XtnQmg+rJ9vljsGbe+ckHeP2fZBOJhj5iyVWmt9omPaKgz9VxDu02+k5UXiWoB3mWslI7Javzhm0TIw/z6ym8RtLDuE36GSVmM/w7rlctZwTPh2IxwMsBZJAb+G0FifW1Ia8tD1KdDVsHS0OiZufwgV4cyN+FOveTOxWkoz5yWgDh4XaP2+U8vj7+dc5d7fDdxdjJhjNeqCdgRjw== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by IA1PR12MB6305.namprd12.prod.outlook.com (2603:10b6:208:3e7::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8182.17; Fri, 22 Nov 2024 01:41:46 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:41:46 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com Subject: [PATCH v3 08/25] fs/dax: Remove PAGE_MAPPING_DAX_SHARED mapping flag Date: Fri, 22 Nov 2024 12:40:29 +1100 Message-ID: <3e85c274cb0bb589e90167fd7c4fc0946b3ec94b.1732239628.git-series.apopple@nvidia.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SY6PR01CA0026.ausprd01.prod.outlook.com (2603:10c6:10:eb::13) To DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|IA1PR12MB6305:EE_ X-MS-Office365-Filtering-Correlation-Id: 15ddf1f9-09b2-47de-c1d0-08dd0a96d340 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|366016|7416014|376014; X-Microsoft-Antispam-Message-Info: XXB03acsgYf2X21ySJppxVpGgTw0jn7y7xMwAs2cd/SsRkEn72y6VdCkB7a9GWiWMch9wGGKWsEwN0TtrsU/3JhWdgbkMdHpVKbmBhq4hjfJdWWjgHnjMUx7lhLwnKJmiQ1UvIyC06t7rcxjOPByzOZNuVXEaP7g1tXCKvvY50t2cZq3U/oOi8Ni9fyt4Iv88bb6yxAXQx3BjLtSvtov+aTAwDxM8N1nAE8MnJO9F65I20LFdOUaX3pBJgxqGXtXH4LUYvXDd+7NbEplXCmp8HA1jB8++25wTi9vZIGZDk92tJAoPL/QmxfDgSykKtkQjhEoBJxIwI+UntgmCE6VM6muIqONPz71lmjXTa1+Dz8oPWCP586YweKT2VXKslK5GwYfkBAwWTVBdJj6lUcH6/uqrjF41jezbWUobIbFgG7QM/+E622pUOqEb/jv96ipLdYPgMiO5P8/VOFVTbgEeNmEVoPeuS7x5iBOpsIJYlsoMzXrp4QuXSNH46yC+MtRb4oCOGc1MZsdl9cuxhGCYKogzlBxFp7ixWlcxVfbOXwkZFf6keks6RzQ6CkaYdFWoUyBj6fbqrtC904Jejp2NDtSFaypafKzwyD3ienj59AxYTOS9Y2+81aO9tQ6lcgg0liEroU5CCtn/alk2b5WSiw9SfCYBfuvMMmRqu/Bgm/uxQ27auL2uL+P25F2XGV34OpY5WLCyq6dRSKPwhA5y1z5OoTFsaZ69W8kwqpEAc2AILmAjpu+ayzWZb36UOqgKES/38lSZZPuDsPJHMqIzRUXFbZcWSIotlbNYa5mE+yrE4aRKdygq4qQU3cbQp3gLyLvf9rOo2es/xkXcQBkHOykgv/H4AOnnoqTUJFcExrOuARc3bIYiFQjyoJ1wlH70G8jC6GJk+XIXD7cl/VpbwKmjCsaWmwglnvxmDp7xcbWOf+ewtDvreivX2EZ0Lb8K8HkyD8VfmShLvsh3cx0xyV4aDvKqfqeT8xJAs5mYWwWMRmUOd75rWpDfv9Dzqfpv8pOkWXEKAPfIgeeeTBMjbwp8+uArRGhvbLScV5eJHT79pFdRGSkuAQFZJzJe8it4o9HDhaDXnZmS+nudDQQ/IIVCj/ntUpaWRLnHujFlP0SpWOwVmiG2bGJKLRmJkI1OkxbJLjfTo/OMpRP7w3dryY/sCLoOtkm7RUXQZ0IaPx6uY3jrBKUHxYuObh052jgn8JsO4eNRU3YGENJDInnuRyqxHHaRyyPJbqCbNFi7IzfBR2vP8/ndd+FQCFtrfSGHUy5IuEnJwel8Pc7/hM1AQspz6QDQVyQ4EQE9xiQErEU1C8TZFdy8Cy79M3WbwQK X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(7416014)(376014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: mwhQ8DMcESTEN+zfBSdtF/EOYmb88y4Aq4ePI7FCujvppROU4AT+szwCMnv6XaJj0T1eHci9qoFmziq4xkWBv5F5MjHNMOvrudlFBUny3fFRkvazSLMu60Hz7vMMnT6nMXyUUQMyA/zfGc8ThCCriCjOTnL9ZOg/AZkprT6l5MNG0tvdST2T4k3P2BeiVo9Fj9WKSmm8/CDpatDWWz4huQkgAHNtuIcgxZ5Rjvh6m9E+hC1zcMS13hItDGmPDODwU+3HmmsVU2tYXOAbtJTHibbtG1esERh6zA62qgcEwjJHKuoGoMGvTcLfW3GgUDAFQQIUIXz0NiuCEYVK5DRfIslcMKaVgAVBsO/Px/4IGFGQqf4uj5rIPKVk5dhq/fGl1JTBbEl2fSFXlcrfPm7fq14Sx3pPw/CLwJeqjCZ2oXhet6H+jKZlzw9MTc5BJxR0CzzviF8vMNrf5SpvE3mEbwxVfXKi4SkvGMdYtmaJBHidjKLL5Hh68J0UUHlIfqP2T1ATtqLyMneHcE8oZGCtDhrRcGEXLMqxLDjwhl1Jq00uYGw2u+OrY/xa3cgU0COdPmLyvcljI9km1Wnco2FuL2iSU4dwZMIlFK2E2k91SgvZo8ItGPp2Y+TMBWxhApirU5r9ZcRoHpXUgP3EkTMZ3l3RpyaFrlUrGAC3iFUrAlqyVbh/UW1QGEtWkVdsVSrlJOpKy93v6TtfULiAxunkaFVbtEh/+3m5VETq6lg0kFeHXHiUI+0CdKk0692aoFvRrgYt9lCHBgi+6mJM0Dk+TZrShNLekZNhD0LRelvamqVqUFndJVZ5pNjqyYxjAzFIcA4Qs6Gj03+jxiAbfxfc875Lf73ZtG+9yaCctxQHqYJA7RDs7FhamTXHfYuJvm1f3oGkmsCCXf6rpGlqwD6AcmvicokSM6ZuRI2EPd6oIME9qkyCbttRCDowdKleb65He7FIzW3qIUY2noeGNM5vVSKnSYhacpoubkRQeJhrQcGU2xMUlb/42PkHKDajOlziXaJ45hr6nYhCIhx54HBiqFLp1bpWeGZunV+raU5beySgwOcbvdjIaFht2DX/Vx09JeR0y99lx9dYHNL0UkBILTiGZP5u7uwvw2CtoUvHS1TuRzzdd/8QRsPu/fK0q2yoz40imDDEZAw0JqdCI0Z89b9Dis0+Km9yR+XGS8gwBMdDf1atFADF/vysTbJ/CmUrsiUObKxVdlXeHt/kairi7kroBR4L0m/m9+6Qsrq7RKGY+zTHHIBJJBh2vHoObr6M1fDCFt19AnAOB3tIb3dqIeGwumSZMynKQE7GED35v9DnLgH3OGAtxeBBaxYlS3wnD5qrZtnfbCbutxAA+m7VkCPfvwoB7nFK/3ozocYjbETS4Ge1OCoGc2MTFiqoEQVwz0vmLhRCqFyuG5OQ7a394u3PKVuldi4jBL+sbFxUcBbjnlrmKz0oYemaSQysba+mcltpum3+Uoxg1iHTnPS/szkzPJIsJspYCgnMdhjUAgWJ92N39X2RTsBwCGRn8sSLBAI2Fm3APUI6AW5C0IW4VW8hwQ0rc0HFVDXbUy9B6+8IlxYBFlpsz3Ef5QAqd8PG X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 15ddf1f9-09b2-47de-c1d0-08dd0a96d340 X-MS-Exchange-CrossTenant-AuthSource: DS0PR12MB7726.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:41:46.5115 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: j9ZBgneg2q3qlygm7JYg48uz5kjZBtI+gXFUX4GXEDDecU1ipZudlGcCZGxiFE168rImu3Zl9jegmxI9gXuyjg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6305 PAGE_MAPPING_DAX_SHARED is the same as PAGE_MAPPING_ANON. This isn't currently a problem because FS DAX pages are treated specially. However a future change will make FS DAX pages more like normal pages, so folio_test_anon() must not return true for a FS DAX page. We could explicitly test for a FS DAX page in folio_test_anon(), etc. however the PAGE_MAPPING_DAX_SHARED flag isn't actually needed. Instead we can use the page->mapping field to implicitly track the first mapping of a page. If page->mapping is non-NULL it implies the page is associated with a single mapping at page->index. If the page is associated with a second mapping clear page->mapping and set page->share to 1. This is possible because a shared mapping implies the file-system implements dax_holder_operations which makes the ->mapping and ->index, which is a union with ->share, unused. The page is considered shared when page->mapping == NULL and page->share > 0 or page->mapping != NULL, implying it is present in at least one address space. This also makes it easier for a future change to detect when a page is first mapped into an address space which requires special handling. Signed-off-by: Alistair Popple --- fs/dax.c | 45 +++++++++++++++++++++++++-------------- include/linux/page-flags.h | 6 +----- 2 files changed, 29 insertions(+), 22 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 0267feb..d193846 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -351,38 +351,41 @@ static unsigned long dax_end_pfn(void *entry) for (pfn = dax_to_pfn(entry); \ pfn < dax_end_pfn(entry); pfn++) +/* + * A DAX page is considered shared if it has no mapping set and ->share (which + * shares the ->index field) is non-zero. Note this may return false even if the + * page if shared between multiple files but has not yet actually been mapped + * into multiple address spaces. + */ static inline bool dax_page_is_shared(struct page *page) { - return page->mapping == PAGE_MAPPING_DAX_SHARED; + return !page->mapping && page->share; } /* - * Set the page->mapping with PAGE_MAPPING_DAX_SHARED flag, increase the - * refcount. + * Increase the page share refcount, warning if the page is not marked as shared. */ static inline void dax_page_share_get(struct page *page) { - if (page->mapping != PAGE_MAPPING_DAX_SHARED) { - /* - * Reset the index if the page was already mapped - * regularly before. - */ - if (page->mapping) - page->share = 1; - page->mapping = PAGE_MAPPING_DAX_SHARED; - } + WARN_ON_ONCE(!page->share); + WARN_ON_ONCE(page->mapping); page->share++; } static inline unsigned long dax_page_share_put(struct page *page) { + WARN_ON_ONCE(!page->share); return --page->share; } /* - * When it is called in dax_insert_entry(), the shared flag will indicate that - * whether this entry is shared by multiple files. If so, set the page->mapping - * PAGE_MAPPING_DAX_SHARED, and use page->share as refcount. + * When it is called in dax_insert_entry(), the shared flag will indicate + * whether this entry is shared by multiple files. If the page has not + * previously been associated with any mappings the ->mapping and ->index + * fields will be set. If it has already been associated with a mapping + * the mapping will be cleared and the share count set. It's then up the + * file-system to track which mappings contain which pages, ie. by implementing + * dax_holder_operations. */ static void dax_associate_entry(void *entry, struct address_space *mapping, struct vm_area_struct *vma, unsigned long address, bool shared) @@ -397,7 +400,17 @@ static void dax_associate_entry(void *entry, struct address_space *mapping, for_each_mapped_pfn(entry, pfn) { struct page *page = pfn_to_page(pfn); - if (shared) { + if (shared && page->mapping && page->share) { + if (page->mapping) { + page->mapping = NULL; + + /* + * Page has already been mapped into one address + * space so set the share count. + */ + page->share = 1; + } + dax_page_share_get(page); } else { WARN_ON_ONCE(page->mapping); diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 1b3a767..b905018 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -668,12 +668,6 @@ PAGEFLAG_FALSE(VmemmapSelfHosted, vmemmap_self_hosted) #define PAGE_MAPPING_KSM (PAGE_MAPPING_ANON | PAGE_MAPPING_MOVABLE) #define PAGE_MAPPING_FLAGS (PAGE_MAPPING_ANON | PAGE_MAPPING_MOVABLE) -/* - * Different with flags above, this flag is used only for fsdax mode. It - * indicates that this page->mapping is now under reflink case. - */ -#define PAGE_MAPPING_DAX_SHARED ((void *)0x1) - static __always_inline bool folio_mapping_flags(const struct folio *folio) { return ((unsigned long)folio->mapping & PAGE_MAPPING_FLAGS) != 0; From patchwork Fri Nov 22 01:40:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882590 Received: from outbound.mail.protection.outlook.com (mail-dm6nam12on2060.outbound.protection.outlook.com [40.107.243.60]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D12B5186E34; Fri, 22 Nov 2024 01:41:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.243.60 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239715; cv=fail; b=CIzMA7RsMR1skJQMI5SLMWKUXw7U5JDW+6DHEGuqj+gSfwWJhmA+hssv9odq3gs/t0Dnm4kdgzPazRNs5p30GAf+K+tu/IcpFul/mS6k9N0Ryt07atwcZzaseheZZc8/jfm7WOSoWelAOhQmzVoj2IOCargfO1DDM1KMme9KUWE= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239715; c=relaxed/simple; bh=UxMjaFoX1XecxjsEqzxRhn/ytYG90McKoDDNSu+KFPE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=B28M9bSLdkuNKbXHIXCJ0XfTp+/E8VSfxinLxPh8GjYhtRI6EivxBgluRqDWAcC7DsFqCMAZeamyeYa+wPUpkmPgXV01EMM4BVOf1zH2UHk3H/pPQqBugw0RgM9oAqsqXvRDkMXs96VBYABFbPLaHPE3SKT7OsmxCMb/j0WymhU= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=SaC8VsqI; arc=fail smtp.client-ip=40.107.243.60 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="SaC8VsqI" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=q74vPdUt07D1cdNw7Ya/LRPqiC8rZXjBnxu5S7qU9y7UPB8n975ZJRCIr6NfHHJQvn/crImYfSTzYEWlOgp/1u2KxGGOkrjk4PCic5MNmFKuuABhCGVRC7UMrZuBbRxI+XWZ7KYyJYnXIHTcgHuIyGvgoPFdYhDUzynqIpeejyXVJM4hHFR6LllkZWty/rkF/nkh1/kOGKANFQnx5bRICUf4Ythf10q9gUWa1NHCFuALASMZqgAEAR6mCQsR+o7cBcf3P60Rz2TpIJlJkgqimm67qvEQtbMfrofqgg5XfEDOZK0NUFwJFmrcVnDAil8F4NAnhWyeHu+Q4sIE8TRKCQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=5+Odtnntc8K4ZZvVtUiGn9anzLC7nGpzGIZ4FsGVPrs=; b=vd1xH+pmY8U/EYlfw+CF8oE90a8vo+Mswi9Uk7KIgSngStnnaN3qrlgzbhE9yEgz2EoPkr7q4j/IJCTp2Lfb1jCKBfeOMByligGIk1ohXAJySSuLbywAjcgCSvu8ty7jJJssxv+v+8JtRjwh6G+pO32OQ4IVu5kr8OnDcXvw8X92NN4Vfd3WpOTRO/ajfcUZJ4twraML9Gbjge3+FV6P7BMiVxbcfuXl6Y7DpbnowPGXqRhC4BTViznnQ1NI5mj0U/GK8CI1/mD6jv66GqxkH20MKS6fspNQ/SYcW2ZuH0a5N3BbXQCMHTW7hqTAU21Sn1Y/szwTQbtq17Tg1VAlbA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=5+Odtnntc8K4ZZvVtUiGn9anzLC7nGpzGIZ4FsGVPrs=; b=SaC8VsqI4GK9Rz37vqiEMEqVpT5mWsM2CHMeaiGveJo4ciKA1pT79ccCrKkhYR4O6UKnYOuBNZN0dcDMVIjGGhRv7EdFjX7RctUlH2IURjGwMrMuSTBtgEU19Nnbs3LwuHOvz6LAdgywGywJdp+MLoly200I72oyxSQX9uvzE9f/43+7i5p0lr3//eZNxWrkIgYQebIWZDuqnJewFouqTT8B4xhaRlct86+d4MTKZievfeMEe+26vcUoioGou+e+d400/ItAstQ6ylz62K5RRTuFTi5m2SD07IKvIU7nl/svWaTfzJjkkegix1vm5a0DJhu7IBQB0vBoIeLPl7pk8A== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by IA1PR12MB6305.namprd12.prod.outlook.com (2603:10b6:208:3e7::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8182.17; Fri, 22 Nov 2024 01:41:50 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:41:50 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com, Jason Gunthorpe Subject: [PATCH v3 09/25] mm/gup.c: Remove redundant check for PCI P2PDMA page Date: Fri, 22 Nov 2024 12:40:30 +1100 Message-ID: X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SY6PR01CA0018.ausprd01.prod.outlook.com (2603:10c6:10:e8::23) To DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|IA1PR12MB6305:EE_ X-MS-Office365-Filtering-Correlation-Id: 2b094f9d-e52a-475f-479b-08dd0a96d5eb X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|366016|7416014|376014|7053199007; X-Microsoft-Antispam-Message-Info: NvA6NCRRYiVzdu6A3dLyR9pI/Qm0Gcp6VuJKvryvuqy0TN0+lAZZqs1BRE5Rc4y+sHxiB9jGA0lLWBueA/4hJXX9ohcy70Qe5MVK5eUDaTJupumNpL4pyPCE0lLU+s9VDOqs/jlo9GH3lrQXKBGsGQmDZm0KV24Sx6jbaAHdX0mVgimsPpO5esBYGbJSdsOus9QjSGUPPPJHnZpMejrgJXOzJmqasqqWSLfANhdVPwURPZpEqmaJE5zljaAPuG/6POOSMz1YgfDghVyhuhmsinLPV8XiiJcDpFzzqVLp8oYP3wNxnxut03RCL04d/zb2uMpjf8o+P+lRg0knrs+ifW/PuiuA39Qz5wN9yrjqCC1PTjY8B8i0npNAUS5wCTC4AAX7hO1SlDZYPQtt3+G1O/4JoYeRSR4vaQFegCnhHVacbAfbsymQxOQvKU3dZ9V0P6lsCHlQdoym3HiZMHt7FaiedrUIb70xwWz7wWw/zdyzSmC60xsa6tL1DnJ6dalob2fG+pSuZBvrArrXvBsmmc3UPFU7QnQYyxS8J+znhbrLwOgT4qSUTbuE639MMplMjGABU8jkmXNfuJQRzfXLjKlbvYHrkd19eCQ/ifUPFNwMqJUM1Ez6wJEuieJBaJi389/5GsS8aE5+d6T7HdYK5STWPgj9mewTg9W0suDMIHvd+Ug2HvddvTRDjuksiOt+u2nyhWLvzAeaFoy5H0XlB/Ta53Tt6+kjv1p1wT1yD9j6S72r6HmTmEpq03m42xYpSo0t6BXwxSPp6QAS1Ri1oASyr7jpCeVNY+QJzZvqVrzSr72Y4VJNu5jhtjMfB37xkW44o6rBObmVDnijfx7J+NBm4ddlU9hLyEl2JO74VQNG2wUS0OhpcjS6/4QvWSxCSRmnq2Tjt0ipZLwAM8JOEbK7to5RfMKA85G5pANg/eqTr51K47YxSr/cEpXwBNFj8HxfkHgL/quESgSliK2502lBNu12FEMpD5GzvyKVufBnNC82n3XyAMsW7eYUIsjNerWX4w7ZYZ6xMNvltvlSF5KaTYv1ux6B/Sr/upn6w3opRNDQYajVMiP7sAbnO50sgyWh1vCsemym2gZcZ2OpXOp4zaR/BUYg7dG6PzyXsyhLPWBe/B4cYRZJfoKxJUK523sDNETpZWG8lBycui7zeB9WDbsT/N1xrFndL7jVl9+lFAwUydxgPuqR5Yp4OVIL9zH8APsA06mgSPNWiawmRuZiGZ4qkr2b5lrsppNfocApcn1jWNpH6/wZsHWdYnkx6agVJnKSQglw2miNrqi65dQPTMbZ4k9jAN8ngPbPZ3PZgyJSwX7uEEi/FK1mosX+ X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(7416014)(376014)(7053199007);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: UE6LSHOB7M+4wt9f9x2EUfiUNnfbOXCrmm+PRQ8BYLS1TRHeD28KoER4MhpvxZ0bobXSiUJHQdeldOADQhyUuTb4+FHzcmGYbjSxj7HaKAit+6O3VgUDnl7HrFfSx+F/Dh8fYd2mlZ5EQWYxSdCAV093aCbZUfbtSFqwKmlL+msWVT3aP9ioeYqXglCkujWC2bKZHCadyIwUULXQI8WHW/RKY8BwTUSafpA4JGxvUR4qJ7d3hu07Wt0KI7EeByUNeAwYIqYQC2Fh7IXzg1CShrGheyniLZ8CWUCJOEUaSTa8llFTCh7nY4SCdx9AaiFW0njjcdqbjsN4Y4eWKaGi88Gx2WvGnjzRQyQstJ3ZPts94K7m7KL1CGJFequQqd45k8edwOXKwiS00wHVhilwVoSBk74/DDBQtovCOhdHNQfJ7rbyKByTliK2OCXSsfK+P5ItXIJBLUq3fyqp8j1UHThta6JkDu9qL5CaSU3ZpNEUZvHVDaH/zD2esgybFjfbxAJTLUHLaVVqrBFl0nSg6848OBDLwoUyZ7NWoKcK/ELOF1RX06RVT3f7f8ecI3n/IOdjDUXAlZrye18Z5m7nMErXB/HkulrNzKIuM5PNB3Qfldidynv1O/OEp+goC+NMT8anMho5JfNXlokvid9Uh6c9XszWEdY2aVgcKipdherISVZzC+WPFvj6pzdynkRW1npcfJq6MTrk2e20QJo7UrIB6plE27R0Z1MgcBaTlCbjELh8r/nV8ldvfzdWupVeTVvU9kzITAxCkcruAIelkDNvBtKCb9CguiW96fQ+9BqEJE8SLrqjWNfkZJUKkpeE0q1xfZqvKK7qM4VxK+pqYoRNqyWbUnCJtt1vpimYuKuVGWIVZYQAfoQxuIZGzMONsXVO1DKKr73KnTQRabDiVnzNC4yjHlEP9OZ+ETSw3/zGeUGM2ZmJSwBoN/pJvw/YXUEEyjRhuTZNzRpFx0t1IGcXFafokE5AFWTWpqc22ws3MBeqNN/SQOs1d8iaPiCgtGwAjou2xHgCKJzERbfOEL9n2AToH3QScu/FnK0DNJSHR8y6AUu80h5J/nZ9BwQ3qcXPDgGIIOau2AXEmZcz3HTcLbfOXg07UGDKEENIsHoj4FRYg8OvlZVWBgFoYRCI9gRdQLFhkWRetC13vumQ8BRaPMDtEHMXZZtKvgeAXYQtM3HnKaE7ZPR9XjZwlRk33TRMWF/RCPQbOJGmFqL5zNVF/W00taO0GBLJHr7BkI224BTW/HuLCMTZwfgRMjlSQQbcmFlsIxirEc3z6k9C1pBr4XSDH+7HqNFyuLTRwBIv+c7pQb6ttVqSQNyoQZJq+z+omc551HX/901CjjdsqRql99m9tD+G1G+cSybElC2Fyp/DmFq49YwjQffCMzVhSUJ2aubo6oyLZi6d01l4ZYF4QBEruwisVMq2Mxs0IldIzPzFzkRxGx3QnljuChdKKzhrkgDsKtARUF9EMCZKzLY++Jdc4pWzWgKtwiEMM7elUqFXMpjc3Qah4iHC8mukV7XBovoncXTqG+d4S7C7E+na7OD4c5X73T8P4XPJGMv7gHUtRDxXQ9T0nfTsFsV+ X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2b094f9d-e52a-475f-479b-08dd0a96d5eb X-MS-Exchange-CrossTenant-AuthSource: DS0PR12MB7726.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:41:50.8025 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: Aq+1tlNeYOAQTduTJa+ioKMrgVJXYTZ/vzCal096EEGIVW8Kajmsx0GULOu9SYw2UxBRDvtF2Tu/YmffcqQANw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6305 PCI P2PDMA pages are not mapped with pXX_devmap PTEs therefore the check in __gup_device_huge() is redundant. Remove it Signed-off-by: Alistair Popple Reviewed-by: Jason Gunthorpe Reviewed-by: Dan Wiliams Acked-by: David Hildenbrand --- mm/gup.c | 5 ----- 1 file changed, 5 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index a82890b..cef8bff 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2967,11 +2967,6 @@ static int gup_fast_devmap_leaf(unsigned long pfn, unsigned long addr, break; } - if (!(flags & FOLL_PCI_P2PDMA) && is_pci_p2pdma_page(page)) { - gup_fast_undo_dev_pagemap(nr, nr_start, flags, pages); - break; - } - folio = try_grab_folio_fast(page, 1, flags); if (!folio) { gup_fast_undo_dev_pagemap(nr, nr_start, flags, pages); From patchwork Fri Nov 22 01:40:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882591 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2055.outbound.protection.outlook.com [40.107.94.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B49FE18990D; Fri, 22 Nov 2024 01:41:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.94.55 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239720; cv=fail; b=j7ZLiH87t4zDbqbzxr1QVvTvCGfYx1qw4mJbEqtexxwSkAsSTN3Tl95OzF1D4adPCMkuXci1EeuDdIbtZUCFiiwM8EOPNGoW9X+FHcwTmjAZltuz1cU2qkuwhOjOLD2TyitlY9OPsdEoCRSi+KHok4wAK5jzcblHcTB7T1dPv9g= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239720; c=relaxed/simple; bh=3BM3Zryn5BD2q0zRtbNas0SwqZfEJ5/aGK6S0dUdGnY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=BlTBgiz81g/yB2Kh2oXr4UyH/u41Lf7FsnG8GXlF5AabCo4hNZZPGGPjzNl3DMT7y8+ZdjojS4UL3UjwaBFvmNL4gc9WoiznpjQD0EqzVCcr2zA9XPyWq2rJBRoMRWLD7IeBXvoUKCohQLYs5I1eqBMV9t/YdJ9jhRWvVuWbVvs= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=Z0DXHulI; arc=fail smtp.client-ip=40.107.94.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="Z0DXHulI" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=IvSsFWVpCDcn3DXD8tYh4PKczkIgdcMaO7OD5+SxO7bBJImgK5If7t5ZXd/M+hXSrDBuoMfSve+Dh97QyYGHSXgrNDqwOD/ElmqIVE7rWwNscotP0CfLLHdQa6goOX820mfTRhCZuC7+gxezsU9LXH2BxrlWxrSUU8v4jGGfq/F3eI3spSnSNtaOaTOfuheKlT/F2P90ECI6izLGQdBU11b1NK4/Am7PA+thltE5d+vYQi2q5SVOAwTLl18k7bvq8l/mEmgm03gh8yEpzXRROiAze3V3zPp8G1uXwoImH+LLLqwlFDfOc4hJnMRjYBYKAXna7e6z6G27obebqkzFzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ESy7x46loh4IEoR5LJkN4qdt6dpTfnJr5eSjyPduQ/Q=; b=kdTZN3CvNoxTygZDF+0Z3BhZEZq49KDu+TkrfN5voeHRP8BXzcEL6qTfXiIcyDiaSx4k/wR5zPlJII0W0KFoKlcjvTbPGH78+/SH5z8dT44sYmjA7Aqj0LELQSI6qnq86uWLcNb+BtHb8ost7BM2HMPprA1lqDrylJTQJjq1my4YDLaMpJ+J7BUr6Yy8KRJ4x2YVo4dchTludjvcP54Ly/nzuEOIbEMiREA7/aAOdau8Gj5yJSj7H3/g/I1182uTc3QMhrLVtbrnshDLa+sLvyskLaxkzMAAPMDF1EYoX6T/DUzdpeTRUlQY5Hh5XITWHtvGhpf3SxEv1d+NV7AvXw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ESy7x46loh4IEoR5LJkN4qdt6dpTfnJr5eSjyPduQ/Q=; b=Z0DXHulIaXHnocHapL/4RkNKv0vd841JpCLXuYJTP5JiVnH4jU064lFe9X1DcYZKQ0ThJdGLFcwHRXqZJuSzd4eo/+nRz1Zz2c67475OaCbykQGpxcfwQNTz34H1nt7N8yhUbfSAbM7TAqRgZ3WOAGyrhU2nUs7gYa8VH6qsP9lKYA3WsU2b1trD38DZFwUTqC/zAHcM9evlkWmpyeci0urR9iS3dCcYholdtPp6JSLzMnNPYxa7B+Vt1mTBv3gYbWoh7a2VTFUBN5LrS4NCrL93ULyFWP6Rxe0sbND1wHPKLvLE+My18iP3PS32vz+01gGlYCcVdNgljMdQn6xLvA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by IA1PR12MB6305.namprd12.prod.outlook.com (2603:10b6:208:3e7::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8182.17; Fri, 22 Nov 2024 01:41:55 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:41:55 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com Subject: [PATCH v3 10/25] pci/p2pdma: Don't initialise page refcount to one Date: Fri, 22 Nov 2024 12:40:31 +1100 Message-ID: <27381b50b65a218da99a2448023b774dd75540df.1732239628.git-series.apopple@nvidia.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SY5PR01CA0107.ausprd01.prod.outlook.com (2603:10c6:10:246::23) To CY8PR12MB7705.namprd12.prod.outlook.com (2603:10b6:930:84::9) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|IA1PR12MB6305:EE_ X-MS-Office365-Filtering-Correlation-Id: ad6702c7-2253-454f-780e-08dd0a96d8a1 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|366016|7416014|376014|7053199007; X-Microsoft-Antispam-Message-Info: AAzer2QAd2rCsPipO8VaHOtjoBkhfWVCEAJeqKdWioOKNMg9GfbgNwvi+Y2lKGX5bfcN381QeVg6wRE7ULK/ArbwzEi+e/WZ7OxtjTkr9qxovFvmZWbtxv8pNyt0rqtfz2VmNxey67D5/C0M8N5tXYfwtAzTmqQEYzs4A1oIczpusZ9qwV0kZsuQlwPCmogUrrUEFudCimM7ZTb8gXw7Rulsd7IG8I8wD4mnrbyIw44yPy5vHH/oWBbOnvmMjpDpRRYxBNNb3xjjmiu8RjDk0+h3ANRjoUFzfv8Cs7XpyvC0aIBEn1etYz3qjGT20YQ8r4tyf7KBtYDIiYzE4/YvApUHG/WQDOVqbxSL/xEWxWXafrNBrBNolQkX3G9VCdxtzL4LsoYO02RLGuOngL73JYjVe2lyFogkM8XOvrT0DVM6sZFVsW1Mm4+G3kq2Lzp3Af0iK59Ptoo7NRZponcv46lm1v+KP7pFpSyMbc6WYJ/m+5dmLfV/Wr3ruzAcRfDWgLXEN9PgYUsJiT/tdFzWspicfXpsj1fpp88sG1ed0oloVo3C1F8dZL6RhuUXCqq0Yw+ImKS8bi+f/b3d/x3SkT2d8c/+CJSuSB1KB6SPoDwIhMj/06iZjuK/T/FveUXTYz/1vDosNmQbMwXZzqCh4wdPSJqevthT88s6y2/GpIEZSibDATe0eIi5S3P3WoV6ocKxzjy5ExMxatCYuLcfX5SLbUvrYrNK9dOJgx+C+3ZU/UGpF6Psld4asEU3zcKFBuBRBrqtdpSQgdXhI1TY8hQ2UPaUG2Lv3SHVvRs7lD0sb8Ae81mQS9E4Mct+LprvswsM1jF5M9dFZwIYzrrgSWLbXrU/8BHUzV3ZSGcZMy/SerUsPa8P9vQZE5ZZuePMeSDh5kH+2ycZ+RjoKqtH3qQb90s0Z63TY+uRg6abVHIJsvi8eXqN2tP/v3drDAjAqkf/v3jYW4fLInbDQCqn542yxmVRmXpLnySldJ+TuFnYcbQdQNpHOvGhPBTAUFrgxNWHiMcMZyu8hxXcdqCNNPgPRdGDlv/daNstUyS2Rx6BvkUqyBRAljogoZIjpCakBd89TGkY42dgqHeZeFTZgCTfltGrvM5UJKFLOBRv6JNC7hxLjExKbX4CgCe+JPF3jzVwHVzvKjgjv7V8fYWD0kCQAyvjda1ZtD1rBjH0zOBWz5JEyWqEGfpr0BX6+IEep/aBdI3I6O4CkgBMRc1puGSnst+kLcflnBsZD2PY/VCWEHX4JwukQxBdpg0NSnHnLdZ640cLiy7+T1OSnbwU2U80rwvxITmb0wHRe2euFkI= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(7416014)(376014)(7053199007);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: ovEzYvTYrcZpUogth9ZSNVD7lRTF9lR3Nlbfaev1FSK3PsvDX3qpVUs/2VzImFev4QyEZmodhJD3qUc/Gvkghfp733lj8kdPSn47fIFgwLWo0AFjf2rmfCGsA+anCbm4VFLWylQeVJbpQV2hHgwib7X2INBQ1DlQD7C3OVDxKN4pvZELU5Pn/y0APV9x2ux1zAc/M7dlsT0/wpkB5RK0Bm3VSZpzFduYcWOZI8/SC2qlZMQ3tCjQ41UzGCYLYuHMvx9YRfkgvCjJSQY+TlnQMG2AqHeDI1uLcm3gHJ9WNSbuoTEl5Co5XsU7jHfOTnOXFz4qXR92dzSzo2qq6+kmji2dz3e2QGgsRCzlWzeEEp12WW6/7EkjqiAjlVq/A6zn7C2+LdWeQrNpK3KWNIZtuSWdP9P1+jtltlXYCyaCxvkZoNJ8laob6bxzvKnjCumtb4Zq0wy6H3W/25W6jRFS1SRK4z0qP5dbhyIy5O5wHfcsuP7F22G4Utr73dsJL8hSZUl5HtDb0V1wPX1X23RRaqyQW/8SLSTphKP58awxdtntLHfL4FHIgLxF4BOoivslIKON6vZ+axURmRhfBolN+xfDFI0EOcKMy5HLgca9kXbD70eZgEcEa8Dm5Pqex8V14nnYIAGj1jj+fcUo3J6LxPi3QB6BeW9qWS3fBWkTlQwLvI8X70ttM3qAppE/ZAmReFuaonwt060dxNkiiqRDVRkYkJigTihym0JHsdFTcfvlZ9hfzCRgX9dgyWYkAew0QLyHThmFn5q6arX2bJ8u4eQytjDr+q1AnIfW2/27dyI2jSocq24pTk7zgSEWzCqmWNsIaGpDIsQbfAXEwapbKEDhop+p24YQt1pgMTmH7nGswjcTUQbrvwo3srtMLkorBE5GXYWYRWPRwvG/aPowC8eY5xoFKYjPVVzo+gsgOiaUQ/HPNmESP/zbQBy23LnvdHEOweIE9ayzZf6Aa9WeM+6EXN4382Qx7zzMiCY/OOOHgpXiYMbFsozbvvxK0JkEJAxuZglZQWPA6ecozgIDWH9H/1xBrrqohFv9YfI0FbHkXXyVSn85MAphLALzO0KL6A+1SUefiDWWsEtLWel3Ps1YScOXJAUTWC+V+AbXqL7A/lxwqNZle+Y35rFcz83z1w8qx3lp4j9BvrRahfy5WJSBjw3hSCb5/+DTvU+G5vtmj79yA4jmoQTYaPTTA83YL4id+4p1Ejs37Vn2ebkK9FDC/83rieZXvLqnW8VjT85JSt7HJ/Z77bfO5/EjqR+pSoWuDeRXO/NxAdANZZkN0vV6UIT07vGTuomAtZDbaXXDP62AhbxNzDAOymE0c94sQCOrvx/sPSPgZdZeK6ZxdbXfztmxR4kw9v4sJ0jA5Q7GRQD6D/csHTveYtKyxtk1yp3sFFTpTMRkoJrVauVXsxOW4S0IrWcvryMEgCuroAzLaDphMVfi7EOJISm4ErjjoorSbnk9AsJV0lPj7vC7TtsDgKFsqRl7Jo2meINTQEwlqBBVZKnFOXArZQqV0s0gd/qopVO93xE0hA1DzDdSwMDexNg0fR+jw1J4+dxFUkjYysBSl4a5mbPyGUtRboEq X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: ad6702c7-2253-454f-780e-08dd0a96d8a1 X-MS-Exchange-CrossTenant-AuthSource: CY8PR12MB7705.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:41:55.7004 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: K7S7vTYjSjo4WhSh/8ji4NUilaH21lz8ZDp+4PyOyCA7vOI7JQlsNI1b6H5L9+3wv6DZi88+bPQLF8Lmin/F0Q== X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6305 The reference counts for ZONE_DEVICE private pages should be initialised by the driver when the page is actually allocated by the driver allocator, not when they are first created. This is currently the case for MEMORY_DEVICE_PRIVATE and MEMORY_DEVICE_COHERENT pages but not MEMORY_DEVICE_PCI_P2PDMA pages so fix that up. Signed-off-by: Alistair Popple Reviewed-by: Dan Williams --- Changes since v2: - Initialise the page refcount for all pages covered by the kaddr --- drivers/pci/p2pdma.c | 13 +++++++++++-- mm/memremap.c | 17 +++++++++++++---- mm/mm_init.c | 22 ++++++++++++++++++---- 3 files changed, 42 insertions(+), 10 deletions(-) diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c index 4f47a13..2c5ac4a 100644 --- a/drivers/pci/p2pdma.c +++ b/drivers/pci/p2pdma.c @@ -140,13 +140,22 @@ static int p2pmem_alloc_mmap(struct file *filp, struct kobject *kobj, rcu_read_unlock(); for (vaddr = vma->vm_start; vaddr < vma->vm_end; vaddr += PAGE_SIZE) { - ret = vm_insert_page(vma, vaddr, virt_to_page(kaddr)); + struct page *page = virt_to_page(kaddr); + + /* + * Initialise the refcount for the freshly allocated page. As + * we have just allocated the page no one else should be + * using it. + */ + VM_WARN_ON_ONCE_PAGE(!page_ref_count(page), page); + set_page_count(page, 1); + ret = vm_insert_page(vma, vaddr, page); if (ret) { gen_pool_free(p2pdma->pool, (uintptr_t)kaddr, len); return ret; } percpu_ref_get(ref); - put_page(virt_to_page(kaddr)); + put_page(page); kaddr += PAGE_SIZE; len -= PAGE_SIZE; } diff --git a/mm/memremap.c b/mm/memremap.c index 40d4547..07bbe0e 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -488,15 +488,24 @@ void free_zone_device_folio(struct folio *folio) folio->mapping = NULL; folio->page.pgmap->ops->page_free(folio_page(folio, 0)); - if (folio->page.pgmap->type != MEMORY_DEVICE_PRIVATE && - folio->page.pgmap->type != MEMORY_DEVICE_COHERENT) + switch (folio->page.pgmap->type) { + case MEMORY_DEVICE_PRIVATE: + case MEMORY_DEVICE_COHERENT: + put_dev_pagemap(folio->page.pgmap); + break; + + case MEMORY_DEVICE_FS_DAX: + case MEMORY_DEVICE_GENERIC: /* * Reset the refcount to 1 to prepare for handing out the page * again. */ folio_set_count(folio, 1); - else - put_dev_pagemap(folio->page.pgmap); + break; + + case MEMORY_DEVICE_PCI_P2PDMA: + break; + } } void zone_device_page_init(struct page *page) diff --git a/mm/mm_init.c b/mm/mm_init.c index 4ba5607..0489820 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -1015,12 +1015,26 @@ static void __ref __init_zone_device_page(struct page *page, unsigned long pfn, } /* - * ZONE_DEVICE pages are released directly to the driver page allocator - * which will set the page count to 1 when allocating the page. + * ZONE_DEVICE pages other than MEMORY_TYPE_GENERIC and + * MEMORY_TYPE_FS_DAX pages are released directly to the driver page + * allocator which will set the page count to 1 when allocating the + * page. + * + * MEMORY_TYPE_GENERIC and MEMORY_TYPE_FS_DAX pages automatically have + * their refcount reset to one whenever they are freed (ie. after + * their refcount drops to 0). */ - if (pgmap->type == MEMORY_DEVICE_PRIVATE || - pgmap->type == MEMORY_DEVICE_COHERENT) + switch (pgmap->type) { + case MEMORY_DEVICE_PRIVATE: + case MEMORY_DEVICE_COHERENT: + case MEMORY_DEVICE_PCI_P2PDMA: set_page_count(page, 0); + break; + + case MEMORY_DEVICE_FS_DAX: + case MEMORY_DEVICE_GENERIC: + break; + } } /* From patchwork Fri Nov 22 01:40:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882592 Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2043.outbound.protection.outlook.com [40.107.244.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 35C20189F43; Fri, 22 Nov 2024 01:42:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.244.43 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239725; cv=fail; b=ef1nnaMKFExvScylO2VbgTsG11cAm7SKa0MDCOQtuYnBVZBoZu5ywcAcmodcHy48/0Fh3GmCuSYFnbVjWAFy/cH/kSCEtF+fgjX1eWCaTTjmlQOpRuWRXNDNwF9+RlNvJaIz1pjEe3FKaIlZN40zho/B+SoNgskw8ttslKGRDVM= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239725; c=relaxed/simple; bh=sJ5Vy/ETsMV56ox6ODm2ei1GasZtCj/OhuxBEbsezbc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=dy7svHtZSVNfWH8+kfijnPRkr7Z1/EnE+zl7nK9UadwqOwqtZOs+VDuku2YnqRIoc3BoGjiC3yTBxrePGqLEELSMQ7RmAHNJ/3vu/EBHvvzCerAeKWq5qIHzQMMmp4MVMUr2ZVW0L9NfaGBjN4a7eDeIOfZy64q+FPuD6rHuHbM= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=OYJW1dvI; arc=fail smtp.client-ip=40.107.244.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="OYJW1dvI" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=gWBmTTYYzU1NCk0+cmFiijQTTh+mKxErkBZdblt5WylATjaq2qhmSl9dVXS70nZ64xK5tAwMN0E038hnXw0Aaf6bv75jZVglj+BGQ5kCamC2IRtWdS4vQSuu9W2/KF6F6aZW78ruqEHyJCAmePr75KnKxcM4H2wwKPSnOpem+sBY7UhhEjqMVjshTsH5oECyUC825YQPSP9aR2nAWrh9Wx82f9MGIomltllLMglC1itioi8lf1t4QM2yifDUbnIHcYn9eaJofxvj6kimeTrVra0iDHMZJoFb+/HAXbDZxIifD4PQNdlZtr9phTkF0OV56CTiNktWm2g3iGN/V0+FRA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ygxL5uP66prRhNdyxIEj8y/1Y+Joj/a0lQHhMsNm4r8=; b=QWi5TdmOtO/dSb3nbtlQmS4CZL9iJNaIsT0rUHqImMEd+VzTRsZDkxokP+3K9gW8hEn5YYMUXMSKfBWPVRl06793nSVXB5huYTxVBpxjs3PHx5jbixei2Bd3oaLDrqPYofLc9H5MxYOhYwnoxspcljE7N/MpgHNgxrn6S4xrkGX16hwbSygweGPc9aXI8SHNVEUkZBBN62IKB/vtBql9vRkVXcGOK/M6seTmmZn8VB9162g6OUI2TplGPyxTKCMyDPYRrTpjQoX7PaHAWK8VSr/BSfTJryUgXwpPLL6DxL8ZnPjMFFRPWyJtsxi3K09GSo0WQt6CUPxQIBJIXWg5cw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ygxL5uP66prRhNdyxIEj8y/1Y+Joj/a0lQHhMsNm4r8=; b=OYJW1dvINY0KOgyRu1VNi8SFTzdXjmEzXVfN4dF2wvVOmAraGqNtWJmEJwqJ3OU2XvUR1bdEGpwV9dZ2f5WKWaBdkx9TIqeTfYZ3Rf/e0+yjGI5LZJVFeWVguPv6C6ApO08tCwkx5ReJhGSCJhLpJFPx+3sY8pWMWNoylluXlZPh/ECY1Z8L5NV8GjhA0OfvL5haHxQnIpEDE48/SXRpjxIw84kwZ5hy2GHRilwJJEBC+odGTTxJaKQ+1orNbTw8C9YScf+8ZXpDz+v1UZPNSNGqsMh0/4ZqF8RvW4m6WeEyOcmMq+cEsg37dsWP3bM/dTuOEpc2xlgAvGuv8RIgLA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by IA1PR12MB6305.namprd12.prod.outlook.com (2603:10b6:208:3e7::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8182.17; Fri, 22 Nov 2024 01:42:00 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:42:00 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com, Jason Gunthorpe Subject: [PATCH v3 11/25] mm: Allow compound zone device pages Date: Fri, 22 Nov 2024 12:40:32 +1100 Message-ID: X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SY4P282CA0004.AUSP282.PROD.OUTLOOK.COM (2603:10c6:10:a0::14) To DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|IA1PR12MB6305:EE_ X-MS-Office365-Filtering-Correlation-Id: b8cf4d45-6b32-43fe-e00f-08dd0a96db92 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|366016|7416014|376014|7053199007; X-Microsoft-Antispam-Message-Info: ZlUH+gGQBsO1Wi1tfiV11DS83nI58lYa6aYE/FUOeFANsD2570R0zJgPNeQ+TrJhL2MTaCTBpjNbqa4D6iX8iWdNxhatiScemU0D4EQPDppCG2NDiS0R+LG9ZYqQC2NR6W5L+9DNla48uj1ywRcxlYGEG0v3JoL81FejhRdQe9q/TeZB5wToevdSBSn/ThZBV6of69dAG7Y8vSrIK/3LGF0MaYdWskruKkwgPcTb/zUXLWzFnKxI5mEHu/eIoN4Xug5RlzIzuR6+i6XWGv3iFzgcBCftp3rr1iw2CTVWpbyOkm11VAW3hypUhKUyC84G03vgklMy6Xb2Acu/6S3d6cAXHsDB3S85p3vozfw4opyB5oDBKOdSWc5UYsamFGGqiKCyqv1uS7We5J/QdCFfgWuC+cd3HRsMnxRIT9SpXJ5/L2m9C+5Vq7sSLxa7TKqJ4gw7QIlJOnDew3hhCu0V3C3+XBdUtJg5xodYTLOr8Bea2goQpnk32T3+ou5W8z8yOBrmuR4cX3n6MGhz8QW8bTZM93gmDeL8HPlkSTC7pHxy7W/uXZtm08y2vNPBlwKAL8wlYbX0Ae4HuJUyIXS9W3IxUWCdNqOIG7QF61VoWcbI/HAe6A0ILV+m2wVjy44Nk4mbZkf3TyFf/6FtOQO5pRDUy5xcdKJHpZLu0p/TYaPGje54N4Wg7nchlE5oYO+U8Egf/++X5t4gHi51asCUCSD2OIf1sPpCbboZ/gA4bYv83HvcS0DNordSUPYRkeQSMSPRKQzK1ROW212227WCdFVGFHoc1YYAMepCEtjEkn4v18WFQbAY5GzRhxskTLADmtzLA5ovXWhQB5MVOema0wTfuOUsjvtUy/DbX56PCqspYzVUYF0h7eDNo7+938kQWnL4ejIPVaM66q+xnlWEwwrcFHPqTBw65rC7sGo3stmpyK8KYSHAVAYWLa6Q142kUnHOAGyygXTx1++LzxKetV4O+H27N2hnLwcr0mWUDCNiKZhA/PQiCETYYwDOrE7jF4/uVk8fXb+aL3t/jPCAgpdVNkJa4R/OMd9azVWhfr9Uc8hy4S4MTZAR8EWOSjn2an8Z4tN55qLGZw6qAOQleFGi9LXyDbi/G+bipC9v2POfqF+x3RYIUwvIpcCwjHr1Qb9rOe+TSY3EmTxUNZB7rte1n+EyADr+smJb7FYAPZOmjydIYFE+B5g334DVabLs8LA6RTpH9GG4Vjy1M/ythim2uZjcUQKOVeD3ymsgNSTU0d/EDzNBdkC+45ZyZaRTzSKkel4R5x0Z/wVtyfR312R4rZ3Go8OiBjbGkYIf942HG31tu6X5lgAv8l6+m31F X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(7416014)(376014)(7053199007);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: EWeLFSHzFCuqHi7ZO3jkWk7AIoV7Cs/1K6Xd4y8OjzehFWIfFTZgHK19vP/XNBqZ8CQe01XA7FA5HU7MSDtXKKBWzL7D+cYQsf0PZLE/LQuR9bWBohR4e2F0Ij5GrVOfOfd+c2voZ7rORjR/0CD0TXanew4+O471mLQ96V78lxkzMoh2BQYaC+svyTfN94xbeu1REvGPTDShD9zJKw3E6uHGuAeNx9AbFBrDTpOh2k4vPhL2O2NQfQONpLUmw+OYuOPtj6eOsv93dYAFAtURyAUY7qmmes3O5qABqb/mMZZ+DZMYf42YoOSppzEwt/QsbgJ/cz8+Yy8R7ARztCljr7L0x27JfGLMdXoa3Wxq2svuhozgS6qjleR3JznTQKDRo9Umt+E4JuByEOx/d7DLuyztRlksZXy89rgpvqFi4SXWUtJwcJsABovz/ZSPCXg9J7uNOl5YVqpYJIWPC+nX70YtjjkVuvfVT9mHSwXA3kmbWwrPZoQWvBGN/ThIMnqzhByhc9h+xk3H7CihMeawGfszjwsRpRuCLI4OicKrbtPMMNX9rVIFMJ6PONtBvO8hbZoLuoP5UWpzUjhoLQ6uxMumXtnAPGqqvu/81y4eUWxquegfXvfRvy7opNLnZF40NcVDNv526hoi02YFDFl51edenWQ/pX48sAlFEMNvbR9sWzLrHh/yiSOQ5jU/bIbQr+pIzYG8XpfG9v6s1BfAdrQIc1y5UtwltKAg9xRSc7kENJYYOzFanqXixSnr3xqAkfw2VmZ6C2OOVz6RLqAox8GRcrGC+NEvLGRHb54rTDeaAcVy5Bi7i3+nk0K641W1dI1FHpQRE6C3WdpHijKDyAYTe+T7pQxKyMdZyNuH5IkR7ouXKQdCKKZ+6CMxBGqiTEOLNWVLC87N5zELm8qa1ie6hhGlGq95qH9m+YERKnDQ4iehZCsb/bf96RvIWvhu5dVMjMwah1lzl5zLEjNEzE/JoMKkf1+FqZFN7AHb9nDH+TDN94p5r2d04n9Un5rXYczkIs9+OxdIaZzJl1esGqSmP5fJfgs66i0xF3bUiyNDam712R5hHxyT8rXshmlD5G4Zim9QWG8+4itdji0NXOb8gpz//k2TajH+2oK0Mts7OyFlD5Z9ThdNF9/qqhX5ak70yope97QbMafKbFnIgIZ/GMdDdyExDsjq6AAXJggZHDVommvr3ezBeJ+fia/I3OZ4zhmfaBw1YbCdnv5U5TR/UoHFcb/YNwijLFSSwI6R0qu1BBpvvABbbl0FNAUPBxdzE0pAVCrumWTEZaTCZ3t8P3eKqrGZbiCdw6+Leq82kJEWTJuEY1D/ND+TkhVT0yINgBtCUA94wwLg1twRVja6zKBoyU/IkXKheYpEQ2rSGAGYXzpu9ZTIeHnVlXDrO4lFFFYF765xwPtYScmaaId/QXWSj1zxjsJbhPRzNA+uPJdtR6uhyLmNIY6PKoKLcsUyFnizFyfRwTX5HxJ3zsjpvijPOJlHY+EYms3N4ZUxZI48DQw6GSgehNeptGk9Rh4kdgRJU3EgSL60uD/ePK9imnqQLAyu0BONKXSlphb6TkQGA/NFG8kp4Zmy9smZ X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: b8cf4d45-6b32-43fe-e00f-08dd0a96db92 X-MS-Exchange-CrossTenant-AuthSource: DS0PR12MB7726.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:42:00.3515 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: psq+0sA9C3dJdP6qg8HdjNnKaxPxwBaJH3Ai8I8Kppyccqs+oIKLVXHsrM2Bsp5qMJOKj2geVSvXrtgIRye3zA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6305 Zone device pages are used to represent various type of device memory managed by device drivers. Currently compound zone device pages are not supported. This is because MEMORY_DEVICE_FS_DAX pages are the only user of higher order zone device pages and have their own page reference counting. A future change will unify FS DAX reference counting with normal page reference counting rules and remove the special FS DAX reference counting. Supporting that requires compound zone device pages. Supporting compound zone device pages requires compound_head() to distinguish between head and tail pages whilst still preserving the special struct page fields that are specific to zone device pages. A tail page is distinguished by having bit zero being set in page->compound_head, with the remaining bits pointing to the head page. For zone device pages page->compound_head is shared with page->pgmap. The page->pgmap field is common to all pages within a memory section. Therefore pgmap is the same for both head and tail pages and can be moved into the folio and we can use the standard scheme to find compound_head from a tail page. Signed-off-by: Alistair Popple Reviewed-by: Jason Gunthorpe Reviewed-by: Dan Williams --- Changes since v2: - Indentation fix - Rename page_dev_pagemap() to page_pgmap() - Rename folio _unused field to _unused_pgmap_compound_head - s/WARN_ON/VM_WARN_ON_ONCE_PAGE/ Changes since v1: - Move pgmap to the folio as suggested by Matthew Wilcox --- drivers/gpu/drm/nouveau/nouveau_dmem.c | 3 ++- drivers/pci/p2pdma.c | 6 +++--- include/linux/memremap.h | 6 +++--- include/linux/migrate.h | 4 ++-- include/linux/mm_types.h | 9 +++++++-- include/linux/mmzone.h | 8 +++++++- lib/test_hmm.c | 3 ++- mm/hmm.c | 2 +- mm/memory.c | 4 +++- mm/memremap.c | 14 +++++++------- mm/migrate_device.c | 7 +++++-- mm/mm_init.c | 2 +- 12 files changed, 43 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c index 1a07256..61d0f41 100644 --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c @@ -88,7 +88,8 @@ struct nouveau_dmem { static struct nouveau_dmem_chunk *nouveau_page_to_chunk(struct page *page) { - return container_of(page->pgmap, struct nouveau_dmem_chunk, pagemap); + return container_of(page_pgmap(page), struct nouveau_dmem_chunk, + pagemap); } static struct nouveau_drm *page_to_drm(struct page *page) diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c index 2c5ac4a..e4e9969 100644 --- a/drivers/pci/p2pdma.c +++ b/drivers/pci/p2pdma.c @@ -202,7 +202,7 @@ static const struct attribute_group p2pmem_group = { static void p2pdma_page_free(struct page *page) { - struct pci_p2pdma_pagemap *pgmap = to_p2p_pgmap(page->pgmap); + struct pci_p2pdma_pagemap *pgmap = to_p2p_pgmap(page_pgmap(page)); /* safe to dereference while a reference is held to the percpu ref */ struct pci_p2pdma *p2pdma = rcu_dereference_protected(pgmap->provider->p2pdma, 1); @@ -1025,8 +1025,8 @@ enum pci_p2pdma_map_type pci_p2pdma_map_segment(struct pci_p2pdma_map_state *state, struct device *dev, struct scatterlist *sg) { - if (state->pgmap != sg_page(sg)->pgmap) { - state->pgmap = sg_page(sg)->pgmap; + if (state->pgmap != page_pgmap(sg_page(sg))) { + state->pgmap = page_pgmap(sg_page(sg)); state->map = pci_p2pdma_map_type(state->pgmap, dev); state->bus_off = to_p2p_pgmap(state->pgmap)->bus_offset; } diff --git a/include/linux/memremap.h b/include/linux/memremap.h index 3f7143a..0256a42 100644 --- a/include/linux/memremap.h +++ b/include/linux/memremap.h @@ -161,7 +161,7 @@ static inline bool is_device_private_page(const struct page *page) { return IS_ENABLED(CONFIG_DEVICE_PRIVATE) && is_zone_device_page(page) && - page->pgmap->type == MEMORY_DEVICE_PRIVATE; + page_pgmap(page)->type == MEMORY_DEVICE_PRIVATE; } static inline bool folio_is_device_private(const struct folio *folio) @@ -173,13 +173,13 @@ static inline bool is_pci_p2pdma_page(const struct page *page) { return IS_ENABLED(CONFIG_PCI_P2PDMA) && is_zone_device_page(page) && - page->pgmap->type == MEMORY_DEVICE_PCI_P2PDMA; + page_pgmap(page)->type == MEMORY_DEVICE_PCI_P2PDMA; } static inline bool is_device_coherent_page(const struct page *page) { return is_zone_device_page(page) && - page->pgmap->type == MEMORY_DEVICE_COHERENT; + page_pgmap(page)->type == MEMORY_DEVICE_COHERENT; } static inline bool folio_is_device_coherent(const struct folio *folio) diff --git a/include/linux/migrate.h b/include/linux/migrate.h index 002e49b..291e62e 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -207,8 +207,8 @@ struct migrate_vma { unsigned long end; /* - * Set to the owner value also stored in page->pgmap->owner for - * migrating out of device private memory. The flags also need to + * Set to the owner value also stored in page_pgmap(page)->owner + * for migrating out of device private memory. The flags also need to * be set to MIGRATE_VMA_SELECT_DEVICE_PRIVATE. * The caller should always set this field when using mmu notifier * callbacks to avoid device MMU invalidations for device private diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 6e3bdf8..209e00a 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -129,8 +129,11 @@ struct page { unsigned long compound_head; /* Bit zero is set */ }; struct { /* ZONE_DEVICE pages */ - /** @pgmap: Points to the hosting device page map. */ - struct dev_pagemap *pgmap; + /* + * The first word is used for compound_head or folio + * pgmap + */ + void *_unused_pgmap_compound_head; void *zone_device_data; /* * ZONE_DEVICE private pages are counted as being @@ -299,6 +302,7 @@ typedef struct { * @_refcount: Do not access this member directly. Use folio_ref_count() * to find how many references there are to this folio. * @memcg_data: Memory Control Group data. + * @pgmap: Metadata for ZONE_DEVICE mappings * @virtual: Virtual address in the kernel direct map. * @_last_cpupid: IDs of last CPU and last process that accessed the folio. * @_entire_mapcount: Do not use directly, call folio_entire_mapcount(). @@ -337,6 +341,7 @@ struct folio { /* private: */ }; /* public: */ + struct dev_pagemap *pgmap; }; struct address_space *mapping; pgoff_t index; diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 17506e4..2c9f864 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1134,6 +1134,12 @@ static inline bool is_zone_device_page(const struct page *page) return page_zonenum(page) == ZONE_DEVICE; } +static inline struct dev_pagemap *page_pgmap(const struct page *page) +{ + VM_WARN_ON_ONCE_PAGE(!is_zone_device_page(page), page); + return page_folio(page)->pgmap; +} + /* * Consecutive zone device pages should not be merged into the same sgl * or bvec segment with other types of pages or if they belong to different @@ -1149,7 +1155,7 @@ static inline bool zone_device_pages_have_same_pgmap(const struct page *a, return false; if (!is_zone_device_page(a)) return true; - return a->pgmap == b->pgmap; + return page_pgmap(a) == page_pgmap(b); } extern void memmap_init_zone_device(struct zone *, unsigned long, diff --git a/lib/test_hmm.c b/lib/test_hmm.c index 056f2e4..ffd0c6f 100644 --- a/lib/test_hmm.c +++ b/lib/test_hmm.c @@ -195,7 +195,8 @@ static int dmirror_fops_release(struct inode *inode, struct file *filp) static struct dmirror_chunk *dmirror_page_to_chunk(struct page *page) { - return container_of(page->pgmap, struct dmirror_chunk, pagemap); + return container_of(page_pgmap(page), struct dmirror_chunk, + pagemap); } static struct dmirror_device *dmirror_page_to_device(struct page *page) diff --git a/mm/hmm.c b/mm/hmm.c index 7e0229a..082f7b7 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -248,7 +248,7 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr, * just report the PFN. */ if (is_device_private_entry(entry) && - pfn_swap_entry_to_page(entry)->pgmap->owner == + page_pgmap(pfn_swap_entry_to_page(entry))->owner == range->dev_private_owner) { cpu_flags = HMM_PFN_VALID; if (is_writable_device_private_entry(entry)) diff --git a/mm/memory.c b/mm/memory.c index 3ccee51..24a34a4 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4225,6 +4225,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) vmf->page = pfn_swap_entry_to_page(entry); ret = remove_device_exclusive_entry(vmf); } else if (is_device_private_entry(entry)) { + struct dev_pagemap *pgmap; if (vmf->flags & FAULT_FLAG_VMA_LOCK) { /* * migrate_to_ram is not yet ready to operate @@ -4249,7 +4250,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) */ get_page(vmf->page); pte_unmap_unlock(vmf->pte, vmf->ptl); - ret = vmf->page->pgmap->ops->migrate_to_ram(vmf); + pgmap = page_pgmap(vmf->page); + ret = pgmap->ops->migrate_to_ram(vmf); put_page(vmf->page); } else if (is_hwpoison_entry(entry)) { ret = VM_FAULT_HWPOISON; diff --git a/mm/memremap.c b/mm/memremap.c index 07bbe0e..68099af 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -458,8 +458,8 @@ EXPORT_SYMBOL_GPL(get_dev_pagemap); void free_zone_device_folio(struct folio *folio) { - if (WARN_ON_ONCE(!folio->page.pgmap->ops || - !folio->page.pgmap->ops->page_free)) + if (WARN_ON_ONCE(!folio->pgmap->ops || + !folio->pgmap->ops->page_free)) return; mem_cgroup_uncharge(folio); @@ -486,12 +486,12 @@ void free_zone_device_folio(struct folio *folio) * to clear folio->mapping. */ folio->mapping = NULL; - folio->page.pgmap->ops->page_free(folio_page(folio, 0)); + folio->pgmap->ops->page_free(folio_page(folio, 0)); - switch (folio->page.pgmap->type) { + switch (folio->pgmap->type) { case MEMORY_DEVICE_PRIVATE: case MEMORY_DEVICE_COHERENT: - put_dev_pagemap(folio->page.pgmap); + put_dev_pagemap(folio->pgmap); break; case MEMORY_DEVICE_FS_DAX: @@ -514,7 +514,7 @@ void zone_device_page_init(struct page *page) * Drivers shouldn't be allocating pages after calling * memunmap_pages(). */ - WARN_ON_ONCE(!percpu_ref_tryget_live(&page->pgmap->ref)); + WARN_ON_ONCE(!percpu_ref_tryget_live(&page_pgmap(page)->ref)); set_page_count(page, 1); lock_page(page); } @@ -523,7 +523,7 @@ EXPORT_SYMBOL_GPL(zone_device_page_init); #ifdef CONFIG_FS_DAX bool __put_devmap_managed_folio_refs(struct folio *folio, int refs) { - if (folio->page.pgmap->type != MEMORY_DEVICE_FS_DAX) + if (folio->pgmap->type != MEMORY_DEVICE_FS_DAX) return false; /* diff --git a/mm/migrate_device.c b/mm/migrate_device.c index 9cf2659..2209070 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -106,6 +106,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, arch_enter_lazy_mmu_mode(); for (; addr < end; addr += PAGE_SIZE, ptep++) { + struct dev_pagemap *pgmap; unsigned long mpfn = 0, pfn; struct folio *folio; struct page *page; @@ -133,9 +134,10 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, goto next; page = pfn_swap_entry_to_page(entry); + pgmap = page_pgmap(page); if (!(migrate->flags & MIGRATE_VMA_SELECT_DEVICE_PRIVATE) || - page->pgmap->owner != migrate->pgmap_owner) + pgmap->owner != migrate->pgmap_owner) goto next; mpfn = migrate_pfn(page_to_pfn(page)) | @@ -151,12 +153,13 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, goto next; } page = vm_normal_page(migrate->vma, addr, pte); + pgmap = page_pgmap(page); if (page && !is_zone_device_page(page) && !(migrate->flags & MIGRATE_VMA_SELECT_SYSTEM)) goto next; else if (page && is_device_coherent_page(page) && (!(migrate->flags & MIGRATE_VMA_SELECT_DEVICE_COHERENT) || - page->pgmap->owner != migrate->pgmap_owner)) + pgmap->owner != migrate->pgmap_owner)) goto next; mpfn = migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE; mpfn |= pte_write(pte) ? MIGRATE_PFN_WRITE : 0; diff --git a/mm/mm_init.c b/mm/mm_init.c index 0489820..3d0611e 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -996,7 +996,7 @@ static void __ref __init_zone_device_page(struct page *page, unsigned long pfn, * and zone_device_data. It is a bug if a ZONE_DEVICE page is * ever freed or placed on a driver-private list. */ - page->pgmap = pgmap; + page_folio(page)->pgmap = pgmap; page->zone_device_data = NULL; /* From patchwork Fri Nov 22 01:40:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882593 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2064.outbound.protection.outlook.com [40.107.243.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AE5C918BB9B; Fri, 22 Nov 2024 01:42:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.243.64 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239729; cv=fail; b=mUVxETffO4zCLTiMF7YvLZHIPUosw1y7vFa5cpBZlajRJj491/1U6Gapa6W5KVD+w3Z2/NyL6NKAghTId6orRUkzLgvkyofZFNw0J64PmHugTwOi6u+KeqXfNvp7Ovq/7Z81snveeBJ14vd3BcB7ATZLx0DsM8oXeG5OTMOzLVE= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239729; c=relaxed/simple; bh=rLH4jadUq4KT1P+/KwUXPEJahOuUhvHdvo2VlbP3YR0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=aP2GRALfMBu97NWlK3wxzI5Q1Q0nJHdcktQfOYojhmewYMercnZ1rRPhN8K1MghMUjb5bBAqc6CQff1ALW7WJfQX1Xf3hkv8cNVbFJ9ij+Hnt1MHPef24MsiODfsdUoTuENdc7nxUrcWxkagJxLUxSGqDRHdphZ7irU6SxM5Wek= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=T6pL+4vK; arc=fail smtp.client-ip=40.107.243.64 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="T6pL+4vK" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=G1UwqxvwdOSaJ0AOiSzR8Cnmx4neIK9mW8tTOorr6heCrtphqjrgLpSENTwfm6IS0xm/jnXIZnbvA8BNT/0A4ILmHY9oYHcOqvaAMY34wy1iV8M2nD1z8CZausLMeSLub1nmHR5GdijwOX76y/BWbfSAD/tBb4YMKfXvlaZf5DSnDPDHvWKB7GzdhadnHEDPk/rRVkaA/DVbgnykiF/fyRIkPSQGfEyjdDkrQBgEdCC1JY194BNBsIF+nDqp2vFTkFVRK//hCkW5GCabneiO9as8IHMKlX1/1C9kYMd6xgy3i9gII8Ot4iRsRLudEpDNeaqf8ovvnhj6jr87QgQl5g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=IK7Wz+KpFlWHHtm9gXpPrlpwCU5rlt3ww3hmEypY42s=; b=iHkC6s/nta81WGT85oFbPnOo2N6JDWajJcc8uNb7PKFdT50q1NTX3HI7XQ6Nn8bUleRVEaiMnk6x9Ciy6L6AFOPQL987ngC8DBFNOPqTS0JgmLNLJ4RTO+qUgEDBDsCjKBJ7H1dqwZZxHJd3BKX8JjAzpQP0h3yLyAPCzFttEuvEZyggESNnomdxbuDL6z3N1DoFzMYHvBwuHaE5EjVW4M06i3ZvWSfb6aRsGECQIiUPsb6z5+aoJCIQ/OC740rz2ouc2oJmrzI3bqLyKNECl5RqDcQxA3QCg3VgyltETYKNRScw5BMh8Lx+mbnLrS6TiJ9Tt8z5kgHyzkVP3xPMlw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=IK7Wz+KpFlWHHtm9gXpPrlpwCU5rlt3ww3hmEypY42s=; b=T6pL+4vKygeQGT0rYY7dPyT8mpNPast9uIx2SrMCDRonti8DMeaOVuAvL2YEQkDwLOvo9dobT/Iv/a9TVSrMpRjTqhjzBjfDldGUeAFUdCc5Wqn9enmPpTrI2+JgZ+7QnQFgS8aC7o0YbDNnd85xNy2oEqkQQnIbIw9wvdI0O27GOARKzL3jAPxb2lwMu4+Zkze5hi4fcCVlRNcaTiv7Wd2YLaAlf5MovTlYDusCpB+a2foqF2leL5LOVs8L+7DCdA1NgAc4qNrZc8IJG9c5X/cdFs2IpLIvId0sy4ydr0bzsN+EfrzUpiVyxDLdWRd7RHaz7x/cLTfiL5S/QFDRdA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by IA1PR12MB6305.namprd12.prod.outlook.com (2603:10b6:208:3e7::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8182.17; Fri, 22 Nov 2024 01:42:04 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:42:04 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com Subject: [PATCH v3 12/25] mm/memory: Enhance insert_page_into_pte_locked() to create writable mappings Date: Fri, 22 Nov 2024 12:40:33 +1100 Message-ID: <995f2e7a6b0ffd9ab07bda41511e4505bfb3247d.1732239628.git-series.apopple@nvidia.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SYBPR01CA0196.ausprd01.prod.outlook.com (2603:10c6:10:16::16) To DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|IA1PR12MB6305:EE_ X-MS-Office365-Filtering-Correlation-Id: 9f54407d-7881-41c3-66cd-08dd0a96de3c X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|366016|7416014|376014|7053199007; X-Microsoft-Antispam-Message-Info: q02AQzHYoR7X3WwDsntD3t7YujfA6B/+liy2x+JWQst0jYSb9GxrSiiaaK2Au2Eeky+4hhMluox+u1lkT7vnWvxgTK8ux9q1dCVf0Efh4IDFbdVc4jJvcU5FrrL4BCmNv/4MlBsKmxRD4M0QtYbZF0fNeN2aG4e+xGqEo5iXvPohaeCG+ww717eI7zTqLXfCFD2EbcGv0TSeryhzBPSn13Ftwr4pTv/c0Oz73wP7FJM9OMU3DWhKaWRlI/CtokmWOLK56L1HKmLFsacd+2fwtrnmPZRpullkHcMXGObEXj/hEL2QlOT+26DbuesKblwTD8oHAMPJDCyaJN4xu90xLI8INhUOMpBdpoJWTLUEMRmQLfIg5JVyhbuoFUXTym8y0y4zNMjfwG1DXsOtEWCYMYKAsNcbSazHA15eV3DW9LnUUlNK3fsTob5Wl7YJ9uCqJxOw17evdc69E53xRx8e2dGENZ/A88BmR+Z/y9O6koNYu9Q0X9wxxsT+W90Ge/kZBDDA2ZW1yA9ug4JT72IDxsxEZ2FkY/a7I2Cvp2mfrerwtIdxgFLpXg0yP1MGyd2JnVBD9D2rY0f2t1fFy4e4wP+9vx2tB7/JP+VaqEmirrzfIqymoQ8jEKwkxy7OlzvCCHNVzDjtMh5R3hCl1Q/lCnCE5pkN4meM1PdG7siTN6w6mi87OOkYj/Ny5lGRTj0RaWGdoWSnSYg4jXdk5w5Ne2EM7daY2L3eje4xKIwaeODOqXrLF47HCTCKDXBjCHou+AwcLUtby/PUS9xdo+6JxhP4IpdyqFGqQ+hP3BvspOevO0RyXgoSFVAgnspCn8J2lSBhiCS8FLHf6HdppPRW9CohpjM88qL3zP7GGO5tAnGTCdJtxHP8cg+mbe/U5KGXvXgyKspohihSjhU7TuQJKzjYOcljdebws1TBYwHzWCDeNA9B+86s2Id/UrO67llBUnvDhLyU+RqipX30taxJlL3Cdfhw0RlocfEout5xsoTLTEbwJFyXC0j4AvDoabLRmxVXIgJgC16LdQQ3NjhXia5jTq0dHs6Xa4ISp/NzJMYm4zF4DyokbuwglYctxKn79HzFV6OXvgu7v9hsU7QlNe2eUNJVlwYaq02HZkaa+KeYA/YZCKk8/weTEGQKILJ0ROxRGgTCyv1zg6uazHPY8+tlHsDjN1T3m5MWGiuhGIyQi8rsY4EjoGnPXLJPZTsY/AA5r0X31ZtWK2HaFkpEq0auYuhpG61PG0vX1gutn868abY7/SciXjNbtfuGVTvfbHj5p4tgnVxFfEqMGU0sIjlxq36jz31bcDq8vhIO8Og= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(7416014)(376014)(7053199007);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: TESPXXhbIdEw9UInSrAciKSMNGcHShHcngY16ays4dGzmVARdHI5PV3LBlZHgc0Ml4iQlNrpp6sjf3lkcYJmIlcWz5fcdtbp6KKSz5rbbioW+KaubAgTwE63ozG+XVGv6aS11ATPUl6s+WP3D+U3X8/2QK7AczL7+ksOf5AAo0/KxWcgieHRAM2lt+xHYJ5hpg6IToOEdSBcDZwAkNstE8zBBy/Nx0RdugyVj3GvdI/yzRZai0vgM+XZokKkqB8yen76y49cNRW5Vi7+U7xtvwd1R27H4NLXF+sSiXJjTT5NXdJtMs79ZdnJywtTuh3aHtJh3gWWTn7tsjX1ecAO60blBJGjJtiB2NSr6HlIpeq2H7jA5PFvJgwOc0WvAEwYqmzb6oqQcAHLQ5czducqwxeK7KBF+Nchc4ICrxQDhkY5Wv7FHPl5c7eBwoWI7JkgRVFRLO6QsPRQ4nxMbmoPPI9XZPBIE/u4sNFcJV1kmZSO9zH7Awu9IlvYQJDZqO2ixvubnyTU6bWtBNmm0qNJssmvOv5jkWOPLL4xO5RXB6RG6AlFq/EcJY8uZq2NNE1chRTDJb1bODUlriTtT1OzQ3Srcbc8K2F0eT0MMwvt+CBboNTKiLBnaYJ0s5aBbpWsojfOhFs+7bTquveViZ7n4tEiQ/o5Nl/pZCK6fSKW7VU7WWSLkP4oIIngti+D5PNbylSAALeczF/7tvSs/54kjH3CFFtpm/GEWGTXrSUZ4ULBAU6eBN0wTHPf2rcPkm/x0B6whwlM0/UqhEM3zOakur4+5hqisshF0gZco5b+Mvp1Hj29XzbYRNeHbYSjqyqaMKKXxrjahpfWoEufPJ+IdeTofmF9cGR+SrA8ErkF1+v1/6xPYVjA9gulMIyAfGJQUoBfk8ed7S2yUAMWgRiuftP+8xfmXLLZAlyqRCM6dzYW6y3Uhb+MoXk6QQgVOSj9e2Erp3p4kNqvqOoeGpWsSNwtupG8XSqTscn+oV0QwvXAFRmbgcz+0aB9InYEXgYdRYMaODdxxgo9KoO5MEKb+K1F8FVqe/jzJCkowXvFnXbXRhIP8belhJCP8qzaNKvmnIqcsHmf93XAeq2KDgywsLAjKh3EJINWyC4zZ90s4kaGwvJagVHkVog0u0+dq+lBiER/IWCJ7+O/eP51ndyv6oZ/wT3im+D52AZUC3PWsxFqWkxjNL2/N3vo+EITCjnaIkrMc90zar4EIsd4qDx8TpreBjq/DWYQIW01ofMt8hieK0Xrpb+dO9Qruzh/kxDZbJqNWXkle4CxY50qb0JtCDbXHXFnuCBr+pKrXnid6mhx9waiDu4cPksh9U6XOk+sMgfLruLNLJrDNOMAFKArU6F/m11Bmq8xT6A1eQFz9dhUKKI2qnM4qI2LC3xL8NoIL8wnwD7ZJYk+Q/a7A+Q625Kbsonl8pXVgA/gG0wVC39RntxkgklKzqGKNz493CVA6kTZAtIfb+TB9b2QFqjLOtv0iaKnamrCKQ63shGHnXtGGsrxNnvR3NAKu26vHT/XFg8PZqMDrFKslHV3198b3v/gv5NQB/NJ6f/eFFtEiS2z+fiI4xC7ks7BJCMuYD4b X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 9f54407d-7881-41c3-66cd-08dd0a96de3c X-MS-Exchange-CrossTenant-AuthSource: DS0PR12MB7726.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:42:04.7904 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: K0aVgBabLnIqjLN+U0qc4h1iDvzT0NkPMPxhFZ/B2Z66v0/kHwBlyqmTFe36Vlw9LMOlS57Xfs0L8EqVtaikHg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6305 In preparation for using insert_page() for DAX, enhance insert_page_into_pte_locked() to handle establishing writable mappings. Recall that DAX returns VM_FAULT_NOPAGE after installing a PTE which bypasses the typical set_pte_range() in finish_fault. Signed-off-by: Alistair Popple Suggested-by: Dan Williams --- Changes since v2: - New patch split out from "mm/memory: Add dax_insert_pfn" --- mm/memory.c | 45 +++++++++++++++++++++++++++++++++++++-------- 1 file changed, 37 insertions(+), 8 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 24a34a4..323662c 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2042,19 +2042,47 @@ static int validate_page_before_insert(struct vm_area_struct *vma, } static int insert_page_into_pte_locked(struct vm_area_struct *vma, pte_t *pte, - unsigned long addr, struct page *page, pgprot_t prot) + unsigned long addr, struct page *page, + pgprot_t prot, bool mkwrite) { struct folio *folio = page_folio(page); + pte_t entry = ptep_get(pte); pte_t pteval; - if (!pte_none(ptep_get(pte))) - return -EBUSY; + if (!pte_none(entry)) { + if (!mkwrite) + return -EBUSY; + + /* + * For read faults on private mappings the PFN passed in may not + * match the PFN we have mapped if the mapped PFN is a writeable + * COW page. In the mkwrite case we are creating a writable PTE + * for a shared mapping and we expect the PFNs to match. If they + * don't match, we are likely racing with block allocation and + * mapping invalidation so just skip the update. + */ + if (pte_pfn(entry) != page_to_pfn(page)) { + WARN_ON_ONCE(!is_zero_pfn(pte_pfn(entry))); + return -EFAULT; + } + entry = maybe_mkwrite(entry, vma); + entry = pte_mkyoung(entry); + if (ptep_set_access_flags(vma, addr, pte, entry, 1)) + update_mmu_cache(vma, addr, pte); + return 0; + } + /* Ok, finally just insert the thing.. */ pteval = mk_pte(page, prot); if (unlikely(is_zero_folio(folio))) { pteval = pte_mkspecial(pteval); } else { folio_get(folio); + entry = mk_pte(page, prot); + if (mkwrite) { + entry = pte_mkyoung(entry); + entry = maybe_mkwrite(pte_mkdirty(entry), vma); + } inc_mm_counter(vma->vm_mm, mm_counter_file(folio)); folio_add_file_rmap_pte(folio, page, vma); } @@ -2063,7 +2091,7 @@ static int insert_page_into_pte_locked(struct vm_area_struct *vma, pte_t *pte, } static int insert_page(struct vm_area_struct *vma, unsigned long addr, - struct page *page, pgprot_t prot) + struct page *page, pgprot_t prot, bool mkwrite) { int retval; pte_t *pte; @@ -2076,7 +2104,8 @@ static int insert_page(struct vm_area_struct *vma, unsigned long addr, pte = get_locked_pte(vma->vm_mm, addr, &ptl); if (!pte) goto out; - retval = insert_page_into_pte_locked(vma, pte, addr, page, prot); + retval = insert_page_into_pte_locked(vma, pte, addr, page, prot, + mkwrite); pte_unmap_unlock(pte, ptl); out: return retval; @@ -2090,7 +2119,7 @@ static int insert_page_in_batch_locked(struct vm_area_struct *vma, pte_t *pte, err = validate_page_before_insert(vma, page); if (err) return err; - return insert_page_into_pte_locked(vma, pte, addr, page, prot); + return insert_page_into_pte_locked(vma, pte, addr, page, prot, false); } /* insert_pages() amortizes the cost of spinlock operations @@ -2226,7 +2255,7 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned long addr, BUG_ON(vma->vm_flags & VM_PFNMAP); vm_flags_set(vma, VM_MIXEDMAP); } - return insert_page(vma, addr, page, vma->vm_page_prot); + return insert_page(vma, addr, page, vma->vm_page_prot, false); } EXPORT_SYMBOL(vm_insert_page); @@ -2506,7 +2535,7 @@ static vm_fault_t __vm_insert_mixed(struct vm_area_struct *vma, * result in pfn_t_has_page() == false. */ page = pfn_to_page(pfn_t_to_pfn(pfn)); - err = insert_page(vma, addr, page, pgprot); + err = insert_page(vma, addr, page, pgprot, mkwrite); } else { return insert_pfn(vma, addr, pfn, pgprot, mkwrite); } From patchwork Fri Nov 22 01:40:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882594 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2085.outbound.protection.outlook.com [40.107.243.85]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8206418C323; Fri, 22 Nov 2024 01:42:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.243.85 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239735; cv=fail; b=PZTVlYSjz0PEYEmShLW/gJRVsRGRVKofR2Y9p05cwz3aRSTNlki9E3B/eRjl4AEiCxFwpEsuK1pvkAfGIzH9kOj0XHzCDDY7P8JsBD9WqQcEK6piLnq4DEw1p4kxCvCVki8NFZti4OoffAxS7gwL8yIxyF5t2Afar1oozgTzuzM= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239735; c=relaxed/simple; bh=lSkXN8Z4HuKhsG5WdO44s/H/V+lBNpFvuagXfPuo0zQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=B1Q4or0uL5jyp30XQixT4O6Xx6WOda5iz672Nw8deXK2NKq/EmEmgJAJLGHvN40xxZd8XSkkYi+EFNOnad7vDGuxoTAKs229E2HH3PaYZt6Qmv93qGNpyxXYRJPGIyu/gsbTiOoDPBWlq3aJs/i4eaRcu2bQ2JXEGQnTRMeDhBA= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=sivYZ0cL; arc=fail smtp.client-ip=40.107.243.85 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="sivYZ0cL" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=mPPBevnEoNffsLybWW7CKruZ9NfXq99PLCYnIjt4hIrq0R67QDyBNwPglwYxNHvtFFna1EVA/VpoLRR+XPE9Uj6vMWvRs30t4h91TMkebhf2pL+wPOxSQcSvqZkEs59y+hNu2QSFOeU5gwg1YoKEO8ZR/h1P8NcpgzPEQ/SVd0Jmze09AxTtjWtoHRcXqkj39oxCu4I4b9/7cS/UYiv3dOBbb7emNKwv8J4MhYb6t910pha7zhy1qUfkt/uKI4npNgP3DDdqvp+cEgI1KA1F0kyD3E5OuFjLXBs4IuGFMN1JBLalaNjEK2Ez9lwhNrpECCgtU83o+egrZuabldHCcw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=slSOif5ow0EtzQN1HVQyBi+KJnic5NqcIaNTXj2ctuU=; b=Uf48QFkqLnWEsrSaI4WskHjK4XQ1cs4P5CH8O8f4uD0YaEw1GWlbHmx6d8XyQVGr56IyFVg8D0YPNE2hZyQCwuMEat9AXfMCbClqObnYaJp/BBsPJrg9sRWR+Wna06lgG9Mm4hKHBMBaT2zpVARQ8uEfrXvrfy54z/kiLutnF6jR0mE1ZsOTAIEWfBKnj6tWvINSWqxOzoIee9S58yLElRdkAQsVBoR46vifBEXd6z46oKZnnnB0+OLKskp0Fk1vxkJrU2qfy0sbodiGhqavXrBkAMDt8+fAQ4oruM5fk7unL4fSe0ok1CDS4N1CLjd5tIA3tSywyO//4CevQCcYnA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=slSOif5ow0EtzQN1HVQyBi+KJnic5NqcIaNTXj2ctuU=; b=sivYZ0cLEW0VJ65t8v22+a0VaN4zoHpkdw2BzxXr/sBqZsnsp8OPrwsa0//GtnSd1kJ2bFp9Op0AvqXmQfFen7ExDofSMF6Vz8dyrxwUrFYzE7FLG6Rrp3zOSD6rTqIC45jN2zsi+3a/TBTQ8tgua6kBK8fcnEeVIfSOG9ac4j3tKQ2rez//za/QLStTsOdzqd6FFhTyjqt90MWap6ptACQiIKQVdFTt066Qsyl804DvfGU7Ogy/XqYVcW1tmQTPJI3XctC+Kh/Zg46O7+UpoTS8NGgbaQ54wFAH70aN4ryXxGMlFxlpnbMjAsD2uMxKGWMOqXM0/O7BjUDu4q1jDQ== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by IA1PR12MB6305.namprd12.prod.outlook.com (2603:10b6:208:3e7::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8182.17; Fri, 22 Nov 2024 01:42:09 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:42:09 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com Subject: [PATCH v3 13/25] mm/memory: Add vmf_insert_page_mkwrite() Date: Fri, 22 Nov 2024 12:40:34 +1100 Message-ID: X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SYCPR01CA0019.ausprd01.prod.outlook.com (2603:10c6:10:31::31) To DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|IA1PR12MB6305:EE_ X-MS-Office365-Filtering-Correlation-Id: ea0766a4-26e7-49e1-37da-08dd0a96e0fc X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|366016|7416014|376014; X-Microsoft-Antispam-Message-Info: UyDP3R7K2+Mjghd/Icg1HPhukIkuHAWp9VAWjuv7B9PHDzi1XptsxaHg00TKXWjc96b4dyTHRLPOBMSXdl2BHyJfa9qPSYuy+D4reAI63y3KDZXNNEauPbSyCMod1qpPnY44Ehe28RPTpmQOxivJfJmY05CRrE3jnKzOCq33saOQ1Ae2VIc3JjxdQMspwkotoHgLz8Dyo/UmJK019ETYGoLzgszSUY6mCvW+LoHNCe1FzpnJUXg7SBH68NEB3U1vJvqyXgtJ0EIXuWqwI9xMDN7UyGGhoO+CwO3nn2PEKfVjcgQ6c4tivMmzEc3dIXsMzdJeBjpB2v9lW6w1jiLUvfL/N/U3Gqo0T06aiEfMLEattTHi8a8vVTkUUwJQzn7FMguLzlKCrec1AESRgA8OKGI8PIsp099AzMC4fjZI7LgBZotWHe6vn2UAbFaOOy792QorK6wI7NUO+4h1bo4JEmYorQLPr/NruE4nFcRoDmwEox8mRm2/A1IVpJv4qURRmkBWWU/gjFuTkqu5po0pJ7gsPh7kHHa6gKru2Z2nyMtI6Bj7zbG6ZEdYmQ+JsuIpPUEQX7vlUvTG8dcMEq5KWW9SV6CHxzlfLw0kNBXGc2s1rUpH90NxHOtP/kwlyQKnJvrF0V8ocupEoSgHXgFlr9eoyzrdQJUiumoOltiP8mAfWJbxVnEPscSxVHlIPDwkBS3Cy54KXBId8BprkfisxLxKpCWqlSub45lz4EwoqSWCiWd8rNl9vu3KjDRFJraSXHAkjeuuuY6WfT5vB1L9QSYWeithBs82+snYBwK7sKDBzROuRrqMZeFMsROLHUwMf0qUHMn1S54KQ+/pslrQprfsc+notiKTzEnJC1Ke+CP57BzKNzXoH669rkhaHRkSYBGA1qyF3ftue/tmpnco4x14HPQr20ipg0WNzvnVeGu9ZseRBOqf2Z+z8jt/v3ZfFMIlsyIW4UOPSVZG42i8R5PLDQD3aYwloQZX6JuFR9XSznOj8EGkz6YwSVqgQlLqq2jeH5LnKMTA6b5wQIxUySTQYONLw5Wej3g3HvVIaFA4YqvpLFKV5Gmzl9SGKmIXSbwDQmRqFq51wJW0i/Ik3xUCVYhqmS6+2LZSKiiuRM5CGtQfe4EOo/TUcWNH7VP9HCIVpGetIT4WNVQOFiyZflEY6HwQ7wEKtfFSsk93rzgq6ue4XrNUuGDY7g8wsB0UITrg0EAs4dj4V1QrN+6bkDB6EuAjsbL7jLKd76bJJgflG3eGqnpuF8cVHmjC2Q4e6e4yX9fnEz2aRlSqItAX2awyNGXH4JDB/qwC1cKFWNqXxUzF+ZbXgG0yefJootXI X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(7416014)(376014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: tZi75HfXhdoh8UgqIGfYtGdj5yPVpxIPpOdrhf4mDVEgMou+3bXHlKL2vrTfnbnmBtIZsQI4QYzZ0gGwrL/a7YWjL41vnxqtXviWtq5gJ8IYB/Ek/WkN9tPtbAEK5ICChEResOPtE4R6JYz5KEhllmHvGgcKapLpA+4fPLqkOFP4P6Sn29KldD+m/kz3RwxnOsINQCCUcT4a07tL1JhEPlZ/tb2vWFTmHOlwODvMkGBwxs0hboT8YaCdawRdFCw2pQkQz+3QMyVB36H9Cv2OyrBKre3ineEDio4ZE/YbkpvdAI0MMyGgM6kBEsXF/4WzAZiM02DGb+1IBzj4gXh6V2plc9DvBAwGQHSAOBM6LJs1gjGpHGw0yDhkzUN5UsFYreRWPWG7fPt62e/gEz2I2Ke5UivcwshVuT7tLRv6ySYadx4MpmlLgvMS/I7MAaFOEizP/Z8MTAzElF1uyjQN8Dta21IQpzxWxPNMEJL2R7RhHYURXaVcz/ALdYG4kkP3pYUliA6DV+hdrTVs5aoWBK1mDZdgCYbmdFHzZH7dHZzg3VIcu4QmS8Tez9eHJY9gBR9OP2JWdLD6sGBMzczeNruPRsBRljrDlcVEETMl5ys4s6kYON/sn9yJRY8rVwWKHIsjPnoJRhd6+mcSsebaCqJUgPbRbuQxdUIhesea0CXAwXPKx5FgrtU0TeGEAKFZAtAfXvcykjuFw35WYC55lpr3Y8qt5fPOw3cTEm6yJpOZdLbRsXu44+70IRlFBmj8s+6wUkgucn/0IrCqdqo+bP0KwKNd3YH7TQtphfUaAfhYsVYGMRRDnmPZI8LsGjOZBEIGW/cv2wq2xY1eT180ZWMtPMpVFqEmjJDcXOWPqfxyMIawaqz8lfL4mClVJsPFqNmAPGyrLPMB4S+l0XWJq7W77V3zM//qaNyDDdP+TR50tqCkjGatdW9NqA5YEmOyT9GPQ5oQHpjN+U7BhYNoZv7iTxtwzvIC/bchWQbh0hYWQ32S4OUbWL2BDSAkkrhrvCig6blQbvkqQMgFzrHHaHRkSgd0W18YWrukkmg3YrGnhd4xQ8pXKDWTAc64hdQf0AygM/bsLMID313Okn7u8r9YeVHPAXxWPnN0SAjh7nHJSUOIlyQVvm725xMsZ6JA/Ge8ty2cYS/M5yf8aeAQptURVKxrhWQ/ZZUMMO81vQOzJhjyhPun/CApFf7iPYj7Zk3SaSMQnGNAh3VyRMdq/OdW6lSu1juhXs6d4fdxekmEeKrh4H0qSkOAgYmqYa0eOLTREUUti0SCmFQZQrM74aUMAo24RY3gp/qAY3a5drecasfRYApaR2yUG7KeHTUs8WJSEFTbdygYMIzjgeYuH5/2a0rt2vkgy6NCwTZ9ChsQT98D95yKrUY9Ak0qbwGLCLH0d3kxUkUfcgzh1ttlycVFH3DQIb1a3SAaGdjw4LJVv72PBXz6Q0YceUtEGzR+qdekIr5xWoZG8dzA8ridu58puqelYuy+e/q/4o+iyXPEunxIS/4PAGd3iTmFL55Qcp6cdv5uEmyb2rTb6vuAFcbk5NMqo/UZikC4u7j1Ro6Ac8tLcr4FC5mwNrkr+Xr4 X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: ea0766a4-26e7-49e1-37da-08dd0a96e0fc X-MS-Exchange-CrossTenant-AuthSource: DS0PR12MB7726.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:42:09.5494 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: rGy1Oy/tOYPUgi9DpMSfSxDBfp8tIfWwlC6U0U/kjBg+24c6UFZPLZTc98WUkKI//FyzAFHsyYsRTVfTFY7khg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6305 Currently to map a DAX page the DAX driver calls vmf_insert_pfn. This creates a special devmap PTE entry for the pfn but does not take a reference on the underlying struct page for the mapping. This is because DAX page refcounts are treated specially, as indicated by the presence of a devmap entry. To allow DAX page refcounts to be managed the same as normal page refcounts introduce vmf_insert_page_mkwrite(). This will take a reference on the underlying page much the same as vmf_insert_page, except it also permits upgrading an existing mapping to be writable if requested/possible. Signed-off-by: Alistair Popple --- Updates from v2: - Rename function to make not DAX specific - Split the insert_page_into_pte_locked() change into a separate patch. Updates from v1: - Re-arrange code in insert_page_into_pte_locked() based on comments from Jan Kara. - Call mkdrity/mkyoung for the mkwrite case, also suggested by Jan. --- include/linux/mm.h | 2 ++ mm/memory.c | 36 ++++++++++++++++++++++++++++++++++++ 2 files changed, 38 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 61fff5d..22c651b 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3531,6 +3531,8 @@ int vm_map_pages(struct vm_area_struct *vma, struct page **pages, unsigned long num); int vm_map_pages_zero(struct vm_area_struct *vma, struct page **pages, unsigned long num); +vm_fault_t vmf_insert_page_mkwrite(struct vm_fault *vmf, struct page *page, + bool write); vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn); vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, diff --git a/mm/memory.c b/mm/memory.c index 323662c..5e04310 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2548,6 +2548,42 @@ static vm_fault_t __vm_insert_mixed(struct vm_area_struct *vma, return VM_FAULT_NOPAGE; } +vm_fault_t vmf_insert_page_mkwrite(struct vm_fault *vmf, struct page *page, + bool write) +{ + struct vm_area_struct *vma = vmf->vma; + pgprot_t pgprot = vma->vm_page_prot; + unsigned long pfn = page_to_pfn(page); + unsigned long addr = vmf->address; + int err; + + if (addr < vma->vm_start || addr >= vma->vm_end) + return VM_FAULT_SIGBUS; + + track_pfn_insert(vma, &pgprot, pfn_to_pfn_t(pfn)); + + if (!pfn_modify_allowed(pfn, pgprot)) + return VM_FAULT_SIGBUS; + + /* + * We refcount the page normally so make sure pfn_valid is true. + */ + if (!pfn_valid(pfn)) + return VM_FAULT_SIGBUS; + + if (WARN_ON(is_zero_pfn(pfn) && write)) + return VM_FAULT_SIGBUS; + + err = insert_page(vma, addr, page, pgprot, write); + if (err == -ENOMEM) + return VM_FAULT_OOM; + if (err < 0 && err != -EBUSY) + return VM_FAULT_SIGBUS; + + return VM_FAULT_NOPAGE; +} +EXPORT_SYMBOL_GPL(vmf_insert_page_mkwrite); + vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr, pfn_t pfn) { From patchwork Fri Nov 22 01:40:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882595 Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2058.outbound.protection.outlook.com [40.107.223.58]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 03DFD1C7B8D; Fri, 22 Nov 2024 01:42:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.223.58 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239743; cv=fail; b=Lfvnz5yRJj6X5uy8DVfovEjryUZn7EqH5Olyi3pZJCn6j+yeoOjRQGBueTZUkChjf5kBEEHhTB74fHj0lF4s9wcObgQ9RW/oNDWnXDXfM5rpps9MmjtiYDb8Bwer16evDvc5mU9xTQKmcH8OLH82Wlhew/8dGbhrUUS7OsLvQx8= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239743; c=relaxed/simple; bh=iYbq+15QowbWNA0FAmnS6DWJSly6bWTzJuzgjL/9rSU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=V6y/HV/SyPKnQXQHAAy/+h6ma/eHOJkHuwgBbWyS3V4PY2dJMJ1tT93nxA9srl/l9U6btUO5bOqbYAIn/5SXl0KndfxwChs1oHMtgF1rQKRo16bJjHHTEuxunAaF/sN5iZmnCWQEZtkIzro7wTkcK04igWIzcPVDs8kCoy0a2Tc= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=RIk91l6p; arc=fail smtp.client-ip=40.107.223.58 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="RIk91l6p" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=NEcOHywHW1iyH/flLPPGlboZd1C4EN3p3UAujN20S/iuP9z9V3WPQQJhfjhuqpVLAotXOl7j79hmLmT9nZDwZVKmBFMbE+oAboRhKaq9AwKo6l1mYtWi9+3bAc/7EQyA2UepVVfr7oIJJ43GqxQk92aw6N7DLXTcudgHG+4j7xnVQdbE5/u7QJyfCG9Xy83Hl1CC1C5ePRyA8rS0ztJKrBl0DMcFOe39GoElSo17z10q8B73BM4/eJT5pj2D7leMirNpFCkiuPIBVQl0OKFxOICK/V9pJsgtTqxzOhLEB4w2PVZ26RnIj0FZcdRTxJuwaJaxet8NWtSSxE7Baw3M4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=mNNDrZGYjuWsY6LGZ+vu93m4DKb6QIsZnqMbyiKmB6k=; b=T6D2YBBz3F7/It+2uny8+i84BJecYT/8+MANPQ+masLy1YKBc2RehuZiIBfoJeVTL9n9M1vA9fkusdUw5EbIbGCGg0jmV/q+RH8+/rXYpAKsMpdT7YdizyL5Gby19s+Z2pc9dYVp8eJ+BeKTipmK0OMqweStcBs2PNmMkzCKbJAq0y6mkxyE6MMf0Nb21NMT5rfsK6z4q+Ju2iq69xMGbN6V5g22VS3KuicPDa5aQ+ESEHnxx4mu3a3fBI1YVlUHkqv9EGjAAj4MOmI1LZq2tNT/lXYyhQcnNl/xnzqXZSAvxOx85/2qVgFP6uqfbpzKH/kZSk5NoldlUXZTyCBepQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=mNNDrZGYjuWsY6LGZ+vu93m4DKb6QIsZnqMbyiKmB6k=; b=RIk91l6placjmqMXs2PnUlPJuNM6gFvlgiLwz+9IzO/6AF2RX6B5h5FsHNQUHlOxtpRJ6smYkK1av1V6wI8aX1FTMJ+GKFr/gxLtN2OuMFYZibwvC6ZoJK3cqcJPcuuycIlayT1PdYWm/03JU0XomkuxRgutqOax5md2sWx+5n7ZkuaSFof0E1cWxai3uvzJDK7wvCi/eVytYLldI+LxoJp9DZ68bELzlF8IJISNbL8FzyS7hSLRxhkl75+rz921hsrGUeM0/+wqnpqhXuM2B42P+hc5XsyUaJjutaWE11r769ctik70p9sMT1Yqa6H4UwqeEwIxGpD/TJ69wNJayg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by IA1PR12MB6305.namprd12.prod.outlook.com (2603:10b6:208:3e7::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8182.17; Fri, 22 Nov 2024 01:42:17 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:42:17 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com Subject: [PATCH v3 14/25] huge_memory: Allow mappings of PUD sized pages Date: Fri, 22 Nov 2024 12:40:35 +1100 Message-ID: X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SY5P300CA0013.AUSP300.PROD.OUTLOOK.COM (2603:10c6:10:1fb::12) To CY8PR12MB7705.namprd12.prod.outlook.com (2603:10b6:930:84::9) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|IA1PR12MB6305:EE_ X-MS-Office365-Filtering-Correlation-Id: d187fabb-62ce-4285-9190-08dd0a96e5c7 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|366016|7416014|376014; X-Microsoft-Antispam-Message-Info: NIxNCHcpauAtobSq2NOaU/Hl3jGS/wnfV3N24Hy8n3TrBWp8L1HWHIx6yn+gfBUcxF3/vC3Bm6QFtYjKae//CCyVCnKkonNkYtH6Ii4dLQfcuiWlibrx4wKDmPU6jB3wWLJA7zJhMqAxkEgXlbsHkT8+W8lQnMJP0f/K2iO8pqcPrDgLEO/NnkTfNMbUFHsXasgcFkZE715vlK5dhAFuPqkoegeFGcO5vQmJ63ySAU7/eclCqIVF3qMpUt2ZDB7MJBjC6shY5VOxtlXNgZJvRfkar5NMY4SNNDT7rn9/eIBxTeiAREQqBWnNdAxOcptg9vlUtQ/YAStv+dC9qYzPYv54qFFe2j18mZFnRIShBlwbfns4h6HBS7y/Z9q2a54TEnuKTKC2r7M542wLt9jJp4Z295KHmvN/ky4xFiHyYnusBFyozOu3zPkJsncZzedwYTfkH/ZiwKpu4b9ZE0SICg8KHaFJp9NvnQly82XNHbIwrSfS2EFN31+NrKzZDjuUCOREkld11u92/rA0zaaWEsCCMrVOT3bcq2SiRWPzqFs2gL2nJMVbuXJXkLeTgVq/fqACjHntUWfYidNT1vcDAV4nUi/MUx6WzNw0cH0zfBO98ha2Vt/RN1gkwP3UhzixBsh8CHSRUde6BDAzSiDqX01EcXhjbIW0M5y7qXkSpEVvE1T1YBohixSWEVct0wJObsStOP7shFuVGR5gnykuLUF5PYbAeCqLI41tUo6UpvHTK+fD3W5PUuojG3uThMqWS3kexVWYb7XvgzqT1QS55mN0d7WzUuEZ6sLJBYMVBJd3nBKlpecb6KfvEJLdPjm1viM0FoW3lKrRw7n6OalZNJab057Y9qTVp4BCAlwi9rSxFUBQJ5XcRrdo0aFTmDPQ4RTcYQr3PZ9Yw2TjfDOGa3C/hgT0c5izUQQhutgVj99vrZfmaW2Uye+6HboRSO9/QN+cj8aCTDh+IcJs4wKEShf7sci1Aorx4i39U3OyxhwrHX6nSqsF4GEDUx5sLmdfRqrTo/n/5UgMZyODHpRN21Zrehp34u5m+7o0dmlVOiCR7KY7DnIooQm+YLxhKO6dfHucBvNoVOvhaxLZ8S76pH954T1P7vwCqROuIsEcSaePOtDf/tU/V+jRqFuQRzoe7rFokWcUhfI2VTSesgkqej1CemvmljJQ/EGH6CMZSZ1YIv5MZ4VtQlz3fXaNg6zZ5wBu2/pxNTLG08Qi9hSUCm0cCcNaG6MLFhE8ybH9Ekuao45uor4Sy/ArS22AvLKekFOsGimHlnNdObk1SyzFecU4X7U5wMdRkajqtxdcQYCEbQcVz4BLECJXqFN+FuOB X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(7416014)(376014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: YEQPSyyrPtr2Aaoonp400ocTCOUkegG72KpRGQDiy7HH0iJqRhZf5Dqz5RS4QGoKvwWE7yoAKNuHwxEwqDxusJXwiyHzSzBjPGpjgblTeOj4ptzmgUJQp9keHnBUDoe6o5xPx5O40ECdoMrZ8ovC2vh8RZDfoHD1XYf5WwA94PmUefuiUyCKGVuhcE7O3DnmcqGOr+vo+cFaOFv7VCxOeYt9jcitXUAEutyuJFTPSXGXjHRgfUN9a3DvRs6wWFnfBuDZKrqlu3W5PanKl28YRXM0iLL/829nVlbUoOGw6f3fIfbU7uHaFEOTtxPNgtUrXXp8Et21nctWM/XXSPvOEJIVcAlWzeTFGTA/NMXZGB2F6xp6MKuoI5bRy5asNV7Ki+GFB6248ERlj/+QaldiK0NkNaCIN3afnnytJ4GK+kCwkxdPPgJtEfXKe5mzoVkWwBAkDjUXF5M9jRwn6fkIzEIws4fLAIN3biNM3RDP4pyUoFA1WeM6mpWBxDCxjDCQ1c8Ikg3JcaHGcdpjEQIL3tD+LAOO4IEEO4Unx1oEmB5jhBALvnPM6b3qYb4uVPwoyeX2xKRJrQffc7r4Kupt4+01tYTGvhWPB4C2dhCWqRW2p5X7f3cNRKwn0eFK90c4BuNDuEJOoYO0/yF2TOwojUtkVTDq/KN4Yd0K3SkQWpHy3P+1cr+RfIqxfOMAx4paO3sfWOu62vtAAFR9eCI1OJ+TrJP2VVBOE1FgiMLIfeW18dGnfVZi7ZWGiOgZS22MrCwulxs9Lh+tp42/Pmuovxcjmhm0JUSxTQL8bWsO8Nc6OidtZPiFhcW49ANrnabeGL2PndQM0A0NQq12DqsuKVOWQelfRpTJeGkCcfaYOKhK3eY05jY1IBji4zLruI8gtOeKrXX5E0p9tDlpbgxP6y6tgWWsoT7LAdhSWE2WqatmKuCGd/kXMPglB7jiCFskVcccyT0nRJGVL8B3vpPKkwFCtjOP95pPAjBWSlTk+jUqGannftPVWGDJk985BQY8l+JLAnNg8AORMkfc9Oak4zOgjnSJ+Ltv43O6aElk+Jlh2rySD+p/Ro7aFWj/cYFb9Glg4KSZjStTZGko41wGC8lTxdS23UY2AWvUbQBkM1wdp2oBC7jYV1hb4SMOitbxIRwPCNkFzLy1ryMprikYBdVJJ8AuYwaXQ2DRDjFHzb82c0yJ6eQsjUun/kCk6dNNebAlT/ECUDdYa90AzwQxkj69JbgmK4HUCaXE6Jsm4j72Cw4px/vaMKdilNfAkTNUNT75CcUBRJJggQkLGmJf05P0wb66GKB6UWGp5WQ+FeW0vdMBF4VGkY+fN7yhHqKaAMgG2FpPv3Z2kHsql9C/hNOMzaPrvTeZHHFdZoY8QDZiFrBlxWmq1i5lnl7ocFAz1owJlLA4KPn+aofoAgyEkAoOp2IaEMC2AVW5Y+JaAtrpICMHdhQimSzbu5ZF887icVbhzoSs2IX6XgNtm2l8v+mIP6avhzSnBADG/QXSJOidnVnORbWb5BMaT2UfAztt8fwk9ZK7lFbuIUytBHHKdiKw3eGb1wnIeyILFM+r03dx9bWgfxsWEWINgIJy02gG X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: d187fabb-62ce-4285-9190-08dd0a96e5c7 X-MS-Exchange-CrossTenant-AuthSource: CY8PR12MB7705.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:42:17.7703 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 1UdrdtaWFgD5dZRgAv+FNpK5jy7StFgvuVfE0nq9mteNnrZObmeyLxK4KklWIzReCoEqZ//BHA6jr8NRiKAMNQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6305 Currently DAX folio/page reference counts are managed differently to normal pages. To allow these to be managed the same as normal pages introduce vmf_insert_folio_pud. This will map the entire PUD-sized folio and take references as it would for a normally mapped page. This is distinct from the current mechanism, vmf_insert_pfn_pud, which simply inserts a special devmap PUD entry into the page table without holding a reference to the page for the mapping. Signed-off-by: Alistair Popple --- include/linux/huge_mm.h | 11 +++++- include/linux/rmap.h | 15 +++++++- mm/huge_memory.c | 96 ++++++++++++++++++++++++++++++++++++------ mm/rmap.c | 49 +++++++++++++++++++++- 4 files changed, 159 insertions(+), 12 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index ef5b80e..e804e41 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -40,6 +40,7 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, vm_fault_t vmf_insert_pfn_pmd(struct vm_fault *vmf, pfn_t pfn, bool write); vm_fault_t vmf_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write); +vm_fault_t vmf_insert_folio_pud(struct vm_fault *vmf, struct folio *folio, bool write); enum transparent_hugepage_flag { TRANSPARENT_HUGEPAGE_UNSUPPORTED, @@ -468,6 +469,11 @@ static inline bool is_huge_zero_pmd(pmd_t pmd) return pmd_present(pmd) && READ_ONCE(huge_zero_pfn) == pmd_pfn(pmd); } +static inline bool is_huge_zero_pud(pud_t pud) +{ + return false; +} + struct folio *mm_get_huge_zero_folio(struct mm_struct *mm); void mm_put_huge_zero_folio(struct mm_struct *mm); @@ -614,6 +620,11 @@ static inline bool is_huge_zero_pmd(pmd_t pmd) return false; } +static inline bool is_huge_zero_pud(pud_t pud) +{ + return false; +} + static inline void mm_put_huge_zero_folio(struct mm_struct *mm) { return; diff --git a/include/linux/rmap.h b/include/linux/rmap.h index d5e93e4..4a811c5 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -192,6 +192,7 @@ typedef int __bitwise rmap_t; enum rmap_level { RMAP_LEVEL_PTE = 0, RMAP_LEVEL_PMD, + RMAP_LEVEL_PUD, }; static inline void __folio_rmap_sanity_checks(struct folio *folio, @@ -228,6 +229,14 @@ static inline void __folio_rmap_sanity_checks(struct folio *folio, VM_WARN_ON_FOLIO(folio_nr_pages(folio) != HPAGE_PMD_NR, folio); VM_WARN_ON_FOLIO(nr_pages != HPAGE_PMD_NR, folio); break; + case RMAP_LEVEL_PUD: + /* + * Asume that we are creating * a single "entire" mapping of the + * folio. + */ + VM_WARN_ON_FOLIO(folio_nr_pages(folio) != HPAGE_PUD_NR, folio); + VM_WARN_ON_FOLIO(nr_pages != HPAGE_PUD_NR, folio); + break; default: VM_WARN_ON_ONCE(true); } @@ -251,12 +260,16 @@ void folio_add_file_rmap_ptes(struct folio *, struct page *, int nr_pages, folio_add_file_rmap_ptes(folio, page, 1, vma) void folio_add_file_rmap_pmd(struct folio *, struct page *, struct vm_area_struct *); +void folio_add_file_rmap_pud(struct folio *, struct page *, + struct vm_area_struct *); void folio_remove_rmap_ptes(struct folio *, struct page *, int nr_pages, struct vm_area_struct *); #define folio_remove_rmap_pte(folio, page, vma) \ folio_remove_rmap_ptes(folio, page, 1, vma) void folio_remove_rmap_pmd(struct folio *, struct page *, struct vm_area_struct *); +void folio_remove_rmap_pud(struct folio *, struct page *, + struct vm_area_struct *); void hugetlb_add_anon_rmap(struct folio *, struct vm_area_struct *, unsigned long address, rmap_t flags); @@ -341,6 +354,7 @@ static __always_inline void __folio_dup_file_rmap(struct folio *folio, atomic_add(orig_nr_pages, &folio->_large_mapcount); break; case RMAP_LEVEL_PMD: + case RMAP_LEVEL_PUD: atomic_inc(&folio->_entire_mapcount); atomic_inc(&folio->_large_mapcount); break; @@ -437,6 +451,7 @@ static __always_inline int __folio_try_dup_anon_rmap(struct folio *folio, atomic_add(orig_nr_pages, &folio->_large_mapcount); break; case RMAP_LEVEL_PMD: + case RMAP_LEVEL_PUD: if (PageAnonExclusive(page)) { if (unlikely(maybe_pinned)) return -EBUSY; diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 2fb3288..c51ef3e 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1441,19 +1441,17 @@ static void insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr, struct mm_struct *mm = vma->vm_mm; pgprot_t prot = vma->vm_page_prot; pud_t entry; - spinlock_t *ptl; - ptl = pud_lock(mm, pud); if (!pud_none(*pud)) { if (write) { if (WARN_ON_ONCE(pud_pfn(*pud) != pfn_t_to_pfn(pfn))) - goto out_unlock; + return; entry = pud_mkyoung(*pud); entry = maybe_pud_mkwrite(pud_mkdirty(entry), vma); if (pudp_set_access_flags(vma, addr, pud, entry, 1)) update_mmu_cache_pud(vma, addr, pud); } - goto out_unlock; + return; } entry = pud_mkhuge(pfn_t_pud(pfn, prot)); @@ -1467,9 +1465,6 @@ static void insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr, } set_pud_at(mm, addr, pud, entry); update_mmu_cache_pud(vma, addr, pud); - -out_unlock: - spin_unlock(ptl); } /** @@ -1487,6 +1482,7 @@ vm_fault_t vmf_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write) unsigned long addr = vmf->address & PUD_MASK; struct vm_area_struct *vma = vmf->vma; pgprot_t pgprot = vma->vm_page_prot; + spinlock_t *ptl; /* * If we had pud_special, we could avoid all these restrictions, @@ -1504,10 +1500,55 @@ vm_fault_t vmf_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write) track_pfn_insert(vma, &pgprot, pfn); + ptl = pud_lock(vma->vm_mm, vmf->pud); insert_pfn_pud(vma, addr, vmf->pud, pfn, write); + spin_unlock(ptl); + return VM_FAULT_NOPAGE; } EXPORT_SYMBOL_GPL(vmf_insert_pfn_pud); + +/** + * vmf_insert_folio_pud - insert a pud size folio mapped by a pud entry + * @vmf: Structure describing the fault + * @pfn: pfn of the page to insert + * @write: whether it's a write fault + * + * Return: vm_fault_t value. + */ +vm_fault_t vmf_insert_folio_pud(struct vm_fault *vmf, struct folio *folio, bool write) +{ + struct vm_area_struct *vma = vmf->vma; + unsigned long addr = vmf->address & PUD_MASK; + pfn_t pfn = pfn_to_pfn_t(folio_pfn(folio)); + pud_t *pud = vmf->pud; + pgprot_t prot = vma->vm_page_prot; + struct mm_struct *mm = vma->vm_mm; + spinlock_t *ptl; + struct page *page; + + if (addr < vma->vm_start || addr >= vma->vm_end) + return VM_FAULT_SIGBUS; + + if (WARN_ON_ONCE(folio_order(folio) != PUD_ORDER)) + return VM_FAULT_SIGBUS; + + track_pfn_insert(vma, &prot, pfn); + + ptl = pud_lock(mm, pud); + if (pud_none(*vmf->pud)) { + page = pfn_t_to_page(pfn); + folio = page_folio(page); + folio_get(folio); + folio_add_file_rmap_pud(folio, page, vma); + add_mm_counter(mm, mm_counter_file(folio), HPAGE_PUD_NR); + } + insert_pfn_pud(vma, addr, vmf->pud, pfn, write); + spin_unlock(ptl); + + return VM_FAULT_NOPAGE; +} +EXPORT_SYMBOL_GPL(vmf_insert_folio_pud); #endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */ void touch_pmd(struct vm_area_struct *vma, unsigned long addr, @@ -2066,7 +2107,8 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, zap_deposited_table(tlb->mm, pmd); spin_unlock(ptl); } else if (is_huge_zero_pmd(orig_pmd)) { - zap_deposited_table(tlb->mm, pmd); + if (!vma_is_dax(vma) || arch_needs_pgtable_deposit()) + zap_deposited_table(tlb->mm, pmd); spin_unlock(ptl); } else { struct folio *folio = NULL; @@ -2554,12 +2596,24 @@ int zap_huge_pud(struct mmu_gather *tlb, struct vm_area_struct *vma, orig_pud = pudp_huge_get_and_clear_full(vma, addr, pud, tlb->fullmm); arch_check_zapped_pud(vma, orig_pud); tlb_remove_pud_tlb_entry(tlb, pud, addr); - if (vma_is_special_huge(vma)) { + if (!vma_is_dax(vma) && vma_is_special_huge(vma)) { spin_unlock(ptl); /* No zero page support yet */ } else { - /* No support for anonymous PUD pages yet */ - BUG(); + struct page *page = NULL; + struct folio *folio; + + /* No support for anonymous PUD pages or migration yet */ + BUG_ON(vma_is_anonymous(vma) || !pud_present(orig_pud)); + + page = pud_page(orig_pud); + folio = page_folio(page); + folio_remove_rmap_pud(folio, page, vma); + VM_BUG_ON_PAGE(!PageHead(page), page); + add_mm_counter(tlb->mm, mm_counter_file(folio), -HPAGE_PUD_NR); + + spin_unlock(ptl); + tlb_remove_page_size(tlb, page, HPAGE_PUD_SIZE); } return 1; } @@ -2567,6 +2621,8 @@ int zap_huge_pud(struct mmu_gather *tlb, struct vm_area_struct *vma, static void __split_huge_pud_locked(struct vm_area_struct *vma, pud_t *pud, unsigned long haddr) { + pud_t old_pud; + VM_BUG_ON(haddr & ~HPAGE_PUD_MASK); VM_BUG_ON_VMA(vma->vm_start > haddr, vma); VM_BUG_ON_VMA(vma->vm_end < haddr + HPAGE_PUD_SIZE, vma); @@ -2574,7 +2630,23 @@ static void __split_huge_pud_locked(struct vm_area_struct *vma, pud_t *pud, count_vm_event(THP_SPLIT_PUD); - pudp_huge_clear_flush(vma, haddr, pud); + old_pud = pudp_huge_clear_flush(vma, haddr, pud); + if (is_huge_zero_pud(old_pud)) + return; + + if (vma_is_dax(vma)) { + struct page *page = pud_page(old_pud); + struct folio *folio = page_folio(page); + + if (!folio_test_dirty(folio) && pud_dirty(old_pud)) + folio_mark_dirty(folio); + if (!folio_test_referenced(folio) && pud_young(old_pud)) + folio_set_referenced(folio); + folio_remove_rmap_pud(folio, page, vma); + folio_put(folio); + add_mm_counter(vma->vm_mm, mm_counter_file(folio), + -HPAGE_PUD_NR); + } } void __split_huge_pud(struct vm_area_struct *vma, pud_t *pud, diff --git a/mm/rmap.c b/mm/rmap.c index a8797d1..84d7ab7 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1180,6 +1180,7 @@ static __always_inline unsigned int __folio_add_rmap(struct folio *folio, atomic_add(orig_nr_pages, &folio->_large_mapcount); break; case RMAP_LEVEL_PMD: + case RMAP_LEVEL_PUD: first = atomic_inc_and_test(&folio->_entire_mapcount); if (first) { nr = atomic_add_return_relaxed(ENTIRELY_MAPPED, mapped); @@ -1330,6 +1331,13 @@ static __always_inline void __folio_add_anon_rmap(struct folio *folio, case RMAP_LEVEL_PMD: SetPageAnonExclusive(page); break; + case RMAP_LEVEL_PUD: + /* + * Keep the compiler happy, we don't support anonymous + * PUD mappings. + */ + WARN_ON_ONCE(1); + break; } } for (i = 0; i < nr_pages; i++) { @@ -1523,6 +1531,26 @@ void folio_add_file_rmap_pmd(struct folio *folio, struct page *page, #endif } +/** + * folio_add_file_rmap_pud - add a PUD mapping to a page range of a folio + * @folio: The folio to add the mapping to + * @page: The first page to add + * @vma: The vm area in which the mapping is added + * + * The page range of the folio is defined by [page, page + HPAGE_PUD_NR) + * + * The caller needs to hold the page table lock. + */ +void folio_add_file_rmap_pud(struct folio *folio, struct page *page, + struct vm_area_struct *vma) +{ +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD + __folio_add_file_rmap(folio, page, HPAGE_PUD_NR, vma, RMAP_LEVEL_PUD); +#else + WARN_ON_ONCE(true); +#endif +} + static __always_inline void __folio_remove_rmap(struct folio *folio, struct page *page, int nr_pages, struct vm_area_struct *vma, enum rmap_level level) @@ -1552,6 +1580,7 @@ static __always_inline void __folio_remove_rmap(struct folio *folio, partially_mapped = nr && atomic_read(mapped); break; case RMAP_LEVEL_PMD: + case RMAP_LEVEL_PUD: atomic_dec(&folio->_large_mapcount); last = atomic_add_negative(-1, &folio->_entire_mapcount); if (last) { @@ -1632,6 +1661,26 @@ void folio_remove_rmap_pmd(struct folio *folio, struct page *page, #endif } +/** + * folio_remove_rmap_pud - remove a PUD mapping from a page range of a folio + * @folio: The folio to remove the mapping from + * @page: The first page to remove + * @vma: The vm area from which the mapping is removed + * + * The page range of the folio is defined by [page, page + HPAGE_PUD_NR) + * + * The caller needs to hold the page table lock. + */ +void folio_remove_rmap_pud(struct folio *folio, struct page *page, + struct vm_area_struct *vma) +{ +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD + __folio_remove_rmap(folio, page, HPAGE_PUD_NR, vma, RMAP_LEVEL_PUD); +#else + WARN_ON_ONCE(true); +#endif +} + /* * @arg: enum ttu_flags will be passed to this argument */ From patchwork Fri Nov 22 01:40:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882596 Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2055.outbound.protection.outlook.com [40.107.244.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5A1361C8FBA; Fri, 22 Nov 2024 01:42:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.244.55 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239747; cv=fail; b=qtnbdXUr0BPQZAAeZk4ITjThuwc75d9nbitfCBRfyeCX9cx7hsbEVmwaBXkZkw4F3mQoEa3evo1To5Tf/huGovl2eNzSA/Hrd1ZqGcVLw1UeGWPLivgF2Fpv8u7vJm2/eDP+LmJP+/A2O3pfVhTlH1LTMQWFq0cfqjDJt9JERr4= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239747; c=relaxed/simple; bh=ARAYHn9uzeP/jw6NhBGh1r4J5w2mHxh6aAVuwkBESrk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=A+Z/OteTPkSPLmrW1GMPWtZMa6m0HwmL29l5mDEI+iIBEf4W9FpknezgHJecf+kaxIJOOYC/6S7ZeoCj0bexI4pDlw3XC8tHmjv9gVgQthEVGAUQtbBze+9L7l6JD398aMPnFf25J5R2byEz2oPI/MYb11kKUHCrHUgptIKJz3E= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=BxOBJCTV; arc=fail smtp.client-ip=40.107.244.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="BxOBJCTV" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=c1+oLFGFVihX/esU17CZsGaIiYkiQ6f2FKdtcoTZmSaljmIocIfkbSipQKhPA/NdbO0m+CRedtue9lflIKcsgXcKhzIJHsGweOza3NAjodegkFFNPY7Ydy1daBkc+LzDQ06rPL7r6PQR76zb3PGCa4CQiTb+K3skbDO84E0DSFOmScQO5IOsLqwOkLwmhs3g9uYU3U/aKFCtmslfCmV1pSWUIinT1VZ9EIqGeh6QMku8yCAegTtc+IxzjD7uNNQ539UfTyA8vq8e2jz7D8eUQOmhM1i7xMYKniKWUdVIie9Gt/+AIdhi11KJKVg4FxWG3wsW7sPidb0KF3YJRVATyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=TsiANw4Jc50v9Ghy0wuZyrDvDVHKZeIEwRTiOOU3Uho=; b=Pfr4mSVD+inH6jZMuPi+ms7kkE3XiYlPMgiFIMXUD+8g7EimQps81gyQ0V5MeZifhQfRixzqRo+BdYVTwOgssqADnsycQEUVPb862bNYbsKa3dFCDBx7A4zM2LXqoA4MGP2k4KD73TLvf5WyZcD423xxnTW2VE6LzOGzIvXBh1xtnDewDB146pVEqGuVVQpqckVcmrLy1u3CJs24IqEkrN1lMV2m6ra9F3125Ht2AebWyGrq406eG9IKggcUF6F2IGBlWDmYrQ6U8NO0qmU/6GlHbiJfAmLMkstGKv83l+1Qgpcf8lTrxks/lJAjCBmAdmdmpoHJEoU5KRvfl06EQA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=TsiANw4Jc50v9Ghy0wuZyrDvDVHKZeIEwRTiOOU3Uho=; b=BxOBJCTV4kV6Xl2aqVfevU0eBQYQD2YJr2heGPBjmTBSA0ScF+KITKjvlAxoz1l5hOgOZlvKlmkh1po5+chAtkN5x9LWstkP1cRfsSXn4vYsDU4/MbZsUSyvTEhQ+BvHY6xeybmGC8Nq1S2qQ5/puwbVMZDeqX0XrP2BT2mJUdEYrStFO4jH+tBi1PWlTOS+EwYZ7aZACRXy5OJMHzE9e2Tz0Bb1jRnRiMUQe8LpVSz9KweDa0LTy9juWqHqmW4Ad5O/rAJKzvYezdlu2DmUOrHPBa5KbRnBSONlX3LlUo3Lv2osiLyLTAAJYnimJKf6oiRqmQMO6O+hJPlO1Y8Taw== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by IA1PR12MB6305.namprd12.prod.outlook.com (2603:10b6:208:3e7::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8182.17; Fri, 22 Nov 2024 01:42:22 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:42:22 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com Subject: [PATCH v3 15/25] huge_memory: Allow mappings of PMD sized pages Date: Fri, 22 Nov 2024 12:40:36 +1100 Message-ID: <3792fee914ca7967b8e481f2e9d72c356f23d44d.1732239628.git-series.apopple@nvidia.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SY0PR01CA0008.ausprd01.prod.outlook.com (2603:10c6:10:1bb::7) To CY8PR12MB7705.namprd12.prod.outlook.com (2603:10b6:930:84::9) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|IA1PR12MB6305:EE_ X-MS-Office365-Filtering-Correlation-Id: ed696b0f-b825-48f8-dbfa-08dd0a96e895 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|366016|7416014|376014; X-Microsoft-Antispam-Message-Info: C5IS1LyYdIRGLxXXfg16lkeo08KbY3/StuAKdzSqXQLCQPZStassz1xpxHc5S8Wn18MaLG+OfUcR7WG0vP3LzUUjXwVPzufbM8tZqzlNPrr/L1G6GgULV1ytcwdcIyP0WCPdYUCUuX+v0CUaFy30DZ2WaiilathoGmbACK61m6I30R6a73VRik0XSzbcnuGK3IWMBwompgMVSILrbdV7mEoelnlC/z7J647gWAo7uJHnCadVOVcbwhTBcsEVQmzrf/74/0D+Sl/fRe0BvcElxqzDncrmeZcC2RZjFtvdStPC+ATWlfdI51sMNOwO/LiNZjp+ZdIkc+xv+GMocHzVBMVmIhv3meig5/Pbiriv+rM7OggU4p+ZnKfPkvneLBhh/80C0ltAbSq9X7RdgSUmkjzgzhAAkAVrVk42lq5ivykE6RzU8Fj8cBU59Wz8izGyH4QpthP0RUBr56V3Ga0ylwDPQ804pkN3hpKBvvz2Oqv04qXZiaOPImTRmLtOlJLpVCURF5r+euOvtLaMmTZbngNukTqepqeIqMZ8l5K4dgjVO7b96J3JEp475SmDMokDgD9CCYpJx7RPL2wal3W8TKuTsrfMBqfQJyzFfb758w/xpuzWt54rm/iIitYb0KNpwuxFcGJXwE1cgJr0lhloPtcS4wktGbWFt66mAePhix3+qG6rraXU9objpqJCHJFC4WqKFRtlrKHBok+JurborcnZP5KLj4suvOn5xvLQQaER8+r+Kd5/rx83VUoOexsgIZ+9Qi7LPasaygKjrqQBJkIZ7xfyVWkrb90eLF7TDdXkesAvgSMxvv6lKj6g+FqSkEvZYel7ufykyw/XxoIZDjl9euE+xNS0AWYCRbqqRM2myAn7OQwoOc2sYl6IAYVqhk7yoEGWMOkTLwnc/yXed67e1JpvHJWJQDSmybMnKy/TLguLydnjI8Z3lo6wT6VgWCphhh9Q7lMACINlLg6/3CTPufH8YObY/wSrN9Da2KMTru/s51SKaDY8nEuFHRb6vRJMy92JIXQW6cLqIXhzizDmAlhxqGjPw75gESBgdrBwAjbJSfr0uTzlyB8hMK+qOuxZcxTTRFTxdrFn6a/cTLfjLwmAlC2vv5af75QnX2KBw8pfCt68npo/RVtwBc+0dYwigRRlAGLPlxKFYRZdbwwHIoE5bqgiz3iHFvAnnYWhPMtKBhJopslNfxSsrt8/j7mrkxiSTUdryBUfZOpVpzYimbz9f2p5iBCkg4Vtx1x4c/UNWdo7HqtMEG8IphvKbb7OvFSOKykI0BVDqlyVOW8Ky6A7U1ULSaHkcvkc9NrP7xhETCP72vTX1cphIhbC X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(7416014)(376014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: UHwQfpWVn4/uHLgtR8rohAFqTyd/lV7kapibaNxlfhUzcI1id7PRhA3btBfnga1Ey37KflkzjDejuHG1j2tsIRN3W+naeg+l1kL5XuWcv3PmAlg8KXDsom+g9B0hnhL4+tIxMIxl+9VbeyHQRkT9UGdrwhE3x7HM6dLbsfOnn+GTJi0/Ojvy3BYNCmw4HdKtjgjzMKtrLsvcIOw9tAZy7whgn1cDk/GeY2OCh7OEJBLfVjafSfzdZGrYStkxzoZ6wZ4xXMM6yTD8j5dZjgSBfxCkXqyZz1brXgXOVdCgzrUm/UVygN9xVImr/43wOrtnBoCSfKRPzFqEPdT/Jbwlec55n1KUzOyRraxQWwpFILyEBHerP/YQRMg0uwezY1nJhiGPdzNdO/WiS+TXe5Ue76+NaLF0/80ukbE4+F2g8lIRYWDFCMEiAF4/7Pk/Zual1sr7zpbwLJ7oIemvwOPGW7BptSbGK1d799fFWI7dH/H7WsSAEP0qlxL0r3/Al3uGEg5saPp1NNu9MZAoo0zAREDXIiRIckdYyYdsfdA0ykMHqHHdQy0T387DHPJMZ5mhwTr2w1lMbfkcMBFHRnzBQNP3Lzy9JjkVGaP0zXE8O9QtGWgktu5FGs+95L9BPSGaqaxg+Fkn/1iU/cCuYI8WEXRa0I+ovfTkD1wUb6E+7OyGEix4Xzu7eOA6DiZQm4+584jt8VrFJtXllmpwNrKPc6UD2A4MO3O+SELeRxVxN41x3zpaJg50SWStoS87phvw837fXelPnNEoBZq1P/gFG+LHggfKf07+cv0G3Cml91R4xuFlVngB6Wptsj5RwnHGPDWmLG8fV8Uwc99CgDcDfbWRZob8vFzIO7MBG7BrWj/wZP0vhSvifoyPQIKEaeX4X4m1xDn/7ogbnLTaYc5HK4ebT2bnynw/IrCZGG3hJosVGBAsQjYaKCoEvAKSXZg1suV+i6LHxup7kYbaEXOGLPZsd2oTDAVg9dOgvX9jsO3QQF/EYFsOLw8rWyINdL06gbFlUF7ndb5VG5i0hN6vYZ6UQ1BpycvVVugOczHTR3gkj6CkHxkWu2OwWVeoBHGy4L+S0lFPLirqHyE6v+eGfX13uBuovv3CgIF0oo9AEvQyfwJNI0myAXa4hCqgohHJIrUPXidtynNT0cqFIKhDMaCEzIdXTJcsT91KnJIJ08BS5RlspDXjphl4LGzEQx96KyDr08+yiktOk18qH0JQGrmByFYPJV/Wmxtb15JZrozrJ8Y0vh/6ucLXTiL6Ou0GKVomejy/wmZKoCRWtf+d3kdHlWQ/pFvK7De2jN1vzMryC3R42fRMsh6zk/FaCmYz5hVIs5JN55Jgb5n1vZqJ18dRTbpZ02RDGpeozF3AgfnJvAew/sk8GavX+5YsrGK81onBB87mbkAemBiB3DrJu12ptqMvCoai6+b2ByRYE21+/ARXHPA4uhl5Q+LiWra8l2KAYwpqWp5pLuUk2YQ6lWKQa5nfZ55pz3f7uN114pO1QnVdfX34gy5EQJUfcPghuabm4mhsTKeeB9rQ7W+uqaQ1mtlsbOHzsfMZIeR8qYDxTRrntxSlB8wDsCtjeHVr X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: ed696b0f-b825-48f8-dbfa-08dd0a96e895 X-MS-Exchange-CrossTenant-AuthSource: CY8PR12MB7705.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:42:22.5003 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: NDhvDGtTdkTg8Xc3dA5GWv8FB/SsIgyfVi6wl7YldZ+KyRCV3eW+sa1kQc1CaFbMp6CDFkuItOwApBlCtuylpg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6305 Currently DAX folio/page reference counts are managed differently to normal pages. To allow these to be managed the same as normal pages introduce vmf_insert_folio_pmd. This will map the entire PMD-sized folio and take references as it would for a normally mapped page. This is distinct from the current mechanism, vmf_insert_pfn_pmd, which simply inserts a special devmap PMD entry into the page table without holding a reference to the page for the mapping. Signed-off-by: Alistair Popple --- include/linux/huge_mm.h | 1 +- mm/huge_memory.c | 60 +++++++++++++++++++++++++++++++++++------- 2 files changed, 51 insertions(+), 10 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index e804e41..120d0ac 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -40,6 +40,7 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, vm_fault_t vmf_insert_pfn_pmd(struct vm_fault *vmf, pfn_t pfn, bool write); vm_fault_t vmf_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write); +vm_fault_t vmf_insert_folio_pmd(struct vm_fault *vmf, struct folio *folio, bool write); vm_fault_t vmf_insert_folio_pud(struct vm_fault *vmf, struct folio *folio, bool write); enum transparent_hugepage_flag { diff --git a/mm/huge_memory.c b/mm/huge_memory.c index c51ef3e..b3bf909 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1340,14 +1340,12 @@ static void insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr, { struct mm_struct *mm = vma->vm_mm; pmd_t entry; - spinlock_t *ptl; - ptl = pmd_lock(mm, pmd); if (!pmd_none(*pmd)) { if (write) { if (pmd_pfn(*pmd) != pfn_t_to_pfn(pfn)) { WARN_ON_ONCE(!is_huge_zero_pmd(*pmd)); - goto out_unlock; + return; } entry = pmd_mkyoung(*pmd); entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); @@ -1355,7 +1353,7 @@ static void insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr, update_mmu_cache_pmd(vma, addr, pmd); } - goto out_unlock; + return; } entry = pmd_mkhuge(pfn_t_pmd(pfn, prot)); @@ -1376,11 +1374,6 @@ static void insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr, set_pmd_at(mm, addr, pmd, entry); update_mmu_cache_pmd(vma, addr, pmd); - -out_unlock: - spin_unlock(ptl); - if (pgtable) - pte_free(mm, pgtable); } /** @@ -1399,6 +1392,7 @@ vm_fault_t vmf_insert_pfn_pmd(struct vm_fault *vmf, pfn_t pfn, bool write) struct vm_area_struct *vma = vmf->vma; pgprot_t pgprot = vma->vm_page_prot; pgtable_t pgtable = NULL; + spinlock_t *ptl; /* * If we had pmd_special, we could avoid all these restrictions, @@ -1421,12 +1415,58 @@ vm_fault_t vmf_insert_pfn_pmd(struct vm_fault *vmf, pfn_t pfn, bool write) } track_pfn_insert(vma, &pgprot, pfn); - + ptl = pmd_lock(vma->vm_mm, vmf->pmd); insert_pfn_pmd(vma, addr, vmf->pmd, pfn, pgprot, write, pgtable); + spin_unlock(ptl); + if (pgtable) + pte_free(vma->vm_mm, pgtable); + return VM_FAULT_NOPAGE; } EXPORT_SYMBOL_GPL(vmf_insert_pfn_pmd); +vm_fault_t vmf_insert_folio_pmd(struct vm_fault *vmf, struct folio *folio, bool write) +{ + struct vm_area_struct *vma = vmf->vma; + unsigned long addr = vmf->address & PMD_MASK; + pfn_t pfn = pfn_to_pfn_t(folio_pfn(folio)); + struct mm_struct *mm = vma->vm_mm; + spinlock_t *ptl; + pgtable_t pgtable = NULL; + struct page *page; + + if (addr < vma->vm_start || addr >= vma->vm_end) + return VM_FAULT_SIGBUS; + + if (WARN_ON_ONCE(folio_order(folio) != PMD_ORDER)) + return VM_FAULT_SIGBUS; + + if (arch_needs_pgtable_deposit()) { + pgtable = pte_alloc_one(vma->vm_mm); + if (!pgtable) + return VM_FAULT_OOM; + } + + track_pfn_insert(vma, &vma->vm_page_prot, pfn); + + ptl = pmd_lock(mm, vmf->pmd); + if (pmd_none(*vmf->pmd)) { + page = pfn_t_to_page(pfn); + folio = page_folio(page); + folio_get(folio); + folio_add_file_rmap_pmd(folio, page, vma); + add_mm_counter(mm, mm_counter_file(folio), HPAGE_PMD_NR); + } + insert_pfn_pmd(vma, addr, vmf->pmd, pfn, vma->vm_page_prot, + write, pgtable); + spin_unlock(ptl); + if (pgtable) + pte_free(mm, pgtable); + + return VM_FAULT_NOPAGE; +} +EXPORT_SYMBOL_GPL(vmf_insert_folio_pmd); + #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD static pud_t maybe_pud_mkwrite(pud_t pud, struct vm_area_struct *vma) { From patchwork Fri Nov 22 01:40:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882597 Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2085.outbound.protection.outlook.com [40.107.223.85]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A34701632C0; Fri, 22 Nov 2024 01:42:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.223.85 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239751; cv=fail; b=N7LpUBmTw/SL4kFL/+gIj4A8Ehh43SfWb8y4GnxVDpdWzmUpnlBF74iaU2hm040YCT4foUiqK3kuvmhc0DH3b5+mqkYfvZl4slAg+sl9MWUzarUSXtCwVqITngXEoLcM+ArxbzgcKfpbFfjj34mhxGFPHeHnlJelwT5OOgHj8Pg= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239751; c=relaxed/simple; bh=Fswc3izhyxoJ/wP4SybgRADX7R/aT7dJ0owuGXFfC7A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=laMS6hoNdCGPHyfhtZBhUtZyhFSXPUiNrR2/wfn1/DFZcpGwu10iBzHqJiwMSoQy1qzKGlsFm/Dt97yYKLqXtfcSFC/IOFnLK3R5E/IGgQonBIh0Neg0jcCTk5q6dk32KxKiSChof/Ld47nZrYgJc6UBAcEP6Yd6liJWkJNyzkA= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=nyeLw+Uu; arc=fail smtp.client-ip=40.107.223.85 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="nyeLw+Uu" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=SEU6HQc46RQXfGb3ygsEt51gFQr4qTHHLjVGHdooVLidcXAanZc6lVZ3oTsuXliI59NEb6RBdqAfKWqQ+joUs0e7u7JtNHj7j2VTMudkV9jHlL6LVf2CCIZHNIheHJsIvEt5NV20rhnwOQCc9zEwKkwPkZo2Itx/AIidRJfhlpkOx19exMT6Tt50/jWEUBLG/j8AYns1lRK60IW8R3HH5HSjmVX1U4LPc1HASh762v7GAvKC0BXqOavw5cfjXN7pmvmJ/EJRVcS2et5TCrvOgTJ84g6iU07T8ZrT8F0q0a05iHo+PRfAFS3xm0X4hjf8FOnHbQwMa+G502opCN5atA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=AoOuGryreVKXelZflkuUcYowBxVEssrOVCkLycDDwBg=; b=qNkF7d7LK/BQa5fl/9sQbJLySbNYRPJSO4IfcJuGR5H2zfQKUueer26tZVrmjrO+mTfOaRzXZdTUQ9wyhTp4RcJv9UJWCrCeZP9Ev1fS+AlcJ9BVRZW6C+443denr6PlIyjOxlKROm9ZoqEmzzoSmk6k4ubPLzOK2uhtdeibly3d4IJkACBPL2qG1AX76grVCuRPSAbFWHLNJRQWuWy3iDmg3Sj5x2NDseoZm2B4feViRoudBi8zVjrB17k07H2f45gVHBohEPipZHprCjPC/+1uYmpkT8p9309GmTu+BgProvyFhhZpp5WLQvRb4z3JRo8fE2zh+NLLyiqytMbjVw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=AoOuGryreVKXelZflkuUcYowBxVEssrOVCkLycDDwBg=; b=nyeLw+UuEsrBgLZAdJtNw0u7gvZyDWICqMIdnFrVbjTgfv/OMn2Beok52/86LTmk64s5+Lai/oh3dbTu4GiJ08V3CSnIzSMEenRqKKhGqw0M67m0I1ZT2VduJDGAVkRG12mkiNGIJL5ZaeDlt16V+Q+p75zg6/5rqZ7zEmhP6LwkUYvrk/B3ljZdWNctSyljqGJcDXhZriutkN76C/eaH5HSKcK9HIaj4Wk3FyewI3uKhjncFhkW9kjTm6eU3BHBNRwWmKOjcVUzYHi2VNF/u3qU0NKEra6OQpKZK5VnFIUCzqxIisdtaTILj7U92W40M7Fo3g4g1HQJTHFNQEcGPg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by IA1PR12MB6305.namprd12.prod.outlook.com (2603:10b6:208:3e7::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8182.17; Fri, 22 Nov 2024 01:42:27 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:42:27 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com Subject: [PATCH v3 16/25] memremap: Add is_device_dax_page() and is_fsdax_page() helpers Date: Fri, 22 Nov 2024 12:40:37 +1100 Message-ID: <9832173195354f346f2a88244e8ef80dc52997ac.1732239628.git-series.apopple@nvidia.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SYBPR01CA0083.ausprd01.prod.outlook.com (2603:10c6:10:3::23) To DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|IA1PR12MB6305:EE_ X-MS-Office365-Filtering-Correlation-Id: 839c59db-7b43-46bd-0698-08dd0a96eb7b X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|366016|7416014|376014; X-Microsoft-Antispam-Message-Info: wns+OaaIx7Se79dWV+36B0Dm6HEa/shqTJI8YhjHPYjHEHdvjw0igdpLtHKUUWpfUy0sR83jGpYTCw1IaZl0iAicQy2jNDcJmmM7PpO/srQRBDFzlg6ZDS5seplFxsgsIKtFPUrow9vnsM/bIUXZrqv62cxH4OMeCPNgQgfuFwXHZz49RemZWAugeXJxQH1wHgjVIypzo7JaLvLrJdaFqKp8HT9SmNB8WXElVqieDRAf8wYcZSFx4Z1AMAa2rV4CpPYtmdT0UdrBN0Iw8LdBzrTqBGqwGHp8PDKiAdDqC+btRmjFWxtuDk3mZM+uEmkp8UcoR46QswwM58xFkZnV2biBUm8AZT+U3xvj/SbnGfEav12SEV7OVWLJ4MUGlyoty5IFGKplBmsleoeeOHt8R8v6QswATERGRKTUomHChXHl+11HUHMJXbUFSdXN7g2DKyAywLltRxLiVXOVr1NZrYY7eTQ1H68SP00YBdkafakDKOg+LeRBwfvvLxn+lriJPR04FlxdDhylholvp0X9QzGWO7/HizcM2gD5jUGu6hszXe0qakMzdYcfxYsRCkKMoDMc04knckOOeAUMfXAXVyY+1xRU3S8lGQlCAKh4TUwGtHgMq43LTv9PvVsiTgbEhrMuGu5ty1NXWkt9Lx4OiON9WICN3phVHMIrESnSpuXT7Avzmy5szBlTj2Xjw6evsf/BlC5W5wLG1ckV7CR8h/zeiiEdXyw1LC1hKLA84+3u+SusiN0BaA9Mar7FgXeJemH6WVsmNB0pQVzL+h4Rpg9zyelV/89bqAWrUrOjT4AIj58yUI4y5brHtJYmDvof0j+dTCIvAaIFyZU7DFC3/ZDe4RoVDZW9CQTMcQE3PMrgqf4PAPVZtOgWcDVYS5JAbYrNbY8JKYOxDrNfEt/FZrsKAJgiXIkkuskCvTT5mi9TVWlyxTuaUtuCXYARFY8HiHGwdowi95ck9YxQBbUn2+cda3WrQyCrUu3yXiOntXJntapOsnwvhib6gHdql3hVEjfGYgdLM5lHsJDA+5nHi0iz7dBUZuDmciGV5n1UuDSAylJbTwQBhfz4yR0eNZNybsqBEQpGtKu1l3kKfFq3q1QhuI/PiAUkee+hO9MznvJycxk7okdjSvCMCoTPBagi36MqgVMp/FdqZh4OJErcefIax2NPS4wIed/sfNLAOsbRiekYZ0V3TqCIiao5Lfh5TaCrLQfEzOeKyfuJVCjVz6x1H4qxnhITevUhYLSHqyPWCy8jmuEiasmK09Evzz0R3NFltWDOHlNe/ZbJN0gXP5L1Kl+A6Sa5K9j8biM6ZRI5LV+yrxenD+wObs+WQT72 X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(7416014)(376014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: c6C/FDhekc6Y6ogJHR+FB5NJSJP0nswtQpShbBu77p1KTwUMzGoIm/2TDGEgJVwcclUqZTVHIJiYcSzPdma1TV2dxgPMvZySgzAe0QKXHC/YYhZaIcFy//SpbYMarqKJ2nyXk/5eUFTFEAFsChrLRAr5jc6p2e2qFmFINaiTaHJYOo4SX9JNloz4+S9PZiZuO4+MYfJvgCAbsCrMCGIYQqSfuzA/ysQedlZexRsskml1v5SioPMmwBLEUOxt7nNP1mVGWpzHAlSUWwfwwMPJVlWJmcxaYgZM0VxzdCuxudPmeoleLdnTkjHwlrHPUuxpI8BYXcyeHMMWdca1Di6Q34wqA/SSVvCKACyFFMuiRLE0/Ghvhuv7N37hH+rvV1Fo1/fA/Rnjpr4iWU8mWR0/87azvHb0+ucUdr6iW7Y6E4i68EWnN4aFZfdOruusRZisw10C4iRelKsPClWm0KdUbpzOD5q18Aon5J6/Esa+RI/gYNewq2E/X/U/0OqSoQYfbj++OqI129rukJqVsjwMmySy29c4ZH3VvyyVEM4OMe+0LEVP1McyLdYwxng1yu2zJHZGFudIl6PbruZJwtIQQ3YmiAD/xaDXQQkbvf6VgbRvPwvHMG22YkOsEkfpR28dxPY/mNDnoe4P9ar//xq8ZF65OsaDZEIe69pkQSCMTVbXm/uzQUCuoKAysO0fWCJdAWokF/Ddrgp8taH6LW5rsvDKu7zo7+//UNTNpGmX1UnPKq8vgMDbHcsVv6dNZW8Bn8uyvjQAOAF1GbPP7M7BOikUQnHA2DdRYXN/2p6zxkfwnkMKn9AcvUZbPY+H1LxxlHv8+uwB3z8JsaH+JT4pjTfzWYXHG0vE8oIiakbDsg9QNBcbC8tXqDsfKCVHvcsfaCHBHcsDlQIN59xKQ96RCO8dGQ53n+tOmbgLoD7ylmmYknSSjOv+dZgbfbHEpo7x8rlj9P2lM50zD5227fkXMdxtYTRBc5ByVd6/phHMBW6DHKFRGBWYY0REco2jlvHPhu7vlz7xDUSeU8spIMIiv6Q+Kr8ndPHqdedj0CgOveESThBtmbGgbjecHstplbBvbKxcRh4IBMsC8lVK94fvmlfrfdPCRJVLVitozH1fyGiLK/qq8KPACT839bEeBgAS0faubpFqoEmd8Kkl9GiDsmnaHWWis/xLVw3SDDxuFjwIPR9wXlzXouoTBWAV9yR4qkkuMWMX82BcRkWXJOAM9JrkKRfcwfrElHFjMZsArwfkQ7toe4PHa86sGBhnospswArbp23GY3c2I7E5VNyDdJGFUJDo3D/+8W8ddFEz4nBXAQ1Y2Q6XXEUCzAWk5ySY0vyf70FiMK0okyLtryLLUoIgqxsxijh8Fhhl4pRhkoqGPBc6l2ZA9VBRy2p+2rNp+dOrZBTOh41pSxiH4piGwJ6fKeL48OG6XxkyEI0mlFK5khH0qrXp60G7Ckj2YbcpQ6i10K8a+bN1FG4cYKThIEywYutrLrv+DGxwdanECjZgLeC3BJRNHBDHa2kmzOsVVr6B4lmXxwBnIl20RQYzoUigEDL/ZMbFKTTaWmckIutIMfOXJg49mbki4JfZqsj5 X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 839c59db-7b43-46bd-0698-08dd0a96eb7b X-MS-Exchange-CrossTenant-AuthSource: DS0PR12MB7726.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:42:26.9685 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: eVjIPKj6LDJY7EueB6RQVHwHdg8eEbz7/peu0sGS45eIyYtmO8No5PP/sx4TpS9K1ZxkSzMwYp60NjCAkcOTug== X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6305 Add helpers to determine if a page or folio is a device dax or fs dax page or folio. Signed-off-by: Alistair Popple --- include/linux/memremap.h | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/include/linux/memremap.h b/include/linux/memremap.h index 0256a42..f2a8d13 100644 --- a/include/linux/memremap.h +++ b/include/linux/memremap.h @@ -187,6 +187,28 @@ static inline bool folio_is_device_coherent(const struct folio *folio) return is_device_coherent_page(&folio->page); } +static inline bool is_fsdax_page(const struct page *page) +{ + return is_zone_device_page(page) && + page_pgmap(page)->type == MEMORY_DEVICE_FS_DAX; +} + +static inline bool folio_is_fsdax(const struct folio *folio) +{ + return is_fsdax_page(&folio->page); +} + +static inline bool is_device_dax_page(const struct page *page) +{ + return is_zone_device_page(page) && + page_pgmap(page)->type == MEMORY_DEVICE_GENERIC; +} + +static inline bool folio_is_device_dax(const struct folio *folio) +{ + return is_device_dax_page(&folio->page); +} + #ifdef CONFIG_ZONE_DEVICE void zone_device_page_init(struct page *page); void *memremap_pages(struct dev_pagemap *pgmap, int nid); From patchwork Fri Nov 22 01:40:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882598 Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2045.outbound.protection.outlook.com [40.107.244.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8D01B1DDC29; Fri, 22 Nov 2024 01:42:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.244.45 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239755; cv=fail; b=fsgAS3wHuYDvdHePJ0IV78y98gPOWVoZE8K1JPBWRF5L2vgQ3QrtgMVIqcc7KgqDRZw3YYF5SeERQL61wDxp1tmfBAfXtkhXqV12vBnANMHRN4uzNH5qu8MbuGbZg+JiSOt54Nl9snWz1PVTxX9V7hXPezVzt6dvTpyRXxIWX1g= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239755; c=relaxed/simple; bh=LZd/w+ZZLlPe1nQqUZL3cc1GD/vmP/L5HkOHDxl8r/4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=gdvWNsTCQgoVhTj+ybf73oYpSNuf0Sv4UP2FCCvz/DfehVvj5cnhFPf1lO2tvcIZ3pmVALqrIP4Gf6naq8Rl+z68M30NdkX0YABe9Gco2BMzu3i7opmVNbguD4pDFoLi0G4P5aYuphYVNxUMUAU8GUGJ5S3WLCwqbixSwgKGRXU= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=mqusUnHU; arc=fail smtp.client-ip=40.107.244.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="mqusUnHU" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=cqGpEBBNz7q9eF/x2uKixzWcehAC6bSozS9CtJFIo1Qs2QsZ3f3dfTzbrGNOZqxMI927qMSt04DCG57+t1Gue5MWTtFhHVOR2vovgvmWXVCVmzaCN/OXBwh7lmMZb+eaeAByy54gpukvEAKfv50/GDSSKDwhh1Doj4pvMUH9ZrXSga37Am+W9Uyrc711nWvNkCptPogfhXew394dNPy+28RBkYUsaOMdoffnIH0rX+bYaYteAsWKuIxqMcZ5MruyIKscwnuLf5X0ccMJ/WHPRVC2ia7IQZp3X106vC2A0bOJMuZIbymv0KoQZNZRAnwTR/U0r3RkMO0EGKc8E9M1jQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=3WcInd05whaB7j/u3Gvg0cn/MBRX2CW8kXh3eIdb7as=; b=YEIeGpXuTuLdwBH7G0ZBJMe/dhZo1HStBxIxlFhCKxft08cyshpl/euKBFx8Vasbs0zZVWLl3z+zngMy6Yg0JLnQh9BhwaObvfOWjPqFPVFm0k7wfpy8rJZuwzI1PxAZ3B1nJZBYKlM/E3ZaEfk7/BRyak+ON9/U+Hav3zTeXsgJkBPGRmhHFxeiKv93QP7IB9D3qoC/a8hMlFJPwZS2lTH7Slgv5h30iJeeUubMOQKoqtIb8IHWfnXbufIEvdsAoH9BJmHqzW1hXuGKmjSsmLpYrT+Jpu4TgAE2uhmUWkFiXNpC+8ImWk73qHNJ4b9r+lai4PUcK5sBkeXPkG80xQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=3WcInd05whaB7j/u3Gvg0cn/MBRX2CW8kXh3eIdb7as=; b=mqusUnHUYGkkuyhdApUH+MTdm7J4R04BmgUg+BXCbv4k4gzB2+9l7L28umRauYp687qJqBnMmuE1G2rsGPhRC+coEDlI+6YoBcsqONK+DE6/tmKA8Z3cuoo2vtOjorFitPRgcJ0ZynDz4AGRzbhCPF39fqoytvJMGS6N3r3sckvkqrGGNJs2zndLru7WWWw8fzNVQXdPNYN/cL07lBUIyTwFVIOUWLLWrR0bUqdZyIFrVwPcFvQVuSfijlLQZwjaztTCVhJWW2hleQxqq2+5EzM7UrzXibYMHV/L20NaPQYImUSme6GUJlOqBZoZ5DLZhJJArXmXZPWdFJI6evlRbQ== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by IA1PR12MB6305.namprd12.prod.outlook.com (2603:10b6:208:3e7::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8182.17; Fri, 22 Nov 2024 01:42:31 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:42:31 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com Subject: [PATCH v3 17/25] gup: Don't allow FOLL_LONGTERM pinning of FS DAX pages Date: Fri, 22 Nov 2024 12:40:38 +1100 Message-ID: X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SY5P282CA0122.AUSP282.PROD.OUTLOOK.COM (2603:10c6:10:209::15) To CY8PR12MB7705.namprd12.prod.outlook.com (2603:10b6:930:84::9) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|IA1PR12MB6305:EE_ X-MS-Office365-Filtering-Correlation-Id: cc17be02-1dd4-4da6-a896-08dd0a96edca X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|366016|7416014|376014; X-Microsoft-Antispam-Message-Info: mFLRxNvwuo0pgMsmliqufkXlW1h5OqnwZCwpd+K/lp4zXlCwh7u9h1ROBYKyFkVPF+NkzvHYDU+3TsWm9emr0jo40J7F6M590QKjm6PYVrjUlv97cVQfh0EDAQSQ57Zergp6UxY/GDLow77E0dQWptq6tFnILn+PKwbfpMry/fBODxLSyY7Vrpp8smcdCEb7K1fnZDW4w0OytoAW7zC4rdF/cvu+AWEorHJXbej1WioIqIC7drmyZ6z0pj1/z2b1BawkRp8lIX651gXIDsDfI/ppNGTAe+OxuMOn7ZliRbpmMAnhwGu/3Xv45rN6FbP63dTuwVD0EyvVy8XivJBuNbk/p5iB11aqa0pT+wjXTie7yS1qGrdneUhje+SxYdfaq5KWUf7VOTQ6cLsCIppMUXrIJDPwPDRCoETfKWbKGvS/Zi10zANXll6L3RCimAqd9qDJjpLI5NFIvQcL2NksSy0qt/2q8e3apZjeyo1akbSpoNwCMt/XJH+zdzEUQsomdXg7HlJzkO4LTSyG5DM1ci2aAdD7xXu7c9p01XiLyth1NW8277PVYoQPMP6vQchCWdWHOE+08Bvdf7FvAEKSrr5vRUEBNc9gNUc9SylH/tKzeBPMjh758mheqOx/krC6u8X6fuHlwW78bOGljv+vPX7Y/uC2FiZeRyRw907Zbw/0fcsNzoJtz9weq6KpnXD8AvdR888YYC/kw6TsyYjIPDp5WWBEmUR5YCWeDwWyONvkVfy0+avy7dupAHnj5PGslIYTxjXelaD4Vd+t4lmEXO92+itJ3IRe+/eknF56OxS2dwLEK+UEDiaWAx1YnJIrpKEtvu2Zy7KBLSvBRCqfbjuj9OZuvJLyfH4ek4F97IryuyWe9161RLp5+UrDglCk6oOCbCD+ILqR0ip6u+0OxvH+Qy9rH9SSjKGr+A7zJrn/lE1HDn7UESQh4ei3cA9ft/ZBeMnLIgi6Qn6mGIGuVQxpyONOG6545xCFlpXZv5CQ/twHNsje6voqeNiIJJQ1egVK7JtNHGnguuxehWNHh1yKk/IHDg3rsWSFuOEy5a2//DU7gVIWDTtrO6wQz0mn/KC3IXwL9j3XUmKiqFeK27CrtMnHhBu4J8f282uSDfsxapzFhBujpPZ8z3y4BtxHqROGFPusL2UYEobNWShv/UwjFbM0iVrmRXvdiFi857nPPduU+mud7nDK9QTqBwqSEr5RPMgLKfRfj2IQK57aR7lTpLV/FxU4sKE2uNVKKdHBBGzNLJDwLuFNspP42VK14cQ+4YGFw3yZvxua/3QmCv5D6GkhjaFNuAUJ5qgG2mk6onSBeW69rFmZHsN6NaPu X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(7416014)(376014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: DyK+4gvhpgN2sHdB6lKNlA/Kv0W9SFgKuWxEy/a/h/svGlJbU1LQL9+t9wNT4PtHM0Jrk318k5yBzGx2B2kZvFQFxCTQs4vQfYHJCfVynVMv48gwMm7sHhmEUayw8Ygu0yXYlbTrBEr/p+s8WNkyBzHSZIAJ1e9R3q9uJofMAeyYi4T7gH+iP3MNgcIQ6Cx6t4X6qMe+gTTBatD1j6FK8jV7bQw/aWfO0+S8HTFfKBrhv8nRhVRco8o65O/1jyeX5hXI+kSdKGFLheMkDeBXN8JXQ57sLHrajrYzKQRmOH5YNDVH5dhRiz93ryEJLI8mNMwMjPO9uh0QolLxY9YRs1rsNVE+pMWo89Eup7GE962kCgTcQoL4hlWR9puvy2QyQZ4oN3uZmsNkADpk7eqtx+jVUUkEpR1bK9a6KYPf/s/3Y5dcLPH/olU2KioU58TYwGzt/WZGBqUUNmak1wPQVqolly13x9zXakBTEd1SE35M4p3YLnPRomxXYPPM9/5LChxdKTwckGbg+aL1Og6N+Tisuas5mqjHTZEzly1OUgZeYxvYfGRjYjJMdweJc3pyPaYG/dhX95iqyKZrMsC5yKG/0QP08ZN8CJGYPgEEP0rkCl5qq9nW5c6XZ5cAoYQR9jr/8F2AInEaqsg0MRf5Q3bK+hDZC0ds2IvDsvhd0jXZq0IwujD2BKtej4Ez5X1nxyLP/7f4H6kmjMgnCPlm1CGdPAQzdtC4Uv/wAQDEgEJmBJuxpivLfpoaZxV+T1eG9rI5W+rULxqDKC6u+WzatNfQZ5RsEfQZnN/UYsnFOaONDpN7+5fOGj6SaOrscy/7Jfs43C5ocEvRyr9TPoeazPrlJSTn3AfgiugoPgpXCd+9WmPM19tFxZYgwwUqgOyWBBhsLM415sU7B145LFn+Ii2+5DjEe/mLMcEiPgZFd6i3GrMneVRUEdBBB3pHtIv4o/MlYoBqZ8PLoSpYhRjsNGb+7GPR9df6FJogCaslocWYbbXHmS6byois7Kd7S3v3ehXmgaGz69UE3rpFS4oXmANi8qTsJ1rXYFNH9STDKowSwo2Mthj8fuucm5e+Ffqpx0F8Ut83HsIuuW5GIpGsFgj68C4HX0NdNRe76Cox4Dp8KXODWf5OTonuvS6h3en9ESS8Nthi/aN4Rea9Lc1/MX5qHZX27TJD1TQOB3FA/MXnqsFkFsh0XCCEtIbq1mBtUk2LKbnQzEtMG+XkLO2V/Yko1VfT+aFXdAivEhKo+JF2SUy9AYqf8CczHfwwlISDVrr0cseaibfghz4Y3Hg4CcYAJZJnoa+UmqQ1NcUNdsJJVBSEkSe9LL400mtsBFSPqtuFYP5kwG9z1QGk7quvU6vVGKuStxRTmIuJgDfjM2RLk4oEAtSVVDsY94K97EhcFK1hJ5Z8fev8pX6Dbro9a9lQm+OmnufVffJkK6nOoHLqSVXKu9xHyFSTykdkk9DonN1zoRMf6CqtbZ3DrhbcLr3Te+UtbA1asfhNYj6n5IJhJ9te7tJ8s7l2/CsPYggByzkifkLbt1WIlp5qR6KEB+UXvvvBKk0YwQbMZW4C5nMJ6tMeI77RlLH0GTvcthVt X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: cc17be02-1dd4-4da6-a896-08dd0a96edca X-MS-Exchange-CrossTenant-AuthSource: CY8PR12MB7705.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:42:31.0545 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: Kg2MinCJzjQ58xxqEuYTq8pUDXDRlXqAyYiqErPqyEsELqH1H9L2sSVe1oEApLc1lQAy29mhHE867SS48raGqQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6305 Longterm pinning of FS DAX pages should already be disallowed by various pXX_devmap checks. However a future change will cause these checks to be invalid for FS DAX pages so make folio_is_longterm_pinnable() return false for FS DAX pages. Signed-off-by: Alistair Popple Reviewed-by: John Hubbard --- include/linux/mm.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 22c651b..4f9ae37 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2004,6 +2004,10 @@ static inline bool folio_is_longterm_pinnable(struct folio *folio) if (folio_is_device_coherent(folio)) return false; + /* DAX must also always allow eviction. */ + if (folio_is_fsdax(folio)) + return false; + /* Otherwise, non-movable zone folios can be pinned. */ return !folio_is_zone_movable(folio); From patchwork Fri Nov 22 01:40:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882599 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2066.outbound.protection.outlook.com [40.107.243.66]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5333E1DE3DF; Fri, 22 Nov 2024 01:42:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.243.66 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239762; cv=fail; b=CMpSvwck1gwBvVarU+9ZEDvSmzi0RGn1LcHy+WsPYotO3coeAokppRQn5+1c1lz7RU7nu3uXePLe+BhW3gTn7LYEg6o2mO3/DGkH3VofBKVW3OlFcpmni/hX163au1pbdxA9+WZkcx/GREeAs0bjBPpA6onb4hcK/n+pv6b68pE= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239762; c=relaxed/simple; bh=o7nc+JtoYjTSq11LxdDVJgiY9OXSsLiKJ5yIvewOkXo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=Tub93b8xYmjH1e+46TkexSULQbzoiZm1eQ9QEAMXyxpTI3ZDDbccEEDmGqAlaCGlYQBB/PGhsQ7sQoPivdkc5yRd4AjMDdSP7/vYbfcbURUMWct2lYifYmr/JkVxCVzEshYHD7KOcO0Iwoy39W3uabtlzTC5kp3O45IMxAKP+cg= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=YV2QLICi; arc=fail smtp.client-ip=40.107.243.66 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="YV2QLICi" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=MSo+WYEJ2IG4rGbiVtY9GPgeQG3xwrckWYGUlOBjcLz/Qpc76riBMg6i3Ug5DPS1WLle7RAaUfHJ5scUUylP9VnQULGQRHgiDZpzuRHnbQlIykyzspIutfOKN0UpITXMUnu3Gh0/BI+RYddOlPXFwNTpcZj4XfaljI+TyUGdwP6wXoKEAcq0Gr0EZ1jyT8LXnovqeaP//hFDfYAjAJD81+ahzwvvjPeSmwRTQhBp81XMm+aVEo1hjiL5bqZU8AJVOPRaYEYDxjiyUEPPSgpnT0BnW+2qQ9xlQIndTh+BcbPFXPgtrlLVhoPqHCqbXRrThW+3Wa+eBp8zxzGG/WRkjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=DHbT0IBV5ti+9abv0ddG6PJRpIdYB3WlmdHzEuvfbG0=; b=I/c+GN+O5a+Hy1xtg4oVcc4WJ6xR483dZiYla4U7fVCYi/ruy1UQlLy76VRdPP75q4LKBKNYMGqqMxxBd7yju12EPorqIXL6rB0Gsgs5fTTwVl3F9lmh9ILeSRg6pAXFtIjWKryyohCBXl0Ci5/nVHrOEmhZVzylpwMLwln9mhf+ncxj5p7/APA3EesxuF5LNfGsqof7LsI6H1WakftliQ4FEHz3ZjgN4CFAyi8eVmyFGs/oSzd+yM3AI4c0nnYwOONk6+wY4kSCCT6ikuwv6B70BMrcAFBaXrEKkKmY+J8oj5dNcJ8+Ewnn4ezYiFALrEd9YHfQ8Sdbma7m4bsiwA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=DHbT0IBV5ti+9abv0ddG6PJRpIdYB3WlmdHzEuvfbG0=; b=YV2QLICit//RchfhDym7W/banCqlk1Km3jc/WnEKxPjH5XQlMluYFLgbaegrn+F8Su4/Xxtp9PqhSX88BqINp1NoJUQgGxeHueMXXCHvYc0R6MpAN4UHaQSYZfmE/krTkf2YOje6PP8DUlOl8m+Cl4U8boYgvRYqiBLm8VCRrL9z4in9zKsW3lKCn4CVdti6XirahzO8oAe+gHSHpo8tGPC7SwoZzcfRu8jBUNSskNBsgJm00ByWyRU/nhdV5rClmjdTEX6Jo+tPPHD7ptcYKvFmDlx73+PHgkg+pLyMF7J+z6WI9MI6ecrUk2lPNC492hIbeV438MqBzvlFJwkOtg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by IA1PR12MB6305.namprd12.prod.outlook.com (2603:10b6:208:3e7::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8182.17; Fri, 22 Nov 2024 01:42:36 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:42:36 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com Subject: [PATCH v3 18/25] proc/task_mmu: Ignore ZONE_DEVICE pages Date: Fri, 22 Nov 2024 12:40:39 +1100 Message-ID: <790644f0c5acd7c413dae08514d4122142967b57.1732239628.git-series.apopple@nvidia.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SY6PR01CA0152.ausprd01.prod.outlook.com (2603:10c6:10:1ba::16) To DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|IA1PR12MB6305:EE_ X-MS-Office365-Filtering-Correlation-Id: 8a1cb113-3359-46e8-a7b8-08dd0a96f0f5 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|366016|7416014|376014; X-Microsoft-Antispam-Message-Info: Nx7Z0v7/GDN/GQJaiVe0OsKPcZokfq2XNEKpNXWy5he9kUQ6jIZU2lXSqbYmfr0uO7Pb7Fz/GW81E98vcc5SWW2A1aKa6BX7clICfzoWET/Vcp9sknhyoRnDC62f1c+LwnwQ2NzRZw57llXuLzuQMJknN7B5UhzO70jD6AY42gRMVCj2wSXYxRzYi0BEznJLfnaF2j8/eN05hDRU7XzSV4uBdkiYUp3clwNrCO0a72SWzbCxrhvxNkJ5AcSGFtp17ZQ+txwRm6jHLE8Gn8j0vOUgfMRBIrT2zDIHHhE6tSLIpIVLmx/dSbn+iaUzMlYWz6ey2XrmzCU1PJkFVF6HRo3Ka+Ka/FDK2rNavIax9KQ8vAtRTbbNSY8gfvfAvhUVoxltPBQ7NLQ5EgpUpEUh/E8AfJ+1oYRizHzLq6kgn5v16WadA/kORsfedM2E22fXPsiNu5h3Z0xy8TfJW7nZEA8bCXUX9z1IAcDxO0kn47amMCXWqyhjenP+epdzEUuHUA7dzmuePfkXtR2DQODmMTHVGOccVvPgdK4pwh2y4zXquSRJsVV37+HIdLkquLwOJk1HXFIf7OtuLeBZu/Ddx1ZhGXm+fk4gJXfM3VxO5IhXQs0jeYU+zNN6aMd+TWOAgzG7f+moL1VSDuqaCG4rw2/HeQGSPUAlt5nXQegoxIb96qHoPVaaBGLT0Ey85zktauI2X0BcJJjuuoN0QdXEemo0kiTb3RbvUI55TkwGcsWSB8nzovEe+QoDcrHIxx2afCqd/ozjsGPoLrmabzeSgxPaM7veUUzIeyWa5DEBWRGtiT8wTpzzbF+cZtG0Ei0+1vLS5lbueShrqxy7fBXG7KKkNLB5YWJncyTlTspo+T7BU2GDWUDDqmbXcW269QPTRsbW/aabJ5vGaEFANl1WGEX1f3r3KXHSWSsxic06o6Bt/cya7vC4QiKiNa2CZFANei2Z9uFwCBGjPbpZsXSY53P6esrdJjWVg1INvb8L45Vec+XjijPwicngkzVXZ8F2VrepRb0Fa30QOj/Eo8rnoHJw0Ze3M7XlF6bmoA44mX4jMd0lrwSSG9E9wNAIerTFp1XrYnGQMUUT3oca9t3AnCK/61HYNO357R9nZesw5b05kx8LSqa7S3VJRlI96NCoXvG7q+GUHNBTRIBPDsrpkab07iHTEfK8yMzVfQl+kumdHnJROUtMXtSju4r+HOiU7I3T7M/Cg0wG83yU/s4lhGRBQSsMyX41HIdNbVsbgg0dCMw7PxbRLrQ5FX3xS4qtuqmkfrZW7BQAMd9YrFvYw8GaP1Ugrrok37K49GyX3odEsEKrD/45F8pHTuC9qLbi X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(7416014)(376014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: /fhSoXrQ5W2FzrDq5Vv85PAV579j8x6dftC3ODtciE4i96ODz+9DNXnBEAH6pvHYz/ZCiWQgb/+wUAlR6VNYSyctrK6lzIJJr8Pq/JqUww+WhUGv7YFDixUfoqXxeREqGyu1UB/RQC71nBWzF1VKFnMG3UyFxTAJ+9Gol6Tti3ZROmdF71EKNjqTtl2ljBCryib0j5y9VjE1wfQ5mY3wmOJYYzGrgnDzE2jgJgAgWXIHM+2tM8R1rSYrx2u/DjHvhCGxp36Gzv+xABWJom1IN25JSY3TxNsMjMSwdTxyjhP4SLWzdr8Tjj/8EAiCUOtmUnV9fveLv/IpnjQJOu8oooRywVxOnZDZSy/cElmd8C26mtYKYEVXCNAUUuXa0LnyKlYS0R1cKVjjNNMtfIMzBGGQwzr1oW4YpZHzvfdNmjmHWf44RQSFJfsjGCBPe3qOCpRZN6/sbanaLvJHtozSibHfGTgAya4yRXmuysPWtRIz7d6QJetcfcLSb0naFq/0TvPrRHf27j2leEu1xm/Md7HT9wWQpB2hBGWBPQADJl1RPDuk7YuAs2plEkM1LT+OwUTeOkV3eXyeJDSuR7KTWj1otBemjeF3RiIMEI39083WkW2obBP0zQbojzP85m4rpTe5BTQVS6gIBD24Noe9aSxffQwL2tM9YcoQZWZfBG9AFoocNypNy6hHQ5rIUTCBfm7ojE6pP5BF+pQvEjhGzsTNrbnCqDHWpI+NIeC+CmZRtsKWVnVBQqHoBWcAK11zK24NU28XYWk14jxIw+mW5DGBD9zEWbokBPNtJ9o9DH0GIVI7XCk620UGBRSfH49n9qwRW/hKTjGCnScxFI55+xk+kgP7cE6Bmq8VGJs5AaTGex50PP3YeJG89umcfKwwKUCnBvjLjq8Hz1d9J46rDNGQRzG/GrXpxktbNji576wDbUYmurylvwAM2aMS4EnsdfiuKsQowgN2lczx1VUB9f2Xbk6a7Zt6d8kpzXrxOqdYVyYWog34Uyidz0a7KPCOBMuZnnu1zkZkEFCD5peAz2PuS1b2IKlgiViORZNOXz7RrPflBdhvdQQpcTzXRxV3DF8T+GIVi8G0KjfMTwe2sXFD36sEeHU38tR4FTsH3/P6/WfgdyUjp8XaC5DgUYtsSEXSg0wPimjCkgJzUIki8BbzUVH6Xr3suL8rZkr2b5P8sg19rUeqkqh5RC+zwUvNRbFjsbX94/dj5jTHRkwv/tA9WLe0CIerQy5pATqbCwbdTbOkC5qKUL2OanRDjnImlimoDyoEJfhhUcbVqOWCZcQH49/5jNXw6L0RxJevz/kmiXC7odgLuIlbaMXlRnaYfIqE58tUV+ApI6C+6+2aQRMzZhnqXIi5Yyit0RrQ9sxocS3523jUZX542zQrm5kSjJJwKrGDm/tdRLh9Jtyt77J/jPO4oyS1IK3MwuwKvwi8vnZ1I1Kowc/ZT7EMtn8DpKoddTr7WZ4242OYYFfoS2vSWPil9Bs0//6Ay3HGQywh3ypLAkOpEEBOsBZluz3QhNkR1HGF1UgMPotJYfUXzDPTQjTHkUYtBiXeqLTTUo5UYgtAaIUWmYG/ieN1QyZY X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8a1cb113-3359-46e8-a7b8-08dd0a96f0f5 X-MS-Exchange-CrossTenant-AuthSource: DS0PR12MB7726.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:42:36.3524 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: ZOWXPQ9TA6lNmW5b6ZFVDr3wRcKccLG2ZFvVAdkcfhzyg9GWNi6zS3zJfevcBq/nlfK8lK7WU69+2aBhpMYdmw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6305 The procfs mmu files such as smaps currently ignore device dax and fs dax pages because these pages are considered special. To maintain existing behaviour once these pages are treated as normal pages and returned from vm_normal_page() add tests to explicitly skip them. Signed-off-by: Alistair Popple --- fs/proc/task_mmu.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index e52bd96..374e976 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -801,6 +801,8 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr, if (pte_present(ptent)) { page = vm_normal_page(vma, addr, ptent); + if (page && (is_device_dax_page(page) || is_fsdax_page(page))) + page = NULL; young = pte_young(ptent); dirty = pte_dirty(ptent); present = true; @@ -849,6 +851,8 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, if (pmd_present(*pmd)) { page = vm_normal_page_pmd(vma, addr, *pmd); + if (page && (is_device_dax_page(page) || is_fsdax_page(page))) + page = NULL; present = true; } else if (unlikely(thp_migration_supported() && is_swap_pmd(*pmd))) { swp_entry_t entry = pmd_to_swp_entry(*pmd); @@ -1378,7 +1382,7 @@ static inline bool pte_is_pinned(struct vm_area_struct *vma, unsigned long addr, if (likely(!test_bit(MMF_HAS_PINNED, &vma->vm_mm->flags))) return false; folio = vm_normal_folio(vma, addr, pte); - if (!folio) + if (!folio || folio_is_device_dax(folio) || folio_is_fsdax(folio)) return false; return folio_maybe_dma_pinned(folio); } @@ -1703,6 +1707,8 @@ static pagemap_entry_t pte_to_pagemap_entry(struct pagemapread *pm, frame = pte_pfn(pte); flags |= PM_PRESENT; page = vm_normal_page(vma, addr, pte); + if (page && (is_device_dax_page(page) || is_fsdax_page(page))) + page = NULL; if (pte_soft_dirty(pte)) flags |= PM_SOFT_DIRTY; if (pte_uffd_wp(pte)) @@ -2089,7 +2095,9 @@ static unsigned long pagemap_page_category(struct pagemap_scan_private *p, if (p->masks_of_interest & PAGE_IS_FILE) { page = vm_normal_page(vma, addr, pte); - if (page && !PageAnon(page)) + if (page && !PageAnon(page) && + !is_device_dax_page(page) && + !is_fsdax_page(page)) categories |= PAGE_IS_FILE; } @@ -2151,7 +2159,9 @@ static unsigned long pagemap_thp_category(struct pagemap_scan_private *p, if (p->masks_of_interest & PAGE_IS_FILE) { page = vm_normal_page_pmd(vma, addr, pmd); - if (page && !PageAnon(page)) + if (page && !PageAnon(page) && + !is_device_dax_page(page) && + !is_fsdax_page(page)) categories |= PAGE_IS_FILE; } @@ -2912,7 +2922,7 @@ static struct page *can_gather_numa_stats_pmd(pmd_t pmd, return NULL; page = vm_normal_page_pmd(vma, addr, pmd); - if (!page) + if (!page || is_device_dax_page(page) || is_fsdax_page(page)) return NULL; if (PageReserved(page)) From patchwork Fri Nov 22 01:40:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882600 Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2061.outbound.protection.outlook.com [40.107.223.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9A2101DE4DB; Fri, 22 Nov 2024 01:42:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.223.61 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239767; cv=fail; b=ne44d8DlTdEciZGjOVXwyhbPpTYzxQpyZAW7f+sHTBUPNSWh/MQWooMnvO2Rcx0MGYxsvYXMNd/evDs5UQs+QJvZwEYkCgtn77OVjFuicyoTl83nF5N/3FTydVxDYC2EaX2sII32wdJBpRS99cINvBofT6qNlL/KPG1V/Lkd/AI= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239767; c=relaxed/simple; bh=uuDYOYtLeYImPLZGtxb1CM6J6IIMGRWkau+AJyJsks0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=akoVrHOVG7ixN5rL3fISrMNWHjJ2rBMb3liI5NdOasz1XApr7uGQSuwAUEvuBRAxHlpndHgJF0DhkKtwsLfgpBC0hjG8r5UEQ4Q89IyEFwcy9B3X51KO+Hhxq8pyzzegNSGe/NvEz9sPeeE/PjyJtoGRbfu5lSRsvj2UVRWKlXA= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=c3VUCeoj; arc=fail smtp.client-ip=40.107.223.61 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="c3VUCeoj" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=vA4CRiuZuT9v5tT/N3Q8qIXkPC/Yx1pUAsWaoPZtzbmKfP6Jc4ZL9yNYjAQvF6fgeqXmNA5N6N9m4ywSodcEUUZ6KtW7VRWsHXQj6iaWYjOVnsSh5hjSjumgUofcIYNJFEgODJwP+SPFVNI4va5TGBmgv3bFKII+QTNzw63k6mGauhlL1sOG6MT16ugtns+tL8rz8/wsGLDJfBZcQ6LXWK//REaF2U7QK8v9FhYNfUThOJeWVj5rJNZtN1702ipqJhn2oFNWSomY0b7D/y3B3AkXA+MfjIA4V1ZaNM+TEzl5JEIX9WHwFjvyaWb/jNlD3MmtmC1OH6C7WVoxuH09NQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=e2xU6oSD+FHRjg+/4Fy6dbYJdX7wRP3wZtGIxoMZTGc=; b=XanYBd/DxP2D4LruCV/z6N5pDOUdcs9coREPv37BAboIpdTRCSjeA1u0SW37509Px3Xbeq79W47q9elJSbM5lUeL30daNNixgeflrcIOl0xlute+CuXel146Su/IiUZ1VQz5vh+mCkKZ6gPcUTlrgUqKhmA/qeei5VTsdk5qWDf0w67ZjdUvWa2/eWA1yUI/vJSfl6hORrWSa/FYbUY+ZwIeDet11T3NLtE17lTi4Y4Xu6mdInn4jbKQBBIpg6Km64HiLDo5Ol6Vupivw/+0PHDt2J45RHlyvpKf2QoRfbCaB3WkujuFFYEWoSjLJGqumYnJnL68ndjMdJ9AASJbIA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=e2xU6oSD+FHRjg+/4Fy6dbYJdX7wRP3wZtGIxoMZTGc=; b=c3VUCeoj6yF99BOq7WkrhuCX0rlQMoq+qEnuZCC3vH6QQQyj0+10iCT4URlVl/ouiaz1TYjDRyuHMj7n+5DY79AP5ofhOSq+qmVSvSG3F6bbiMNjl4GvMoYnsQKSEGC9WX9AY79a6q9UrNG6y2NviQL/1PtwLLgtL6TChwNgLhiCCdEYGSkj6IZA+RvX9Dzu+8rIC0rJo0aLUS9GcXdEwv19tT1a9w0ICymLwn27ul+iUcRZfDboHn7CsVxufWIeB/BgRNm5d+JDujySBZvzqoIMZdRgBQ0eJzjLU81gx1isJSBOXUwHYJhz6TVJsQTwYUgzoBXTo1rHV3P90d0JdA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by BY5PR12MB4131.namprd12.prod.outlook.com (2603:10b6:a03:212::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8158.24; Fri, 22 Nov 2024 01:42:42 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:42:41 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com Subject: [PATCH v3 19/25] memcontrol-v1: Ignore ZONE_DEVICE pages Date: Fri, 22 Nov 2024 12:40:40 +1100 Message-ID: <86b1894c81725688cc54afc8019ebb3cb3b6dae4.1732239628.git-series.apopple@nvidia.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SY6PR01CA0134.ausprd01.prod.outlook.com (2603:10c6:10:1b9::12) To DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|BY5PR12MB4131:EE_ X-MS-Office365-Filtering-Correlation-Id: a0400356-4ae3-4c76-9b27-08dd0a96f3ff X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|7416014|1800799024|366016; X-Microsoft-Antispam-Message-Info: BgTh3UJqMWcTWWQHMqyDtb11ZYC5LrHScWl00Z5dzqzn5Te8Ye/A281z6SIftYRtJESDk5IVKroP3dZNIyhalN/Z7holHo9JCKJKdNWSOQxROudA3+xTzEXrOcgxB1j147C17d5c/4bLnYerqIQJ2N/2f+yJbtG4nyXY+pkhbQWLYUyyWJJTjpyPyQ3vGz6NQNXWvE/1tMNStLHirNC1zdguN8v9e2tDTiQsoouqH9i2lkCNGDmCxGAtNt3nRoMIH98+yn5Mu9vh8vY0cQPbgdLFdkZj8dIfW49u236inM3V2A3NTxLoBuaKE0zyYC4OX/ptfJKm+Z9guDHj2qGTUgT9mwILmtJ+HPDffzrtvGFs94hKOhsU10gFL8q1zYeSFyj3ieOLa16tQkY2l1SYYNGECpi6T9unJwau9MuSmL7QBo++KaARY7cRQ/biPK2vSGiyuxKNhVeMxoCCgsQYbUD0t9FlSXN8xxPHj3HRHfxd4Md5wHnh+LPy/qX5iKezAHOjdP7cFGioKVoNV5Fi8TrPVVwiXXMDfYN3fBQlL7nM0et9/U6Vn+HxUMEbskGThVCQWeesGx2yFcHz0x+ka4EbGQ33b1d/ckQHje0rrBH2v0kFxKXSr9SqTFqddD5wA8F7lRQs1/XttQd1OgONFvmm03rASg3ydaOWDwW3pUy89Oja75oLNG/c+CMB3lqxQVxShtVF/ibM4w7d/Xg5C6cL38kmDWagD3lNqB2BB1heHxclWRfCYdUMzWNcOD2FG03ICAS8UNB2U572MXcKmAKf2Drsl1oUN/xBHPNwgID6l76gbKuFlksRlpJWHzeU7FFJ/yR0ArhDQgmkLeSOhDFbZygPj+t2D8Kr3dXi2TKKAaD/Mc2xg9iQMVG+fYqwqj+2aTUJ2QGC6SeWORMFjNXSinnNe0LKJRkINv6hn+ACGKRdyw/8N04fbL2gtr6tpdzDNpR1+N2zUdFwu3DSb/K454dIqAFmG5pOWowXUx1FeSgjhlWzuSALGYh7p5GbyJVUGRMYEWCefwuHulKX5nFvE6NCHgL5RC90oaIsT4pRFY8VLNNXoJHqyxy4GCe34t+DDK6OKkAzWqn+TRwMXrnl3952jRlsi/0IGzCuwA1GgihlSYPjjX/AopGCSfHHcCId3n7FGEwUM9gUOkfAqizkgvEdKiHEHELKuRi8Otirp5MDlJVmIXAqArAZiWNELghIYSH+WAWz8U7F6p8YKrHmBC4hncQgCyOVvPjEZN9JqiEcaCTnBY/mwYDhRkdZ6uQuaRILSSqao9Zgkx85GY6SVuXe4pRlwu/TfUP2BEe04+e6Iab+Aq1k+te7POM7 X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(376014)(7416014)(1800799024)(366016);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: gG+P7NKv1NBMcLXVSZFeSqIEzlZzrtsJwVSMNPc3SgFsqNKJkYAvpm0jtdKtg1gGL0SXjJuxEcqC0re+1IAijwxOtL4oLNd4I3/GgKy0hL5YejWATZDKBu2LNJG8ZJsgiY4M1QEXaJrj7HFpxz/o5XVGNp/ua/AsBf3bziGYuGnmaAaDfWfRJifWV3vgG2quHIJBFdrajBk+JgYMZxCUz2WFsmhok2NaLRnzbS9iQoPQ4Wqe5F96OJgjEIJL+DG5MK56nCxupQkKOdiCLExj8zJp/2UQQ7RdvIiGcgmkOqbQl9RUe82V9/GFrs8H/e7j88rTT4dUj4KlkEdphwieULNcgUrDI1xIpTVleiTwMlQCgqO06rwr0tzHUwLgDq86Qd7DTu8Dq7jVhqpbvgo4xFZKQHeaGW0T1rG7PiKAQL6OFLC3O81zqctvy0j/Dg2vvtRB01fN359adXdv3bRMiq71BqnRs9U05GtMLqzVuuuwALVA6zfJMYa+dYJHs+P0k6B5KnU5kVbRHV7luZUvI84iOcGfOOpa8mPzkArKYS0zkkoE6Wssuqgeoerr4+0ce4ov1n63o4nBDk6iPpZQ9bmxgxowy6nGIRfzNyodlpOoNgUEIT9T/m55g62aVi2yU40AzFbyLZY0Z90RC9H31Jv6IHFZyGt7ZLoWQ31jpwXTUSdb6mdFMUrN4ULyXt4SpM9vv68bp+rXT7cjT2ByaghsHTJH9iLw7EOJlMg+G5sRWnjkLttgXuUwDRY5cEo3NA3/JL+FuSV7bAT5HJjJLGOya141SpCfacd7U2viWXaEL33icFAuJZDjvqYvruyjJEBe0+NSR5Q+MYSE0Te7zh5jOVXHXuEKyDw3o3IZ9w2IQhW0WRuNh7WIkZdL87CD2KSTwLHHoYovi9gPyFMoWPMMpBdGmxjRWhVqhFT6z+nMXo7weZu1JwXXDCOOw0qBB+0TxQEmgEGuOf27MA1cQY/hkOp2tvCTOgq+7ZZWLNg5kYOQfXmRH8BRAW6X9+1242WMn0Vap7yct62zVBXZbw9JmELCO3QDYQ2RcHdwCi+t0+V+wMV81H6DwfP6XSncXa7OeZEmO1r9QYwz3VaslshjdWMOXiJZnI94nbA540P72HPhHuUTXjD+cIuVmFb4llPQ/Npi9fgq6q8z0055djhc4NAS6eyg4Wcad1P2CgRu7o6FXfxGXt5OP9TRzwtNKq95XPVcp5O/sd+BuJuu3POXK8Ch+OLq1qaw/k12zW3HfKmy9PSuZV0atWKsjZ2sytrlM5kc0LSJHXMyFEg9vVRypTmf8LGISPkSZ36l/34cLZzMsegufucz7GYz28DVcQ4W7YlHS2n9TFQlML8maq0jOh6I83xScwuALPJgVWqTHvVGFAz4Oi4TCWEyyQ9WmJy1iwNy8Qm2R2gjaFrofmhYbW3/CEwNrTG2u7Mp2gavofbRG02nrjCSr1fO9/xaTCb+I2V0R3nCrF5hfNHsKBSbANIv8TOb0KztYJ9K+x/h0+5AoGUvnXuZAx+/2PuSPA76B07zbIs1P/fV7IPSeO+iAAf0gjTPZRH8OGsmkh5Rzx7TMBpF2XrRmOSBNsiz X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: a0400356-4ae3-4c76-9b27-08dd0a96f3ff X-MS-Exchange-CrossTenant-AuthSource: DS0PR12MB7726.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:42:41.2651 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: RJnoaEdfL5nvpmzDUnjqmrC4/KNz1GaFsgxUDhSTMgcAKYYgN2R/jHcWvWEtBmQdN03aJABY4mS4f01AbBYs/g== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR12MB4131 memcontrol currently ignores device dax and fs dax pages because these pages are considered special. To maintain existing behaviour once these pages are treated as normal pages and returned from vm_normal_page() add a test to explicitly skip charging them. Signed-off-by: Alistair Popple --- mm/memcontrol-v1.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/memcontrol-v1.c b/mm/memcontrol-v1.c index 81d8819..b10a095 100644 --- a/mm/memcontrol-v1.c +++ b/mm/memcontrol-v1.c @@ -667,7 +667,7 @@ static struct page *mc_handle_present_pte(struct vm_area_struct *vma, { struct page *page = vm_normal_page(vma, addr, ptent); - if (!page) + if (!page || is_device_dax_page(page) || is_fsdax_page(page)) return NULL; if (PageAnon(page)) { if (!(mc.flags & MOVE_ANON)) From patchwork Fri Nov 22 01:40:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882601 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2085.outbound.protection.outlook.com [40.107.243.85]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6141D1DE4FE; Fri, 22 Nov 2024 01:42:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.243.85 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239769; cv=fail; b=Q9AqoOaShcebBDmHpMJ2nUOCNA+mifUl0GHYhf8G6/RXbGuSNNyEiWCKVNuk65w9PoBpo7D1WDYoyxoRfBl6WMrMIxqgeDm0mJjmL/NwJTPC2UFzd6iiJ7AoQvUG/SAF2SOBOMTBQ0iZkJH9wLkmn1pVbLWsv3M5YnO36zq7i6k= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239769; c=relaxed/simple; bh=TL3YpTkyvb0TrsNdadVavaPZVXqghI3XosqqFtixP2w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=fgUxg3Zdjm3UofdT6Yv0EyqK9JMohQaCYeJSjnAMVwS1uV5OIPlwsOb/JZ2daVyp0jxpk+F6qEeUs0aMeVTDeq+yOap0giGHwvkKHVodpltY2jZKPWra54ff6lTPQTdmDnnVTuiagGsBS+eJ3vryM6eCl/VMYUouxHntJ9Am3bM= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=FLBGCStn; arc=fail smtp.client-ip=40.107.243.85 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="FLBGCStn" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=PgUmX/DVdqcIn30PYD6vuZuFZI2UsPq4xfQgGT7nNQS84DK/7pH2E/+8KjMsUhKP6cygPBSmNeIfpF2cp+CKSg3xmBO7jP/0Lq9Qx7AQQqPKptuKaInyJ8ApZtQETOpAvElCjxIA45gEZXXRl9TWo0JdsHFxAaLB58mAC3iXEDinRJcYCfabv17aVx5AAjBnww/bfohXDmQBIVxY1BnEc8U6RM0lwFtG+rf3x84QB3Um5H1lvH0Hokrk3+Iv24sh/QuPc+6VZI0tmfUhkzy6QLYou06wfv4ksmGfQWqbLh5G+UcR9osVCre0lzelFvWXch/LpZ+RLakld0zxU9cIPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7Ie2UfWwGOlQGgWf9J0Trgw3+Mxxvp09CYeo6TMsj7w=; b=fZeLGvFdGgxG/D5ygjADVYrUKtTFbQXhRTf3RuOUer450Pa6Sj4UuWz5V9gaDaSOKEcKGFxwyz1fEn5X+nBsd5XGSpDvRF0SDTGUAAW7WV3H24jrwrSKaXT1FHcz5tPEy8h8FLefuaR6f+a796LOIn9YYB4yiNhPnS4b3rAfc+ZoGiIk0fbda92FxTE4MuhLlwQdy88U34z4IHqBduLoFXKFu7GA/Y/1F4DM9SZjL6uMZ1evW56pneXlnVwWOpL3vZECTZLT8UBJ6re0I9UioD+vzrAhwlpHgT3DLZWulQ5fBexQr0O/OpyeEL4PljdlujXQ6zBX535zL3oqFyYJkw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7Ie2UfWwGOlQGgWf9J0Trgw3+Mxxvp09CYeo6TMsj7w=; b=FLBGCStnKRvqHLUczft/RorWlDtwirw2wYyg/3V9o22N+dhgfDPEhg/sRlVrcozqX0WCjc4+rV7QqAJfC5vVGOQUBWYs6Vdu6hfhDrW2A6xcVfkp2YYywaKCyZuADvVGS0/X2WXmSrrvdFl+b07Dt62FsiVGPcicz8ZCB14rT8JSByYq9CoTWdxaxbhbbcYDrDfLRPNu7IjchelKScf/9YL1RC1XcEL0n4dU6o6JjXqAXHEuCQdIjuhuJvSJTjq+pHnYTV3Awfpcr0Oxm8Yp9Y1csii5FqKYCXQxkI9PnmgLg1N0gBKMXcFO8JX/wk9uXCyiKWykL/aEYk6crxK9nQ== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by BY5PR12MB4131.namprd12.prod.outlook.com (2603:10b6:a03:212::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8158.24; Fri, 22 Nov 2024 01:42:45 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:42:45 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com Subject: [PATCH v3 20/25] mm/mlock: Skip ZONE_DEVICE PMDs during mlock Date: Fri, 22 Nov 2024 12:40:41 +1100 Message-ID: <0fc7bd0c1f7aefcc5959fcb2a1bfd5a64e022d57.1732239628.git-series.apopple@nvidia.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SY6PR01CA0131.ausprd01.prod.outlook.com (2603:10c6:10:1b9::7) To DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|BY5PR12MB4131:EE_ X-MS-Office365-Filtering-Correlation-Id: 4f782ee2-3bed-4b2b-6020-08dd0a96f681 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|7416014|1800799024|366016; X-Microsoft-Antispam-Message-Info: VZvWrXcGhNKiNbJGQJpvjpoi0c6zD4a8i8ZULLHWFVG6xBQoySPjOKeyzbFPXfupxtgo3rEPGxAewS1WriJJ/SNfSrxILkihjmN+HijxPIccnUuOYIGS9eiFbwAqCC9aMpgmEp+nddFpBBbiT5YV7fTOOuOCOPmQ0bDqDX3STB1BkkZKHMJMYApamUNZvswCSdjaHR+ReQJ5YtCs9e1GBb/WvuRg0gQMNb0hKF+3VOiGeJPahkJqzlNNMZRKoDO5BWBy7tFJh/7mp2AqVEsGusxTEsSbais14dgPsLTe9WZGCW5hsnn9xu1uJK8kObci5b1iK5rdY7UL0TiXcLReZ2sLzv+44QuE4GgmePthOc4fY+W3CnQypaog571BAjbrNIMogqBVp5MmJmHUMEISOKs6Z96XL1JWhHh+MxD8NOePqZodGGWEYMpBkCeYEi8oVum/LVzc3JGyO4RVtn31HIZ0ETPPzj6mNG1j8vjT8Td2h89D8R+iwNI3494s9x9UEZUIE5Srxdkiap+KUBdRarNcQ51J1tWdENCNFt81FV/EecuwF9KQlctiWQ02w4IQ9eDbqJoFfbtKz6Qw0vep6BtsBffVphCoEvZSUHDe750LLX/r6922KvRKZm4d7eJpCzOe5aekHysgV1ATDhcUZ4Sb2Ok/PDnoZDEgqlXTW+yXod5yO7XFfjaaf2gFWzAwrOXUu7A8dnuRgrY911dexOMcMWaE0dCq5Vto/mK+Ri5ZAwY/xc57XtmUJ17pftIy75E1/4HodWGU6CZJepcQ9pmI9eD1FEkPZcVNaIKwAB4YGHAGSk4zpU8o8o017WQBwxDmT2bzIHkwAmyIjWy0dduvzGPfwWT5CUJrEHeyB1ENzz0/1fVgUyu7beosAA6V4qPbVlReKJHVYsvpYXA1YBEq/sbsVqZwQ3fi37+pomUY/1o3fIRW60jEDJyofTYhk9c9TVowOlqWZHuJpXFKmWi/M6DeRFVYJu01MdKe8AaDxyPK7M/vlYeeejxa2P0RPDOFWB2toPgh+NVCh7dZuc+Vk5aDbIjiX4BX6fYddeq7gyzGqf2b9VkCmcG2AYU8hDKZmkz0aQe9XNHkTcesIgvyz1dI0lpotzXxrq5x9gVJL8Aeda+Ba1fvXdTTvksWTfJ4Trqdr14xV2fHy30r5DHbBaa10aWrybJplP8QPN/vRQyi/tOYzu5F7w3fBR/VpjcPITcQCYVO8sPx/r3CNPkQsxmHzaFCtM6OD1bPY6n+J6pJeYqLUvGwRTsfZTwujG3jbp+x/OhX3M7wtDtw097qjaD+xM7b8gKbkYImuIXuUodQU6ryAVPmUkB1h1vY X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(376014)(7416014)(1800799024)(366016);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: k6LNpSfqA3Vxz/i4L9MDNT8Yxcuxgoxf0Y90r8+SSfj2uJYNF7arRgldAlcXML2F+GVm9/viTMd/v1e21HOkE2BVjIahI/Tye2aLxHX7gDNIjDPLIDMZ7P1RtncC68yTj0YHmKMT9eNZdHXeVRwboMtNDTyujkm0E0gUuYxnr4BnQGYI7csGSdmlDBasn6ggLwmhFfKMSrgCg0N7VYN3ALLkerghjRHs4hyj7LCFzQeRbsjA3SpeW8V5HZNRYC0y9w7qTFN4almRvpd2+q5pccR8ZD4UbMS5PJH8z2zUJkFN+cl31x9/ZCleP2w7c3UkaIQ3mGyJBL2Hi8j2yLbswm864EuiOKH9bTyyfY+L1faw1ifzlK1ZgZoaVnrxsH6kL+tyrof5EsX514n6TmL1wHQDoKxHQ1JRAZZnbR4rmwkOc+cZHSnRwzS9NtdEuj28dOGfNVUPP/m++zLyS/m4xHkmhu7+UrY+2IpAUPGiveSbl2emSCDMbPniQxmeB1JqaFFCWlEya7bcrptweZ/kG5ll7DlSmqu+17TGuby/4e5rLTq9bjH/uQIzF0eSvqpwLKBsnv/THGkPORpeOQYMWu7xKtXRMULztHBYVgU9mVGMHetgiw7j8/tO3AV0ZBoeCKT49yctmC/AGW6oPR2XBmUD7PFVZSfwPOQF04AZYp4306TAJWCRLKnswhetmX16oWUejRwk5CJgv8QhqafDUTXEbuSjOPaZMNQuAAPP6D8dSCxYL2MWOQj86DgGGO24s8x+h94Nyl/JJsMHjpO3cShhgO7MyHRDbhsU7mdl8fKqR0i6TNP550cdVLPUIqL8c1L4jZN7fdLO+HaQwZRANibtmRkldSAspvrXqquiThhMqsetvKTZzAYUhoQqgRHhVCh1NzYtGNgUz6JNMIQabd64GFc+p3LbD7lN5DbH+sAvcIqUZBynEhFRbZMkpUh+H0O7EY0kBTgClZch4CtW053iim+N3JUHCbbe6lT406b2FsWoog0rBv5bXdnNOW56374vctOfUHTevxQQzRwUjsEc75Y5fyXl41L3qX6FBaHKkrwZwcMpSjqed4P48RVyCvTKWuf6Qjho9YEJzp5S2b1tjr99SUgo1UoKMySr3re/4mMVTOEnLFq7oX7QBv7fQ42pyre89ScmZyJhQ0Bk7moxOSeFrmocmVpBiL+A3zojOOhbhqNQmNBNtyrBsDE+KZc/GIaEIPV3h/GEgpwGo7eig3IGsvYFKT7xK4lqzP65u/NZwJbDXMDIEXrXs0iR7PujhLQdZeDeecpmDIqgCf9FITFOXtxluV/OtbC4HlAyZcFoU/0gVleGyuwKdz8S0ezH9SP1owT2QZzGxYF1zWqdGmo/keax0q2ilcCmaSV0f4VWZI/NFVW1/bVMCa4Gr1LaLBUzzyrcj62Sk3au0CIlBUBQnlYPRDTZEXKPEORcCugVihEdXxtxIjgmxXtx1z/wTE/uqTlcH/YSuJRaXnGnhCwy7KqQWd3bCn0r6ZXTCZ6vvYQtzTOpxZRMQpndrjlN/y3jnuuFEqpEfVCfjx4PYZVNQEfJDjuql1fs3kLLpfoWUcJk2rqIK1K3FLeb X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 4f782ee2-3bed-4b2b-6020-08dd0a96f681 X-MS-Exchange-CrossTenant-AuthSource: DS0PR12MB7726.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:42:45.5109 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: z4UXWC02nP81RMjXliDt/cKEj1mEb3zFNTztGu01lZo2GZTcLdt/3auaumYKgoqnfzI0njuybDqO30gpgqp84g== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR12MB4131 At present mlock skips ptes mapping ZONE_DEVICE pages. A future change to remove pmd_devmap will allow pmd_trans_huge_lock() to return ZONE_DEVICE folios so make sure we continue to skip those. Signed-off-by: Alistair Popple --- mm/mlock.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/mlock.c b/mm/mlock.c index e3e3dc2..567cf64 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -368,6 +368,8 @@ static int mlock_pte_range(pmd_t *pmd, unsigned long addr, if (is_huge_zero_pmd(*pmd)) goto out; folio = pmd_folio(*pmd); + if (folio_is_zone_device(folio)) + goto out; if (vma->vm_flags & VM_LOCKED) mlock_folio(folio); else From patchwork Fri Nov 22 01:40:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882602 Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11on2048.outbound.protection.outlook.com [40.107.236.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F208A1DE4FE; Fri, 22 Nov 2024 01:42:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.236.48 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239778; cv=fail; b=JllXBASXJNdE06w6YzXZzCnG/GfIb2tg4o+FHDBKQKuHcb+0aLV/ugQ98dr7FgSN8MOlL5RwEC85lgRY7IxBjO01C6RnEdLENUgl32YK7oeMHy4xeiiLuYBRtvKUdFhrhfMswsVksNPXoZKh5N4HNkmzu6/CN7FirJ/gvxEfWv8= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239778; c=relaxed/simple; bh=IXrwvdSUmsSy1KSIN7Qq/Z6stvtQxYeoKfH4FLsWF7s=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=AUrYZ+ozU2qX9yGwxVd0+JGHjTKjWrwkDQHxdoimMVpDaJ1xjHUqxE/wfHWv72F1qkkwivZievqbXInhTVDPdRIlecCLvBCT/GUm8JdhZx+SYnmneMa+aorZEsNEySbmzmfrrtFxjJ7JHEdM1ofRCUdxWEsfYE8KmTCItel9e2o= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=qt21ePJo; arc=fail smtp.client-ip=40.107.236.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="qt21ePJo" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=ctsLsdqO04xQD9WErOHJZxc89R1LIHblsKGGpOcIFQ/fOxAkZidEtiIMGc5XkezqvVpY2j6m2iEfkdDK/gtfeo32lEc6PhbyrcJIMcTW4hzO2t7CHIXuCCMVjjYpcIizR0oolqpWsWA0wzZ5gzzGuXi5lXtG3v47TwOT9UzS5GFG/K97Bx+4nasou+unGABsCw28fT/NeSFoNh/0JnyNBTsmm4a3hxM9tmF58a56YSAmhgU+fVIccfIdoDmp82i5s3GQOdEKuw5oPlRpVr1+AR59/jvtwFOzrs6oReob14yx6Ya7Jl38wTp6FaMOB2saheTyfXI3XcdWUmzUk+PgVQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=rWDu2JtnIVZqou/gv8q22HEsDaWS10l08SaCQCPDMDE=; b=BMZPRPocE9vacVyfbHVimsc253FaX0h0o8/2Uct0dUGcdDg7ew2dmnCTf2A5Uq3qN9PH0xEtDgUMeORbLYG0GQx+eg0A2NdCLrNqb2N6b+lME6MD/TIrCLXtIt4r/rK85H6lgMWpssGz8ytFbNULbj1I8Qz2tK5tFAmoYNieelAYENFKYp12J7CmTN0uC+i9aMdqiCJAfOcI0n5+cE0+8/fza280/3E18Sb6+EwLHkI/a0ucPc/ib4sqJ+pz0g8YvGKFRldHgXGt9Qgc6v4AAcuSjd72K8tyPWzp7daGVaX79nsSi2E1woVaLvHJG8Xj/Y6faKkugydR9CidtabCcg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=rWDu2JtnIVZqou/gv8q22HEsDaWS10l08SaCQCPDMDE=; b=qt21ePJooRIXgU7CtPmEpGcZ1DtQnnG6ogZpqbAM98NzmMjWFoUowMI1oWrrcfy06+5Gv9gjAPQe5+6OuWxqd4lJdnSgfw97O6Tqj8borj9AohDAeEqpaPcCUOYHwaoLNwzjMRhSzpMqps9V8ULcfFjIsVHFUtUEN73W0xY9Q2rNhgSq7JW/fWH1dWDLsMX8NuqjW+O3qBkngsr+qjtyGVRDWXFofQTvI17ospZkBDRKOMFswfa2cAlBqSXxOSWarFoFInyFyIfBt9zBoJ0MT7SCVvvjLkMNDxS2PhDSVv4PZSVlm+x3G7IU1jUm/R2j4B8xJNM8HtKJB80izAh7vw== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by BY5PR12MB4131.namprd12.prod.outlook.com (2603:10b6:a03:212::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8158.24; Fri, 22 Nov 2024 01:42:50 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:42:50 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com Subject: [PATCH v3 21/25] fs/dax: Properly refcount fs dax pages Date: Fri, 22 Nov 2024 12:40:42 +1100 Message-ID: <39a896451e59b735f205e34da5510ead5e4cd47d.1732239628.git-series.apopple@nvidia.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SYBPR01CA0180.ausprd01.prod.outlook.com (2603:10c6:10:52::24) To DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|BY5PR12MB4131:EE_ X-MS-Office365-Filtering-Correlation-Id: 2bcaa455-2f1b-43a3-a188-08dd0a96f92d X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|7416014|1800799024|366016; X-Microsoft-Antispam-Message-Info: V5rJZFCo4tGiDm7QWP2qdNE2PawGz0p7Nk5BKhVIYPAyag3pUTV0qq9nprarV7W0fUqonStnsq1sUG4biLSb5IexJMFuslcGhaLvuJ55//Oh7m1K2+9mhWH8Yl794XIQJ81pwHDj5KJIcNqoPPKlSLVrnwEHxd/M4aKiqVL/cbKtsxXv17wzmrqvmRyYdpCigH13uMxtlKfRJJ6XaV2GRnz9SFsOIPYjS3qipLVLV4SehPKXEe+CLlux++7dsfa1mFEP+bhVmeVEQ07y7/BWxzH36A7YSjQGN63Ro+ymmLpflBvA3KFCcprwurzGsrx0/mFTtDsY4QXNXw1AH+6rhjh/8T9kjnhlT1Sja8SfaX2Z8ypKgMhCUbstVnM8sgei5+HXiRDmWdylbcs5tHGpCqyi+UmRBKXQErGvOSr+cMith1XSN8XJkOIZugRooyFO2yVlbQ70++Oyn4TAXQSBhY3Fdjc8FMHd0MaUQbUjVLnhQKDshW4rfedgkajZJioL3PucHIMbGnLVBiv/TBOS/Fq8zO+16/Jjs5hjvpuU7Cyq9CBFZq52JOuJdEe0z0amkJmBYq6G92QeJ+OlO0YSG7Dd9AMqYad7veYL1/aLXLYP8c61kPdc8U7izE3HyI3z9U7FUspuyrMzsFNXCmRCaGtFNM2pOwcnoZlXJvK9H75E4XX4UklNwjOEpXn1fzL0OY3mrD4tkDo1nm9gOr+WO/Ip78bYidawMoi8FYztAMAHgKyCa0khpE4nd7KDu1A956hDa8rL2E+gRv1WJOqAXSYphhASEzNKdRQ9bhoB/RKTn9dZMAz6noLX/BUFRXGkm8DruYedtRtr2bpDXkYgWm6RcFZ+ltk3+FMoJGoKw74Vmaw3o07K+/3KCsjH0CAkXaYT3bXTtE7ra1uf64DNIHwzd90zLnTQf2ZxinFKD5yvSkKj0ryHP+REntTFykhZVXgOnlGanJdPARcWjfdNkgInwWo3TwViPqt87itV2FgX0DcbaBQoeIHp+NRNrmmGb7Yp9AuW8MQRvOST+n5sohVFWCWU15e6ghqqtruWhPRCYe5pC2ZloAR1HfKvejAJgFIVFcD+fDAwTI5/qK4lvj0VL2d4bmOsk3+Qa5rWm9hySkCfTmz35RBYYdHbDjiJXHfYkoCg1aSISi25GnNYEUIWCQgnu1hGT3HICwq/msOk/OuwALusPz/0KQ951qmCdBZh4bf1KFEjdongqn9FTy7VWNrTYUHvLMSRB9bqhoB1GBvyAsPs1sdqb9xaZHSuNJAj+z4fVS63rKRRuCSM+H2rwbXQc992WlZWQy6gLjAkCXPBuWEZvOoNco86EEzk X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(376014)(7416014)(1800799024)(366016);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: ZcfS3ODF1O7QsdHS6yihE8PHYTd6hdDT8RzDABNoARibCPcadufNcCOpJbH59qwhqLj9fxjDgB4wsA9fR/7w7PilxjoJkbezj20R9quCVbqJsH2rAGmEWRLunuLFOQC+YnTTitV8f+2f4+wK/tNdi6uJhOzzE3XU0x1IicMYj4dvdPPGKaOLe9Uk0qZA6HXDmSZeB6+nSOCruxX2xxaIEVWY2/43Jz7U0tE32PZi2J3w5jGyLA94hcRt/xLOVMdZbvWE5KW+2V7rUFlRZWI7UiwLbjEdYVfJiHRcRM8gRXFFhqoLYrlqzTsy5v118lw65Dktic5yqP4IZLKn6avGiLkBfjL29d2Cevghrc286hJjVwOMgrmm9CCCKbzDMtTajn/VQSTHfzobk+zX6PaI7SV0/LGAQaeBb2WHMVsiedN+Qsdtn8zAnGGt2dgmzg7mYhFJer6FAHdaYMyUARuvbzro/FoPi8RS/KB1oZlsrB+Cl4HwhhHBSb6qbrzBe0xiTOTsnnAMpYIA2nug4XOA3kc/lcfWFalg+uxDphEEqJyXRe3DwXTfMzzKq0nEAf9pVr339Gae8CouR5LEMakeNCIKO2DEIoBThVuzpS3IzrZFqr+LkJFMP+xbn33g4xkbtGVFqe7wiZ688pRkCT++sCneX2FEZR/HXqxhSt43lQQc31fQuA20acYHqnpTWlUDKPLLo9Fp6i6kLBubHW4oRw1JA9yx/IPiaQPWj6DV8CUMwmg4LMyvkvXbLL6Gusnf7dU0LIbQmghU9ygdWuGrDPR1jvWE+BLe7sDjBFYXE1mlWFmD0tFYAb0YNNjPCAIXeVrLCNIQi1NWbXzmQw98iBczJaj7sJDeTgVzH58hQgX+NLbjugt7I2b6yiWWNaJWRJ+jw1OJTo4NbrxXis1BlZRkA4oeUZRlZdvu1T8hzmuqX3Trmnc7BAO2uM9pzE5MTltLd5AxvurfbNGubFBPcIG21lplfkiJ1NC6xevIpoa5aGZ9K8TTEvXOpondyPuGeQ/SenJuSC+5TGF0zLneWtZ6i+OZmk3/8LXdddfHUKr3uPYvTmiDLW0sV9nmeM362dq7pwPqZkPvMKuE8uLH2MQE6E4s0nrYYQmWO/ZW8zfIOQ8FiOVTuOsEs8PGcloFKL70JkN/oGKSNjO70KqfY0rtLBLNCQ1du0M0EsVCdQ6biUzlhjlJUhSMOof72K3QgvUeK8XZVAXeKbJ1pWu0jewA4wkbkqRr2SJvjRj6HbzM4lmChaWimfWv1nMOO1sLEvK9Nvf5l/A2r1vsWzMcM/hMMtQ1TiFp/zioKyfl+z9OJpz0rf9hqkAG92tBRffyoKzZR/px6SbvWlo/8BK9/mYAY7tgfKpjdL3z/CCc58UTuQkI6SEvGgtY+eqDsgq8nj/ectcQtTsBu0lKqAehTlMe0e6ol9cQCyi78YBIJ+9QhD7pS5l85NA1Lw35EW+c56hemnRPG7zTmUr1RId0PwyXKcghMlpfJYD8sv3UspCrj0UnJBzafIZNM40c8LcxXOSg0O1orT5PiCtMx4oLLIGo9TNRuUwrjHZ9jDfdA8eEUAMctkvcJleS4p7s2Q3v X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2bcaa455-2f1b-43a3-a188-08dd0a96f92d X-MS-Exchange-CrossTenant-AuthSource: DS0PR12MB7726.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:42:50.1915 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: GIEVgE4gjYG2kd7T/b3VkEcuTvOa5nBQx8vxH9Ht5P0bCYOFHJFBQpu3TMIw+e/MiGm7ye4CuMBi+qP+VpPzeA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR12MB4131 Currently fs dax pages are considered free when the refcount drops to one and their refcounts are not increased when mapped via PTEs or decreased when unmapped. This requires special logic in mm paths to detect that these pages should not be properly refcounted, and to detect when the refcount drops to one instead of zero. On the other hand get_user_pages(), etc. will properly refcount fs dax pages by taking a reference and dropping it when the page is unpinned. Tracking this special behaviour requires extra PTE bits (eg. pte_devmap) and introduces rules that are potentially confusing and specific to FS DAX pages. To fix this, and to possibly allow removal of the special PTE bits in future, convert the fs dax page refcounts to be zero based and instead take a reference on the page each time it is mapped as is currently the case for normal pages. This may also allow a future clean-up to remove the pgmap refcounting that is currently done in mm/gup.c. Signed-off-by: Alistair Popple --- Changes since v2: Based on some questions from Dan I attempted to have the FS DAX page cache (ie. address space) hold a reference to the folio whilst it was mapped. However I came to the strong conclusion that this was not the right thing to do. If the page refcount == 0 it means the page is: 1. not mapped into user-space 2. not subject to other access via DMA/GUP/etc. Ie. From the core MM perspective the page is not in use. The fact a page may or may not be present in one or more address space mappings is irrelevant for core MM. It just means the page is still in use or valid from the file system perspective, and it's a responsiblity of the file system to remove these mappings if the pfn mapping becomes invalid (along with first making sure the MM state, ie. page->refcount, is idle). So we shouldn't be trying to track that lifetime with MM refcounts. Doing so just makes DMA-idle tracking more complex because there is now another thing (one or more address spaces) which can hold references on a page. And FS DAX can't even keep track of all the address spaces which might contain a reference to the page in the XFS/reflink case anyway. We could do this if we made file systems invalidate all address space mappings prior to calling dax_break_layouts(), but that isn't currently neccessary and would lead to increased faults just so we could do some superfluous refcounting which the file system already does. I have however put the page sharing checks and WARN_ON's back which also turned out to be useful for figuring out when to re-initialising a folio. --- drivers/nvdimm/pmem.c | 4 +- fs/dax.c | 212 +++++++++++++++++++++++----------------- fs/fuse/virtio_fs.c | 3 +- fs/xfs/xfs_inode.c | 2 +- include/linux/dax.h | 6 +- include/linux/mm.h | 27 +----- include/linux/mm_types.h | 5 +- mm/gup.c | 9 +-- mm/huge_memory.c | 6 +- mm/internal.h | 2 +- mm/memory-failure.c | 6 +- mm/memory.c | 6 +- mm/memremap.c | 47 ++++----- mm/mm_init.c | 9 +-- mm/swap.c | 2 +- 15 files changed, 181 insertions(+), 165 deletions(-) diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c index 210fb77..451cd0f 100644 --- a/drivers/nvdimm/pmem.c +++ b/drivers/nvdimm/pmem.c @@ -513,7 +513,7 @@ static int pmem_attach_disk(struct device *dev, pmem->disk = disk; pmem->pgmap.owner = pmem; - pmem->pfn_flags = PFN_DEV; + pmem->pfn_flags = 0; if (is_nd_pfn(dev)) { pmem->pgmap.type = MEMORY_DEVICE_FS_DAX; pmem->pgmap.ops = &fsdax_pagemap_ops; @@ -522,7 +522,6 @@ static int pmem_attach_disk(struct device *dev, pmem->data_offset = le64_to_cpu(pfn_sb->dataoff); pmem->pfn_pad = resource_size(res) - range_len(&pmem->pgmap.range); - pmem->pfn_flags |= PFN_MAP; bb_range = pmem->pgmap.range; bb_range.start += pmem->data_offset; } else if (pmem_should_map_pages(dev)) { @@ -532,7 +531,6 @@ static int pmem_attach_disk(struct device *dev, pmem->pgmap.type = MEMORY_DEVICE_FS_DAX; pmem->pgmap.ops = &fsdax_pagemap_ops; addr = devm_memremap_pages(dev, &pmem->pgmap); - pmem->pfn_flags |= PFN_MAP; bb_range = pmem->pgmap.range; } else { addr = devm_memremap(dev, pmem->phys_addr, diff --git a/fs/dax.c b/fs/dax.c index d193846..45d04ca 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -71,6 +71,11 @@ static unsigned long dax_to_pfn(void *entry) return xa_to_value(entry) >> DAX_SHIFT; } +static struct folio *dax_to_folio(void *entry) +{ + return page_folio(pfn_to_page(dax_to_pfn(entry))); +} + static void *dax_make_entry(pfn_t pfn, unsigned long flags) { return xa_mk_value(flags | (pfn_t_to_pfn(pfn) << DAX_SHIFT)); @@ -338,44 +343,88 @@ static unsigned long dax_entry_size(void *entry) return PAGE_SIZE; } -static unsigned long dax_end_pfn(void *entry) -{ - return dax_to_pfn(entry) + dax_entry_size(entry) / PAGE_SIZE; -} - -/* - * Iterate through all mapped pfns represented by an entry, i.e. skip - * 'empty' and 'zero' entries. - */ -#define for_each_mapped_pfn(entry, pfn) \ - for (pfn = dax_to_pfn(entry); \ - pfn < dax_end_pfn(entry); pfn++) - /* * A DAX page is considered shared if it has no mapping set and ->share (which * shares the ->index field) is non-zero. Note this may return false even if the * page if shared between multiple files but has not yet actually been mapped * into multiple address spaces. */ -static inline bool dax_page_is_shared(struct page *page) +static inline bool dax_folio_is_shared(struct folio *folio) { - return !page->mapping && page->share; + return !folio->mapping && folio->share; } /* - * Increase the page share refcount, warning if the page is not marked as shared. + * Increase the folio share refcount, warning if the folio is not marked as shared. */ -static inline void dax_page_share_get(struct page *page) +static inline void dax_folio_share_get(void *entry) { - WARN_ON_ONCE(!page->share); - WARN_ON_ONCE(page->mapping); - page->share++; + struct folio *folio = dax_to_folio(entry); + + WARN_ON_ONCE(!folio->share); + WARN_ON_ONCE(folio->mapping); + WARN_ON_ONCE(dax_entry_order(entry) != folio_order(folio)); + folio->share++; +} + +static inline unsigned long dax_folio_share_put(struct folio *folio) +{ + unsigned long ref; + + if (!dax_folio_is_shared(folio)) + ref = 0; + else + ref = --folio->share; + + WARN_ON_ONCE(ref < 0); + if (!ref) { + folio->mapping = NULL; + if (folio_order(folio)) { + struct dev_pagemap *pgmap = page_pgmap(&folio->page); + unsigned int order = folio_order(folio); + unsigned int i; + + for (i = 0; i < (1UL << order); i++) { + struct page *page = folio_page(folio, i); + + ClearPageHead(page); + clear_compound_head(page); + + /* + * Reset pgmap which was over-written by + * prep_compound_page(). + */ + page_folio(page)->pgmap = pgmap; + + /* Make sure this isn't set to TAIL_MAPPING */ + page->mapping = NULL; + page->share = 0; + WARN_ON_ONCE(page_ref_count(page)); + } + } + } + + return ref; } -static inline unsigned long dax_page_share_put(struct page *page) +static void dax_device_folio_init(void *entry) { - WARN_ON_ONCE(!page->share); - return --page->share; + struct folio *folio = dax_to_folio(entry); + int order = dax_entry_order(entry); + + /* + * Folio should have been split back to order-0 pages in + * dax_folio_share_put() when they were removed from their + * final mapping. + */ + WARN_ON_ONCE(folio_order(folio)); + + if (order > 0) { + prep_compound_page(&folio->page, order); + if (order > 1) + INIT_LIST_HEAD(&folio->_deferred_list); + WARN_ON_ONCE(folio_ref_count(folio)); + } } /* @@ -388,72 +437,58 @@ static inline unsigned long dax_page_share_put(struct page *page) * dax_holder_operations. */ static void dax_associate_entry(void *entry, struct address_space *mapping, - struct vm_area_struct *vma, unsigned long address, bool shared) + struct vm_area_struct *vma, unsigned long address, bool shared) { - unsigned long size = dax_entry_size(entry), pfn, index; - int i = 0; + unsigned long size = dax_entry_size(entry), index; + struct folio *folio = dax_to_folio(entry); if (IS_ENABLED(CONFIG_FS_DAX_LIMITED)) return; index = linear_page_index(vma, address & ~(size - 1)); - for_each_mapped_pfn(entry, pfn) { - struct page *page = pfn_to_page(pfn); - - if (shared && page->mapping && page->share) { - if (page->mapping) { - page->mapping = NULL; + if (shared && (folio->mapping || dax_folio_is_shared(folio))) { + if (folio->mapping) { + folio->mapping = NULL; - /* - * Page has already been mapped into one address - * space so set the share count. - */ - page->share = 1; - } - - dax_page_share_get(page); - } else { - WARN_ON_ONCE(page->mapping); - page->mapping = mapping; - page->index = index + i++; + /* + * folio has already been mapped into one address + * space so set the share count. + */ + folio->share = 1; } + + dax_folio_share_get(entry); + } else { + WARN_ON_ONCE(folio->mapping); + dax_device_folio_init(entry); + folio = dax_to_folio(entry); + folio->mapping = mapping; + folio->index = index; } } static void dax_disassociate_entry(void *entry, struct address_space *mapping, - bool trunc) + bool trunc) { - unsigned long pfn; + struct folio *folio = dax_to_folio(entry); if (IS_ENABLED(CONFIG_FS_DAX_LIMITED)) return; - for_each_mapped_pfn(entry, pfn) { - struct page *page = pfn_to_page(pfn); - - WARN_ON_ONCE(trunc && page_ref_count(page) > 1); - if (dax_page_is_shared(page)) { - /* keep the shared flag if this page is still shared */ - if (dax_page_share_put(page) > 0) - continue; - } else - WARN_ON_ONCE(page->mapping && page->mapping != mapping); - page->mapping = NULL; - page->index = 0; - } + dax_folio_share_put(folio); } static struct page *dax_busy_page(void *entry) { - unsigned long pfn; + struct folio *folio = dax_to_folio(entry); - for_each_mapped_pfn(entry, pfn) { - struct page *page = pfn_to_page(pfn); + if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) + return NULL; - if (page_ref_count(page) > 1) - return page; - } - return NULL; + if (folio_ref_count(folio) - folio_mapcount(folio)) + return &folio->page; + else + return NULL; } /** @@ -786,7 +821,7 @@ struct page *dax_layout_busy_page(struct address_space *mapping) EXPORT_SYMBOL_GPL(dax_layout_busy_page); static int __dax_invalidate_entry(struct address_space *mapping, - pgoff_t index, bool trunc) + pgoff_t index, bool trunc) { XA_STATE(xas, &mapping->i_pages, index); int ret = 0; @@ -891,7 +926,7 @@ static int wait_page_idle(struct page *page, void (cb)(struct inode *), struct inode *inode) { - return ___wait_var_event(page, page_ref_count(page) == 1, + return ___wait_var_event(page, page_ref_count(page) == 0, TASK_INTERRUPTIBLE, 0, 0, cb(inode)); } @@ -899,7 +934,7 @@ static void wait_page_idle_uninterruptible(struct page *page, void (cb)(struct inode *), struct inode *inode) { - ___wait_var_event(page, page_ref_count(page) == 1, + ___wait_var_event(page, page_ref_count(page) == 0, TASK_UNINTERRUPTIBLE, 0, 0, cb(inode)); } @@ -941,7 +976,8 @@ void dax_break_mapping_uninterruptible(struct inode *inode, wait_page_idle_uninterruptible(page, cb, inode); } while (true); - dax_delete_mapping_range(inode->i_mapping, 0, LLONG_MAX); + if (!page) + dax_delete_mapping_range(inode->i_mapping, 0, LLONG_MAX); } /* @@ -1026,8 +1062,10 @@ static void *dax_insert_entry(struct xa_state *xas, struct vm_fault *vmf, void *old; dax_disassociate_entry(entry, mapping, false); - dax_associate_entry(new_entry, mapping, vmf->vma, vmf->address, - shared); + if (!(flags & DAX_ZERO_PAGE)) + dax_associate_entry(new_entry, mapping, vmf->vma, vmf->address, + shared); + /* * Only swap our new entry into the page cache if the current * entry is a zero page or an empty entry. If a normal PTE or @@ -1215,9 +1253,7 @@ static int dax_iomap_direct_access(const struct iomap *iomap, loff_t pos, goto out; if (pfn_t_to_pfn(*pfnp) & (PHYS_PFN(size)-1)) goto out; - /* For larger pages we need devmap */ - if (length > 1 && !pfn_t_devmap(*pfnp)) - goto out; + rc = 0; out_check_addr: @@ -1324,7 +1360,7 @@ static vm_fault_t dax_load_hole(struct xa_state *xas, struct vm_fault *vmf, *entry = dax_insert_entry(xas, vmf, iter, *entry, pfn, DAX_ZERO_PAGE); - ret = vmf_insert_mixed(vmf->vma, vaddr, pfn); + ret = vmf_insert_page_mkwrite(vmf, pfn_t_to_page(pfn), false); trace_dax_load_hole(inode, vmf, ret); return ret; } @@ -1784,7 +1820,8 @@ static vm_fault_t dax_fault_iter(struct vm_fault *vmf, loff_t pos = (loff_t)xas->xa_index << PAGE_SHIFT; bool write = iter->flags & IOMAP_WRITE; unsigned long entry_flags = pmd ? DAX_PMD : 0; - int err = 0; + struct folio *folio; + int ret, err = 0; pfn_t pfn; void *kaddr; @@ -1816,17 +1853,18 @@ static vm_fault_t dax_fault_iter(struct vm_fault *vmf, return dax_fault_return(err); } + folio = dax_to_folio(*entry); if (dax_fault_is_synchronous(iter, vmf->vma)) return dax_fault_synchronous_pfnp(pfnp, pfn); - /* insert PMD pfn */ + folio_ref_inc(folio); if (pmd) - return vmf_insert_pfn_pmd(vmf, pfn, write); + ret = vmf_insert_folio_pmd(vmf, pfn_folio(pfn_t_to_pfn(pfn)), write); + else + ret = vmf_insert_page_mkwrite(vmf, pfn_t_to_page(pfn), write); + folio_put(folio); - /* insert PTE pfn */ - if (write) - return vmf_insert_mixed_mkwrite(vmf->vma, vmf->address, pfn); - return vmf_insert_mixed(vmf->vma, vmf->address, pfn); + return ret; } static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, @@ -2065,6 +2103,7 @@ dax_insert_pfn_mkwrite(struct vm_fault *vmf, pfn_t pfn, unsigned int order) { struct address_space *mapping = vmf->vma->vm_file->f_mapping; XA_STATE_ORDER(xas, &mapping->i_pages, vmf->pgoff, order); + struct folio *folio; void *entry; vm_fault_t ret; @@ -2082,14 +2121,17 @@ dax_insert_pfn_mkwrite(struct vm_fault *vmf, pfn_t pfn, unsigned int order) xas_set_mark(&xas, PAGECACHE_TAG_DIRTY); dax_lock_entry(&xas, entry); xas_unlock_irq(&xas); + folio = pfn_folio(pfn_t_to_pfn(pfn)); + folio_ref_inc(folio); if (order == 0) - ret = vmf_insert_mixed_mkwrite(vmf->vma, vmf->address, pfn); + ret = vmf_insert_page_mkwrite(vmf, &folio->page, true); #ifdef CONFIG_FS_DAX_PMD else if (order == PMD_ORDER) - ret = vmf_insert_pfn_pmd(vmf, pfn, FAULT_FLAG_WRITE); + ret = vmf_insert_folio_pmd(vmf, folio, FAULT_FLAG_WRITE); #endif else ret = VM_FAULT_FALLBACK; + folio_put(folio); dax_unlock_entry(&xas, entry); trace_dax_insert_pfn_mkwrite(mapping->host, vmf, ret); return ret; diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c index 6404a18..ae69deb 100644 --- a/fs/fuse/virtio_fs.c +++ b/fs/fuse/virtio_fs.c @@ -1011,8 +1011,7 @@ static long virtio_fs_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, if (kaddr) *kaddr = fs->window_kaddr + offset; if (pfn) - *pfn = phys_to_pfn_t(fs->window_phys_addr + offset, - PFN_DEV | PFN_MAP); + *pfn = phys_to_pfn_t(fs->window_phys_addr + offset, 0); return nr_pages > max_nr_pages ? max_nr_pages : nr_pages; } diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index bdb335c..53ced36 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -2729,7 +2729,7 @@ xfs_mmaplock_two_inodes_and_break_dax_layout( * for this nested lock case. */ page = dax_layout_busy_page(VFS_I(ip2)->i_mapping); - if (page && page_ref_count(page) != 1) { + if (page && page_ref_count(page) != 0) { xfs_iunlock(ip2, XFS_MMAPLOCK_EXCL); xfs_iunlock(ip1, XFS_MMAPLOCK_EXCL); goto again; diff --git a/include/linux/dax.h b/include/linux/dax.h index 53a7482..7779e6f 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -217,8 +217,12 @@ static inline int dax_wait_page_idle(struct page *page, void (cb)(struct inode *), struct inode *inode) { - return ___wait_var_event(page, page_ref_count(page) == 1, + int ret; + + ret = ___wait_var_event(page, !page_ref_count(page), TASK_INTERRUPTIBLE, 0, 0, cb(inode)); + + return ret; } #if IS_ENABLED(CONFIG_DAX) diff --git a/include/linux/mm.h b/include/linux/mm.h index 4f9ae37..052f234 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1085,6 +1085,8 @@ int vma_is_stack_for_current(struct vm_area_struct *vma); struct mmu_gather; struct inode; +extern void prep_compound_page(struct page *page, unsigned int order); + /* * compound_order() can be called without holding a reference, which means * that niceties like page_folio() don't work. These callers should be @@ -1408,25 +1410,6 @@ vm_fault_t finish_fault(struct vm_fault *vmf); * back into memory. */ -#if defined(CONFIG_ZONE_DEVICE) && defined(CONFIG_FS_DAX) -DECLARE_STATIC_KEY_FALSE(devmap_managed_key); - -bool __put_devmap_managed_folio_refs(struct folio *folio, int refs); -static inline bool put_devmap_managed_folio_refs(struct folio *folio, int refs) -{ - if (!static_branch_unlikely(&devmap_managed_key)) - return false; - if (!folio_is_zone_device(folio)) - return false; - return __put_devmap_managed_folio_refs(folio, refs); -} -#else /* CONFIG_ZONE_DEVICE && CONFIG_FS_DAX */ -static inline bool put_devmap_managed_folio_refs(struct folio *folio, int refs) -{ - return false; -} -#endif /* CONFIG_ZONE_DEVICE && CONFIG_FS_DAX */ - /* 127: arbitrary random number, small enough to assemble well */ #define folio_ref_zero_or_close_to_overflow(folio) \ ((unsigned int) folio_ref_count(folio) + 127u <= 127u) @@ -1541,12 +1524,6 @@ static inline void put_page(struct page *page) { struct folio *folio = page_folio(page); - /* - * For some devmap managed pages we need to catch refcount transition - * from 2 to 1: - */ - if (put_devmap_managed_folio_refs(folio, 1)) - return; folio_put(folio); } diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 209e00a..01c4a58 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -344,7 +344,10 @@ struct folio { struct dev_pagemap *pgmap; }; struct address_space *mapping; - pgoff_t index; + union { + pgoff_t index; + unsigned long share; + }; union { void *private; swp_entry_t swap; diff --git a/mm/gup.c b/mm/gup.c index cef8bff..ff5f35b 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -91,8 +91,7 @@ static inline struct folio *try_get_folio(struct page *page, int refs) * belongs to this folio. */ if (unlikely(page_folio(page) != folio)) { - if (!put_devmap_managed_folio_refs(folio, refs)) - folio_put_refs(folio, refs); + folio_put_refs(folio, refs); goto retry; } @@ -111,8 +110,7 @@ static void gup_put_folio(struct folio *folio, int refs, unsigned int flags) refs *= GUP_PIN_COUNTING_BIAS; } - if (!put_devmap_managed_folio_refs(folio, refs)) - folio_put_refs(folio, refs); + folio_put_refs(folio, refs); } /** @@ -556,8 +554,7 @@ static struct folio *try_grab_folio_fast(struct page *page, int refs, */ if (unlikely((flags & FOLL_LONGTERM) && !folio_is_longterm_pinnable(folio))) { - if (!put_devmap_managed_folio_refs(folio, refs)) - folio_put_refs(folio, refs); + folio_put_refs(folio, refs); return NULL; } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index b3bf909..0ac0e54 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2142,7 +2142,7 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, tlb->fullmm); arch_check_zapped_pmd(vma, orig_pmd); tlb_remove_pmd_tlb_entry(tlb, pmd, addr); - if (vma_is_special_huge(vma)) { + if (!vma_is_dax(vma) && vma_is_special_huge(vma)) { if (arch_needs_pgtable_deposit()) zap_deposited_table(tlb->mm, pmd); spin_unlock(ptl); @@ -2786,13 +2786,15 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, */ if (arch_needs_pgtable_deposit()) zap_deposited_table(mm, pmd); - if (vma_is_special_huge(vma)) + if (!vma_is_dax(vma) && vma_is_special_huge(vma)) return; if (unlikely(is_pmd_migration_entry(old_pmd))) { swp_entry_t entry; entry = pmd_to_swp_entry(old_pmd); folio = pfn_swap_entry_folio(entry); + } else if (is_huge_zero_pmd(old_pmd)) { + return; } else { page = pmd_page(old_pmd); folio = page_folio(page); diff --git a/mm/internal.h b/mm/internal.h index 93083bb..50ff42f 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -687,8 +687,6 @@ static inline void prep_compound_tail(struct page *head, int tail_idx) set_page_private(p, 0); } -extern void prep_compound_page(struct page *page, unsigned int order); - extern void post_alloc_hook(struct page *page, unsigned int order, gfp_t gfp_flags); extern bool free_pages_prepare(struct page *page, unsigned int order); diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 96ce31e..80dd2a7 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -419,18 +419,18 @@ static unsigned long dev_pagemap_mapping_shift(struct vm_area_struct *vma, pud = pud_offset(p4d, address); if (!pud_present(*pud)) return 0; - if (pud_devmap(*pud)) + if (pud_trans_huge(*pud)) return PUD_SHIFT; pmd = pmd_offset(pud, address); if (!pmd_present(*pmd)) return 0; - if (pmd_devmap(*pmd)) + if (pmd_trans_huge(*pmd)) return PMD_SHIFT; pte = pte_offset_map(pmd, address); if (!pte) return 0; ptent = ptep_get(pte); - if (pte_present(ptent) && pte_devmap(ptent)) + if (pte_present(ptent)) ret = PAGE_SHIFT; pte_unmap(pte); return ret; diff --git a/mm/memory.c b/mm/memory.c index 5e04310..1c91dd4 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3768,13 +3768,15 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) if (vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) { /* * VM_MIXEDMAP !pfn_valid() case, or VM_SOFTDIRTY clear on a - * VM_PFNMAP VMA. + * VM_PFNMAP VMA. FS DAX also wants ops->pfn_mkwrite called. * * We should not cow pages in a shared writeable mapping. * Just mark the pages writable and/or call ops->pfn_mkwrite. */ - if (!vmf->page) + if (!vmf->page || is_fsdax_page(vmf->page)) { + vmf->page = NULL; return wp_pfn_shared(vmf); + } return wp_page_shared(vmf, folio); } diff --git a/mm/memremap.c b/mm/memremap.c index 68099af..9a8879b 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -458,8 +458,13 @@ EXPORT_SYMBOL_GPL(get_dev_pagemap); void free_zone_device_folio(struct folio *folio) { - if (WARN_ON_ONCE(!folio->pgmap->ops || - !folio->pgmap->ops->page_free)) + struct dev_pagemap *pgmap = folio->pgmap; + + if (WARN_ON_ONCE(!pgmap->ops)) + return; + + if (WARN_ON_ONCE(pgmap->type != MEMORY_DEVICE_FS_DAX && + !pgmap->ops->page_free)) return; mem_cgroup_uncharge(folio); @@ -484,26 +489,36 @@ void free_zone_device_folio(struct folio *folio) * For other types of ZONE_DEVICE pages, migration is either * handled differently or not done at all, so there is no need * to clear folio->mapping. + * + * FS DAX pages clear the mapping when the folio->share count hits + * zero which indicating the page has been removed from the file + * system mapping. */ - folio->mapping = NULL; - folio->pgmap->ops->page_free(folio_page(folio, 0)); + if (pgmap->type != MEMORY_DEVICE_FS_DAX) + folio->mapping = NULL; - switch (folio->pgmap->type) { + switch (pgmap->type) { case MEMORY_DEVICE_PRIVATE: case MEMORY_DEVICE_COHERENT: - put_dev_pagemap(folio->pgmap); + pgmap->ops->page_free(folio_page(folio, 0)); + put_dev_pagemap(pgmap); break; - case MEMORY_DEVICE_FS_DAX: case MEMORY_DEVICE_GENERIC: /* * Reset the refcount to 1 to prepare for handing out the page * again. */ + pgmap->ops->page_free(folio_page(folio, 0)); folio_set_count(folio, 1); break; + case MEMORY_DEVICE_FS_DAX: + wake_up_var(&folio->page); + break; + case MEMORY_DEVICE_PCI_P2PDMA: + pgmap->ops->page_free(folio_page(folio, 0)); break; } } @@ -519,21 +534,3 @@ void zone_device_page_init(struct page *page) lock_page(page); } EXPORT_SYMBOL_GPL(zone_device_page_init); - -#ifdef CONFIG_FS_DAX -bool __put_devmap_managed_folio_refs(struct folio *folio, int refs) -{ - if (folio->pgmap->type != MEMORY_DEVICE_FS_DAX) - return false; - - /* - * fsdax page refcounts are 1-based, rather than 0-based: if - * refcount is 1, then the page is free and the refcount is - * stable because nobody holds a reference on the page. - */ - if (folio_ref_sub_return(folio, refs) == 1) - wake_up_var(&folio->_refcount); - return true; -} -EXPORT_SYMBOL(__put_devmap_managed_folio_refs); -#endif /* CONFIG_FS_DAX */ diff --git a/mm/mm_init.c b/mm/mm_init.c index 3d0611e..3c32190 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -1015,23 +1015,22 @@ static void __ref __init_zone_device_page(struct page *page, unsigned long pfn, } /* - * ZONE_DEVICE pages other than MEMORY_TYPE_GENERIC and - * MEMORY_TYPE_FS_DAX pages are released directly to the driver page - * allocator which will set the page count to 1 when allocating the - * page. + * ZONE_DEVICE pages other than MEMORY_TYPE_GENERIC are released + * directly to the driver page allocator which will set the page count + * to 1 when allocating the page. * * MEMORY_TYPE_GENERIC and MEMORY_TYPE_FS_DAX pages automatically have * their refcount reset to one whenever they are freed (ie. after * their refcount drops to 0). */ switch (pgmap->type) { + case MEMORY_DEVICE_FS_DAX: case MEMORY_DEVICE_PRIVATE: case MEMORY_DEVICE_COHERENT: case MEMORY_DEVICE_PCI_P2PDMA: set_page_count(page, 0); break; - case MEMORY_DEVICE_FS_DAX: case MEMORY_DEVICE_GENERIC: break; } diff --git a/mm/swap.c b/mm/swap.c index 835bdf3..bb06b19 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -969,8 +969,6 @@ void folios_put_refs(struct folio_batch *folios, unsigned int *refs) unlock_page_lruvec_irqrestore(lruvec, flags); lruvec = NULL; } - if (put_devmap_managed_folio_refs(folio, nr_refs)) - continue; if (folio_ref_sub_and_test(folio, nr_refs)) free_zone_device_folio(folio); continue; From patchwork Fri Nov 22 01:40:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882603 Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11on2048.outbound.protection.outlook.com [40.107.236.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 429091DEFE2; Fri, 22 Nov 2024 01:42:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.236.48 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239781; cv=fail; b=LLTYBPAUXzc7GFGAiXwBd3xnbY6Yd7egAkNNclbhBDr1z3BcaGVMhMdABhgC6yEkJjB8uQYT/UKQcznp46lUJsnV8dXNusqOI9BDAwP1WYA52ZPBjoJWQG2Fadg+v6KYgIEA15TlsyyoanVDLciw4PH1aVHzmBqSHqtXyT/no08= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239781; c=relaxed/simple; bh=bVzgWyQa4w+3Tu64z6XFGDevakKWyWcLsWA7cirN12I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=LVRON9XvS9GhlefKge8catzraZwmR6XSTr92yJkuVuIdc0aKwMd+TfOR3uNk8qunYK/p8mwL+YzVYV4QvePcOD0P19dyGRqsEL/h0xGHMp79tEOciLvZBzoOLMF/u51dsGFloxVEoaGYHYV7jdm0+Xyh9h3K/Tt8zkgRCtV311w= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=gzKEsHdp; arc=fail smtp.client-ip=40.107.236.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="gzKEsHdp" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=weBVc+GFy9Z2a6njO1QBsRcTg5NrzYqRqkifilR4dxf5l3F8/hCRAy/O5XxTnbkK4I63f68TDz/FN7ACapsar88Uhsr21WDOh89fTd8CL0/pO6trGykNzPFc5oNNxOoZV6npDqIS4RXXV2JPM0mLnhrIwHwxcjJniS3XFuW2JWXe260tAlqGyVgEc9OcfkgDOf1zrQTMLuWlRmXSPwznVfR7G8N8ju7gMsNqmv5nwZdwi8T6ecJlqF347A0+Nc2SkMe5QDO55kwtCcWfxswXKcU1OS4OnNi00TpED7IpzaicedS3hukeqvpcTvNWO1GmcoXQ6sqwsTVcoPqEUYpG7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=+RK8HcctKWrWGjHAxyN8k6xlAbP24zFsv0TFbZlW19A=; b=av4+KqFMohbx3u6nPn9rm4qbV4xY1PNoLcWOZX2uodQ3pM95AV81Rg+AKZj6GPTIiIUxTxUQh08O6seR6CZye2TS2Q8Ppk8UtX5Eh/eSqA8wCjP5+2UReQZjw73QqUN4RBrJShkl0RknbwYsxg2ntwCLykGLgjvhhfiqvhw4XVKmiz5SwRMzRObJDWvoJnlGX3mE/2eIJv/bEgZFHu+hh94pRo7mlO7Zf+8m1C683HGkZELUVqPYJE1PGgmqA21av4sDA8AWPCjv0ES8IP4jU8EcykoHozuBJKbCGNQjynRnMSbSRtUzGX79g/CVnkHbu0YJ58k/NNJ9oAAZrlU9HQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+RK8HcctKWrWGjHAxyN8k6xlAbP24zFsv0TFbZlW19A=; b=gzKEsHdplPpPAw7PXbI3+wJoS0Ng80yS85rNLBmNO6KtDDdoz86XNyKZ2rzE01NjOkKeQ+GzVzzg0j60OqUecxSTLSUbf8oFvq63nvtYyv52RR95fnSht0c3Oz+rSAmtjVjny/SznVM+rOoWJbB165I2CSnYfXbFfMrixf36uBHuZ/HZ5fgzw5mo4PUuQqAi1o35fbk8HSSB1IXYu+9HtBxkPu/Iof0uKpjr+0xqFrw3ueQ6n7qBK2E8H8/VSjISdYLnxvVcgH4Hw080gd17687jDmKzSat2lCKgrk4W2HcV/OYE6FSs/7qT9Z7qvM7rLd9OzU9foOKLI2ng9Xknzg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by BY5PR12MB4131.namprd12.prod.outlook.com (2603:10b6:a03:212::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8158.24; Fri, 22 Nov 2024 01:42:55 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:42:55 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com Subject: [PATCH v3 22/25] device/dax: Properly refcount device dax pages when mapping Date: Fri, 22 Nov 2024 12:40:43 +1100 Message-ID: <5a3f120dd11d92123835c8565aa9c59e44f10f9c.1732239628.git-series.apopple@nvidia.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SYBPR01CA0181.ausprd01.prod.outlook.com (2603:10c6:10:52::25) To DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|BY5PR12MB4131:EE_ X-MS-Office365-Filtering-Correlation-Id: 507e2c92-7f8b-4735-a445-08dd0a96fc85 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|7416014|1800799024|366016; X-Microsoft-Antispam-Message-Info: xbhDw791yL54vbSjLKKuqGEPKGjjmSqvxQHP9hFdjBau71cxGBu1MlEMzVtHTo+fPc71yJi8BMnLpDNPyMdZZdPDenJ2gtBGwsRWLTuHoVgLs3oqIAf4BOx8d+Z9qBE7ocz2YW/aDvt/Wz2dmfwSmpE996K2IBUJWxIHFLc90BcPTRXyS+xvxgcrzPpp9Sa+fNmd846v7cCi+5XsSW30Y8VfRvPapcxZ4anuS/Khc2d59u9VqDi9FQZATJV+P95V3c5f1YxQTogSHpUuQDGpsq/B9k+5KcH3GToxEqaVNAZY+3Kxrgg3ibwoAJQaUFZzkyAGpkAlM+/7mIP0pleu31EKOppHViM9Hc/EZSOKChTcM8gXg6AQrsluzzrIaQ6TKvvRjvuWmX3sybnSy4pbu7SpZVlpJ/twgCdTn7bwFzKU8Z2i78L3QpiYQ25X9ZSHMwTrXVt7/WKYsv4IiFCVuh4vYiOmB3tSLd8/2+K1c2YELZBxvy/hFnrXzGXeBflx8sF5qeQtYX94q4fVmSJCuJO+/5H/yEaFHhLgigVBaE7xH6av4l/DCF7ldDUTtc4B49Ti5kXKR+/Ae0Gz8xRi03RQYD/lqdWEpmay0V88c5dblpOpExZFnBJACMuDhgQ7bJoQJhIhomUa7YBQVjYRI9vFrpfVC5AHcnHqAYOB06e3H/A0M3PjOzZSMqmb5qoOBPd2LPmiTq0tTSyaymZ0/4kQvIPOA8yOmwTGjaYJr7AZK5TPXylRH6gfYdpRFpBxXga4ZE2FSz6FkjDzJRr+ryjYDOz1D+MESnM2/65XuM3CszyV6vHcZ5Rh55edF/zw0kwXryDYLlYukm8o4damlQU792W2tgHP2nrSnejacDOIIzu2ZplPEfMzx3RmEXow30/HET80jyeEBxfKIoufAmAX/r7dAJZzaVbxEIY+4FJLc4Nxz5Azi9IcMTWbHafcAOlOI3DbnYqv4vEa+cRI354LyzXyfOPT0vCMYssmToet71zoFaRzpzM8GMnJAWb5mikL14RPIfPpaG/jkqHAJ0y+lxa6hWBCvDEzJgP0u3nOyId9aBJdsdkOV22bbwUnoQP2tF5ILKvVCF2MBp/ehp1yPK9oUYQj6L32HC7G/X/2j+3rxRt+Fvay9FX7DV+9Z5OEgwv5hdYH51oJcGi9+wwuIR0bQg1SnkD/zz/1xAzxX0Dcgi/9l6dFXMWxhwLJdhqlYPpOtRsR7XBmGCIZxjFfd+PeZpDTon9vf5O8ZlCM5bAVYMO8Bch+rHhyH22G7b8TRFIObZ+42h8eSWa2DIs3MhyZNkGPmVzRqIw7IXcOwtSdF4RDXJ2jLTsqhuKp X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(376014)(7416014)(1800799024)(366016);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: pysUss7gV89SjQKS9PZYYfjgBqcTzLrLZLDM7JO+wMDxVDU20+ORo7RVaAqH5GqpXVZoPnr54KXmhPC9a9Iw4NUzffo+wAiusFj1qyvfF1FBYMrM5KZdmmNHz5c7PxQk5I0sgV9wKXfPBpdxZ0j5j2EgLcdmDEBJl8CpzraVOScjut07BW+1hbPGbI5y8XbcTSXc2JsJfWJNHb61DRcEuPQk+/y2e22yE/uVop/GLhc3GP5ulRlAjVSsNZBBeg34eQ+uOlKCTtDCXtSx0ageOUMfQ7ddwzCX3GJ09YpeqmqHRMWJoyigQpVegRDVze7DCH+cNeER3/0mrB2eBKfrj4LETCOAbi7WLzJ6I2RhPtBwcmDBkNjeGddvPqjFZq5ZX2E5tw1N9kth+4/oP8GaFtR3LU2XEP+ULkfaUYuJQuTD0ARrNTnplxxHpfwGaSgJ2q5q6neQG0WnibKxw+GE7goOG11FrN8gAgZMMacbiUG6RH4dJ5vmG7EgfKAuc/7lVzd4v6u8tYNMBp02eBLqLT2EZThlj7czDzMIai3zH8DjVOrvoHNvKr2CrvdIZ0fN08qnWK0UYFw3I1M7Rx3HYAwXQhEo8ESn5WETxoCu0n7wCwqpAEawTA6MI+z8xV07+BA7xXW28uc3hytfz8B5gqim/xePDXfH/6LE7DvU8oPpB/pF7ln0WpTawAKTvt1PFRW2/GAqiBfnof5cJCH0OEjfpzLRGR8J4AO/N4oLltwufoip6MtMxremYMhOkIhYbIPN59nKzKy0wFMvZWtRTj9a6g0vpdOC/HDyPX7eDkAoNj5xk6kahFBEgdOMXdhslfHCtQfYndKtLdONvvQFWoogWZuacVo61YwQ7nml9mDLBdoUPyp6ciXJfJ+tCENDctdLtavNVkms6d+8EKQwRYoZJxq+c5w1BY9iS1Mh7m7O6mxPsHSVbxAFcFxLGfSASZOiju6+rnk0I6mVC0iUDYDev/rCQBO21cqql1y+FCUMl5eOIGzOF/WGSBP1luTrM83SEH3jCpGjFYVIFneBFeG/mymYgp/PgLK3AI/ljw0Vlb3aGu1MI3MPiOjgHND5FpyovOc9Ik4sznNympQ6sChP/LjN8IYUEtaQqeSMeRAI+0Za7aa6KmdQVZkV1fxnuv2c7vTiusG/e/TRNZJrVTYGIM+/sFSYqc8ayxsxHNYhE0Lh3Rwmsv17tpB+IP/ELIVgdHikck5HWYjQ7evuIqtbewuCjAzZ/Xt9WJ0+D6wVBIL/Lfo/qtqJJn3NxTPHOcu861RGXwUtbhoqg/WF6KEZqhUyb7LN/sE82haZSdIHhPSC37KLQDSL0936ww88sLu2pACDL2Rd54IC2zy1+2Xvo31zSeBJmw3r6jjVqEAU+uf4p2pGFeVipYeNl1YSGPuhLQ4yx8iyJKzb0INlTSZA+qgpKahv4tWvAC/obfyhtxxlCYZqQHvq/Rfl7AI7QfVYHu2RSnYJI49wHfm9/Li1gMUYbtgfQaYHUj4G814ZSS9Cb98IjCZtDdwedbJhHoatmh8WNbnMaJXQsOcUTNluEdwexoh2E4b1a1/L2GNy9emWRRmWSofLaVdj0q0O X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 507e2c92-7f8b-4735-a445-08dd0a96fc85 X-MS-Exchange-CrossTenant-AuthSource: DS0PR12MB7726.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:42:55.7591 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: aEhpBGb1QIrbfogxmU6SPSwcqXdpK8dXgtDhvWROm/tO70pIGQd6k5cyG8a0oFGIRuaFWWxR3UKYKblQ2Wmu3g== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR12MB4131 Device DAX pages are currently not reference counted when mapped, instead relying on the devmap PTE bit to ensure mapping code will not get/put references. This requires special handling in various page table walkers, particularly GUP, to manage references on the underlying pgmap to ensure the pages remain valid. However there is no reason these pages can't be refcounted properly at map time. Doning so eliminates the need for the devmap PTE bit, freeing up a precious PTE bit. It also simplifies GUP as it no longer needs to manage the special pgmap references and can instead just treat the pages normally as defined by vm_normal_page(). Signed-off-by: Alistair Popple --- drivers/dax/device.c | 15 +++++++++------ mm/memremap.c | 13 ++++++------- 2 files changed, 15 insertions(+), 13 deletions(-) diff --git a/drivers/dax/device.c b/drivers/dax/device.c index 6d74e62..fd22dbf 100644 --- a/drivers/dax/device.c +++ b/drivers/dax/device.c @@ -126,11 +126,12 @@ static vm_fault_t __dev_dax_pte_fault(struct dev_dax *dev_dax, return VM_FAULT_SIGBUS; } - pfn = phys_to_pfn_t(phys, PFN_DEV|PFN_MAP); + pfn = phys_to_pfn_t(phys, 0); dax_set_mapping(vmf, pfn, fault_size); - return vmf_insert_mixed(vmf->vma, vmf->address, pfn); + return vmf_insert_page_mkwrite(vmf, pfn_t_to_page(pfn), + vmf->flags & FAULT_FLAG_WRITE); } static vm_fault_t __dev_dax_pmd_fault(struct dev_dax *dev_dax, @@ -169,11 +170,12 @@ static vm_fault_t __dev_dax_pmd_fault(struct dev_dax *dev_dax, return VM_FAULT_SIGBUS; } - pfn = phys_to_pfn_t(phys, PFN_DEV|PFN_MAP); + pfn = phys_to_pfn_t(phys, 0); dax_set_mapping(vmf, pfn, fault_size); - return vmf_insert_pfn_pmd(vmf, pfn, vmf->flags & FAULT_FLAG_WRITE); + return vmf_insert_folio_pmd(vmf, page_folio(pfn_t_to_page(pfn)), + vmf->flags & FAULT_FLAG_WRITE); } #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD @@ -214,11 +216,12 @@ static vm_fault_t __dev_dax_pud_fault(struct dev_dax *dev_dax, return VM_FAULT_SIGBUS; } - pfn = phys_to_pfn_t(phys, PFN_DEV|PFN_MAP); + pfn = phys_to_pfn_t(phys, 0); dax_set_mapping(vmf, pfn, fault_size); - return vmf_insert_pfn_pud(vmf, pfn, vmf->flags & FAULT_FLAG_WRITE); + return vmf_insert_folio_pud(vmf, page_folio(pfn_t_to_page(pfn)), + vmf->flags & FAULT_FLAG_WRITE); } #else static vm_fault_t __dev_dax_pud_fault(struct dev_dax *dev_dax, diff --git a/mm/memremap.c b/mm/memremap.c index 9a8879b..532a52a 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -460,11 +460,10 @@ void free_zone_device_folio(struct folio *folio) { struct dev_pagemap *pgmap = folio->pgmap; - if (WARN_ON_ONCE(!pgmap->ops)) - return; - - if (WARN_ON_ONCE(pgmap->type != MEMORY_DEVICE_FS_DAX && - !pgmap->ops->page_free)) + if (WARN_ON_ONCE((!pgmap->ops && + pgmap->type != MEMORY_DEVICE_GENERIC) || + (pgmap->ops && !pgmap->ops->page_free && + pgmap->type != MEMORY_DEVICE_FS_DAX))) return; mem_cgroup_uncharge(folio); @@ -494,7 +493,8 @@ void free_zone_device_folio(struct folio *folio) * zero which indicating the page has been removed from the file * system mapping. */ - if (pgmap->type != MEMORY_DEVICE_FS_DAX) + if (pgmap->type != MEMORY_DEVICE_FS_DAX && + pgmap->type != MEMORY_DEVICE_GENERIC) folio->mapping = NULL; switch (pgmap->type) { @@ -509,7 +509,6 @@ void free_zone_device_folio(struct folio *folio) * Reset the refcount to 1 to prepare for handing out the page * again. */ - pgmap->ops->page_free(folio_page(folio, 0)); folio_set_count(folio, 1); break; From patchwork Fri Nov 22 01:40:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882604 Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11on2065.outbound.protection.outlook.com [40.107.236.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E53E9166F26; Fri, 22 Nov 2024 01:43:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.236.65 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239787; cv=fail; b=aQXb84+G4AG2r+Y5Z5Fi2u9UP7bFyAOmqIMDJUQ3p6zlQN6Roa0AjgUR1hlFGdE+frDL9OGEcb56byGW4pQHQWenYEuN6lBWGaZi+/DoI3RbQN39/lIXx5e2Dpvwvk2vz34pngnUFGo+QsKIF93KXBFrS8qjVS5q/V8ftRcyGL8= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239787; c=relaxed/simple; bh=XP9HpLHr+Nd8du63p13KutA8CRYDdhWKoF658wqFbw4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=bCOfulx2Gdeio5f79b1hjnhIjVf4IRL/c2vCRBNTYCW9EagOtWdDTN+RFhPzy/1+ssDXxZ1DxvsUa6T9WzgVJ26pLq3kUWbkiqp8zFIcnCRCvFqWKqoRowxHNfVOd7xiLxv7qYufBHbVnWfYVhZ4f70nhIieKazWWVkjdH7MnMM= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=pcJ5hcnW; arc=fail smtp.client-ip=40.107.236.65 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="pcJ5hcnW" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Jm/GXTOCuQgYfEa75vCqU+2oC+1cb0jUyyCuMKQtk6BFRawnw10UPTuHeaKKaMbDJyNowozN1XnFhvEnoBDt0SfjLpF/OmLw6VHqYnuiuR2mfNI/RYDKHEatmp53RZ0sK+4UhcWj2zn0YweRCVvQ8cYxrMrp2V/H6FM97ERqhMM7Oriyvuur+xLu3Hm1vE/oh2UiOGMEZSAsxwPJhwsxh86oBOqb/qFlfw7yiyigWGlbI1ZzRVLSP2N2xGeOPea6bd1LvsDDPlPk6rHGt5g6yexHO19NkFH1FutQNkSOuFPC/lmi2ZM6xYJZmD14EVDP/IAoPeTIAbdUhLdOgjDTKQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=jPibUqGmQzqgJ/QPEuHkh/JxRvNHxLjxZ/YeAz1/+mw=; b=r2S0xtwSAd2UMJP1Y/JM5hjXMyHtp+oK3qX0u4TSQATN6c4+DoyL68Te3nQmPBdugPArvajcnVcCcQmimClqK+iXKpTlSwlBxJNtteel4mAzbeoUIVafMOuvtFm6QpJapShRCHz0MAM0X0zKhjy0ZXOn9a0ri3w6zDMw0rB2JGcJE+yOv5UjvGKnke5bQHHN/lYGAQkpE/0NotwnbarWpwjewIN9KaJMFgJNjf9/PlKuIjYld62pbzmDJjyM0dNP2xuGRfJiUOWU7iNsi3fHV8Wubb+O/62zzbZwlbmeFjB84kMUUHTL1g4zzKjeymtrrgNjoI2Kw8is5ILL3HwKPA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=jPibUqGmQzqgJ/QPEuHkh/JxRvNHxLjxZ/YeAz1/+mw=; b=pcJ5hcnWeICDyRMzAwkcH5sCtCnGm50jnkuMMQAhE4S+OtqoxIrjvd2mvTiFYVakvUT0Wb4hOKLApnE9ZPqV5OUIweaAK0clo9pxuLdUpFZIlM82DsISgG+9yYbDoIu+hnQGZuGtM2A9DXwmKBxPKAkC5xvfle8bCQwBlNfqXF1PqzSF6UpT0UPbA775xtllugqW8mb2u6rD/H37MGida8w971zlBORjmk3T64pWg0KMnugqfWNaiFlGTPacaq64Zvg5LVY5eJIr7Y9y3V2Vaw/9iVAtdxHmlQ9Y7d0Zgiwi1DixrnbuQtI9XxMRBJzGSayNvSPk5uc1T+SjC1PM1g== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by BY5PR12MB4131.namprd12.prod.outlook.com (2603:10b6:a03:212::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8158.24; Fri, 22 Nov 2024 01:43:00 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:43:00 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com Subject: [PATCH v3 23/25] mm: Remove pXX_devmap callers Date: Fri, 22 Nov 2024 12:40:44 +1100 Message-ID: <11c205d96a5e9161c3a8e8df5a1c44b0122b2187.1732239628.git-series.apopple@nvidia.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SY6PR01CA0098.ausprd01.prod.outlook.com (2603:10c6:10:111::13) To DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|BY5PR12MB4131:EE_ X-MS-Office365-Filtering-Correlation-Id: 98c03fe6-2b1e-4524-d66f-08dd0a96ff30 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|7416014|1800799024|366016; X-Microsoft-Antispam-Message-Info: qcUp7Iaaj7j5fS+5rfhD2P/8BMoc4Zagk5+6YT0tNSOdrvPETLHiRIZJkVri0Dqvmmx8DaQvERkCa59O+sX83sMQ+epWn5YA4SUfAgZGGRzfwI737t+Tx0AxxVH5pbndvZy7YOLEk2fCpUMAJjmVojsbMxwxOgFblXuOUIVxRQELFIjAR/XKycx7+S+LwdK5qrEKJLrcUB8w/8TYRQ3f1Fb6dbFHblgROz246okbTp0pDZ1Qv+F0hvXx7NT9pIfjj+rM20H3eBZAQnUgp7egJHaWHToesvBPpw3FO/oHk+1Ss9tTyYIh5YFeI+4YoBbzrqC1GruZ/0j/HZxZjiGz7uUU9NBW7MkKKw1dnIRS17o8uafA6EE9sOXHjiiQ2L1zamA9ACJORaKPbGKoGK3lZPUh2HcNv7uefObI3VmsNKL3YvYrNYTEfVeM7gg7Ys0ojTyt+/ol+zqH25v4tA/0lZOF2XHVvBcAcml048uijK7p/LJTUgNSjeKNdQhnuYekJiksRSMY4WDOZdvAjyUcTNpWerlLed4Jf/q5OHTMAOHJSD8POWbwBsUSw+GKjG2oD6anzKHcHICZHkf8CWrnn+JcRYhMs5nheXIYR4u5s+LQJszXIjHgfJAn20d7mzLsBAEDYkgeoNdBPBVG4uoV0X/J8svO1g40aPwEYJ7d/hWUo0QZek/1J5nbEGXExjCTYf+mzK5bpXJmuToeC/FwV4wdrCblhtw747Z+wI4OrPaII6gN/lLVY0ykW1CGhy3b7cKQUp9RpavraW+97t4sERNfMg8/Rku/jbIrbzB93gGx/sPLfZGcm4c6FtLhFf0UKck1kbmAbQsYCb/+DJ4R+e25RoWS4uiskODr4CWdnLu+0IiE6lOLSXeD/nToD8mRdZtp23ew8aUYtIU1Y7hnsEsjQSzInhGNiC+0ox7GXDJiUMmr6JY4YVA5r2x7agNG5dIB63w7NJleFxqbM5wGk6hxUKfitvnaZgEwv2N/edv65LZdwXuPPu942bki4+nhaSiAL928tq70iY064fsuknjwkzMHBCG3SQ8zPrjUH1s0n+NWbIuFJa4e2C8axpLlz9uQS1kcWVWn8CjE8MhQMMwQrZ8GHt0q6YrPtBz04mJA24fBkzHMTW1BIAuq+AENNgs2uNe13cIpUycVsMWgEgTYtZK6l7F+fwy4j0Qr8Nf+5lbNxik2RovhqtW21N+vwWJg8lcWjIPDM18juoorhw9K7yBIJKGnsgYzVyWWlpPAb9AMOh0tDBtydUOWU1p/4FUr2gLNoGLyrWv9yOotXGxkLaeVUhyM8D861z+O7OKNye5vI4LXH6qME+hZCgg4 X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(376014)(7416014)(1800799024)(366016);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: mX+FUmG3LzTi87Vkci/4dAvYyBaSyGzFX2xKDgExuGwsMU4tuf6VPi1Jla/F03x5ORsploxsbHVo0wSe1KI2PN1blXH/lZQ0xMnUGZMFl9NliDZpR7MJgtUDoL14m6Z+MiZ1eiGMSo/oDOrxyMRJnxjkgy8I36uw3o09b1sZyRugHtL1lFDSH8+hBCQPMMRPdUWw8mtGfC78jKhj3ilvHPSRYeWMHuSoixFMQAnYre7L8/xSA/EGW/Ky1b59X96pVI4azsotRQF69NlSkUSYKkbajspT72awmdB4F2L+of1joDgMJsnA6ZEd6+VbOpEj8uTh5HqUfxACiJ904Z+YPkKuDynzGWlXU5UY6jlG3UvcA41y/jWGaFrB5uBM0fWtlF1tjSmZ4UzZ2bSIkikOKgLexTMwQnCa816gENpunDBfxn692RnsBCpEnyKUXKGb/2ru9TTXYhXQ/ZlINg4SsOCydbMngoZsFaCP9P+JuTLKsRnTz3rqYH9tT4YSW0K/dem/dNLbreP+vzd6FChxOdz+YzA0Msb+GsmB1GE+/zVWaVBYGZf2J1NHOpSGOJ6i9zm4S0U2kGnUAkTAMs6htUPShePivy58X0HduFk71v1Oe/O0EGNpjp1Z3Gd5R5OX/kQVbugD9/dOcUUHvH5zHw6ShAIxrbOzTJ84m3IWskZhLIMzqlwpiFl3TNyXYh98FGJ9Nuio/tQ56qHisll/bgeTukNLBn0HB4MmK1xEnizr1rNLA330fp/s4n6DhYo22MFQYNsj80Wy0ZPCikFVttL8lkKm6LNOXjxbIjhcuIjOaSO5b3n8SX6mdDEYvP65D+xUGL1ZlUnXyQs9TUKE+YgLELgb/zOk9sh4yJjQX6bpauQ3nc+CnRnD27+TW2NmV4uOIu8r4vIRsV0b1nAHIo2VUxHxuKcDz4rNccPF4SxzNbCNpPi4YDmZWLwrWQVcBBwg9vKtYR/UWrFIh44XiYpOv8dsyvJSzwLfcs1+mrY535Ys8H+g9mAaMBIDx/jt9SsYpyglYDJG/xi5XNweDkq8UydMFfMGl3LJFZbsb0iyNrZ9o+s613zoPTrlbebRMRjzikPw2n0BBV0ZWJUc0zWN5DFQKVxAglUPxYNciuArETDW8Bz0LRmcD4hF8WSPcVGnHMTl6FkGUtxli9pAscDnJgUgegUbzMxhnTrdAnq/w6/HvCVFZMEft0RYMNp1WeXUyN/v7ISbyjlU+OwTwrBgCv4c65TRBBCaBVMElaFH+9EqAYSomJ39Pb0NaKSTXV86HHClJOePUeom3rQRtUqe/N2muJHVdqfCSHwHwUexcqSX19QxhDjPQOYcI0wmhgKpllf3EnWRjPlmlCH4dMirbbSyrEEwhNXEdp8MBuM7J3T/rTTkZsVY5X8CwGPOi6vdnwftAOTyspmepiOdkWigTesVkUjRbgI02Y0sH2osJvuQ4btOClWIg6akjiOXajG/eGLNmc8p9VRI8CNy0x60mlhx+YsSPHlwyj87CmGhXWz3Ttj+UEamEvxQkCNxX8FBcae8aS6OFq1DQlFDOScNdKEkTmeg5MurSEO1uAg7oYNfyQlLsg9XpJiR5ISW X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 98c03fe6-2b1e-4524-d66f-08dd0a96ff30 X-MS-Exchange-CrossTenant-AuthSource: DS0PR12MB7726.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:43:00.2603 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: upLkhdncCW8FUj3YjwvLpmGVtg1PyuTUmzFszAnLj39q6k07BnMDTstErmpxV9chjkankS0WqJpNZMVFdcEKig== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR12MB4131 The devmap PTE special bit was used to detect mappings of FS DAX pages. This tracking was required to ensure the generic mm did not manipulate the page reference counts as FS DAX implemented it's own reference counting scheme. Now that FS DAX pages have their references counted the same way as normal pages this tracking is no longer needed and can be removed. Almost all existing uses of pmd_devmap() are paired with a check of pmd_trans_huge(). As pmd_trans_huge() now returns true for FS DAX pages dropping the check in these cases doesn't change anything. However care needs to be taken because pmd_trans_huge() also checks that a page is not an FS DAX page. This is dealt with either by checking !vma_is_dax() or relying on the fact that the page pointer was obtained from a page list. This is possible because zone device pages cannot appear in any page list due to sharing page->lru with page->pgmap. Signed-off-by: Alistair Popple --- arch/powerpc/mm/book3s64/hash_pgtable.c | 3 +- arch/powerpc/mm/book3s64/pgtable.c | 8 +- arch/powerpc/mm/book3s64/radix_pgtable.c | 5 +- arch/powerpc/mm/pgtable.c | 2 +- fs/dax.c | 5 +- fs/userfaultfd.c | 2 +- include/linux/huge_mm.h | 10 +- include/linux/pgtable.h | 2 +- mm/gup.c | 162 +------------------------ mm/hmm.c | 7 +- mm/huge_memory.c | 71 +---------- mm/khugepaged.c | 2 +- mm/mapping_dirty_helpers.c | 4 +- mm/memory.c | 35 +---- mm/migrate_device.c | 2 +- mm/mprotect.c | 2 +- mm/mremap.c | 5 +- mm/page_vma_mapped.c | 5 +- mm/pagewalk.c | 8 +- mm/pgtable-generic.c | 7 +- mm/userfaultfd.c | 5 +- mm/vmscan.c | 5 +- 22 files changed, 61 insertions(+), 296 deletions(-) diff --git a/arch/powerpc/mm/book3s64/hash_pgtable.c b/arch/powerpc/mm/book3s64/hash_pgtable.c index 988948d..82d3117 100644 --- a/arch/powerpc/mm/book3s64/hash_pgtable.c +++ b/arch/powerpc/mm/book3s64/hash_pgtable.c @@ -195,7 +195,7 @@ unsigned long hash__pmd_hugepage_update(struct mm_struct *mm, unsigned long addr unsigned long old; #ifdef CONFIG_DEBUG_VM - WARN_ON(!hash__pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp)); + WARN_ON(!hash__pmd_trans_huge(*pmdp)); assert_spin_locked(pmd_lockptr(mm, pmdp)); #endif @@ -227,7 +227,6 @@ pmd_t hash__pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long addres VM_BUG_ON(address & ~HPAGE_PMD_MASK); VM_BUG_ON(pmd_trans_huge(*pmdp)); - VM_BUG_ON(pmd_devmap(*pmdp)); pmd = *pmdp; pmd_clear(pmdp); diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c index 5a4a753..fd1971f 100644 --- a/arch/powerpc/mm/book3s64/pgtable.c +++ b/arch/powerpc/mm/book3s64/pgtable.c @@ -50,7 +50,7 @@ int pmdp_set_access_flags(struct vm_area_struct *vma, unsigned long address, { int changed; #ifdef CONFIG_DEBUG_VM - WARN_ON(!pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp)); + WARN_ON(!pmd_trans_huge(*pmdp)); assert_spin_locked(pmd_lockptr(vma->vm_mm, pmdp)); #endif changed = !pmd_same(*(pmdp), entry); @@ -70,7 +70,6 @@ int pudp_set_access_flags(struct vm_area_struct *vma, unsigned long address, { int changed; #ifdef CONFIG_DEBUG_VM - WARN_ON(!pud_devmap(*pudp)); assert_spin_locked(pud_lockptr(vma->vm_mm, pudp)); #endif changed = !pud_same(*(pudp), entry); @@ -193,7 +192,7 @@ pmd_t pmdp_huge_get_and_clear_full(struct vm_area_struct *vma, pmd_t pmd; VM_BUG_ON(addr & ~HPAGE_PMD_MASK); VM_BUG_ON((pmd_present(*pmdp) && !pmd_trans_huge(*pmdp) && - !pmd_devmap(*pmdp)) || !pmd_present(*pmdp)); + !pmd_present(*pmdp)); pmd = pmdp_huge_get_and_clear(vma->vm_mm, addr, pmdp); /* * if it not a fullmm flush, then we can possibly end up converting @@ -211,8 +210,7 @@ pud_t pudp_huge_get_and_clear_full(struct vm_area_struct *vma, pud_t pud; VM_BUG_ON(addr & ~HPAGE_PMD_MASK); - VM_BUG_ON((pud_present(*pudp) && !pud_devmap(*pudp)) || - !pud_present(*pudp)); + VM_BUG_ON(!pud_present(*pudp)); pud = pudp_huge_get_and_clear(vma->vm_mm, addr, pudp); /* * if it not a fullmm flush, then we can possibly end up converting diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c index b0d9270..78907b6 100644 --- a/arch/powerpc/mm/book3s64/radix_pgtable.c +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c @@ -1424,7 +1424,7 @@ unsigned long radix__pmd_hugepage_update(struct mm_struct *mm, unsigned long add unsigned long old; #ifdef CONFIG_DEBUG_VM - WARN_ON(!radix__pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp)); + WARN_ON(!radix__pmd_trans_huge(*pmdp)); assert_spin_locked(pmd_lockptr(mm, pmdp)); #endif @@ -1441,7 +1441,7 @@ unsigned long radix__pud_hugepage_update(struct mm_struct *mm, unsigned long add unsigned long old; #ifdef CONFIG_DEBUG_VM - WARN_ON(!pud_devmap(*pudp)); + WARN_ON(!pud_trans_huge(*pudp)); assert_spin_locked(pud_lockptr(mm, pudp)); #endif @@ -1459,7 +1459,6 @@ pmd_t radix__pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long addre VM_BUG_ON(address & ~HPAGE_PMD_MASK); VM_BUG_ON(radix__pmd_trans_huge(*pmdp)); - VM_BUG_ON(pmd_devmap(*pmdp)); /* * khugepaged calls this for normal pmd */ diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c index 7316396..c8cba4d 100644 --- a/arch/powerpc/mm/pgtable.c +++ b/arch/powerpc/mm/pgtable.c @@ -509,7 +509,7 @@ pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea, return NULL; #endif - if (pmd_trans_huge(pmd) || pmd_devmap(pmd)) { + if (pmd_trans_huge(pmd)) { if (is_thp) *is_thp = true; ret_pte = (pte_t *)pmdp; diff --git a/fs/dax.c b/fs/dax.c index 45d04ca..0169b35 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -1908,7 +1908,7 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, * the PTE we need to set up. If so just return and the fault will be * retried. */ - if (pmd_trans_huge(*vmf->pmd) || pmd_devmap(*vmf->pmd)) { + if (pmd_trans_huge(*vmf->pmd)) { ret = VM_FAULT_NOPAGE; goto unlock_entry; } @@ -2029,8 +2029,7 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, * the PMD we need to set up. If so just return and the fault will be * retried. */ - if (!pmd_none(*vmf->pmd) && !pmd_trans_huge(*vmf->pmd) && - !pmd_devmap(*vmf->pmd)) { + if (!pmd_none(*vmf->pmd) && !pmd_trans_huge(*vmf->pmd)) { ret = 0; goto unlock_entry; } diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 68cdd89..1c90913 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -304,7 +304,7 @@ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx, goto out; ret = false; - if (!pmd_present(_pmd) || pmd_devmap(_pmd)) + if (!pmd_present(_pmd) || vma_is_dax(vmf->vma)) goto out; if (pmd_trans_huge(_pmd)) { diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 120d0ac..1710070 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -378,8 +378,7 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, #define split_huge_pmd(__vma, __pmd, __address) \ do { \ pmd_t *____pmd = (__pmd); \ - if (is_swap_pmd(*____pmd) || pmd_trans_huge(*____pmd) \ - || pmd_devmap(*____pmd)) \ + if (is_swap_pmd(*____pmd) || pmd_trans_huge(*____pmd)) \ __split_huge_pmd(__vma, __pmd, __address, \ false, NULL); \ } while (0) @@ -405,8 +404,7 @@ change_huge_pud(struct mmu_gather *tlb, struct vm_area_struct *vma, #define split_huge_pud(__vma, __pud, __address) \ do { \ pud_t *____pud = (__pud); \ - if (pud_trans_huge(*____pud) \ - || pud_devmap(*____pud)) \ + if (pud_trans_huge(*____pud)) \ __split_huge_pud(__vma, __pud, __address); \ } while (0) @@ -429,7 +427,7 @@ static inline int is_swap_pmd(pmd_t pmd) static inline spinlock_t *pmd_trans_huge_lock(pmd_t *pmd, struct vm_area_struct *vma) { - if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || pmd_devmap(*pmd)) + if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd)) return __pmd_trans_huge_lock(pmd, vma); else return NULL; @@ -437,7 +435,7 @@ static inline spinlock_t *pmd_trans_huge_lock(pmd_t *pmd, static inline spinlock_t *pud_trans_huge_lock(pud_t *pud, struct vm_area_struct *vma) { - if (pud_trans_huge(*pud) || pud_devmap(*pud)) + if (pud_trans_huge(*pud)) return __pud_trans_huge_lock(pud, vma); else return NULL; diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index e8b2ac6..fa661b2 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1645,7 +1645,7 @@ static inline int pud_trans_unstable(pud_t *pud) defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) pud_t pudval = READ_ONCE(*pud); - if (pud_none(pudval) || pud_trans_huge(pudval) || pud_devmap(pudval)) + if (pud_none(pudval) || pud_trans_huge(pudval)) return 1; if (unlikely(pud_bad(pudval))) { pud_clear_bad(pud); diff --git a/mm/gup.c b/mm/gup.c index ff5f35b..bb12871 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -629,31 +629,9 @@ static struct page *follow_huge_pud(struct vm_area_struct *vma, return NULL; pfn += (addr & ~PUD_MASK) >> PAGE_SHIFT; - - if (IS_ENABLED(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) && - pud_devmap(pud)) { - /* - * device mapped pages can only be returned if the caller - * will manage the page reference count. - * - * At least one of FOLL_GET | FOLL_PIN must be set, so - * assert that here: - */ - if (!(flags & (FOLL_GET | FOLL_PIN))) - return ERR_PTR(-EEXIST); - - if (flags & FOLL_TOUCH) - touch_pud(vma, addr, pudp, flags & FOLL_WRITE); - - ctx->pgmap = get_dev_pagemap(pfn, ctx->pgmap); - if (!ctx->pgmap) - return ERR_PTR(-EFAULT); - } - page = pfn_to_page(pfn); - if (!pud_devmap(pud) && !pud_write(pud) && - gup_must_unshare(vma, flags, page)) + if (!pud_write(pud) && gup_must_unshare(vma, flags, page)) return ERR_PTR(-EMLINK); ret = try_grab_folio(page_folio(page), 1, flags); @@ -852,8 +830,7 @@ static struct page *follow_page_pte(struct vm_area_struct *vma, page = vm_normal_page(vma, address, pte); /* - * We only care about anon pages in can_follow_write_pte() and don't - * have to worry about pte_devmap() because they are never anon. + * We only care about anon pages in can_follow_write_pte(). */ if ((flags & FOLL_WRITE) && !can_follow_write_pte(pte, page, vma, flags)) { @@ -861,18 +838,7 @@ static struct page *follow_page_pte(struct vm_area_struct *vma, goto out; } - if (!page && pte_devmap(pte) && (flags & (FOLL_GET | FOLL_PIN))) { - /* - * Only return device mapping pages in the FOLL_GET or FOLL_PIN - * case since they are only valid while holding the pgmap - * reference. - */ - *pgmap = get_dev_pagemap(pte_pfn(pte), *pgmap); - if (*pgmap) - page = pte_page(pte); - else - goto no_page; - } else if (unlikely(!page)) { + if (unlikely(!page)) { if (flags & FOLL_DUMP) { /* Avoid special (like zero) pages in core dumps */ page = ERR_PTR(-EFAULT); @@ -954,14 +920,6 @@ static struct page *follow_pmd_mask(struct vm_area_struct *vma, return no_page_table(vma, flags, address); if (!pmd_present(pmdval)) return no_page_table(vma, flags, address); - if (pmd_devmap(pmdval)) { - ptl = pmd_lock(mm, pmd); - page = follow_devmap_pmd(vma, address, pmd, flags, &ctx->pgmap); - spin_unlock(ptl); - if (page) - return page; - return no_page_table(vma, flags, address); - } if (likely(!pmd_leaf(pmdval))) return follow_page_pte(vma, address, pmd, flags, &ctx->pgmap); @@ -2843,7 +2801,7 @@ static int gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, int *nr) { struct dev_pagemap *pgmap = NULL; - int nr_start = *nr, ret = 0; + int ret = 0; pte_t *ptep, *ptem; ptem = ptep = pte_offset_map(&pmd, addr); @@ -2867,16 +2825,7 @@ static int gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, if (!pte_access_permitted(pte, flags & FOLL_WRITE)) goto pte_unmap; - if (pte_devmap(pte)) { - if (unlikely(flags & FOLL_LONGTERM)) - goto pte_unmap; - - pgmap = get_dev_pagemap(pte_pfn(pte), pgmap); - if (unlikely(!pgmap)) { - gup_fast_undo_dev_pagemap(nr, nr_start, flags, pages); - goto pte_unmap; - } - } else if (pte_special(pte)) + if (pte_special(pte)) goto pte_unmap; VM_BUG_ON(!pfn_valid(pte_pfn(pte))); @@ -2947,91 +2896,6 @@ static int gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, } #endif /* CONFIG_ARCH_HAS_PTE_SPECIAL */ -#if defined(CONFIG_ARCH_HAS_PTE_DEVMAP) && defined(CONFIG_TRANSPARENT_HUGEPAGE) -static int gup_fast_devmap_leaf(unsigned long pfn, unsigned long addr, - unsigned long end, unsigned int flags, struct page **pages, int *nr) -{ - int nr_start = *nr; - struct dev_pagemap *pgmap = NULL; - - do { - struct folio *folio; - struct page *page = pfn_to_page(pfn); - - pgmap = get_dev_pagemap(pfn, pgmap); - if (unlikely(!pgmap)) { - gup_fast_undo_dev_pagemap(nr, nr_start, flags, pages); - break; - } - - folio = try_grab_folio_fast(page, 1, flags); - if (!folio) { - gup_fast_undo_dev_pagemap(nr, nr_start, flags, pages); - break; - } - folio_set_referenced(folio); - pages[*nr] = page; - (*nr)++; - pfn++; - } while (addr += PAGE_SIZE, addr != end); - - put_dev_pagemap(pgmap); - return addr == end; -} - -static int gup_fast_devmap_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, - unsigned long end, unsigned int flags, struct page **pages, - int *nr) -{ - unsigned long fault_pfn; - int nr_start = *nr; - - fault_pfn = pmd_pfn(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); - if (!gup_fast_devmap_leaf(fault_pfn, addr, end, flags, pages, nr)) - return 0; - - if (unlikely(pmd_val(orig) != pmd_val(*pmdp))) { - gup_fast_undo_dev_pagemap(nr, nr_start, flags, pages); - return 0; - } - return 1; -} - -static int gup_fast_devmap_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, - unsigned long end, unsigned int flags, struct page **pages, - int *nr) -{ - unsigned long fault_pfn; - int nr_start = *nr; - - fault_pfn = pud_pfn(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); - if (!gup_fast_devmap_leaf(fault_pfn, addr, end, flags, pages, nr)) - return 0; - - if (unlikely(pud_val(orig) != pud_val(*pudp))) { - gup_fast_undo_dev_pagemap(nr, nr_start, flags, pages); - return 0; - } - return 1; -} -#else -static int gup_fast_devmap_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, - unsigned long end, unsigned int flags, struct page **pages, - int *nr) -{ - BUILD_BUG(); - return 0; -} - -static int gup_fast_devmap_pud_leaf(pud_t pud, pud_t *pudp, unsigned long addr, - unsigned long end, unsigned int flags, struct page **pages, - int *nr) -{ - BUILD_BUG(); - return 0; -} -#endif - static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, unsigned long end, unsigned int flags, struct page **pages, int *nr) @@ -3046,13 +2910,6 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr, if (pmd_special(orig)) return 0; - if (pmd_devmap(orig)) { - if (unlikely(flags & FOLL_LONGTERM)) - return 0; - return gup_fast_devmap_pmd_leaf(orig, pmdp, addr, end, flags, - pages, nr); - } - page = pmd_page(orig); refs = record_subpages(page, PMD_SIZE, addr, end, pages + *nr); @@ -3093,13 +2950,6 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr, if (pud_special(orig)) return 0; - if (pud_devmap(orig)) { - if (unlikely(flags & FOLL_LONGTERM)) - return 0; - return gup_fast_devmap_pud_leaf(orig, pudp, addr, end, flags, - pages, nr); - } - page = pud_page(orig); refs = record_subpages(page, PUD_SIZE, addr, end, pages + *nr); @@ -3138,8 +2988,6 @@ static int gup_fast_pgd_leaf(pgd_t orig, pgd_t *pgdp, unsigned long addr, if (!pgd_access_permitted(orig, flags & FOLL_WRITE)) return 0; - BUILD_BUG_ON(pgd_devmap(orig)); - page = pgd_page(orig); refs = record_subpages(page, PGDIR_SIZE, addr, end, pages + *nr); diff --git a/mm/hmm.c b/mm/hmm.c index 082f7b7..285578e 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -298,7 +298,6 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr, * fall through and treat it like a normal page. */ if (!vm_normal_page(walk->vma, addr, pte) && - !pte_devmap(pte) && !is_zero_pfn(pte_pfn(pte))) { if (hmm_pte_need_fault(hmm_vma_walk, pfn_req_flags, 0)) { pte_unmap(ptep); @@ -351,7 +350,7 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, return hmm_pfns_fill(start, end, range, HMM_PFN_ERROR); } - if (pmd_devmap(pmd) || pmd_trans_huge(pmd)) { + if (pmd_trans_huge(pmd)) { /* * No need to take pmd_lock here, even if some other thread * is splitting the huge pmd we will get that event through @@ -362,7 +361,7 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, * values. */ pmd = pmdp_get_lockless(pmdp); - if (!pmd_devmap(pmd) && !pmd_trans_huge(pmd)) + if (!pmd_trans_huge(pmd)) goto again; return hmm_vma_handle_pmd(walk, addr, end, hmm_pfns, pmd); @@ -429,7 +428,7 @@ static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end, return hmm_vma_walk_hole(start, end, -1, walk); } - if (pud_leaf(pud) && pud_devmap(pud)) { + if (pud_leaf(pud) && vma_is_dax(walk->vma)) { unsigned long i, npages, pfn; unsigned int required_fault; unsigned long *hmm_pfns; diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 0ac0e54..900a596 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1357,10 +1357,7 @@ static void insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr, } entry = pmd_mkhuge(pfn_t_pmd(pfn, prot)); - if (pfn_t_devmap(pfn)) - entry = pmd_mkdevmap(entry); - else - entry = pmd_mkspecial(entry); + entry = pmd_mkspecial(entry); if (write) { entry = pmd_mkyoung(pmd_mkdirty(entry)); entry = maybe_pmd_mkwrite(entry, vma); @@ -1399,8 +1396,6 @@ vm_fault_t vmf_insert_pfn_pmd(struct vm_fault *vmf, pfn_t pfn, bool write) * but we need to be consistent with PTEs and architectures that * can't support a 'special' bit. */ - BUG_ON(!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) && - !pfn_t_devmap(pfn)); BUG_ON((vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) == (VM_PFNMAP|VM_MIXEDMAP)); BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags)); @@ -1495,10 +1490,7 @@ static void insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr, } entry = pud_mkhuge(pfn_t_pud(pfn, prot)); - if (pfn_t_devmap(pfn)) - entry = pud_mkdevmap(entry); - else - entry = pud_mkspecial(entry); + entry = pud_mkspecial(entry); if (write) { entry = pud_mkyoung(pud_mkdirty(entry)); entry = maybe_pud_mkwrite(entry, vma); @@ -1529,8 +1521,6 @@ vm_fault_t vmf_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write) * but we need to be consistent with PTEs and architectures that * can't support a 'special' bit. */ - BUG_ON(!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) && - !pfn_t_devmap(pfn)); BUG_ON((vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) == (VM_PFNMAP|VM_MIXEDMAP)); BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags)); @@ -1604,46 +1594,6 @@ void touch_pmd(struct vm_area_struct *vma, unsigned long addr, update_mmu_cache_pmd(vma, addr, pmd); } -struct page *follow_devmap_pmd(struct vm_area_struct *vma, unsigned long addr, - pmd_t *pmd, int flags, struct dev_pagemap **pgmap) -{ - unsigned long pfn = pmd_pfn(*pmd); - struct mm_struct *mm = vma->vm_mm; - struct page *page; - int ret; - - assert_spin_locked(pmd_lockptr(mm, pmd)); - - if (flags & FOLL_WRITE && !pmd_write(*pmd)) - return NULL; - - if (pmd_present(*pmd) && pmd_devmap(*pmd)) - /* pass */; - else - return NULL; - - if (flags & FOLL_TOUCH) - touch_pmd(vma, addr, pmd, flags & FOLL_WRITE); - - /* - * device mapped pages can only be returned if the - * caller will manage the page reference count. - */ - if (!(flags & (FOLL_GET | FOLL_PIN))) - return ERR_PTR(-EEXIST); - - pfn += (addr & ~PMD_MASK) >> PAGE_SHIFT; - *pgmap = get_dev_pagemap(pfn, *pgmap); - if (!*pgmap) - return ERR_PTR(-EFAULT); - page = pfn_to_page(pfn); - ret = try_grab_folio(page_folio(page), 1, flags); - if (ret) - page = ERR_PTR(ret); - - return page; -} - int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, pmd_t *dst_pmd, pmd_t *src_pmd, unsigned long addr, struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma) @@ -1795,7 +1745,7 @@ int copy_huge_pud(struct mm_struct *dst_mm, struct mm_struct *src_mm, ret = -EAGAIN; pud = *src_pud; - if (unlikely(!pud_trans_huge(pud) && !pud_devmap(pud))) + if (unlikely(!pud_trans_huge(pud))) goto out_unlock; /* @@ -2598,8 +2548,7 @@ spinlock_t *__pmd_trans_huge_lock(pmd_t *pmd, struct vm_area_struct *vma) { spinlock_t *ptl; ptl = pmd_lock(vma->vm_mm, pmd); - if (likely(is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || - pmd_devmap(*pmd))) + if (likely(is_swap_pmd(*pmd) || pmd_trans_huge(*pmd))) return ptl; spin_unlock(ptl); return NULL; @@ -2616,7 +2565,7 @@ spinlock_t *__pud_trans_huge_lock(pud_t *pud, struct vm_area_struct *vma) spinlock_t *ptl; ptl = pud_lock(vma->vm_mm, pud); - if (likely(pud_trans_huge(*pud) || pud_devmap(*pud))) + if (likely(pud_trans_huge(*pud))) return ptl; spin_unlock(ptl); return NULL; @@ -2666,7 +2615,7 @@ static void __split_huge_pud_locked(struct vm_area_struct *vma, pud_t *pud, VM_BUG_ON(haddr & ~HPAGE_PUD_MASK); VM_BUG_ON_VMA(vma->vm_start > haddr, vma); VM_BUG_ON_VMA(vma->vm_end < haddr + HPAGE_PUD_SIZE, vma); - VM_BUG_ON(!pud_trans_huge(*pud) && !pud_devmap(*pud)); + VM_BUG_ON(!pud_trans_huge(*pud)); count_vm_event(THP_SPLIT_PUD); @@ -2700,7 +2649,7 @@ void __split_huge_pud(struct vm_area_struct *vma, pud_t *pud, (address & HPAGE_PUD_MASK) + HPAGE_PUD_SIZE); mmu_notifier_invalidate_range_start(&range); ptl = pud_lock(vma->vm_mm, pud); - if (unlikely(!pud_trans_huge(*pud) && !pud_devmap(*pud))) + if (unlikely(!pud_trans_huge(*pud))) goto out; __split_huge_pud_locked(vma, pud, range.start); @@ -2773,8 +2722,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, VM_BUG_ON(haddr & ~HPAGE_PMD_MASK); VM_BUG_ON_VMA(vma->vm_start > haddr, vma); VM_BUG_ON_VMA(vma->vm_end < haddr + HPAGE_PMD_SIZE, vma); - VM_BUG_ON(!is_pmd_migration_entry(*pmd) && !pmd_trans_huge(*pmd) - && !pmd_devmap(*pmd)); + VM_BUG_ON(!is_pmd_migration_entry(*pmd) && !pmd_trans_huge(*pmd)); count_vm_event(THP_SPLIT_PMD); @@ -2991,8 +2939,7 @@ void split_huge_pmd_locked(struct vm_area_struct *vma, unsigned long address, * require a folio to check the PMD against. Otherwise, there * is a risk of replacing the wrong folio. */ - if (pmd_trans_huge(*pmd) || pmd_devmap(*pmd) || - is_pmd_migration_entry(*pmd)) { + if (pmd_trans_huge(*pmd) || is_pmd_migration_entry(*pmd)) { if (folio && folio != pmd_folio(*pmd)) return; __split_huge_pmd_locked(vma, pmd, address, freeze); diff --git a/mm/khugepaged.c b/mm/khugepaged.c index b538c3d..f832882 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -961,8 +961,6 @@ static int find_pmd_or_thp_or_none(struct mm_struct *mm, return SCAN_PMD_NULL; if (pmd_trans_huge(pmde)) return SCAN_PMD_MAPPED; - if (pmd_devmap(pmde)) - return SCAN_PMD_NULL; if (pmd_bad(pmde)) return SCAN_PMD_NULL; return SCAN_SUCCEED; diff --git a/mm/mapping_dirty_helpers.c b/mm/mapping_dirty_helpers.c index 2f8829b..208b428 100644 --- a/mm/mapping_dirty_helpers.c +++ b/mm/mapping_dirty_helpers.c @@ -129,7 +129,7 @@ static int wp_clean_pmd_entry(pmd_t *pmd, unsigned long addr, unsigned long end, pmd_t pmdval = pmdp_get_lockless(pmd); /* Do not split a huge pmd, present or migrated */ - if (pmd_trans_huge(pmdval) || pmd_devmap(pmdval)) { + if (pmd_trans_huge(pmdval)) { WARN_ON(pmd_write(pmdval) || pmd_dirty(pmdval)); walk->action = ACTION_CONTINUE; } @@ -152,7 +152,7 @@ static int wp_clean_pud_entry(pud_t *pud, unsigned long addr, unsigned long end, pud_t pudval = READ_ONCE(*pud); /* Do not split a huge pud */ - if (pud_trans_huge(pudval) || pud_devmap(pudval)) { + if (pud_trans_huge(pudval)) { WARN_ON(pud_write(pudval) || pud_dirty(pudval)); walk->action = ACTION_CONTINUE; } diff --git a/mm/memory.c b/mm/memory.c index 1c91dd4..012c82c 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -604,16 +604,6 @@ struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr, return NULL; if (is_zero_pfn(pfn)) return NULL; - if (pte_devmap(pte)) - /* - * NOTE: New users of ZONE_DEVICE will not set pte_devmap() - * and will have refcounts incremented on their struct pages - * when they are inserted into PTEs, thus they are safe to - * return here. Legacy ZONE_DEVICE pages that set pte_devmap() - * do not have refcounts. Example of legacy ZONE_DEVICE is - * MEMORY_DEVICE_FS_DAX type in pmem or virtio_fs drivers. - */ - return NULL; print_bad_pte(vma, addr, pte, NULL); return NULL; @@ -691,8 +681,6 @@ struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr, } } - if (pmd_devmap(pmd)) - return NULL; if (is_huge_zero_pmd(pmd)) return NULL; if (unlikely(pfn > highest_memmap_pfn)) @@ -1238,8 +1226,7 @@ copy_pmd_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, src_pmd = pmd_offset(src_pud, addr); do { next = pmd_addr_end(addr, end); - if (is_swap_pmd(*src_pmd) || pmd_trans_huge(*src_pmd) - || pmd_devmap(*src_pmd)) { + if (is_swap_pmd(*src_pmd) || pmd_trans_huge(*src_pmd)) { int err; VM_BUG_ON_VMA(next-addr != HPAGE_PMD_SIZE, src_vma); err = copy_huge_pmd(dst_mm, src_mm, dst_pmd, src_pmd, @@ -1275,7 +1262,7 @@ copy_pud_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, src_pud = pud_offset(src_p4d, addr); do { next = pud_addr_end(addr, end); - if (pud_trans_huge(*src_pud) || pud_devmap(*src_pud)) { + if (pud_trans_huge(*src_pud)) { int err; VM_BUG_ON_VMA(next-addr != HPAGE_PUD_SIZE, src_vma); @@ -1713,7 +1700,7 @@ static inline unsigned long zap_pmd_range(struct mmu_gather *tlb, pmd = pmd_offset(pud, addr); do { next = pmd_addr_end(addr, end); - if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || pmd_devmap(*pmd)) { + if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd)) { if (next - addr != HPAGE_PMD_SIZE) __split_huge_pmd(vma, pmd, addr, false, NULL); else if (zap_huge_pmd(tlb, vma, pmd, addr)) { @@ -1755,7 +1742,7 @@ static inline unsigned long zap_pud_range(struct mmu_gather *tlb, pud = pud_offset(p4d, addr); do { next = pud_addr_end(addr, end); - if (pud_trans_huge(*pud) || pud_devmap(*pud)) { + if (pud_trans_huge(*pud)) { if (next - addr != HPAGE_PUD_SIZE) { mmap_assert_locked(tlb->mm); split_huge_pud(vma, pud, addr); @@ -2378,10 +2365,7 @@ static vm_fault_t insert_pfn(struct vm_area_struct *vma, unsigned long addr, } /* Ok, finally just insert the thing.. */ - if (pfn_t_devmap(pfn)) - entry = pte_mkdevmap(pfn_t_pte(pfn, prot)); - else - entry = pte_mkspecial(pfn_t_pte(pfn, prot)); + entry = pte_mkspecial(pfn_t_pte(pfn, prot)); if (mkwrite) { entry = pte_mkyoung(entry); @@ -2492,8 +2476,6 @@ static bool vm_mixed_ok(struct vm_area_struct *vma, pfn_t pfn, bool mkwrite) /* these checks mirror the abort conditions in vm_normal_page */ if (vma->vm_flags & VM_MIXEDMAP) return true; - if (pfn_t_devmap(pfn)) - return true; if (pfn_t_special(pfn)) return true; if (is_zero_pfn(pfn_t_to_pfn(pfn))) @@ -2525,8 +2507,7 @@ static vm_fault_t __vm_insert_mixed(struct vm_area_struct *vma, * than insert_pfn). If a zero_pfn were inserted into a VM_MIXEDMAP * without pte special, it would there be refcounted as a normal page. */ - if (!IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL) && - !pfn_t_devmap(pfn) && pfn_t_valid(pfn)) { + if (!IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL) && pfn_t_valid(pfn)) { struct page *page; /* @@ -5907,7 +5888,7 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma, pud_t orig_pud = *vmf.pud; barrier(); - if (pud_trans_huge(orig_pud) || pud_devmap(orig_pud)) { + if (pud_trans_huge(orig_pud)) { /* * TODO once we support anonymous PUDs: NUMA case and @@ -5948,7 +5929,7 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma, pmd_migration_entry_wait(mm, vmf.pmd); return 0; } - if (pmd_trans_huge(vmf.orig_pmd) || pmd_devmap(vmf.orig_pmd)) { + if (pmd_trans_huge(vmf.orig_pmd)) { if (pmd_protnone(vmf.orig_pmd) && vma_is_accessible(vma)) return do_huge_pmd_numa_page(&vmf); diff --git a/mm/migrate_device.c b/mm/migrate_device.c index 2209070..a721e0d 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -599,7 +599,7 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate, pmdp = pmd_alloc(mm, pudp, addr); if (!pmdp) goto abort; - if (pmd_trans_huge(*pmdp) || pmd_devmap(*pmdp)) + if (pmd_trans_huge(*pmdp)) goto abort; if (pte_alloc(mm, pmdp)) goto abort; diff --git a/mm/mprotect.c b/mm/mprotect.c index 0c5d6d0..e6b721d 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -382,7 +382,7 @@ static inline long change_pmd_range(struct mmu_gather *tlb, goto next; _pmd = pmdp_get_lockless(pmd); - if (is_swap_pmd(_pmd) || pmd_trans_huge(_pmd) || pmd_devmap(_pmd)) { + if (is_swap_pmd(_pmd) || pmd_trans_huge(_pmd)) { if ((next - addr != HPAGE_PMD_SIZE) || pgtable_split_needed(vma, cp_flags)) { __split_huge_pmd(vma, pmd, addr, false, NULL); diff --git a/mm/mremap.c b/mm/mremap.c index dda09e9..d3c14a9 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -594,7 +594,7 @@ unsigned long move_page_tables(struct vm_area_struct *vma, new_pud = alloc_new_pud(vma->vm_mm, vma, new_addr); if (!new_pud) break; - if (pud_trans_huge(*old_pud) || pud_devmap(*old_pud)) { + if (pud_trans_huge(*old_pud)) { if (extent == HPAGE_PUD_SIZE) { move_pgt_entry(HPAGE_PUD, vma, old_addr, new_addr, old_pud, new_pud, need_rmap_locks); @@ -616,8 +616,7 @@ unsigned long move_page_tables(struct vm_area_struct *vma, if (!new_pmd) break; again: - if (is_swap_pmd(*old_pmd) || pmd_trans_huge(*old_pmd) || - pmd_devmap(*old_pmd)) { + if (is_swap_pmd(*old_pmd) || pmd_trans_huge(*old_pmd)) { if (extent == HPAGE_PMD_SIZE && move_pgt_entry(HPAGE_PMD, vma, old_addr, new_addr, old_pmd, new_pmd, need_rmap_locks)) diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index ae5cc42..77da636 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -235,8 +235,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) */ pmde = pmdp_get_lockless(pvmw->pmd); - if (pmd_trans_huge(pmde) || is_pmd_migration_entry(pmde) || - (pmd_present(pmde) && pmd_devmap(pmde))) { + if (pmd_trans_huge(pmde) || is_pmd_migration_entry(pmde)) { pvmw->ptl = pmd_lock(mm, pvmw->pmd); pmde = *pvmw->pmd; if (!pmd_present(pmde)) { @@ -251,7 +250,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) return not_found(pvmw); return true; } - if (likely(pmd_trans_huge(pmde) || pmd_devmap(pmde))) { + if (likely(pmd_trans_huge(pmde))) { if (pvmw->flags & PVMW_MIGRATION) return not_found(pvmw); if (!check_pmd(pmd_pfn(pmde), pvmw)) diff --git a/mm/pagewalk.c b/mm/pagewalk.c index 461ea3b..576ff14 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -753,7 +753,7 @@ struct folio *folio_walk_start(struct folio_walk *fw, fw->pudp = pudp; fw->pud = pud; - if (!pud_present(pud) || pud_devmap(pud) || pud_special(pud)) { + if (!pud_present(pud) || pud_special(pud)) { spin_unlock(ptl); goto not_found; } else if (!pud_leaf(pud)) { @@ -765,6 +765,12 @@ struct folio *folio_walk_start(struct folio_walk *fw, * support PUD mappings in VM_PFNMAP|VM_MIXEDMAP VMAs. */ page = pud_page(pud); + + if (is_device_dax_page(page)) { + spin_unlock(ptl); + goto not_found; + } + goto found; } diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c index a78a4ad..093c435 100644 --- a/mm/pgtable-generic.c +++ b/mm/pgtable-generic.c @@ -139,8 +139,7 @@ pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma, unsigned long address, { pmd_t pmd; VM_BUG_ON(address & ~HPAGE_PMD_MASK); - VM_BUG_ON(pmd_present(*pmdp) && !pmd_trans_huge(*pmdp) && - !pmd_devmap(*pmdp)); + VM_BUG_ON(pmd_present(*pmdp) && !pmd_trans_huge(*pmdp)); pmd = pmdp_huge_get_and_clear(vma->vm_mm, address, pmdp); flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE); return pmd; @@ -153,7 +152,7 @@ pud_t pudp_huge_clear_flush(struct vm_area_struct *vma, unsigned long address, pud_t pud; VM_BUG_ON(address & ~HPAGE_PUD_MASK); - VM_BUG_ON(!pud_trans_huge(*pudp) && !pud_devmap(*pudp)); + VM_BUG_ON(!pud_trans_huge(*pudp)); pud = pudp_huge_get_and_clear(vma->vm_mm, address, pudp); flush_pud_tlb_range(vma, address, address + HPAGE_PUD_SIZE); return pud; @@ -293,7 +292,7 @@ pte_t *__pte_offset_map(pmd_t *pmd, unsigned long addr, pmd_t *pmdvalp) *pmdvalp = pmdval; if (unlikely(pmd_none(pmdval) || is_pmd_migration_entry(pmdval))) goto nomap; - if (unlikely(pmd_trans_huge(pmdval) || pmd_devmap(pmdval))) + if (unlikely(pmd_trans_huge(pmdval))) goto nomap; if (unlikely(pmd_bad(pmdval))) { pmd_clear_bad(pmd); diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index ce13c40..9d39f8e 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -798,8 +798,7 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx, * (This includes the case where the PMD used to be THP and * changed back to none after __pte_alloc().) */ - if (unlikely(!pmd_present(dst_pmdval) || pmd_trans_huge(dst_pmdval) || - pmd_devmap(dst_pmdval))) { + if (unlikely(!pmd_present(dst_pmdval) || pmd_trans_huge(dst_pmdval))) { err = -EEXIST; break; } @@ -1682,7 +1681,7 @@ ssize_t move_pages(struct userfaultfd_ctx *ctx, unsigned long dst_start, ptl = pmd_trans_huge_lock(src_pmd, src_vma); if (ptl) { - if (pmd_devmap(*src_pmd)) { + if (vma_is_dax(src_vma)) { spin_unlock(ptl); err = -ENOENT; break; diff --git a/mm/vmscan.c b/mm/vmscan.c index eb4e844..34ff5a6 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3303,7 +3303,7 @@ static unsigned long get_pte_pfn(pte_t pte, struct vm_area_struct *vma, unsigned if (!pte_present(pte) || is_zero_pfn(pfn)) return -1; - if (WARN_ON_ONCE(pte_devmap(pte) || pte_special(pte))) + if (WARN_ON_ONCE(pte_special(pte))) return -1; if (WARN_ON_ONCE(!pfn_valid(pfn))) @@ -3321,9 +3321,6 @@ static unsigned long get_pmd_pfn(pmd_t pmd, struct vm_area_struct *vma, unsigned if (!pmd_present(pmd) || is_huge_zero_pmd(pmd)) return -1; - if (WARN_ON_ONCE(pmd_devmap(pmd))) - return -1; - if (WARN_ON_ONCE(!pfn_valid(pfn))) return -1; From patchwork Fri Nov 22 01:40:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882605 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2071.outbound.protection.outlook.com [40.107.243.71]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3B457168483; Fri, 22 Nov 2024 01:43:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.243.71 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239791; cv=fail; b=NFxpojHL1rS2ncoOtb+AGu1LXNUUmjqU4+083QgV3cHxSnziT7nCx4dJ4KKHo6MijcXd8LjW2yF2NP/S6YgRsuAIe2LVIeTkJS523UE0zsyR/GDQ/x8X1WXb0ukjRvdXxM1dan9FoIHZnDgVrSz1P1DuvmMk8enBmTGVtu9UfV8= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239791; c=relaxed/simple; bh=EbBLhX82VnDygAgn7lcB/AlEHQk5g0SDfbY9SI6srvQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=Kl1f2MRJ9kJ5GhJTMiRPW1Sr9rRfAPatkJySjfKzD1eGY0sGF1IDEzWFWOOxxX/alls3U1LZi2eXzH3/yPyYMDwZlJzJrSd8TkY8YA/wiw+DKH9h/L8dsa2sxb9sojrJDVK5JSMcOR4ijXu+0WLGWERTAXYLayLWfFuaZDIyKqo= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=eh75kw0W; arc=fail smtp.client-ip=40.107.243.71 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="eh75kw0W" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=tnoYf/B/xkIv1ENLsqgtDwJCqT3+eZ3tStkQGSDZDpzlk7ODBryvUJds2R2byoZwhOftLFYfouDTJVGeEZ47vNmbff0yCOnIZWjZ29n5CokPUtu7iAhwdufh/QTAe+FvSP5P2pFkWWW/KF06gl3aQm8XRxXufadL+uL6Zgwv7xa4yzDwtQivZetSIJqEyrh+iAleNBtgMAc6Io+lSuGClshFomZJm/tlXEUKK6tPoZ8i1bV0dYmNXDZIgDXDFKOvN27m27T9+E/YjA1Q6yLPneYOcb6RgRIa4TcGTKW+DQfPV0QDwrelYSjMowjAgRlwIA/akQV6GNqzRXMALLhkMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=NJ03nN5uTvXFVD/nas8UfoAyyB+o+5aCpG21hhkqAN4=; b=gMBiIBbAJC5ZqPQS2dK4K4DNDeFFbqtItjUcRFSLWHuVmE27NVotlsfzqXkMMNfqsgJeZxMw8U7U9gn/AKTMZfoLMtHqXWo6Sk0DYSqz10zGSKm6Go+2vIJInrs7HuNVvbjcLIRb3YAZi5o7MZBXYTQHl7gzLvKcEJdQH9J5U0FnNmOpMzE2qtqgmA8cvfctWounVvzEoX8NvZGSqWWwRyQgJRGM0A7fVkibzwepL4hZmEvAEPjJiMAWwYByZE4QxaU4KnWG4CRIZYLk/h+J3j+NHWg81z7CR5ioTcXop+1s/Hr+V1SU8UaOiG1AxEVYeH64WshgUAt9JKPtfceSyA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=NJ03nN5uTvXFVD/nas8UfoAyyB+o+5aCpG21hhkqAN4=; b=eh75kw0Wj/Q7j9AHfYCgUT57lCYNi+8IJP0sYdjFg9BuzBAar9yoF/Mx65MsTV2j/wC87yLZ5mh/13/ubq9hrWr1JC2NcU78rnRPsWx+UYenGn8a+S0JfX2R9oF48dktISQAl5h6NSuen4450WtwHwIvUPVU+R8cRnh7mtEvFN6J1PBqiXjHzaPf8z1dXlvPCBgsAHYmc3h5fHmvCBEz8FX2uyKuGL0XFi9VD14zxYnnuG9pWHcCdUaDgo5QvDgo9tjUpHqCI1XH17JWs/FsyZfCf2TAW4TVrRDtat5Sf7Np6kvqzok9Cn9uB6gZt4NkdmJagCp2qEsFGAulZhzfIw== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by BY5PR12MB4131.namprd12.prod.outlook.com (2603:10b6:a03:212::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8158.24; Fri, 22 Nov 2024 01:43:05 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:43:04 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com Subject: [PATCH v3 24/25] mm: Remove devmap related functions and page table bits Date: Fri, 22 Nov 2024 12:40:45 +1100 Message-ID: <4c978d98e9082062e3d6d1195630501a62ee1d6a.1732239628.git-series.apopple@nvidia.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SYCP282CA0002.AUSP282.PROD.OUTLOOK.COM (2603:10c6:10:80::14) To DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|BY5PR12MB4131:EE_ X-MS-Office365-Filtering-Correlation-Id: 51f97d95-fdf3-40ea-9517-08dd0a9701ed X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|7416014|1800799024|366016; X-Microsoft-Antispam-Message-Info: WsiKLD6+0uJ+7RWxqdG38d34iPTHvR2DjjGfNoVopCof1B+YfiCSNBUIPLWa7mdVmiAo3uyPuNNHs6Yg+i0AC6yiZ6bCDwRzVb/NRK1ITzaMorIz3z2nhlhE4N+FhLxiLTKg8l+FF26F6eT0cq4yXnHeg/xjS4+L3WLNcEYZqxC4yP4fGhB5+mW8Qb0H4Rj6XRG98MzCe7dUInjlUu2uPfhG+6GVK7wJiwrGCn5cnM/RDA8xVjFiXzesdp2UgVuB+u3LZl1ojas7mu2SGB+CXU9Ste7Qx5LqnuTiVnrMKLQGN0dIyRxw1uLLiUAMWV5kRo2uw2JWHKkb0xjq4AyohH9zeXRRfwwk6L4sm4VpnXBULngtdv7Q3KqQLyWPA/JffvQwBuxFiTh6LkC5Fm049OtBkhGziF1WuaWue4yB0pcU2N+bjYuIC862hWfPH9IwpuYntB12Q1PC/7I5qbtFBhWkQZlwi3e+e03WPrHUQugzyQl9BRZ4sCiFl5O0zrzmsIlm1gw5svCADEbR7Mg+Op/xl/W33D5Tjjrj/a4oyZPwt8Gteg1TE6aArD7bWqXJkoXQtRJrYggpG+gWjbV847TsjXquuEDf1Otv5mDtQR5k3Y/mNPtL3QwqqdRrFg6QgmlRdkdGXVd5HGbCbwgpJarwGq1CSp90QHL/wkJoh19/k7nxIzRLtgTK+EhcfgONbq5ghCIRTrx8kvux2UlsVoWtuu45x60uCRXjsdge5BMKcfvSbEQoEaVt/9rz2QW0CfcOGH5mKuQWLPih4mddEUYLiQeQprJocTl0LX3OXO8KvO4Lslgb9eizkL0zYu2bXl+P9X4PXpRcMdWB+4tMuduy0FSJHhSo082khIeB9a3k83nRiUEVs/3+nO9h/hFuqcqSDP3guKyNIuRU/+v3x54Vptirc+iTwvbKruMsoS2DmTqsZ1Kczpxa9P9HgQvb5Qc5JF3+HMt4o7XQCCp/St3NE6hOQGFSlGJ1I175UTe2MtZTtIgeaByoulb0mAPTbkHxzjoJD5YSRtdjJAYxKkpDavVLkyR5uVI6GoGq84Qr9CY9Oylg0qM2pRpouE4/AdNljT36yiTCOeuCHwgQgO1l81xYkon1C4I/mr6ZUnBxnSqUDmnr1EC0DDT6nTsPEuM3RqAOIUWT1pdsGli9Awk4fffujBpN8Wo/0u1nd5ud6nrBdJSPtIpsp53tJdYQjE2Oy9bWIWV/JfWFwOy3+gXPSyeZoT70/qjIlRZm3V8KzZ4JHjvFWZv6wme0CYHTK1aO7UktZJyNHpR2llyVFnkMlMXI8omceVJJNNAo8vQ= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(376014)(7416014)(1800799024)(366016);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 0nDgjCJlptlfr8LdZrtPHKX9dabqksp6Zm0q3uqPyE6dgpVwb66OjLQCOwIoQGIa+ZWP3RwQbNccy1zTRwlCQ+iaSaf/8f8GergF7IR9T2f2/oKNsmpCRUViGT8x9Yh5apq5HQhrx8MUBQkCelXHolTWUS2sSSFx8HlAzpGXHoNY6szHJmbSUCg8LpVxbsnp+Fzy6PJeOp9R9lz7WmXuxhs6v2JKJrAQ743dFfyaj6n/cEogTT0ODVDwL/XaWd027GDHcGIF7mkOTB6gxKuumyrT/cInMxJYW8ZBba9zqB3pM/d36QhrENtUXMp3L65j4hdqMwbHqXUj8HhEbRgVDUsqwfkBapgYDDKWZTzdA59EpWbFiLZTyuGetDgndXQ2VwMmRR3ZR433YjTG/EpmQ1NbQytTLhv/1tj9FpmTzFOQb/koQ8W6QA493FML+mh6H5aHLYGRWGEglM/siY/4lcJw8YhkLflE9IwBmpkjNd6TMAyya6hFugSAg09GsuxPLMTDK/dAy+W998s0KbIW3DYjxpc1F8HmaO67XES0AgOhy75D6HmaLIXzvTucTE1DgzXKCcnhCuAEuuYoS9nfS8QWuZcNoB9HKjlVZ7SGvbaBx5dwJHXKCUk+ZSBDl8ojaQMS9QbhXXMlcdU5S11kwVQdALWvA0w/Rz45xjg1On7v834p3jypU82VxPFcjUjXpdUIjDTRaGUDV4yh/9V2nEYLPDUNS7uJHgV4i6xmbLwyxt2CBB8Jf6HVVS8e2lsnkHSpEWLWauYm95T168y9CjMZ908OSgl6ISGS9BwRn57zhVzwg3DxphKKXC7GjHHWyjMhQrKeABg7VO4c40N3MQIpYCxOCwtCwUrjy0MMAe9TE/J8m+2MSWxRiTXcEWHatX+TFSMsD1azlWC9xgpAAkXxCbZZDdHBB0VLY0u77J+juGtNac8xUzPUCz5bGG12CTSLhz4Nmt7XyxiyT7Pdi5p0r+kzByFcBzTo4QR+T4h2RhbOEnJyiB270NwZqvn7aS2GKA5+Nqx+XXdGnOHZlCJoQv1up6i0kH8dC72spUAQrO8/YFhShHp2TZ6jJpVAm2BJvr1WIVMf+NV3gS2qusc3S8K7B3NOXMTc8OjlG18D+y2as5T+uIKM/SqfRocT7SSNn9bKUFHoNvMBC94dy+zG0ixMFTnVljjB8DVhU0Wpbeb0lhvo7Y7i6QtB1aa0ojcAonfYMaQqgwVWN/3qyVrC4ikAp2VAS4+sOt4GCnBl3PuWzJjjamIypKtaNHs282lSFB/4vlJWJLwNFU28/d3YBCq66kE+0QdZ9uJgRuWZtf9Q2Rznyhj9e+gmHM08ZFwm+7Uvvygp3+RQyyHcDaQEH0A25lXc9DKA/mJqxwzUPOWinuw0xr26LsmDY9V1N0t52YWomQR5G9zcXnpoFp/qfLuUkilGdpuCnrlJsWwXjxlgPxZ8LUnyVh6P2yvOup1IdmDfwvaxztL8nb87hHURkHAI3baEWcmeaWTYXvCYzS406ByExw9Egfhpm2mWZrdyfm2oKMHRqxHyqPyagsXLRlcybfounB8HkV3HS+xfv9L0FWbBOoZ3bv+YkJ77 X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 51f97d95-fdf3-40ea-9517-08dd0a9701ed X-MS-Exchange-CrossTenant-AuthSource: DS0PR12MB7726.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:43:04.8316 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: +BDpavGxlHHDL7Ricjy8e6GUn5Rfxe0yAHc9exGMqGzwfXTCAHoO/54aMN4RIY5WSkr7DYXP/amAxdKCfCWOSw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR12MB4131 Now that DAX and all other reference counts to ZONE_DEVICE pages are managed normally there is no need for the special devmap PTE/PMD/PUD page table bits. So drop all references to these, freeing up a software defined page table bit on architectures supporting it. Signed-off-by: Alistair Popple Acked-by: Will Deacon # arm64 --- Documentation/mm/arch_pgtable_helpers.rst | 6 +-- arch/arm64/Kconfig | 1 +- arch/arm64/include/asm/pgtable-prot.h | 1 +- arch/arm64/include/asm/pgtable.h | 24 +-------- arch/powerpc/Kconfig | 1 +- arch/powerpc/include/asm/book3s/64/hash-4k.h | 6 +-- arch/powerpc/include/asm/book3s/64/hash-64k.h | 7 +-- arch/powerpc/include/asm/book3s/64/pgtable.h | 52 +------------------ arch/powerpc/include/asm/book3s/64/radix.h | 14 +----- arch/x86/Kconfig | 1 +- arch/x86/include/asm/pgtable.h | 51 +----------------- arch/x86/include/asm/pgtable_types.h | 5 +-- include/linux/mm.h | 7 +-- include/linux/pfn_t.h | 20 +------- include/linux/pgtable.h | 19 +------ mm/Kconfig | 4 +- mm/debug_vm_pgtable.c | 59 +-------------------- mm/hmm.c | 3 +- 18 files changed, 11 insertions(+), 270 deletions(-) diff --git a/Documentation/mm/arch_pgtable_helpers.rst b/Documentation/mm/arch_pgtable_helpers.rst index af24516..c88c7fa 100644 --- a/Documentation/mm/arch_pgtable_helpers.rst +++ b/Documentation/mm/arch_pgtable_helpers.rst @@ -30,8 +30,6 @@ PTE Page Table Helpers +---------------------------+--------------------------------------------------+ | pte_protnone | Tests a PROT_NONE PTE | +---------------------------+--------------------------------------------------+ -| pte_devmap | Tests a ZONE_DEVICE mapped PTE | -+---------------------------+--------------------------------------------------+ | pte_soft_dirty | Tests a soft dirty PTE | +---------------------------+--------------------------------------------------+ | pte_swp_soft_dirty | Tests a soft dirty swapped PTE | @@ -104,8 +102,6 @@ PMD Page Table Helpers +---------------------------+--------------------------------------------------+ | pmd_protnone | Tests a PROT_NONE PMD | +---------------------------+--------------------------------------------------+ -| pmd_devmap | Tests a ZONE_DEVICE mapped PMD | -+---------------------------+--------------------------------------------------+ | pmd_soft_dirty | Tests a soft dirty PMD | +---------------------------+--------------------------------------------------+ | pmd_swp_soft_dirty | Tests a soft dirty swapped PMD | @@ -177,8 +173,6 @@ PUD Page Table Helpers +---------------------------+--------------------------------------------------+ | pud_write | Tests a writable PUD | +---------------------------+--------------------------------------------------+ -| pud_devmap | Tests a ZONE_DEVICE mapped PUD | -+---------------------------+--------------------------------------------------+ | pud_mkyoung | Creates a young PUD | +---------------------------+--------------------------------------------------+ | pud_mkold | Creates an old PUD | diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index fd9df6d..1d90ab9 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -38,7 +38,6 @@ config ARM64 select ARCH_HAS_MEM_ENCRYPT select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE - select ARCH_HAS_PTE_DEVMAP select ARCH_HAS_PTE_SPECIAL select ARCH_HAS_HW_PTE_YOUNG select ARCH_HAS_SETUP_DMA_OPS diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h index 2a11d0c..f7e12d6 100644 --- a/arch/arm64/include/asm/pgtable-prot.h +++ b/arch/arm64/include/asm/pgtable-prot.h @@ -17,7 +17,6 @@ #define PTE_SWP_EXCLUSIVE (_AT(pteval_t, 1) << 2) /* only for swp ptes */ #define PTE_DIRTY (_AT(pteval_t, 1) << 55) #define PTE_SPECIAL (_AT(pteval_t, 1) << 56) -#define PTE_DEVMAP (_AT(pteval_t, 1) << 57) /* * PTE_PRESENT_INVALID=1 & PTE_VALID=0 indicates that the pte's fields should be diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index c329ea0..07a252b 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -108,7 +108,6 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys) #define pte_user(pte) (!!(pte_val(pte) & PTE_USER)) #define pte_user_exec(pte) (!(pte_val(pte) & PTE_UXN)) #define pte_cont(pte) (!!(pte_val(pte) & PTE_CONT)) -#define pte_devmap(pte) (!!(pte_val(pte) & PTE_DEVMAP)) #define pte_tagged(pte) ((pte_val(pte) & PTE_ATTRINDX_MASK) == \ PTE_ATTRINDX(MT_NORMAL_TAGGED)) @@ -291,11 +290,6 @@ static inline pmd_t pmd_mkcont(pmd_t pmd) return __pmd(pmd_val(pmd) | PMD_SECT_CONT); } -static inline pte_t pte_mkdevmap(pte_t pte) -{ - return set_pte_bit(pte, __pgprot(PTE_DEVMAP | PTE_SPECIAL)); -} - #ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP static inline int pte_uffd_wp(pte_t pte) { @@ -593,14 +587,6 @@ static inline int pmd_trans_huge(pmd_t pmd) #define pmd_mkhuge(pmd) (__pmd(pmd_val(pmd) & ~PMD_TABLE_BIT)) -#ifdef CONFIG_TRANSPARENT_HUGEPAGE -#define pmd_devmap(pmd) pte_devmap(pmd_pte(pmd)) -#endif -static inline pmd_t pmd_mkdevmap(pmd_t pmd) -{ - return pte_pmd(set_pte_bit(pmd_pte(pmd), __pgprot(PTE_DEVMAP))); -} - #ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP #define pmd_special(pte) (!!((pmd_val(pte) & PTE_SPECIAL))) static inline pmd_t pmd_mkspecial(pmd_t pmd) @@ -1190,16 +1176,6 @@ static inline int pmdp_set_access_flags(struct vm_area_struct *vma, return __ptep_set_access_flags(vma, address, (pte_t *)pmdp, pmd_pte(entry), dirty); } - -static inline int pud_devmap(pud_t pud) -{ - return 0; -} - -static inline int pgd_devmap(pgd_t pgd) -{ - return 0; -} #endif #ifdef CONFIG_PAGE_TABLE_CHECK diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 8094a01..c50136f 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -145,7 +145,6 @@ config PPC select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE select ARCH_HAS_PHYS_TO_DMA select ARCH_HAS_PMEM_API - select ARCH_HAS_PTE_DEVMAP if PPC_BOOK3S_64 select ARCH_HAS_PTE_SPECIAL select ARCH_HAS_SCALED_CPUTIME if VIRT_CPU_ACCOUNTING_NATIVE && PPC_BOOK3S_64 select ARCH_HAS_SET_MEMORY diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h index c3efaca..b0546d3 100644 --- a/arch/powerpc/include/asm/book3s/64/hash-4k.h +++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h @@ -160,12 +160,6 @@ extern pmd_t hash__pmdp_huge_get_and_clear(struct mm_struct *mm, extern int hash__has_transparent_hugepage(void); #endif -static inline pmd_t hash__pmd_mkdevmap(pmd_t pmd) -{ - BUG(); - return pmd; -} - #endif /* !__ASSEMBLY__ */ #endif /* _ASM_POWERPC_BOOK3S_64_HASH_4K_H */ diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h index 0bf6fd0..0fb5b7d 100644 --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h @@ -259,7 +259,7 @@ static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array, */ static inline int hash__pmd_trans_huge(pmd_t pmd) { - return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE | _PAGE_DEVMAP)) == + return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE)) == (_PAGE_PTE | H_PAGE_THP_HUGE)); } @@ -281,11 +281,6 @@ extern pmd_t hash__pmdp_huge_get_and_clear(struct mm_struct *mm, extern int hash__has_transparent_hugepage(void); #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ -static inline pmd_t hash__pmd_mkdevmap(pmd_t pmd) -{ - return __pmd(pmd_val(pmd) | (_PAGE_PTE | H_PAGE_THP_HUGE | _PAGE_DEVMAP)); -} - #endif /* __ASSEMBLY__ */ #endif /* _ASM_POWERPC_BOOK3S_64_HASH_64K_H */ diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h index 6d98e6f..bda0649 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -88,7 +88,6 @@ #define _PAGE_SOFT_DIRTY _RPAGE_SW3 /* software: software dirty tracking */ #define _PAGE_SPECIAL _RPAGE_SW2 /* software: special page */ -#define _PAGE_DEVMAP _RPAGE_SW1 /* software: ZONE_DEVICE page */ /* * Drivers request for cache inhibited pte mapping using _PAGE_NO_CACHE @@ -109,7 +108,7 @@ */ #define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \ _PAGE_ACCESSED | H_PAGE_THP_HUGE | _PAGE_PTE | \ - _PAGE_SOFT_DIRTY | _PAGE_DEVMAP) + _PAGE_SOFT_DIRTY) /* * user access blocked by key */ @@ -123,7 +122,7 @@ */ #define _PAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \ _PAGE_ACCESSED | _PAGE_SPECIAL | _PAGE_PTE | \ - _PAGE_SOFT_DIRTY | _PAGE_DEVMAP) + _PAGE_SOFT_DIRTY) /* * We define 2 sets of base prot bits, one for basic pages (ie, @@ -609,24 +608,6 @@ static inline pte_t pte_mkhuge(pte_t pte) return pte; } -static inline pte_t pte_mkdevmap(pte_t pte) -{ - return __pte_raw(pte_raw(pte) | cpu_to_be64(_PAGE_SPECIAL | _PAGE_DEVMAP)); -} - -/* - * This is potentially called with a pmd as the argument, in which case it's not - * safe to check _PAGE_DEVMAP unless we also confirm that _PAGE_PTE is set. - * That's because the bit we use for _PAGE_DEVMAP is not reserved for software - * use in page directory entries (ie. non-ptes). - */ -static inline int pte_devmap(pte_t pte) -{ - __be64 mask = cpu_to_be64(_PAGE_DEVMAP | _PAGE_PTE); - - return (pte_raw(pte) & mask) == mask; -} - static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) { /* FIXME!! check whether this need to be a conditional */ @@ -1380,35 +1361,6 @@ static inline bool arch_needs_pgtable_deposit(void) } extern void serialize_against_pte_lookup(struct mm_struct *mm); - -static inline pmd_t pmd_mkdevmap(pmd_t pmd) -{ - if (radix_enabled()) - return radix__pmd_mkdevmap(pmd); - return hash__pmd_mkdevmap(pmd); -} - -static inline pud_t pud_mkdevmap(pud_t pud) -{ - if (radix_enabled()) - return radix__pud_mkdevmap(pud); - BUG(); - return pud; -} - -static inline int pmd_devmap(pmd_t pmd) -{ - return pte_devmap(pmd_pte(pmd)); -} - -static inline int pud_devmap(pud_t pud) -{ - return pte_devmap(pud_pte(pud)); -} - -static inline int pgd_devmap(pgd_t pgd) -{ - return 0; } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h index 8f55ff7..df23a82 100644 --- a/arch/powerpc/include/asm/book3s/64/radix.h +++ b/arch/powerpc/include/asm/book3s/64/radix.h @@ -264,7 +264,7 @@ static inline int radix__p4d_bad(p4d_t p4d) static inline int radix__pmd_trans_huge(pmd_t pmd) { - return (pmd_val(pmd) & (_PAGE_PTE | _PAGE_DEVMAP)) == _PAGE_PTE; + return (pmd_val(pmd) & _PAGE_PTE) == _PAGE_PTE; } static inline pmd_t radix__pmd_mkhuge(pmd_t pmd) @@ -274,7 +274,7 @@ static inline pmd_t radix__pmd_mkhuge(pmd_t pmd) static inline int radix__pud_trans_huge(pud_t pud) { - return (pud_val(pud) & (_PAGE_PTE | _PAGE_DEVMAP)) == _PAGE_PTE; + return (pud_val(pud) & _PAGE_PTE) == _PAGE_PTE; } static inline pud_t radix__pud_mkhuge(pud_t pud) @@ -315,16 +315,6 @@ static inline int radix__has_transparent_pud_hugepage(void) } #endif -static inline pmd_t radix__pmd_mkdevmap(pmd_t pmd) -{ - return __pmd(pmd_val(pmd) | (_PAGE_PTE | _PAGE_DEVMAP)); -} - -static inline pud_t radix__pud_mkdevmap(pud_t pud) -{ - return __pud(pud_val(pud) | (_PAGE_PTE | _PAGE_DEVMAP)); -} - struct vmem_altmap; struct dev_pagemap; extern int __meminit radix__vmemmap_create_mapping(unsigned long start, diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 16354df..6a1a06f 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -93,7 +93,6 @@ config X86 select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE select ARCH_HAS_PMEM_API if X86_64 - select ARCH_HAS_PTE_DEVMAP if X86_64 select ARCH_HAS_PTE_SPECIAL select ARCH_HAS_HW_PTE_YOUNG select ARCH_HAS_NONLEAF_PMD_YOUNG if PGTABLE_LEVELS > 2 diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 4c2d080..23a993a 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -308,16 +308,15 @@ static inline bool pmd_leaf(pmd_t pte) } #ifdef CONFIG_TRANSPARENT_HUGEPAGE -/* NOTE: when predicate huge page, consider also pmd_devmap, or use pmd_leaf */ static inline int pmd_trans_huge(pmd_t pmd) { - return (pmd_val(pmd) & (_PAGE_PSE|_PAGE_DEVMAP)) == _PAGE_PSE; + return (pmd_val(pmd) & _PAGE_PSE) == _PAGE_PSE; } #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD static inline int pud_trans_huge(pud_t pud) { - return (pud_val(pud) & (_PAGE_PSE|_PAGE_DEVMAP)) == _PAGE_PSE; + return (pud_val(pud) & _PAGE_PSE) == _PAGE_PSE; } #endif @@ -327,24 +326,6 @@ static inline int has_transparent_hugepage(void) return boot_cpu_has(X86_FEATURE_PSE); } -#ifdef CONFIG_ARCH_HAS_PTE_DEVMAP -static inline int pmd_devmap(pmd_t pmd) -{ - return !!(pmd_val(pmd) & _PAGE_DEVMAP); -} - -#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD -static inline int pud_devmap(pud_t pud) -{ - return !!(pud_val(pud) & _PAGE_DEVMAP); -} -#else -static inline int pud_devmap(pud_t pud) -{ - return 0; -} -#endif - #ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP static inline bool pmd_special(pmd_t pmd) { @@ -368,12 +349,6 @@ static inline pud_t pud_mkspecial(pud_t pud) return pud_set_flags(pud, _PAGE_SPECIAL); } #endif /* CONFIG_ARCH_SUPPORTS_PUD_PFNMAP */ - -static inline int pgd_devmap(pgd_t pgd) -{ - return 0; -} -#endif #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ static inline pte_t pte_set_flags(pte_t pte, pteval_t set) @@ -534,11 +509,6 @@ static inline pte_t pte_mkspecial(pte_t pte) return pte_set_flags(pte, _PAGE_SPECIAL); } -static inline pte_t pte_mkdevmap(pte_t pte) -{ - return pte_set_flags(pte, _PAGE_SPECIAL|_PAGE_DEVMAP); -} - /* See comments above mksaveddirty_shift() */ static inline pmd_t pmd_mksaveddirty(pmd_t pmd) { @@ -610,11 +580,6 @@ static inline pmd_t pmd_mkwrite_shstk(pmd_t pmd) return pmd_set_flags(pmd, _PAGE_DIRTY); } -static inline pmd_t pmd_mkdevmap(pmd_t pmd) -{ - return pmd_set_flags(pmd, _PAGE_DEVMAP); -} - static inline pmd_t pmd_mkhuge(pmd_t pmd) { return pmd_set_flags(pmd, _PAGE_PSE); @@ -680,11 +645,6 @@ static inline pud_t pud_mkdirty(pud_t pud) return pud_mksaveddirty(pud); } -static inline pud_t pud_mkdevmap(pud_t pud) -{ - return pud_set_flags(pud, _PAGE_DEVMAP); -} - static inline pud_t pud_mkhuge(pud_t pud) { return pud_set_flags(pud, _PAGE_PSE); @@ -1012,13 +972,6 @@ static inline int pte_present(pte_t a) return pte_flags(a) & (_PAGE_PRESENT | _PAGE_PROTNONE); } -#ifdef CONFIG_ARCH_HAS_PTE_DEVMAP -static inline int pte_devmap(pte_t a) -{ - return (pte_flags(a) & _PAGE_DEVMAP) == _PAGE_DEVMAP; -} -#endif - #define pte_accessible pte_accessible static inline bool pte_accessible(struct mm_struct *mm, pte_t a) { diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index 6f82e75..4a13b76 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -33,7 +33,6 @@ #define _PAGE_BIT_CPA_TEST _PAGE_BIT_SOFTW1 #define _PAGE_BIT_UFFD_WP _PAGE_BIT_SOFTW2 /* userfaultfd wrprotected */ #define _PAGE_BIT_SOFT_DIRTY _PAGE_BIT_SOFTW3 /* software dirty tracking */ -#define _PAGE_BIT_DEVMAP _PAGE_BIT_SOFTW4 #ifdef CONFIG_X86_64 #define _PAGE_BIT_SAVED_DIRTY _PAGE_BIT_SOFTW5 /* Saved Dirty bit */ @@ -117,11 +116,9 @@ #if defined(CONFIG_X86_64) || defined(CONFIG_X86_PAE) #define _PAGE_NX (_AT(pteval_t, 1) << _PAGE_BIT_NX) -#define _PAGE_DEVMAP (_AT(u64, 1) << _PAGE_BIT_DEVMAP) #define _PAGE_SOFTW4 (_AT(pteval_t, 1) << _PAGE_BIT_SOFTW4) #else #define _PAGE_NX (_AT(pteval_t, 0)) -#define _PAGE_DEVMAP (_AT(pteval_t, 0)) #define _PAGE_SOFTW4 (_AT(pteval_t, 0)) #endif @@ -148,7 +145,7 @@ #define _COMMON_PAGE_CHG_MASK (PTE_PFN_MASK | _PAGE_PCD | _PAGE_PWT | \ _PAGE_SPECIAL | _PAGE_ACCESSED | \ _PAGE_DIRTY_BITS | _PAGE_SOFT_DIRTY | \ - _PAGE_DEVMAP | _PAGE_CC | _PAGE_UFFD_WP) + _PAGE_CC | _PAGE_UFFD_WP) #define _PAGE_CHG_MASK (_COMMON_PAGE_CHG_MASK | _PAGE_PAT) #define _HPAGE_CHG_MASK (_COMMON_PAGE_CHG_MASK | _PAGE_PSE | _PAGE_PAT_LARGE) diff --git a/include/linux/mm.h b/include/linux/mm.h index 052f234..600a1a8 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2692,13 +2692,6 @@ static inline pud_t pud_mkspecial(pud_t pud) } #endif /* CONFIG_ARCH_SUPPORTS_PUD_PFNMAP */ -#ifndef CONFIG_ARCH_HAS_PTE_DEVMAP -static inline int pte_devmap(pte_t pte) -{ - return 0; -} -#endif - extern pte_t *__get_locked_pte(struct mm_struct *mm, unsigned long addr, spinlock_t **ptl); static inline pte_t *get_locked_pte(struct mm_struct *mm, unsigned long addr, diff --git a/include/linux/pfn_t.h b/include/linux/pfn_t.h index 2d91482..0100ad8 100644 --- a/include/linux/pfn_t.h +++ b/include/linux/pfn_t.h @@ -97,26 +97,6 @@ static inline pud_t pfn_t_pud(pfn_t pfn, pgprot_t pgprot) #endif #endif -#ifdef CONFIG_ARCH_HAS_PTE_DEVMAP -static inline bool pfn_t_devmap(pfn_t pfn) -{ - const u64 flags = PFN_DEV|PFN_MAP; - - return (pfn.val & flags) == flags; -} -#else -static inline bool pfn_t_devmap(pfn_t pfn) -{ - return false; -} -pte_t pte_mkdevmap(pte_t pte); -pmd_t pmd_mkdevmap(pmd_t pmd); -#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && \ - defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) -pud_t pud_mkdevmap(pud_t pud); -#endif -#endif /* CONFIG_ARCH_HAS_PTE_DEVMAP */ - #ifdef CONFIG_ARCH_HAS_PTE_SPECIAL static inline bool pfn_t_special(pfn_t pfn) { diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index fa661b2..9406de1 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1616,21 +1616,6 @@ static inline int pud_write(pud_t pud) } #endif /* pud_write */ -#if !defined(CONFIG_ARCH_HAS_PTE_DEVMAP) || !defined(CONFIG_TRANSPARENT_HUGEPAGE) -static inline int pmd_devmap(pmd_t pmd) -{ - return 0; -} -static inline int pud_devmap(pud_t pud) -{ - return 0; -} -static inline int pgd_devmap(pgd_t pgd) -{ - return 0; -} -#endif - #if !defined(CONFIG_TRANSPARENT_HUGEPAGE) || \ !defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) static inline int pud_trans_huge(pud_t pud) @@ -1885,8 +1870,8 @@ typedef unsigned int pgtbl_mod_mask; * - It should contain a huge PFN, which points to a huge page larger than * PAGE_SIZE of the platform. The PFN format isn't important here. * - * - It should cover all kinds of huge mappings (e.g., pXd_trans_huge(), - * pXd_devmap(), or hugetlb mappings). + * - It should cover all kinds of huge mappings (i.e. pXd_trans_huge() + * or hugetlb mappings). */ #ifndef pgd_leaf #define pgd_leaf(x) false diff --git a/mm/Kconfig b/mm/Kconfig index 4c9f5ea..06f939b 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1044,9 +1044,6 @@ config ARCH_HAS_CURRENT_STACK_POINTER register alias named "current_stack_pointer", this config can be selected. -config ARCH_HAS_PTE_DEVMAP - bool - config ARCH_HAS_ZONE_DMA_SET bool @@ -1064,7 +1061,6 @@ config ZONE_DEVICE depends on MEMORY_HOTPLUG depends on MEMORY_HOTREMOVE depends on SPARSEMEM_VMEMMAP - depends on ARCH_HAS_PTE_DEVMAP select XARRAY_MULTI help diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c index bc748f7..cf5ff92 100644 --- a/mm/debug_vm_pgtable.c +++ b/mm/debug_vm_pgtable.c @@ -348,12 +348,6 @@ static void __init pud_advanced_tests(struct pgtable_debug_args *args) vaddr &= HPAGE_PUD_MASK; pud = pfn_pud(args->pud_pfn, args->page_prot); - /* - * Some architectures have debug checks to make sure - * huge pud mapping are only found with devmap entries - * For now test with only devmap entries. - */ - pud = pud_mkdevmap(pud); set_pud_at(args->mm, vaddr, args->pudp, pud); flush_dcache_page(page); pudp_set_wrprotect(args->mm, vaddr, args->pudp); @@ -366,7 +360,6 @@ static void __init pud_advanced_tests(struct pgtable_debug_args *args) WARN_ON(!pud_none(pud)); #endif /* __PAGETABLE_PMD_FOLDED */ pud = pfn_pud(args->pud_pfn, args->page_prot); - pud = pud_mkdevmap(pud); pud = pud_wrprotect(pud); pud = pud_mkclean(pud); set_pud_at(args->mm, vaddr, args->pudp, pud); @@ -384,7 +377,6 @@ static void __init pud_advanced_tests(struct pgtable_debug_args *args) #endif /* __PAGETABLE_PMD_FOLDED */ pud = pfn_pud(args->pud_pfn, args->page_prot); - pud = pud_mkdevmap(pud); pud = pud_mkyoung(pud); set_pud_at(args->mm, vaddr, args->pudp, pud); flush_dcache_page(page); @@ -693,53 +685,6 @@ static void __init pmd_protnone_tests(struct pgtable_debug_args *args) static void __init pmd_protnone_tests(struct pgtable_debug_args *args) { } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ -#ifdef CONFIG_ARCH_HAS_PTE_DEVMAP -static void __init pte_devmap_tests(struct pgtable_debug_args *args) -{ - pte_t pte = pfn_pte(args->fixed_pte_pfn, args->page_prot); - - pr_debug("Validating PTE devmap\n"); - WARN_ON(!pte_devmap(pte_mkdevmap(pte))); -} - -#ifdef CONFIG_TRANSPARENT_HUGEPAGE -static void __init pmd_devmap_tests(struct pgtable_debug_args *args) -{ - pmd_t pmd; - - if (!has_transparent_hugepage()) - return; - - pr_debug("Validating PMD devmap\n"); - pmd = pfn_pmd(args->fixed_pmd_pfn, args->page_prot); - WARN_ON(!pmd_devmap(pmd_mkdevmap(pmd))); -} - -#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD -static void __init pud_devmap_tests(struct pgtable_debug_args *args) -{ - pud_t pud; - - if (!has_transparent_pud_hugepage()) - return; - - pr_debug("Validating PUD devmap\n"); - pud = pfn_pud(args->fixed_pud_pfn, args->page_prot); - WARN_ON(!pud_devmap(pud_mkdevmap(pud))); -} -#else /* !CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */ -static void __init pud_devmap_tests(struct pgtable_debug_args *args) { } -#endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */ -#else /* CONFIG_TRANSPARENT_HUGEPAGE */ -static void __init pmd_devmap_tests(struct pgtable_debug_args *args) { } -static void __init pud_devmap_tests(struct pgtable_debug_args *args) { } -#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ -#else -static void __init pte_devmap_tests(struct pgtable_debug_args *args) { } -static void __init pmd_devmap_tests(struct pgtable_debug_args *args) { } -static void __init pud_devmap_tests(struct pgtable_debug_args *args) { } -#endif /* CONFIG_ARCH_HAS_PTE_DEVMAP */ - static void __init pte_soft_dirty_tests(struct pgtable_debug_args *args) { pte_t pte = pfn_pte(args->fixed_pte_pfn, args->page_prot); @@ -1341,10 +1286,6 @@ static int __init debug_vm_pgtable(void) pte_protnone_tests(&args); pmd_protnone_tests(&args); - pte_devmap_tests(&args); - pmd_devmap_tests(&args); - pud_devmap_tests(&args); - pte_soft_dirty_tests(&args); pmd_soft_dirty_tests(&args); pte_swap_soft_dirty_tests(&args); diff --git a/mm/hmm.c b/mm/hmm.c index 285578e..2a12879 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -395,8 +395,7 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, return 0; } -#if defined(CONFIG_ARCH_HAS_PTE_DEVMAP) && \ - defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) +#if defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) static inline unsigned long pud_to_hmm_pfn_flags(struct hmm_range *range, pud_t pud) { From patchwork Fri Nov 22 01:40:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13882606 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2071.outbound.protection.outlook.com [40.107.243.71]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F1860169AE4; Fri, 22 Nov 2024 01:43:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.243.71 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239793; cv=fail; b=RVfVqB5ri/MTbJziCggyvqSaYHZxV1cgxh1+7Ufb4HATQG7zVXy/KjPUyQ0ltIWshU2Djs0v9y9xSD4vKGVPL8hkrwHa9KSecP7am1UNfw2B5ugfpl97hC1ZQaxWloJJU/QOSR589u7Pu0/7EQGqgOHptIUkt+7uv7t0aM8v2ww= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732239793; c=relaxed/simple; bh=LvWJTKncrqEz9V3CVjZ/oNdrwzKuo3VO7pA+GM2GRKU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=iQpwDB85Msk2p22+m8hTGkUr61duH0Rze963s6hnVD+1MDoZiMQovLuaN7Lvf0982J6rkDM/z3NhnQ9TnPGoZQeCDCIgc71RBeqw666tnXAlMzz6x8thSCvsaGC0EJ0bGQzP1BxQ1BV7nYhOe9JYg7U/efPIecFyV43FD4zVlpY= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=p9HGgjvY; arc=fail smtp.client-ip=40.107.243.71 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="p9HGgjvY" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=icbA9pL/B28q+U6ofgArMls7koavVNYDRvwDlzy0eVmsPvx4gFzg1FJYRHS+iNrVEUpyePQkqOhLQ66SYWWONpsPEpcis+ytiBJ6hQPMFVKqZvvtpbheahDgCV5uf8OhGYzCF4SAppoQnuxsnwApmwdQUGMHVWZgxIiYX4NfTZnaS6I5lghbs7a1mRHebEc6LRBHHiPMuVTfI7rB8nx0Vaj1eICs7gbfgJdt4+Bf1l6IWoDE3hLzjph1ascTA7aZLvuuVMfpuU7BoRYMGrbslDYf1uz0P80cHL7mTeQ0zeCzBaddx86OQCFRH03aUqhtQB89/VlFfbfZ+3NE0PLRHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=jD/illaP0OIVmmCg6mzj5RVo8vUE1Tl9cr1eNsTS2Pc=; b=ltwGYHFXlrTiEtfHTggpR9LGfJ4rK5klgbr8ILi0WfaCY9GYyNkneL9gFju2dVlNOeVU5VV16UWVvXv4ZYLnB8a7bWhJseXyyK8HMYMjaW5OC1swff4Q6ry2sc2HccZMJs4NVzKSmLHg0WobOe1XOu99fm9awBx/vP6AcNF556HoUO1aY/p7DuISqRfEjRn6sL6OUxe0l52xMqd0gR4qx5uppKXho1BUCPI5o2TLniBOKnMhYXVyS4BJjayCoX42IJn21DEawKeNc4kTYdxELOXVzWXHqCrR8fHWPEoRqwKE3W0JzY0cWgt6UbpTI9K0JRF4Wq8vXuu/Q8YiN/Ds9A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=jD/illaP0OIVmmCg6mzj5RVo8vUE1Tl9cr1eNsTS2Pc=; b=p9HGgjvY6yo6EwioMWZKxQA/sAfvshQErH4jekym8hymDnDgxrY5AMWB82tDcbUciByFO21jaho/mRllbcZjcclU+dLNpaMnZk1MbCIfiYzsMyD0bvUaata+v258rBMR7Vi5kPQsbwoDapOws/v2/5mYZjNuXTeQHjfryjNITtAhaey7zP8Xu2H9kiMFQD8bpvcbQ3yH1Ueosfts0nqgPy9rkMRkFD5ecppAK/h8eoe8buTK1hiErgdbCPSjnr2Eazm+zIHeAMVBSC7vO4Xp0iIaxcaXqAKLDQ+S+F1/OmcPSdW7vUq1uF9+9x+Y16F/dsBhKMIn77/6PaJqnIgdaw== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by BY5PR12MB4131.namprd12.prod.outlook.com (2603:10b6:a03:212::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8158.24; Fri, 22 Nov 2024 01:43:09 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%4]) with mapi id 15.20.8182.016; Fri, 22 Nov 2024 01:43:09 +0000 From: Alistair Popple To: dan.j.williams@intel.com, linux-mm@kvack.org Cc: Alistair Popple , lina@asahilina.net, zhang.lyra@gmail.com, gerald.schaefer@linux.ibm.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca, catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com Subject: [PATCH v3 25/25] Revert "riscv: mm: Add support for ZONE_DEVICE" Date: Fri, 22 Nov 2024 12:40:46 +1100 Message-ID: X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: SY2PR01CA0038.ausprd01.prod.outlook.com (2603:10c6:1:15::26) To DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|BY5PR12MB4131:EE_ X-MS-Office365-Filtering-Correlation-Id: 02028ee6-2fd1-4714-1731-08dd0a97049b X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|7416014|1800799024|366016; X-Microsoft-Antispam-Message-Info: 35hYQCfm4jU7I4LEDTDao9VtBktIBFxZw2KyX/aAxf+OrS4SQy0QiPR19p2ZWjjpK5nwqKpFAodh/XqGCTTlaFL2EOsENSjthoBninSOeSwiNnHW8J9wO6y+jo8+JZd3+mOSnuy67fpL1nV/W+uIra3avtDZLeqxelT1KsLg67sSEREScPeE8d5E7fwa5JJD4NjqEkjITK162X097aXFqeYpkEYwPpCkMIK4MbE3NiWyha5+NDovbA7Do5gPKFzORrWc0LJ5i67CGBDynMnVKTqwIkHeWv62Wl5K+V+SjqlVp8inT2ygKobhaNLykNkF5YVllZLJlOXVWzrQyTAFZGTdsWXbhI1DZZNuEIPddBpJBeYp3BUFucL4xXIXEpQBiV0hgObJEWhuM2jm12irm7xSD//Apbq+dTx5VYQ9XZDC9vYVwLhmdAlE89imXO4h0BTEaIBT+KVWpQu2lRS8MxOW/8nL85vRWtRpGk148tW8y99oaJ+2/sE9q2Bjyes8POyHrMFPxD5Yd4xhYxA9/hKmzRMD+wmRWrnpmWTIRrtPnWwkOJNG/55AYOniohnSGOYF7o5JOfZZOz+fOP6ABLTTUHEvDtIbbI2e/NZxbeAmF5uDabiUC8CISFU0LZXHFT1aEyJtMNb9K6jrBsnLQDkWFvHUZdhF8xc90jmhniUXZ1+TjOZrhIL2QnFEZnwPGUOz4q2CzvkmAWwxP6XJkZpW4wjWoe0/5gSk1eYwJ2D9DV3t8w0XUoac1mjt8DNWR9iBFR/fDyuQ7RXyJ5Y2K6BeiaNf9SGkYOFLiNrhWvgAeh8i3TOXTh5RMvROFlKsvbPX4nxmYvZVZezb6oyapiAHXtVErsgyyNnYlohMLNLUanWZXOZu3ycIvkrWceiJNef4mxcohyO4YgLOJ93jfo29y6jCtqJBIcN6P+ZBBYEt0hZvVN2ZtJ2QR+qfH/+rzW5gfNRUkULpAyfdbgBG+gZvwHXmaORoTfkjOl0tDrm1InvbkYi0+Q9zr3q7A2K4yBPEJZJkvMaQxHobKLkPG5veTStjmei41mM7sfgTOy9pQH32alwaMY2Ha7DT1sfhatB+jQE5NIHekl4CNIl44ryC6t1hncqI2qpZah9hNiL53V2McXZIHByaUJsVtN2kXm+Dho4dW//9jUmwYr1HyqoEgef6AE729ACiG0IvXYHGgS9ekam0/IgoA/HWLNgxiqXtutoCWaPyKbA7AORvHxzxxKHVtG3o3XbUN783uholLVWihFXS07Ostt/Lu1easXbj4uXSSy9XjnxqYclqadvQ+IScI/TlP4x7Ek3XF7bV7thW/Ob5xt7QOaUFOANR X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(376014)(7416014)(1800799024)(366016);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: iEYq4pHDworgqm7emTaayj27+d/jtMOZf3cRpeXN3gFIdWBzf8SFdIT82iDD4TKGGSjkbew6gQ/CP2kPVUGc3Cf5TXz30kVclelc5GY4LYLcZjiQHEe15oUAy1tOPUMyKi8Z9OSmXuhEviC2a2xl1+EypMVMQhHA3KTWhSCrLq2otMm5tzh3qDe8Cpspl2NsfBGnG2v+fCA04G+xKDaLOzM5+q0HcwNDBXLg0FheK2xuukFVV9SqgyzQfQra5lCkR2e2nRL9vUU3mhBtU7INYL3lQEgHYzApSRxpOhQ/vvwfAYlc3sHs2IiIZmx4VuWyaDBvlJjn3Emnx+yF+UtIMjdUiWRkRigWwM5jWoM/8LDqzYtM71chFumWNSJNR0sue68XubNnDgMsVqTA/junTVcD5KvJotw6+C0nRT0P7l6nT6dGI8uslFw/SWMp7sVCMC8aZW66rmNlOgJ6TjRJCdgDrLM1OvwcLNwzqTre+K52P2DKb4pzZxzxGhAdr9sWlqhTZ9qA5GiLLBejuW01+64u+/kKVlaqAEQxdEvCPHPLMGJDRauSIu7DqterGLNAQY9JfrsKFejEVRm2LfSR7UmHhZF0x/R+r2LSRBYiyCq3xtYSTXJ6Y2GPcPFF3gzFR+yq+Ulq1lNd3qW+c4leR8Kfojwu9b62HSWHW1PdPfwbZa2yhqg8XMf7QWctVpNomYWYKakuFrkJlUjDS4MCkKP0gg+XEmhaIMtnNSHYXv+kd+5TgbggCjh71+8EXfeP2eNb7BqAad4KtCscsuPNLLoyHGh6FXECBsQoR+AZOKIV8CJKkhGwqXgt0LpqYURxXAhUcyV+EPlRwI7ZTu7PT4g4d8N1u5LcUFwu7UH0kvDMmaFwZH6NUu2RnVSVjA7c6CYZaom417sMc95ESsDTl8OHsojC4QM4t7YjCWQnIjg57xvkEi696NTph64RfBMzQGTEgnreNlMqRP+ESgsNjMpGiz1xm1B1JzJx43gJnuAosLxaPbtY4bG/qFjLsdr7/Qy2R4yjAJTezJ5tIVRCCGUb1IHGNpgIaNRuSNv7sJ2XLv9DhT6ZXcLhExYO5BJHooyrRrbS5jTVuQg/wstOCy8OeYyLMISRMc43AGe1ENrpFOGbS7r4gdE+1zsrkigjpxcJ1DDYmGJt+13t5kfjqfhtXxBmaQEV6Ux6qvZsQMSwHIVeWEVTjH/hFbbyVxouYz5gaG9JDjUuONdbAkZDxxcJaIi3dsB+dUxz6yIALPt7Qtd3CkgbhkPefGrcdSjSj3km9BLzlifIkqerXLdIkFZ5nl0V0jfGeOgwkh20RXiB2ezNUC5eptX612p9o6R1QZ6Gu5cwiwIUYsyLwHdVj8yoUz7fBt+VjUYbCby+n2pmgchwp2ixpu0KHuw57eT6FC6SV9fruaYUJwJegxp7SIWPJsA6S2YvIfqN3ZdCgINuGouDX4B+fqS5XMHLZbzVFrEhTwwAMsxo6QK4sU7EJ1DFl63wAaw+xqttFmViR3AZzdNec1e+xphLcps1AtOieWfxM4JUWI2AWOBlr42QUv0kQCbZXL7UB7EUS5tvhaUW1P25LunlsLhg70+6vZ3M X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 02028ee6-2fd1-4714-1731-08dd0a97049b X-MS-Exchange-CrossTenant-AuthSource: DS0PR12MB7726.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2024 01:43:09.3279 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: cxvc/hMc9bPl7/wdi8drZNOC8ETUpk5vSY2WWBsMN1/h9Wvm9qoQ5B8x23IfX41oVmkX2K8PqyQ8mKzhcNtkeg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR12MB4131 DEVMAP PTEs are no longer required to support ZONE_DEVICE so remove them. Signed-off-by: Alistair Popple Suggested-by: Chunyan Zhang --- arch/riscv/Kconfig | 1 - arch/riscv/include/asm/pgtable-64.h | 20 -------------------- arch/riscv/include/asm/pgtable-bits.h | 1 - arch/riscv/include/asm/pgtable.h | 17 ----------------- 4 files changed, 39 deletions(-) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 6254594..2475c5b 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -40,7 +40,6 @@ config RISCV select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE select ARCH_HAS_PMEM_API select ARCH_HAS_PREPARE_SYNC_CORE_CMD - select ARCH_HAS_PTE_DEVMAP if 64BIT && MMU select ARCH_HAS_PTE_SPECIAL select ARCH_HAS_SET_DIRECT_MAP if MMU select ARCH_HAS_SET_MEMORY if MMU diff --git a/arch/riscv/include/asm/pgtable-64.h b/arch/riscv/include/asm/pgtable-64.h index 0897dd9..8c36a88 100644 --- a/arch/riscv/include/asm/pgtable-64.h +++ b/arch/riscv/include/asm/pgtable-64.h @@ -398,24 +398,4 @@ static inline struct page *pgd_page(pgd_t pgd) #define p4d_offset p4d_offset p4d_t *p4d_offset(pgd_t *pgd, unsigned long address); -#ifdef CONFIG_TRANSPARENT_HUGEPAGE -static inline int pte_devmap(pte_t pte); -static inline pte_t pmd_pte(pmd_t pmd); - -static inline int pmd_devmap(pmd_t pmd) -{ - return pte_devmap(pmd_pte(pmd)); -} - -static inline int pud_devmap(pud_t pud) -{ - return 0; -} - -static inline int pgd_devmap(pgd_t pgd) -{ - return 0; -} -#endif - #endif /* _ASM_RISCV_PGTABLE_64_H */ diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h index a8f5205..179bd4a 100644 --- a/arch/riscv/include/asm/pgtable-bits.h +++ b/arch/riscv/include/asm/pgtable-bits.h @@ -19,7 +19,6 @@ #define _PAGE_SOFT (3 << 8) /* Reserved for software */ #define _PAGE_SPECIAL (1 << 8) /* RSW: 0x1 */ -#define _PAGE_DEVMAP (1 << 9) /* RSW, devmap */ #define _PAGE_TABLE _PAGE_PRESENT /* diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index e79f152..93b6571 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -399,13 +399,6 @@ static inline int pte_special(pte_t pte) return pte_val(pte) & _PAGE_SPECIAL; } -#ifdef CONFIG_ARCH_HAS_PTE_DEVMAP -static inline int pte_devmap(pte_t pte) -{ - return pte_val(pte) & _PAGE_DEVMAP; -} -#endif - /* static inline pte_t pte_rdprotect(pte_t pte) */ static inline pte_t pte_wrprotect(pte_t pte) @@ -447,11 +440,6 @@ static inline pte_t pte_mkspecial(pte_t pte) return __pte(pte_val(pte) | _PAGE_SPECIAL); } -static inline pte_t pte_mkdevmap(pte_t pte) -{ - return __pte(pte_val(pte) | _PAGE_DEVMAP); -} - static inline pte_t pte_mkhuge(pte_t pte) { return pte; @@ -752,11 +740,6 @@ static inline pmd_t pmd_mkdirty(pmd_t pmd) return pte_pmd(pte_mkdirty(pmd_pte(pmd))); } -static inline pmd_t pmd_mkdevmap(pmd_t pmd) -{ - return pte_pmd(pte_mkdevmap(pmd_pte(pmd))); -} - static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp, pmd_t pmd) {