From patchwork Mon Jul 29 11:50:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lorenzo Stoakes X-Patchwork-Id: 13744779 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0915C1474B6; Mon, 29 Jul 2024 11:51:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=205.220.177.32 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722253895; cv=fail; b=MRPwuYaYQLCThh/BIizg9JJHgTk2jcEALeDlEjbHw9+CLhqeCXGgeOIsxMXtDypXGtx/PSBrDhyU1vzqPlCcaZQpwglKPjDv1jC4ZYPcPi6G7B+lv0zwdWtAuknYe/nVpGxlmmuVSOqsMf/sIuMhFKLO2QbcD54NVmXpGqk3kHk= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722253895; c=relaxed/simple; bh=QGvFHBDuaPu/LVtOxK3O4Wfyh0jw2T2Ga6EOOLbps20=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=XIUygQ9dyYSyECByRIB891NzTihvA5lDUf5KecmpuC2YqHtV9oKAMBn6h3OtAH/W31kZc5UXWGQ9DeGcT9aqV/dBJNszcEaHT5qvHSpqvYT8vXvp8jcdJS9UWnk+j/w0i7XpGSOafdMcCbr7/LbKwMtb7m7CqEweCffVfuasnKE= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=Qkz8Bgo/; dkim=pass (1024-bit key) header.d=oracle.onmicrosoft.com header.i=@oracle.onmicrosoft.com header.b=w2YuB/33; arc=fail smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="Qkz8Bgo/"; dkim=pass (1024-bit key) header.d=oracle.onmicrosoft.com header.i=@oracle.onmicrosoft.com header.b="w2YuB/33" Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 46T8MXaZ018367; Mon, 29 Jul 2024 11:51:12 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :content-transfer-encoding:content-type:mime-version; s= corp-2023-11-20; bh=cDxNkSoIeY29Ui9zcs5X0Vh1jlKEEnhLuoyQyV3JJmk=; b= Qkz8Bgo/hJZLua4/yZTqxBLXaMv0M8P9HM4+nYGURvMIO1ufVXP7T3WSffq4wbWE +0Oyz4Rtn7ZWCns7aX1TR++9vLs2SLHTdI9+mCzRLxCxlb+QDt4WXJpTlPP1EQ8o yyPn3/6t6h1idHRIAh2xHUFe88b4CW4LhY6YoP3SdY1EaSvPj+/j3dn9W+7cpwzM /Ian/rTvhWODXEkZDXnnoKyLJylIBi7vz88ZkYvrIGLAfHD2SxYO4lYCSO9FkXEM c0r4CLBk1B8YBGdIDfjsLhFzbPJNjApZDXlbs/bP9MtZwt+YKrZhYLYtn6Nyt2JE DkXT2hQihE44qRwfGijPVw== Received: from phxpaimrmta02.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta02.appoci.oracle.com [147.154.114.232]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 40mrgs2bdp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 29 Jul 2024 11:51:12 +0000 (GMT) Received: from pps.filterd (phxpaimrmta02.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta02.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 46TBXkeM025722; Mon, 29 Jul 2024 11:50:58 GMT Received: from nam04-bn8-obe.outbound.protection.outlook.com (mail-bn8nam04lp2040.outbound.protection.outlook.com [104.47.74.40]) by phxpaimrmta02.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 40npcegnam-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 29 Jul 2024 11:50:57 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=AVXmVH8K3ZZPitN9yiw6WvsGyPGdVLEphT/9TWKtCQ9GTL059XqLT2IPEduVap08Aa45CbVmRKMhXyu6g+QTcPficq+cqIUffgQpq1QEe4Lgurt0F9QwHePCRD4nnNYG6G8QRxls7gnlNrO5BGbq3UYK90H248WkAO7DGKPHtvwyX2cneWfJtMnGRLseG6FQGTso+Wm0g9dAu5enJ9gHgrPDt6uN9K35KAyJ0VaHdHXiGJWg83f5vyZtTqlsiyy3c4TcTKh1369GD/lg5bWsp5v074y/m6as8wdaWB+y9c06/JLiR3fA3oham1VKRm7PbSCKEq5ycQSrpoDSxD1LDQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=cDxNkSoIeY29Ui9zcs5X0Vh1jlKEEnhLuoyQyV3JJmk=; b=QbLMD5j8bwLlGHyyKkvoIH7+hSn/bSKt7+X9plx0BvOz/AER9TcU0tjCTlo+ACif+uP/0c32E+UO9ZsY2M5esuN7/YMEg1E9z/y9vm2YNpuyMzi903ULnxgYITD5dGrg0JaZzw5/8SHvrHgaOYMVnD9YWFJ7ReFUIDWzNZbUfar/tp45OdiGqx2eWrFju2Na20EE4bmkUSAXZM8sSYR0MJM+yQKrVEJ9cc5HLmByTAr43xYlINH9TlvlN8trol1IzsduZxgsWhM9qUHnqtUmYEohBDgNhc89UaajDzdZsODl+YWm9MYJfWbrvTrPr0xSngpGKcUksPk0E6U5FmZm1Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=cDxNkSoIeY29Ui9zcs5X0Vh1jlKEEnhLuoyQyV3JJmk=; b=w2YuB/33n2ycRQbXkhSNhaZynxkSisJRB4DZ3hUFCN6vTYRHkkZe82W/wg5v15o1B5m9ArWu1edKUwonIjKzvjPoFTdna6Z4eFQUgDYznDFCqobq6S8MMqYx9xuUVjY4PWDfJhTtAGAtriIrcHD+IX/Euqm8Fcr+vFVF/SJ+qFU= Received: from SJ0PR10MB5613.namprd10.prod.outlook.com (2603:10b6:a03:3d0::5) by SJ0PR10MB4543.namprd10.prod.outlook.com (2603:10b6:a03:2d9::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7807.28; Mon, 29 Jul 2024 11:50:54 +0000 Received: from SJ0PR10MB5613.namprd10.prod.outlook.com ([fe80::4239:cf6f:9caa:940e]) by SJ0PR10MB5613.namprd10.prod.outlook.com ([fe80::4239:cf6f:9caa:940e%6]) with mapi id 15.20.7807.026; Mon, 29 Jul 2024 11:50:54 +0000 From: Lorenzo Stoakes To: Andrew Morton Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, "Liam R . Howlett" , Vlastimil Babka , Matthew Wilcox , Alexander Viro , Christian Brauner , Jan Kara , Eric Biederman , Kees Cook , Suren Baghdasaryan , SeongJae Park , Shuah Khan , Brendan Higgins , David Gow , Rae Moar Subject: [PATCH v4 1/7] userfaultfd: move core VMA manipulation logic to mm/userfaultfd.c Date: Mon, 29 Jul 2024 12:50:35 +0100 Message-ID: <50c3ed995fd81c45876c86304c8a00bf3e396cfd.1722251717.git.lorenzo.stoakes@oracle.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: LO2P123CA0064.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:1::28) To SJ0PR10MB5613.namprd10.prod.outlook.com (2603:10b6:a03:3d0::5) Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ0PR10MB5613:EE_|SJ0PR10MB4543:EE_ X-MS-Office365-Filtering-Correlation-Id: ce41b799-98b9-4fdd-c645-08dcafc4b38a X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|1800799024|7416014|376014; X-Microsoft-Antispam-Message-Info: 7nOOScVXEUCoHAUPpi/otGVvnoaiONZJJc/G7tJqeIpiTcx3Avse1+E8zolArs3cQhP7inzqQZcco/LWeiR/4678JrHaImexiKEgsOtt8MmjVWPWNm1CUMgCsEilXtXLP8RP6K56uRNpQTs+mmd9G7DNcYji0QrSg1awdxXQbd3aqX7494Vhzk3oex7bwcbJ2MAJCQT/CoufTI94Bmr4GdsPrM/nSx9UjioaM/8XcOwIpVfmbLOF4H4GsF9H3O8raV1ugFxDvrBfMBIvlza54XpsSgqiNW70hx6CB+jJ6yfFIHiE207HAowJxiPnPpA6VPtjxDo7W80tATljkSi2Lwhhpwk7lGMhcVg7DlhClXbVlSxuPYF50zr7ulKUiitMgFRpTbMkQVt5k7hNhQpntsFrLYzj5GjbPZlAKex6G1PxhJoNvTPb70rXyZkiaCH3Z6dDzf3zrK36yyxVnafrYqlXhyAR//8Rpms71zPEDhWccBsy44NAy1/PEXn1eqc+Utkohaq3UU5lm8Xo/+Jk2xA0yh0yVKw/EWW0GOT6JeTFTxvDX9Yw5Y/C2sKVKGrXyJsqOpJdCT+zqFUkvGj+p7FWhAiL67WgAOd2LtkakOX+VgpIEUwD6r1qqxxr+GfD9pd6D32/UlS9rtJV/lgvAGIwMwmzER3L9t35KsoS+x3OK+g2fKc6eDbhT/6eyUiRsxj5/sY9K29wPfKPMFWmfkp3wI401bhTQNUPyQ2CKpczDdb1uuYu0q2N1pq7LsrErliY0E0ZhogHl77pUgbRTBAx1VRaSMZaKpkyTLWuIYqitfjm4cYiV+cH5N0qCC/j5RRrQFFM73upOM4E/wJC7QeKgI9ch5bAJt9FEAsnO0Go0T7GBqWJkQ4n1In7CvnTLMTUvGQ2vNx7qp/04dqYuB/o/rj0pQMDIjtDTFwxBxnpT0b+9kdB8OPCO+zK5DI4smQ5Db17tQ1zjKyZkIL93Ec9CHA2kNdkXssIG1lIaF36LHKawXShGsAx8LPMHirhEW7+HiBRI97V3+iI+cul7ivE6VDIznmRbHjcgrvSjQlnWvJzXioK8Bb97iPnwdWVEHcPNyfJwskJxVG4LXKmAggxEwSjoUjAPG/+svhLbxEvijCUndINujpY3km5W9Q7MYjqz2kmiDXYLuZh9HVgWjS9xdEEM251/t4r8hWlplIV7RsvzU9Dll+dQ0wPxF8/gASsADQ3hJshrMqZteHL+ZZcgdRGoPgJf7ui1HoLcrNDnv19o0F4ixrmF5/Te6wKTj8yvpIsM8rm7WmAzplFw0Unfh/pwrSMuchQulHz5LaU9rAiVWf4mNME091S2kDWsfrlVObumYbxY6SQCFSXpQ== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ0PR10MB5613.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(1800799024)(7416014)(376014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: HMd9BdT9Mc3bP/Sn6uK4pA2pWAAEiOAzgwwBRqcFANzhuafMW8y1EbiLeOtloKwBllqepEhJtil6aGa/V564VJEQoH6KsmZWnZXoODL2toA4nZH7d4JbbjhtIrVPWl2MI9d1TP4Q91hi69bdHF9iA205a17Lt1axBOVp8zSL82YjE8WZilyzXki3e+OuwBNKdekDTFZX/aQwZjlkp5TBaDhteIniAn9GFtsMWATZsetQeX+yulKbxk8LC3oyzbwedb4AOPxGgXWVHZi8zzALeoxNEGuk0katZSBodNu6ZhOxRW9Hv//0c3arSb+v/rYItBF+g3dapkxyG1lu3d6MTCz4vGow7hFTsK51G8DZ3t2intEwUO6J3oEaIK42LKr/DXhqWkPcRbxw3y42lT3kPT2IkRx+h+nMxSt+PTN2Rwn6sgqAx/n6nRwDVYvZldhCgJSvQOgRsPTV9t/1fzHCwWHeCDcoAZLH8QQL3mKLBtMQS8PrAM/gxXyHNRNCELxCFOhAee3IWYqlanGiA4CwtPGMTqits7hzrn0vy5lzaxof+tiVZIH0TFZDDAdj5s6AkCGq65Q/j6Xcc4ruyF/Qn24fRPtUKWUMmcBJf6xwaNClJ+/nnS6Vy+reweqNBS+7wzai4sIPj+feR3ZkaBUK5QYobp6PNSTUbLrw5v0XOKAyrwad9sO32im+vxXcHNwHjZj2h1hXC2UzQU/1Wfj7/tUxvmoa2Bn+kQeXyqVCAwlGanfG/jjM+MU7kGtjIMahHA7C42PTTpncPsap1dPH0lm3sCam48kTe5NH2TlwdQ2DSArdDRcDy8YnwoMxkQyPq8Hz2j8mP6ig+vFTT/gpJe/jjnPNpXFgGbrH9dRLKzt7CtbsaaldCWRdwrdxleJDkNaZMDBhJSEFr7mDbpGigSIlMfHvYM8GALu59hDUsD+nhRlL10FtbtrvzY6SD4GYa0rW7F6XhW1kX0dX9ZPU1LxBqNvJ3CFJFnbAIotKWDS8KnIrmO4QAmIEwDBMjFHFuOsPOZ5a0RzZjGq7FMa+SdDonokAYyRcY9oxJgu6lK9idbhGnQ7jr5bpro+rPdxXVXGwu/9/w45ju3PI7Of+JLBM4B5fQKVY6rGhPiORVqzMloQrIGxuiDnb97uRbF91Z9cgXW9dxH+bTK+g8h6qfQJ1ilT09xDa8JUtuGYWMbclyZzuQM6rCZuyYeFJaW6afgp6Bdq6uDNK9bwcGf3l3dIRxUMxBcEgrg10Hg1LfVomoYTwG/ojse3eGWJVNEJXITwFDTk7gCI1NBfTrsBntTbmjIcjy69SDOENx1CejdMLwKYHuxp1DxEG/77lbJWyWTRfJr4zU6h70VElNyxSIuZTgIyFQ1CVMQm+bQ+s7kwhM1IC1u8yY1u2pYW+vVzxS7rI2erdn8kJkR4XEAoH/Ywy1KRDmlI8O9NxCzBp7K2KvhG8Py9uB6UUGKr3hZW4ve/QPi/b59ksPUGoBK85wrJW+9JDSYFpqRL90olIcVGQJq8bV1g4RSnpNIQkcsobM9lFnNsG3zNJQCwhDdUvDWc0Owj7TzOgVVjb2U02TWZyChwBrKuELku+evIVRSrme18iEViuMIct1OsVQKtw+Q== X-MS-Exchange-AntiSpam-ExternalHop-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-ExternalHop-MessageData-0: K5wBmcG5DtPLD0AeKemk7yFumnsNGHF6LDRK0bPf1noRIyBeJu8aLxJGs/0TKETz87DpE5eP5t9PXP+fYab1YXDtMX6TcLDmXBMSCDRJ1G/FcoryunYNZ91rB5SdJlmW0+ccOx+UDZhaPUErdMcOAxtfOYHOtkuCg2bib2XJ6dSVoZc1qdzl+a32fRV32/T9KAazpW+NEndjvg4npJFw5gxjEqIsuzSx9E0dq0KKXS2YukmTSoAPssrc+ic2PUSl51eiY17pQxoFhuEczpt6rFGWrqEXH3alF5xoE9G0pts3Q3UkrWsrzFcti0NHHFCczqH3oYKKnFyAwD8OanQp3Ag4CdgZP2oru0/YPVBDhW9HtlsZ9eDGyhGbDK2CN8ZqSm+WiZKn7P38PdVNoqdnm4JSChTEbpeIqpZjK83VFLVdRjtSBMEU9LkBNcAwBvJ5FkN95qWC7KtoLCxbWDBIvI6PewgmZAer+WhOxELqpbyxZzC3BXwWS6+7lILuAzqY68v/JoCPpe7fn7l6YPSQrPBICHIQkytbxfoGwwpkSwZa6vmBP5FXWdaYpRL7pLXuLn8ran3KJA0r/dGuGhy7xBx9I+LjF+Ehd/mwMupeWXE= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: ce41b799-98b9-4fdd-c645-08dcafc4b38a X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB5613.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Jul 2024 11:50:54.1180 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: aFm5AzEEhC0eyB7ZiHqFo/j6LmFVCbX2Hk9a5KUZSeLidHUBxCopZgUM66ZuETte7h5QZmNOjA+KHGnh+oI55UZcmQxv9r8BQ9fKi1gKrI0= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR10MB4543 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-07-29_10,2024-07-26_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 mlxlogscore=999 bulkscore=0 suspectscore=0 spamscore=0 malwarescore=0 adultscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2407110000 definitions=main-2407290080 X-Proofpoint-GUID: VN1g3t3-iZJZX1V6kEV-GjVWiYqFC14H X-Proofpoint-ORIG-GUID: VN1g3t3-iZJZX1V6kEV-GjVWiYqFC14H This patch forms part of a patch series intending to separate out VMA logic and render it testable from userspace, which requires that core manipulation functions be exposed in an mm/-internal header file. In order to do this, we must abstract APIs we wish to test, in this instance functions which ultimately invoke vma_modify(). This patch therefore moves all logic which ultimately invokes vma_modify() to mm/userfaultfd.c, trying to transfer code at a functional granularity where possible. Reviewed-by: Vlastimil Babka Reviewed-by: Liam R. Howlett Signed-off-by: Lorenzo Stoakes Reported-by: Pengfei Xu Signed-off-by: Lorenzo Stoakes Acked-by: Vlastimil Babka --- fs/userfaultfd.c | 160 +++----------------------------- include/linux/userfaultfd_k.h | 19 ++++ mm/userfaultfd.c | 168 ++++++++++++++++++++++++++++++++++ 3 files changed, 198 insertions(+), 149 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 27a3e9285fbf..b3ed7207df7e 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -104,21 +104,6 @@ bool userfaultfd_wp_unpopulated(struct vm_area_struct *vma) return ctx->features & UFFD_FEATURE_WP_UNPOPULATED; } -static void userfaultfd_set_vm_flags(struct vm_area_struct *vma, - vm_flags_t flags) -{ - const bool uffd_wp_changed = (vma->vm_flags ^ flags) & VM_UFFD_WP; - - vm_flags_reset(vma, flags); - /* - * For shared mappings, we want to enable writenotify while - * userfaultfd-wp is enabled (see vma_wants_writenotify()). We'll simply - * recalculate vma->vm_page_prot whenever userfaultfd-wp changes. - */ - if ((vma->vm_flags & VM_SHARED) && uffd_wp_changed) - vma_set_page_prot(vma); -} - static int userfaultfd_wake_function(wait_queue_entry_t *wq, unsigned mode, int wake_flags, void *key) { @@ -615,22 +600,7 @@ static void userfaultfd_event_wait_completion(struct userfaultfd_ctx *ctx, spin_unlock_irq(&ctx->event_wqh.lock); if (release_new_ctx) { - struct vm_area_struct *vma; - struct mm_struct *mm = release_new_ctx->mm; - VMA_ITERATOR(vmi, mm, 0); - - /* the various vma->vm_userfaultfd_ctx still points to it */ - mmap_write_lock(mm); - for_each_vma(vmi, vma) { - if (vma->vm_userfaultfd_ctx.ctx == release_new_ctx) { - vma_start_write(vma); - vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX; - userfaultfd_set_vm_flags(vma, - vma->vm_flags & ~__VM_UFFD_FLAGS); - } - } - mmap_write_unlock(mm); - + userfaultfd_release_new(release_new_ctx); userfaultfd_ctx_put(release_new_ctx); } @@ -662,9 +632,7 @@ int dup_userfaultfd(struct vm_area_struct *vma, struct list_head *fcs) return 0; if (!(octx->features & UFFD_FEATURE_EVENT_FORK)) { - vma_start_write(vma); - vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX; - userfaultfd_set_vm_flags(vma, vma->vm_flags & ~__VM_UFFD_FLAGS); + userfaultfd_reset_ctx(vma); return 0; } @@ -749,9 +717,7 @@ void mremap_userfaultfd_prep(struct vm_area_struct *vma, up_write(&ctx->map_changing_lock); } else { /* Drop uffd context if remap feature not enabled */ - vma_start_write(vma); - vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX; - userfaultfd_set_vm_flags(vma, vma->vm_flags & ~__VM_UFFD_FLAGS); + userfaultfd_reset_ctx(vma); } } @@ -870,53 +836,13 @@ static int userfaultfd_release(struct inode *inode, struct file *file) { struct userfaultfd_ctx *ctx = file->private_data; struct mm_struct *mm = ctx->mm; - struct vm_area_struct *vma, *prev; /* len == 0 means wake all */ struct userfaultfd_wake_range range = { .len = 0, }; - unsigned long new_flags; - VMA_ITERATOR(vmi, mm, 0); WRITE_ONCE(ctx->released, true); - if (!mmget_not_zero(mm)) - goto wakeup; - - /* - * Flush page faults out of all CPUs. NOTE: all page faults - * must be retried without returning VM_FAULT_SIGBUS if - * userfaultfd_ctx_get() succeeds but vma->vma_userfault_ctx - * changes while handle_userfault released the mmap_lock. So - * it's critical that released is set to true (above), before - * taking the mmap_lock for writing. - */ - mmap_write_lock(mm); - prev = NULL; - for_each_vma(vmi, vma) { - cond_resched(); - BUG_ON(!!vma->vm_userfaultfd_ctx.ctx ^ - !!(vma->vm_flags & __VM_UFFD_FLAGS)); - if (vma->vm_userfaultfd_ctx.ctx != ctx) { - prev = vma; - continue; - } - /* Reset ptes for the whole vma range if wr-protected */ - if (userfaultfd_wp(vma)) - uffd_wp_range(vma, vma->vm_start, - vma->vm_end - vma->vm_start, false); - new_flags = vma->vm_flags & ~__VM_UFFD_FLAGS; - vma = vma_modify_flags_uffd(&vmi, prev, vma, vma->vm_start, - vma->vm_end, new_flags, - NULL_VM_UFFD_CTX); - - vma_start_write(vma); - userfaultfd_set_vm_flags(vma, new_flags); - vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX; + userfaultfd_release_all(mm, ctx); - prev = vma; - } - mmap_write_unlock(mm); - mmput(mm); -wakeup: /* * After no new page faults can wait on this fault_*wqh, flush * the last page faults that may have been already waiting on @@ -1293,14 +1219,14 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, unsigned long arg) { struct mm_struct *mm = ctx->mm; - struct vm_area_struct *vma, *prev, *cur; + struct vm_area_struct *vma, *cur; int ret; struct uffdio_register uffdio_register; struct uffdio_register __user *user_uffdio_register; - unsigned long vm_flags, new_flags; + unsigned long vm_flags; bool found; bool basic_ioctls; - unsigned long start, end, vma_end; + unsigned long start, end; struct vma_iterator vmi; bool wp_async = userfaultfd_wp_async_ctx(ctx); @@ -1428,57 +1354,8 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, } for_each_vma_range(vmi, cur, end); BUG_ON(!found); - vma_iter_set(&vmi, start); - prev = vma_prev(&vmi); - if (vma->vm_start < start) - prev = vma; - - ret = 0; - for_each_vma_range(vmi, vma, end) { - cond_resched(); - - BUG_ON(!vma_can_userfault(vma, vm_flags, wp_async)); - BUG_ON(vma->vm_userfaultfd_ctx.ctx && - vma->vm_userfaultfd_ctx.ctx != ctx); - WARN_ON(!(vma->vm_flags & VM_MAYWRITE)); - - /* - * Nothing to do: this vma is already registered into this - * userfaultfd and with the right tracking mode too. - */ - if (vma->vm_userfaultfd_ctx.ctx == ctx && - (vma->vm_flags & vm_flags) == vm_flags) - goto skip; - - if (vma->vm_start > start) - start = vma->vm_start; - vma_end = min(end, vma->vm_end); - - new_flags = (vma->vm_flags & ~__VM_UFFD_FLAGS) | vm_flags; - vma = vma_modify_flags_uffd(&vmi, prev, vma, start, vma_end, - new_flags, - (struct vm_userfaultfd_ctx){ctx}); - if (IS_ERR(vma)) { - ret = PTR_ERR(vma); - break; - } - - /* - * In the vma_merge() successful mprotect-like case 8: - * the next vma was merged into the current one and - * the current one has not been updated yet. - */ - vma_start_write(vma); - userfaultfd_set_vm_flags(vma, new_flags); - vma->vm_userfaultfd_ctx.ctx = ctx; - - if (is_vm_hugetlb_page(vma) && uffd_disable_huge_pmd_share(vma)) - hugetlb_unshare_all_pmds(vma); - - skip: - prev = vma; - start = vma->vm_end; - } + ret = userfaultfd_register_range(ctx, vma, vm_flags, start, end, + wp_async); out_unlock: mmap_write_unlock(mm); @@ -1519,7 +1396,6 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, struct vm_area_struct *vma, *prev, *cur; int ret; struct uffdio_range uffdio_unregister; - unsigned long new_flags; bool found; unsigned long start, end, vma_end; const void __user *buf = (void __user *)arg; @@ -1622,27 +1498,13 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, wake_userfault(vma->vm_userfaultfd_ctx.ctx, &range); } - /* Reset ptes for the whole vma range if wr-protected */ - if (userfaultfd_wp(vma)) - uffd_wp_range(vma, start, vma_end - start, false); - - new_flags = vma->vm_flags & ~__VM_UFFD_FLAGS; - vma = vma_modify_flags_uffd(&vmi, prev, vma, start, vma_end, - new_flags, NULL_VM_UFFD_CTX); + vma = userfaultfd_clear_vma(&vmi, prev, vma, + start, vma_end); if (IS_ERR(vma)) { ret = PTR_ERR(vma); break; } - /* - * In the vma_merge() successful mprotect-like case 8: - * the next vma was merged into the current one and - * the current one has not been updated yet. - */ - vma_start_write(vma); - userfaultfd_set_vm_flags(vma, new_flags); - vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX; - skip: prev = vma; start = vma->vm_end; diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index a12bcf042551..9fc6ce15c499 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -267,6 +267,25 @@ extern void userfaultfd_unmap_complete(struct mm_struct *mm, extern bool userfaultfd_wp_unpopulated(struct vm_area_struct *vma); extern bool userfaultfd_wp_async(struct vm_area_struct *vma); +void userfaultfd_reset_ctx(struct vm_area_struct *vma); + +struct vm_area_struct *userfaultfd_clear_vma(struct vma_iterator *vmi, + struct vm_area_struct *prev, + struct vm_area_struct *vma, + unsigned long start, + unsigned long end); + +int userfaultfd_register_range(struct userfaultfd_ctx *ctx, + struct vm_area_struct *vma, + unsigned long vm_flags, + unsigned long start, unsigned long end, + bool wp_async); + +void userfaultfd_release_new(struct userfaultfd_ctx *ctx); + +void userfaultfd_release_all(struct mm_struct *mm, + struct userfaultfd_ctx *ctx); + #else /* CONFIG_USERFAULTFD */ /* mm helpers */ diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index e54e5c8907fa..3b7715ecf292 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -1760,3 +1760,171 @@ ssize_t move_pages(struct userfaultfd_ctx *ctx, unsigned long dst_start, VM_WARN_ON(!moved && !err); return moved ? moved : err; } + +static void userfaultfd_set_vm_flags(struct vm_area_struct *vma, + vm_flags_t flags) +{ + const bool uffd_wp_changed = (vma->vm_flags ^ flags) & VM_UFFD_WP; + + vm_flags_reset(vma, flags); + /* + * For shared mappings, we want to enable writenotify while + * userfaultfd-wp is enabled (see vma_wants_writenotify()). We'll simply + * recalculate vma->vm_page_prot whenever userfaultfd-wp changes. + */ + if ((vma->vm_flags & VM_SHARED) && uffd_wp_changed) + vma_set_page_prot(vma); +} + +static void userfaultfd_set_ctx(struct vm_area_struct *vma, + struct userfaultfd_ctx *ctx, + unsigned long flags) +{ + vma_start_write(vma); + vma->vm_userfaultfd_ctx = (struct vm_userfaultfd_ctx){ctx}; + userfaultfd_set_vm_flags(vma, + (vma->vm_flags & ~__VM_UFFD_FLAGS) | flags); +} + +void userfaultfd_reset_ctx(struct vm_area_struct *vma) +{ + userfaultfd_set_ctx(vma, NULL, 0); +} + +struct vm_area_struct *userfaultfd_clear_vma(struct vma_iterator *vmi, + struct vm_area_struct *prev, + struct vm_area_struct *vma, + unsigned long start, + unsigned long end) +{ + struct vm_area_struct *ret; + + /* Reset ptes for the whole vma range if wr-protected */ + if (userfaultfd_wp(vma)) + uffd_wp_range(vma, start, end - start, false); + + ret = vma_modify_flags_uffd(vmi, prev, vma, start, end, + vma->vm_flags & ~__VM_UFFD_FLAGS, + NULL_VM_UFFD_CTX); + + /* + * In the vma_merge() successful mprotect-like case 8: + * the next vma was merged into the current one and + * the current one has not been updated yet. + */ + if (!IS_ERR(ret)) + userfaultfd_reset_ctx(vma); + + return ret; +} + +/* Assumes mmap write lock taken, and mm_struct pinned. */ +int userfaultfd_register_range(struct userfaultfd_ctx *ctx, + struct vm_area_struct *vma, + unsigned long vm_flags, + unsigned long start, unsigned long end, + bool wp_async) +{ + VMA_ITERATOR(vmi, ctx->mm, start); + struct vm_area_struct *prev = vma_prev(&vmi); + unsigned long vma_end; + unsigned long new_flags; + + if (vma->vm_start < start) + prev = vma; + + for_each_vma_range(vmi, vma, end) { + cond_resched(); + + BUG_ON(!vma_can_userfault(vma, vm_flags, wp_async)); + BUG_ON(vma->vm_userfaultfd_ctx.ctx && + vma->vm_userfaultfd_ctx.ctx != ctx); + WARN_ON(!(vma->vm_flags & VM_MAYWRITE)); + + /* + * Nothing to do: this vma is already registered into this + * userfaultfd and with the right tracking mode too. + */ + if (vma->vm_userfaultfd_ctx.ctx == ctx && + (vma->vm_flags & vm_flags) == vm_flags) + goto skip; + + if (vma->vm_start > start) + start = vma->vm_start; + vma_end = min(end, vma->vm_end); + + new_flags = (vma->vm_flags & ~__VM_UFFD_FLAGS) | vm_flags; + vma = vma_modify_flags_uffd(&vmi, prev, vma, start, vma_end, + new_flags, + (struct vm_userfaultfd_ctx){ctx}); + if (IS_ERR(vma)) + return PTR_ERR(vma); + + /* + * In the vma_merge() successful mprotect-like case 8: + * the next vma was merged into the current one and + * the current one has not been updated yet. + */ + userfaultfd_set_ctx(vma, ctx, vm_flags); + + if (is_vm_hugetlb_page(vma) && uffd_disable_huge_pmd_share(vma)) + hugetlb_unshare_all_pmds(vma); + +skip: + prev = vma; + start = vma->vm_end; + } + + return 0; +} + +void userfaultfd_release_new(struct userfaultfd_ctx *ctx) +{ + struct mm_struct *mm = ctx->mm; + struct vm_area_struct *vma; + VMA_ITERATOR(vmi, mm, 0); + + /* the various vma->vm_userfaultfd_ctx still points to it */ + mmap_write_lock(mm); + for_each_vma(vmi, vma) { + if (vma->vm_userfaultfd_ctx.ctx == ctx) + userfaultfd_reset_ctx(vma); + } + mmap_write_unlock(mm); +} + +void userfaultfd_release_all(struct mm_struct *mm, + struct userfaultfd_ctx *ctx) +{ + struct vm_area_struct *vma, *prev; + VMA_ITERATOR(vmi, mm, 0); + + if (!mmget_not_zero(mm)) + return; + + /* + * Flush page faults out of all CPUs. NOTE: all page faults + * must be retried without returning VM_FAULT_SIGBUS if + * userfaultfd_ctx_get() succeeds but vma->vma_userfault_ctx + * changes while handle_userfault released the mmap_lock. So + * it's critical that released is set to true (above), before + * taking the mmap_lock for writing. + */ + mmap_write_lock(mm); + prev = NULL; + for_each_vma(vmi, vma) { + cond_resched(); + BUG_ON(!!vma->vm_userfaultfd_ctx.ctx ^ + !!(vma->vm_flags & __VM_UFFD_FLAGS)); + if (vma->vm_userfaultfd_ctx.ctx != ctx) { + prev = vma; + continue; + } + + vma = userfaultfd_clear_vma(&vmi, prev, vma, + vma->vm_start, vma->vm_end); + prev = vma; + } + mmap_write_unlock(mm); + mmput(mm); +} From patchwork Mon Jul 29 11:50:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lorenzo Stoakes X-Patchwork-Id: 13744777 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A0A4E1448DD; Mon, 29 Jul 2024 11:51:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=205.220.177.32 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722253886; cv=fail; b=S9wY3c2/MPxZFtRipYA77cya6zpxdELOc3slRG5HCwdcv5mhnPnOkoJuOm3+TbCYrQhX7p7B1MyGgIiF7RTuRoSumy1LfZnpRBQiAnI+pqn6jXN3f7nHZ6mo1SC8SN7QpZyWy8Pihz34dDmX76ih26H1RmS3cQ47x83I1zMDUC4= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722253886; c=relaxed/simple; bh=p4h+dW3R+CucWpNkuKSI30/V92nLGygkozcDURGQNwg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=nD3gamcD3NyOeiGpdUPqjcaHMphuf7ScPxxgipSyC4sVfT1mWc7eXWZOrml1qLktK4/enQ1ML6k2Y1YDzosphY3mG2k8gTQozloNknOAZhMNLt156xR4GHpnaBvIKccANLKhG5wj2o63t2K8V+GfRlI3RP3coaJEkZDgQE2mggE= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=BC07gggk; dkim=pass (1024-bit key) header.d=oracle.onmicrosoft.com header.i=@oracle.onmicrosoft.com header.b=cQJByGXQ; arc=fail smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="BC07gggk"; dkim=pass (1024-bit key) header.d=oracle.onmicrosoft.com header.i=@oracle.onmicrosoft.com header.b="cQJByGXQ" Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 46T8MaCJ031626; Mon, 29 Jul 2024 11:51:02 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :content-transfer-encoding:content-type:mime-version; s= corp-2023-11-20; bh=bCGB4cbvYKZKvzaKlCgplmHcT2xJiVG546dxnLnY9nw=; b= BC07gggkSeNaOcQtJrDCFn630YjiQAxqDsGXdkQB7LHsfzcHgSEJl2W6up0sXaqJ j+1ukojLATOnr0j5xoVlHA81vppJkGyS89ADQ2UM0ucO1VvLjcRhDW0bBItV6x4G avjSBpbIsSRwh1D1Pp6CS8nyYyIZlO2lRxL59Ld9/IqH3cvN9e2p0AtApjTeCjYP Ii79CSdjjwTxES4xhPZGK2k8h2ezxyoWj0G0f2oRkECBBjUjiMHESYUUg6ydNYMF 7soaWM88Kinl1IEuEGMHv24EzoVzhk3MXgZVW1Cn8uCat+kFM/wuXwtrbPp6BLp7 wbq25+Qf99/2fCG6K+fr4Q== Received: from iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta03.appoci.oracle.com [130.35.103.27]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 40mqacjcbt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 29 Jul 2024 11:51:02 +0000 (GMT) Received: from pps.filterd (iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 46TAf61g010104; Mon, 29 Jul 2024 11:51:01 GMT Received: from nam04-bn8-obe.outbound.protection.outlook.com (mail-bn8nam04lp2042.outbound.protection.outlook.com [104.47.74.42]) by iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 40nrn5q3py-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 29 Jul 2024 11:51:01 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=AAIJes9tWSznYvWEX3QmJLpr8rvGb/e7U4gZawATq67HdGbQksbn27S8j6b2+Mew5m5zZiTOqsQXWIObbOzB2pDGbuB54v3YU6EaLu4KjC80+0KQpj8KgHTTlwDK5DXIygtDip9p4dUC7yc35ErHH9/yOHKAsl/NJXGjKZIEva0cZtIs/UB8hhC2jqp+d8T3o84CN2sOtiAAHP70PLds+iYySllhuHQXQwFC1G0ZPbxWFrB8tX7WSSgA5L1IBjxAexudeUISbRWJpnmXpPNpWdWdK8lFVaS35JmVaRf67mlKtpGJnJvaW8lTpZPFsm++/s704y3H80BiaUMBagpzgA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=bCGB4cbvYKZKvzaKlCgplmHcT2xJiVG546dxnLnY9nw=; b=Kgwuq+L7Z+3L0NtEQUMSA0RKbgLLMSS2Jm6AYi8hv1iHYfBFUuiWj4EfDMW1Z1EtkfWnnWy1OVOwuNm3onlkmIIAq7bDCqFkFGxis4phOOdwapCx/2bBP0rUdyMg6HScR39qRDbnkNEuvn9Ld57O1jFJFLWVruHRnWF283JeYyIDNduYP1hNhSR/7b0LjEG8XdMpSFEmF16LRynygLpI5k93ZJFA7s0dN2CWYBZbh1Aog4f71j7H0PgIxwYYFVOrjoYkxHW7+nUzQXWSPDHeHlf/LebspjoNvLmoCV5RpriB4YKzHYyUa9Ci2kSNLAUrafqITaj7iEfolTiAreldbQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=bCGB4cbvYKZKvzaKlCgplmHcT2xJiVG546dxnLnY9nw=; b=cQJByGXQwywW0BtKJ5L8RssFZVwfcQAM0Rw6tU9hl0huFfbKmVsxmhBA9kr2kUxszfi5DudIJPzK2201wxY5wEkM6KeyE7FtzwXZe76qp+j6nMy3kmF3yOXnq4hFmuvwtQH0gnJh3HfFEy1icsW6OQ/pIL/3Z2l/1hMGl8AAyjI= Received: from SJ0PR10MB5613.namprd10.prod.outlook.com (2603:10b6:a03:3d0::5) by SJ0PR10MB4543.namprd10.prod.outlook.com (2603:10b6:a03:2d9::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7807.28; Mon, 29 Jul 2024 11:50:59 +0000 Received: from SJ0PR10MB5613.namprd10.prod.outlook.com ([fe80::4239:cf6f:9caa:940e]) by SJ0PR10MB5613.namprd10.prod.outlook.com ([fe80::4239:cf6f:9caa:940e%6]) with mapi id 15.20.7807.026; Mon, 29 Jul 2024 11:50:58 +0000 From: Lorenzo Stoakes To: Andrew Morton Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, "Liam R . Howlett" , Vlastimil Babka , Matthew Wilcox , Alexander Viro , Christian Brauner , Jan Kara , Eric Biederman , Kees Cook , Suren Baghdasaryan , SeongJae Park , Shuah Khan , Brendan Higgins , David Gow , Rae Moar Subject: [PATCH v4 2/7] mm: move vma_modify() and helpers to internal header Date: Mon, 29 Jul 2024 12:50:36 +0100 Message-ID: <5efde0c6342a8860d5ffc90b415f3989fd8ed0b2.1722251717.git.lorenzo.stoakes@oracle.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: LO4P265CA0229.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:315::18) To SJ0PR10MB5613.namprd10.prod.outlook.com (2603:10b6:a03:3d0::5) Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ0PR10MB5613:EE_|SJ0PR10MB4543:EE_ X-MS-Office365-Filtering-Correlation-Id: 8f214b83-c4bb-467f-6129-08dcafc4b643 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|1800799024|7416014|376014; X-Microsoft-Antispam-Message-Info: 2I93gE+4bzVjbBGkjetexkQZ/0ppJ7d7nfXTupN1HcfQqD3kg4bGMl430/hcq91hdHdt07S6vM0KTAyMvNkBvR/KAc2xVk1+pXRhIxMRkZcaJWiJ/82FXDoZLR+RYLgVquoBp7xKK+YdLA+n2FHZiBVILoOcHJDAI5uLPucaifOVNRObBRS7wRq04JlVmtkgx5m2LNG8CvkUtUWuwShJ8ZqixexNVf5KXM6l/L9aiqOyGd0o+pSakO9j0s8cUKbjV2bN8FAufmq+keHTOwy+zMhjxX2lInQ0eynlJmI3kd9C8L47HCFEYeYvPo7mtvx1Xmrljwzv0iEBIEueNzUyjVHYT8h8+NcUYrr9wivjLHDIZOnmKpjZHb74a7IDPSjPwwjRB8BR66hbii2uyAoeQ/CbvuV6nXJ1GWnK8omDovxJQeyzJHSRwkFrZ/NY0VUP4RMwRD85bEAfWbX5AhiDfF8PrCj5C4UDXEtRL1MU0no7OAwfabeNFOJ7z/PIcxec75rikFtKvXRJQejUj4FMUFEIrkCHXtXm/iyx2Rjk+lI2yJx6HxzfaF7S79pHlKlZ5jJyG1LJdqPvQ97Nvryukk+5fmE1wXzM0olF7m7mFJ0xJzbbULHXjriIQZVBHP4MdjFxb9A8L3zgy/KKMe9wdWGE9fEMN5SBeqf8RucZu0igl2piblfBeUn/YwtJoiXRAgJ/WWVhWinp/VwU3hU2ykRWmZu9QCuTd3H36DPBxP5S/IqXXdiirF3M/F3udcy0ghxVl/mXCCcCH6evd/dMbRuNB1JiF0TcY/u+MQQaftMp8luVHfk1ud1wuRhxxFU983GhD8fBBOtw3GFNeBkMLI68qTxKP9inu6xfnJDuAgZFRh8uOz0y//gNEYL4G9GgCufBt9HPVIXq6oKvfV0FEpTwT0Z5drBWlFXMiLO6wp6+/pZC6jTRZkCX8Z7RdwnuBXxECTG+pzbB77D1x94y90XLsf/vc1/tXbtnpEOcPCjMDdE4vpEXHWtKlNTkP2tGaZ/oJULnE+1qhZUkHlx4GDrp4orR0yVPhCuzB7PfAlMX7rZxbKCLWL8izud3mkvtYN0UoNXJAgwr6KoOjVEBZXoxzrtxBLrvjFRVlEtee7CQI73/x9HfWzi9BHEBRG7QcEVjCXm/xPF7KlFJuMbBQGjAr1rPEgL7xlAH4FKC3me3SzEISSgCBt4ZALJomJncKJVRIr9JCavHEaxLLCK/JtnN2LTg2O5PtElGAMtmJaXv7sZFlGQaasANJxrmp5ZVYaDCVTxgNir7GEEIgcQJ2O7xlaTfKilecwyBAfr26l9wiAQ1r+2LNCvSQ7G5/Js6te3uRcEvoaX/9XkfZBmEBQ== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ0PR10MB5613.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(1800799024)(7416014)(376014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: sDwiEhhzwRn9307MhA7R2Nnq8JxfXJqSkqsnfBIk/06uUVXfSLoQgGG/j61jjAmEUi2Z+fgWqPg2lzBAXVz/TLkrrPTTeBIi2e+iiHzCHqNhDzg5mtixyDTWuN765oukdh5fGn44cOEJSgBj5OXsCXhSNTj0cug013TG3Nhq/nOeBHA96yZ/x4WFdaPLfEXnH1ulL96KB5ssOO76XfR5Zix0fn8XSatWxMR49VwrBgNHts8SwamSrGEhyvrgK9FWIdU/fUZgFesA5jVdA9BW5HHZVnh9twXVTZnlfuEDrLqUQjlMJ97dFIoYnghBn1cmdNe4aNMpkW3tueyjW1cXDNHiwDuNU791ttf7X198TtA2HjSsZTr8f2ahD2YCZTfDadTqLkJ/w0+/xm+Iyydw9Cjrg9dJSqOpmp+xnNuxyGLjKSkVrwIP4XnqL97o5MnNBYHAnildCPgDHUv3mV1j/isHEWE2qF+TBKC0TKbo7ykxQJIo6n5RK4hn7drwQA5E6z8KFl5h/yl0SLOl1U07y4NutYDAi54gH2BAJzjk9Ej4QrfJyZHGK2MdR/iPsHG1U93XoPSsOp3SBa3bpCWxYlC9jmxbVMtRMmc3JGqDr0BfZ3cuZtJCE5V2OuPA4Q2iSlj2uNQDzvxMPTpiwKsTiz3UTrumv+1eFMC6+6Lbd0U3sFBKsEzJllkz4lQ+7CHmG3Gqdh06MzoNjD/KtpT1VgBA0dlYFLs/potCR/FcSgP76uHyPgtkvMMBJ7FKZHuhXvJJFUP9Xd7jQuB/OPUa0jbAIodk85k1laGCeI3nBVAhJ67qamzLkc63vHzEvKFoaKT0jNR66s/KFdJt5mghLU/oIuy9JX4Z+j6Xzq1Y6eyZgUvHvWi6jnozO1vn+cVe9OQWY+aQ0Qx4+IYRG7V2bgoaXg/xia90fK6G5Qa4/4s1TePc/u3qdUyNSGhDGM3mVbHqRswyijuniryqNZamm6wRNrci5c88NcbCuKiGIA/nTKL49U+b2DO/xza5FptE2jBmFRHu/mMzaCRzYRW02Y7i2J5KVeloOyA2jwvRhxQYgyTZSVubZwAH4sLjFYiykzGH/vqqoWZSVegO6R0f68eecpmtxgUuEHj307hr7LjebbSq0DKybdUMV+HB5hzJQ3IDk19fWRbWbxLLojp/MErfZ9c0zLp5tfAknbGBKz9Et56EcyOGxm/8G31SxcvXidrb1BCN9B4tVarqroA11PNCPgyhYX4Mbyro6uwmpds01X7o8fgktaQI89ir4Jo4oHXt/2XWVTpHH5kqM42kBk0DHkWlH5UYGWptTZrwmjc3q9OPreOBnsfFNAttcgS0qaLwEg8xY+yKAoZ9CYKNmf3sKG3U0JWaxvvfvlnxkuQvGtdACYFttG7sRtHBD3ZBXlauuCtW/y0sm/KszptG9TGHWjskvDhzxqI5J8A4CEqYxMPRvpDdiSAmJCz46xtQdX4HvThTkgM7JfNRjYB2T7Z0Lblie+MpkTadYUKVR7eunhZXw+W5oY0PElzCYM4Ew6dZIyN5CCm7YSggk4lj+WqXu3u8ZBv1p8pZZ9pWVTunSGbn0rED1GQokhRKD28LXdgeW6zQNpqIJ4rOBflGMw== X-MS-Exchange-AntiSpam-ExternalHop-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-ExternalHop-MessageData-0: iC/O/m0U/UW/1lNQtVKaQN84fLb3dLEK9pW8nfZ/NW9f5//Q4v4YbHiQhhIzDygNpIwDN0xIe3KOcA8Ae1lA4j/7RuWZxmzqLev/Aep5HVIlYO3i6w+8psG6/MPqaWzfOUOPWnzwMgTg0Q9u+tduw0qm0ptz6/K+FptIktdXLq1T2y5mxbwbz9MwlZiRX8rw7DeSxS8M8G2ZxxI5QFcv7xSxre8HIM+hIL0PM6OHUHKg8HyRbtkVwNVriqDTJlhcOB3n8Ts3oT22bFJT6wrDb0idqi42uwPr+WjrZM9OX9bGC1GTXiUlb3q8W8PIW4W8RqBNhD48d/yYyr3Paan0N1EaELC/gZJOkXcq9pFTnJd1GKEVMNzAo5oipNwu8js6Rlpm3RHCbc+TsbmZXq882mNsKNs1Vvb1u34FX4ramkQp5igFM1vGMWbU3NLvDyx8VLUVHoGIbaWhmUEgxRi9RSx7pGivOc20IMTvdg16CYowh8QIfCqxlCnCBmBTXVJakjaGWEX9gnUFdz+m6w05bmx8uQdmwgyV1jsCu2NOtGQV1fbOeKUf+Ex3PCm+ONpMCKODUMPHS3lzZsjvNZjckGIBQaLeXuaNN0NvBxWKdIc= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8f214b83-c4bb-467f-6129-08dcafc4b643 X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB5613.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Jul 2024 11:50:58.6862 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: oPhh/CWH7Q/MMLr0cILYuiC4d0qtb2t55ZEEuCRWUiNPbv0QWKl+A9birYLIJZ7XKdZhY0PKox4sKVM8dR97xcrXIvnbF0HhqyuaYnKQY7g= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR10MB4543 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-07-29_10,2024-07-26_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 mlxlogscore=999 suspectscore=0 bulkscore=0 phishscore=0 adultscore=0 malwarescore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2407110000 definitions=main-2407290080 X-Proofpoint-ORIG-GUID: LKyVgv6RPcShMT321Cgj8e3hElCvlGL- X-Proofpoint-GUID: LKyVgv6RPcShMT321Cgj8e3hElCvlGL- These are core VMA manipulation functions which invoke VMA splitting and merging and should not be directly accessed from outside of mm/. Reviewed-by: Vlastimil Babka Reviewed-by: Liam R. Howlett Signed-off-by: Lorenzo Stoakes --- include/linux/mm.h | 60 --------------------------------------------- mm/internal.h | 61 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 61 insertions(+), 60 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index c4b238a20b76..2d519975e9b6 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3278,66 +3278,6 @@ extern struct vm_area_struct *copy_vma(struct vm_area_struct **, unsigned long addr, unsigned long len, pgoff_t pgoff, bool *need_rmap_locks); extern void exit_mmap(struct mm_struct *); -struct vm_area_struct *vma_modify(struct vma_iterator *vmi, - struct vm_area_struct *prev, - struct vm_area_struct *vma, - unsigned long start, unsigned long end, - unsigned long vm_flags, - struct mempolicy *policy, - struct vm_userfaultfd_ctx uffd_ctx, - struct anon_vma_name *anon_name); - -/* We are about to modify the VMA's flags. */ -static inline struct vm_area_struct -*vma_modify_flags(struct vma_iterator *vmi, - struct vm_area_struct *prev, - struct vm_area_struct *vma, - unsigned long start, unsigned long end, - unsigned long new_flags) -{ - return vma_modify(vmi, prev, vma, start, end, new_flags, - vma_policy(vma), vma->vm_userfaultfd_ctx, - anon_vma_name(vma)); -} - -/* We are about to modify the VMA's flags and/or anon_name. */ -static inline struct vm_area_struct -*vma_modify_flags_name(struct vma_iterator *vmi, - struct vm_area_struct *prev, - struct vm_area_struct *vma, - unsigned long start, - unsigned long end, - unsigned long new_flags, - struct anon_vma_name *new_name) -{ - return vma_modify(vmi, prev, vma, start, end, new_flags, - vma_policy(vma), vma->vm_userfaultfd_ctx, new_name); -} - -/* We are about to modify the VMA's memory policy. */ -static inline struct vm_area_struct -*vma_modify_policy(struct vma_iterator *vmi, - struct vm_area_struct *prev, - struct vm_area_struct *vma, - unsigned long start, unsigned long end, - struct mempolicy *new_pol) -{ - return vma_modify(vmi, prev, vma, start, end, vma->vm_flags, - new_pol, vma->vm_userfaultfd_ctx, anon_vma_name(vma)); -} - -/* We are about to modify the VMA's flags and/or uffd context. */ -static inline struct vm_area_struct -*vma_modify_flags_uffd(struct vma_iterator *vmi, - struct vm_area_struct *prev, - struct vm_area_struct *vma, - unsigned long start, unsigned long end, - unsigned long new_flags, - struct vm_userfaultfd_ctx new_ctx) -{ - return vma_modify(vmi, prev, vma, start, end, new_flags, - vma_policy(vma), new_ctx, anon_vma_name(vma)); -} static inline int check_data_rlimit(unsigned long rlim, unsigned long new, diff --git a/mm/internal.h b/mm/internal.h index b4d86436565b..81564ce0f9e2 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1244,6 +1244,67 @@ struct vm_area_struct *vma_merge_extend(struct vma_iterator *vmi, struct vm_area_struct *vma, unsigned long delta); +struct vm_area_struct *vma_modify(struct vma_iterator *vmi, + struct vm_area_struct *prev, + struct vm_area_struct *vma, + unsigned long start, unsigned long end, + unsigned long vm_flags, + struct mempolicy *policy, + struct vm_userfaultfd_ctx uffd_ctx, + struct anon_vma_name *anon_name); + +/* We are about to modify the VMA's flags. */ +static inline struct vm_area_struct +*vma_modify_flags(struct vma_iterator *vmi, + struct vm_area_struct *prev, + struct vm_area_struct *vma, + unsigned long start, unsigned long end, + unsigned long new_flags) +{ + return vma_modify(vmi, prev, vma, start, end, new_flags, + vma_policy(vma), vma->vm_userfaultfd_ctx, + anon_vma_name(vma)); +} + +/* We are about to modify the VMA's flags and/or anon_name. */ +static inline struct vm_area_struct +*vma_modify_flags_name(struct vma_iterator *vmi, + struct vm_area_struct *prev, + struct vm_area_struct *vma, + unsigned long start, + unsigned long end, + unsigned long new_flags, + struct anon_vma_name *new_name) +{ + return vma_modify(vmi, prev, vma, start, end, new_flags, + vma_policy(vma), vma->vm_userfaultfd_ctx, new_name); +} + +/* We are about to modify the VMA's memory policy. */ +static inline struct vm_area_struct +*vma_modify_policy(struct vma_iterator *vmi, + struct vm_area_struct *prev, + struct vm_area_struct *vma, + unsigned long start, unsigned long end, + struct mempolicy *new_pol) +{ + return vma_modify(vmi, prev, vma, start, end, vma->vm_flags, + new_pol, vma->vm_userfaultfd_ctx, anon_vma_name(vma)); +} + +/* We are about to modify the VMA's flags and/or uffd context. */ +static inline struct vm_area_struct +*vma_modify_flags_uffd(struct vma_iterator *vmi, + struct vm_area_struct *prev, + struct vm_area_struct *vma, + unsigned long start, unsigned long end, + unsigned long new_flags, + struct vm_userfaultfd_ctx new_ctx) +{ + return vma_modify(vmi, prev, vma, start, end, new_flags, + vma_policy(vma), new_ctx, anon_vma_name(vma)); +} + enum { /* mark page accessed */ FOLL_TOUCH = 1 << 16, From patchwork Mon Jul 29 11:50:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lorenzo Stoakes X-Patchwork-Id: 13744776 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A72E877101; Mon, 29 Jul 2024 11:51:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=205.220.177.32 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722253885; cv=fail; b=CZe2DpaVzoHi75uT91lTNBxp8m429ghObL1JednIIp8z5Xz834nzmGCnHD3eeY91RJcPpKLe3Zw00qycdjt5TV/zyLHJlWqiqk/J3NXfa9zF1g8NNApm7s0r6G/rbVu1RyZh0G3kvWTz6rPqoMq55x9qPaT1iwDkmtf8zImGCZY= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722253885; c=relaxed/simple; bh=2pjHq9cWa6lBGtEBMFTpkD2g/TWqB/42Lwd4a1F7U+w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=my1y+sAZwYS0Igj/LOI4G/+BmmszNfNeJE7rqmytQ2m6A9b/Kb3Xrw9g0HZNOfl/HVctnXsuzeHcrElkNINMa9+I4tSWyx+CixGWbByVGY535jrOEFw8e2mfORpgD4IkBTSEsPC+HyWi3nc5V6ocjACkML5zEf8tTA2T/UsIqC8= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=KAWAr9rn; dkim=pass (1024-bit key) header.d=oracle.onmicrosoft.com header.i=@oracle.onmicrosoft.com header.b=Uquj+Lxv; arc=fail smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="KAWAr9rn"; dkim=pass (1024-bit key) header.d=oracle.onmicrosoft.com header.i=@oracle.onmicrosoft.com header.b="Uquj+Lxv" Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 46T8MWeC006551; Mon, 29 Jul 2024 11:51:07 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :content-transfer-encoding:content-type:mime-version; s= corp-2023-11-20; bh=3+ijfn2lZ5feZObkE1krecjkm9yAh32cwU0beDZ++3A=; b= KAWAr9rnx9lc0xlJD9skr/zcNRJDIL+v6fOo1H8Uo5qHkfUgP0SkmxT7aLSL9MLF KGznOJ01HLo3jCrtiZsdEcq0YIftcZ3htX6H91ODquBIHuwDaEeLCZgsboGH5GG5 Y8+iKetiuZfN1mPxu3ONv0vfsyY9Tm3ndC72RchlRkZ6s1/34/UZlHr/eplPNXcP 1FfcrCa7dbjBJmghipHvyP3xG5VZv23WUIMTtaiTdGL2Zvj38bMWOIwu//SMzqEc 9XuSmKC/y4j2iQbQv/fV5aPTWQXdcj9etH335XHMGWoBulo9vXAw7JiSMpFrw09Z Vvw6vVwg2R5lbYIcPnaFsA== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 40mqp1tcpe-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 29 Jul 2024 11:51:06 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 46TBdmcl003780; Mon, 29 Jul 2024 11:51:06 GMT Received: from nam04-bn8-obe.outbound.protection.outlook.com (mail-bn8nam04lp2049.outbound.protection.outlook.com [104.47.74.49]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 40p4bxm1q0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 29 Jul 2024 11:51:05 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=poeYmszhHmk5tpVmWbNlYogxKXKnfLLKDFaIntJSYNYgIbABu6vVDLov6XC9R+DDUluD3BWA3fmkoAEhw3J8+dYdQ8hJR9vM0k4ZzCPldIhbWHk9wAq0asQZW6vUrLhOsgN4h+gZbImYg34gz3o8K819petWPOsf2JIwqdXtxi/ypw9DDmDLujZCkXVwZZHGrplCP/jeKAX9JZ6bLcvDPU9QLa16NZv6hVyI3TVfUdw9xjKGbOibcDa0wn+bxqdOtoIw8UjBJVaE+BXAbDaiF9TcQybpWSlrt8PzwfRhOSZyxFLe7cl84xUcuIBGN1DAxBH8GMveB8nafvRJrnRzYw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=3+ijfn2lZ5feZObkE1krecjkm9yAh32cwU0beDZ++3A=; b=rwxZRdX/RKsZ2WXXUW5iY04KHH10NBhZNNYUd7N33j1d03XXMDZ7JHFTYeHlaq1i3bEhtFdveMTmmblxreFgzVGEHC1fnnP4zTeE959IYRWOYl2NNmcNMLW0uV7wO6DAC/0q9pRE/wc+i7lLyaa9QS520U52JxmmSVuc6QsYWRWjJ7labAekCFMfOMPwhVniFL5MNytnlUtzzr+XsU+ORZTk+FzqjKnhIPYSZg0S2OG8v7E6TpvCmBSJwTf7wvmohRx4maWxvcHPb0iPaW2oxftH2/1wqmL6WwjeuxZv6B7OWW1AOb7MCcTCO9VgyUor7yj+m1PQzUWHS5etgK7t6g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=3+ijfn2lZ5feZObkE1krecjkm9yAh32cwU0beDZ++3A=; b=Uquj+Lxv0GVzvvbw4Zij55OxqpzULwmsvcka1hav7oPLc2GhglbbsnxInjIgT0hgOiEaw5VYG6HzrPthFH03xg3m8g1PtJ+jASI8tiewPo0zuPkjWmF4C6TqjSFUnOmEA8LILTHgUps0G4vNC1Vxc2JqXqGzJtq3ETr3nRgWMtc= Received: from SJ0PR10MB5613.namprd10.prod.outlook.com (2603:10b6:a03:3d0::5) by SJ0PR10MB4543.namprd10.prod.outlook.com (2603:10b6:a03:2d9::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7807.28; Mon, 29 Jul 2024 11:51:02 +0000 Received: from SJ0PR10MB5613.namprd10.prod.outlook.com ([fe80::4239:cf6f:9caa:940e]) by SJ0PR10MB5613.namprd10.prod.outlook.com ([fe80::4239:cf6f:9caa:940e%6]) with mapi id 15.20.7807.026; Mon, 29 Jul 2024 11:51:02 +0000 From: Lorenzo Stoakes To: Andrew Morton Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, "Liam R . Howlett" , Vlastimil Babka , Matthew Wilcox , Alexander Viro , Christian Brauner , Jan Kara , Eric Biederman , Kees Cook , Suren Baghdasaryan , SeongJae Park , Shuah Khan , Brendan Higgins , David Gow , Rae Moar Subject: [PATCH v4 3/7] mm: move vma_shrink(), vma_expand() to internal header Date: Mon, 29 Jul 2024 12:50:37 +0100 Message-ID: <3cfcd9ec433e032a85f636fdc0d7d98fafbd19c5.1722251717.git.lorenzo.stoakes@oracle.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: LO4P123CA0535.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:2c5::20) To SJ0PR10MB5613.namprd10.prod.outlook.com (2603:10b6:a03:3d0::5) Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ0PR10MB5613:EE_|SJ0PR10MB4543:EE_ X-MS-Office365-Filtering-Correlation-Id: 5c32d8b9-fee9-4368-4755-08dcafc4b87b X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|1800799024|7416014|376014; X-Microsoft-Antispam-Message-Info: p09bJ9FPMMfZADgs/36pWvBaotmeK2jmCv30toLMr+anl21KHlsmHj+Rfm6vaYia0hwr0yAb8eH2DQOaf6Tee/LuM1x8eFhwoP5XlNWIm0jCezVdZ1eMRHNE+pQMbePudzvGAdMhh3jhnBkAoW3+SEWjIDCtQ6xIWmvrzEAsRa3zZrBXMnuG05gWhElkTUBoNhKQ//eEl9N0a/4V9qamgK8O2xjmW1ftWcTqB44QBIFjufb9rKXG4yvIM9pGTU6oD7CFXta3VXrm0DujQD/1cCzxNM0a2oodD7oyyBi5C8cusLonkQFsNEUc7m+MOSzc2o0OUZlEZodqzzTO/ILvr+HZyt1EZWFtxxucuTKwUp9S7Q5pSIs9UOIvFTCAVZBvtFHhi+q6pAJzgw9aJ9Tyz7KjyAYVxTEp2JPY0nvwNd6RZ9zcZ8/4JdhYYmUnL9Z73j70iTF6MvrqevBEo2Jtkb3KlpKDta+g/cwF1cIH7wmrkKNCRgNkxp/5SHKUhe90xwM3cORwfcxvPv/Eur4rLMKesfW4dZlt8o5W+Bp4zBzj9k+Bi3u/V2loF+saj4oKErguf46A5K/G6Y+Aue6+sO6infsmmc7HF88ftscDivNljNxl4gM/0uwlbATinOU8Rx4y2gsgV4HqIAngAXiBql7V6WhdK0ivTKZYQXkTqqQYVgQE85gyBb8hcn1A26lQRFKQ5Bcg8Zbaqfvj4x/JZcJfhXJ1z5ZseVuzBGB60tPq8zgt42bUvnMfGVj1Db/vxOVDL/6rtBH6PH4hp/AeoLbuJb4mDETahMmX4XVhOM3MpSKCFhwv/qvmrMv4+vAUd926m72veS51T9WUYdgmzJGsKtSQKhyfdHlLuGwoyBoDpzwMhQ2N2fKaQfpuga4I9kELIadUvaDy71SUf+NRHgSFN4K84uH2RDkZ3C/QXiVeaLl+Aem8P2ZzRMszMS2pqYprecdcACEtP9WiwrJdXx/xsAh+lwm/s8b89cZU8kzgPHyvc0tNSEIpbPArLil67UNr/DQZZkR3RhNemLEHVDqtWKlN7SMtpRIqLtpdXFyyDdQBXfdjU0kqHMc7Ohmib80BQfy154rlQnvciqf7f/5dj4ZovjEb2poEYjZ3Yy2A7E6J68To7pQnmzibXQPb8RXS8NQw3H3RWUG9Fl2koYQu9MR/wAmNsr8rt4A1ju9isJxxqI3ZQE2xLZ7ja4lsLInLiOW3gosDWCl/TNJ723Fh+A4xITIq2R8qeU5IkIW9Qin9wOqpwLtqGpYD9qsSbsVdFDNe/trvHjqX3TeeTehVwAqTEdD7mmov4FnruFQc6fnhCIlWV/lgGmmOlE/iAbNk+P0c1iFfE9oOmQGY9A== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ0PR10MB5613.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(1800799024)(7416014)(376014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: fhznoqBrnehuPj6fFX178fmV4xyr+3IpKTHRwUEyaZYobDXaAjT9wAZkQ7+XLrAHx7IBKa6rwBzfml9coNyNMeoQpgBK/Jb4jYTTvKm6UHgMvl6Mne1fxz6t/IJs1zg3sWtuO8olf3qpENHpk2PYF55EwJdLreuIh3uDDfcCXIzuURey4g12a04BM5NSN5J5CQj0yCHcdCUEblVdj94PdijoAoOcrzb7p3NGbSBmKrQxct1hqa8jAqrEoqDLUzYU1ZwhEgYMi3fmAzvMr3l8YJyS6i5uyMMyq9o++r7Ukg2smCzdS3SLyMq+QE8HSbu4Pfxi+iFuGoqQ7P54TNSzb6vNJpbY9dQQ1VlZZLy9Xpu41sKGXX8q6yloAXXZaG1dp7SjGMw9nBKIuxoW3dFpidyfspJVD0vHYMLVZWIlqJOQBwte4vBrmsM5X6Dhv38tZwOVnv7vfmDYJWYpuak5tJxjMvETw5WkL4LOoZIz82CkqV96iCCWYqbXSAdkuxiAPkB9371d/aBZWGodHJPZ+7lIUAsDAEuMfNaMKdhx5T/+CJzzRcDiskQVxiIr/bywajPVB+4dC6+okZULs+tKejjjkpHUArp8YpaCciCcduQ3Qx1K1bMw2Ad1/MMtrogp2MFEzwlbfb4v2T59qWjfTPnxUwN9dlU2QTx0zSDBwlIjHvvcJKjLjgNqFSMbGyG/1Kjlq2WG4D/Yr/OxxyaFJvIN2LTYz8c6HdOntq8caHGWd9Afh5UT+hFW+qp/TWghn9I5eBjNqFTZhNDibRRnOOnUO+nYeN5xf+3tPRPBZIHLU55bfddQn6TEcKevncKPwjHehUMlUD2lotychGqY5nsa+WPtTk6/QqxqhMLAHepwq3r9lxOFsmNuIJwjPJYwzaLTVNITOgxvqU991xXEqFNZhE75Zfe1gYT8pZVufaqi8FxRglHJpXj99izpOGchiCSFJ5VhnQjfvP2SzWC/74VId6OQLMF6kNSErr5zq8WjThpbpBM3oy67loCznejyK/tCjH2On9aXgZJVMCO8o2eO7/LPDL97G9Bn5NIw+P3GGm2A3gipuRziVTzkR/7DqggdIrHAbIyBCpRcr0YEhuimPDpqBrER/Ku5iYpCjmU6Xwk9HBp7Sg1sMmpXQFVW2IULSLIbIxJlObSIj41IMrJZfqWyIZO+FYYdN0vBQlX7o+cn4W32iHRpm2DA4aGdxsbF4GKz8Rps6zdZ3F7l+tQkE6qtTIiGMZAY2bEf3WTR9IA09mVI+iTcFLKx1zoK6Wn3tdBzfsb+K1eWsX5U820/D2mOlN6jJpUp0198UNoM32wU42EsC8vbNtE5LDyghIs6L5FlGZdmyZswtOyi+REFT/41nnCCawkDuTGT9YtMSPudFtr/anZzua/u4cxWloWLB/kMQIYKazZqNDzC46ZRHHgbq3n8Vwm2FFPwr6hMrtsqAlEtiJwKhQ6lj4BZsxUtKGrl/sXr8498V/NGrFautq9SuY3ST2hxZklLvDc/OemEfad4LaRjZEhcYzxe0ZsVvmviuSBzZly3kLMHPjz+2jBZDblBtFwMlHR0I3bUVOy8etR7nm+Y3scE/NXOcKafQkMBI1mMt2Y00gN91Q== X-MS-Exchange-AntiSpam-ExternalHop-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-ExternalHop-MessageData-0: jH20qeV1D/9FoHiC2PD4tkQIZAskYWc/twXG2zSk28T/pzqZPKjrpr4D4+BAOfrisi+BrK4np7rlINxTpCFfbbKhZ+bXh6KRWdL5+ABv+HwUy7I2K3v8wVujU79hWLC7chRmDHS3gU7/OQvv/yaZQQOtkxl+eHmHc09flRw4wjP27PypEeov3rnXwadMMHZh2rqhrCEMlTw7Rk5J5KwrrnKCDC7xW/rnK+pgVq1bLgHrpScLoaZz/M0TSuyI7FssPlbfwZMdcxJlSUv/MIVJZqnqU5H4aU1bGmk7EQPOhDl3cDenNV8K7Kobwb+NAZMGXDpkqccXwRyzFPq63dIxMsogRpyaK4885ebyO4hLUs8Ui34J/puL1WLCRHjY0HH6clBUwQRhAbCVIRgNJx1uVghbizK7aIArynx3vXsi+sYeV/Nf9ppVf5wpC5YXxfIVrR8uQLDzDHpqGzAMGXdjTBis3+sMrvsyDgkPGzT0pSTckn05WTp8EY+10M41R+dyWiyfdKWVansnusDdUle24zrYpmsNIvySNOU8T1OKIWogk/EN+5gZB1+KJlhdWUmcZO/m/a5uKS1X3/4R7sSEUNMjdHMH8ni1LxmAt/adivc= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 5c32d8b9-fee9-4368-4755-08dcafc4b87b X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB5613.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Jul 2024 11:51:02.6601 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 6Rj0LkupUMAx6tPteE5EdJg8DSRPJvhROjrFhJGwMA1cUrzj89w7s/mJl/eHTYkXhKv2xQaOji/6Z6+Eg1MEG8wIdNn7KAqNgtFJZvUSZFw= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR10MB4543 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-07-29_10,2024-07-26_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 mlxscore=0 suspectscore=0 malwarescore=0 adultscore=0 bulkscore=0 spamscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2407110000 definitions=main-2407290080 X-Proofpoint-ORIG-GUID: VKliCbX_UCpVYC7Z4_0M3RqyafbxM0eB X-Proofpoint-GUID: VKliCbX_UCpVYC7Z4_0M3RqyafbxM0eB The vma_shrink() and vma_expand() functions are internal VMA manipulation functions which we ought to abstract for use outside of memory management code. To achieve this, we replace shift_arg_pages() in fs/exec.c with an invocation of a new relocate_vma_down() function implemented in mm/mmap.c, which enables us to also move move_page_tables() and vma_iter_prev_range() to internal.h. The purpose of doing this is to isolate key VMA manipulation functions in order that we can both abstract them and later render them easily testable. Reviewed-by: Vlastimil Babka Reviewed-by: Liam R. Howlett Signed-off-by: Lorenzo Stoakes --- fs/exec.c | 81 ++++------------------------------------------ include/linux/mm.h | 17 +--------- mm/internal.h | 18 +++++++++++ mm/mmap.c | 81 ++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 106 insertions(+), 91 deletions(-) diff --git a/fs/exec.c b/fs/exec.c index a126e3d1cacb..e55efc761947 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -711,80 +711,6 @@ static int copy_strings_kernel(int argc, const char *const *argv, #ifdef CONFIG_MMU -/* - * During bprm_mm_init(), we create a temporary stack at STACK_TOP_MAX. Once - * the binfmt code determines where the new stack should reside, we shift it to - * its final location. The process proceeds as follows: - * - * 1) Use shift to calculate the new vma endpoints. - * 2) Extend vma to cover both the old and new ranges. This ensures the - * arguments passed to subsequent functions are consistent. - * 3) Move vma's page tables to the new range. - * 4) Free up any cleared pgd range. - * 5) Shrink the vma to cover only the new range. - */ -static int shift_arg_pages(struct vm_area_struct *vma, unsigned long shift) -{ - struct mm_struct *mm = vma->vm_mm; - unsigned long old_start = vma->vm_start; - unsigned long old_end = vma->vm_end; - unsigned long length = old_end - old_start; - unsigned long new_start = old_start - shift; - unsigned long new_end = old_end - shift; - VMA_ITERATOR(vmi, mm, new_start); - struct vm_area_struct *next; - struct mmu_gather tlb; - - BUG_ON(new_start > new_end); - - /* - * ensure there are no vmas between where we want to go - * and where we are - */ - if (vma != vma_next(&vmi)) - return -EFAULT; - - vma_iter_prev_range(&vmi); - /* - * cover the whole range: [new_start, old_end) - */ - if (vma_expand(&vmi, vma, new_start, old_end, vma->vm_pgoff, NULL)) - return -ENOMEM; - - /* - * move the page tables downwards, on failure we rely on - * process cleanup to remove whatever mess we made. - */ - if (length != move_page_tables(vma, old_start, - vma, new_start, length, false, true)) - return -ENOMEM; - - lru_add_drain(); - tlb_gather_mmu(&tlb, mm); - next = vma_next(&vmi); - if (new_end > old_start) { - /* - * when the old and new regions overlap clear from new_end. - */ - free_pgd_range(&tlb, new_end, old_end, new_end, - next ? next->vm_start : USER_PGTABLES_CEILING); - } else { - /* - * otherwise, clean from old_start; this is done to not touch - * the address space in [new_end, old_start) some architectures - * have constraints on va-space that make this illegal (IA64) - - * for the others its just a little faster. - */ - free_pgd_range(&tlb, old_start, old_end, new_end, - next ? next->vm_start : USER_PGTABLES_CEILING); - } - tlb_finish_mmu(&tlb); - - vma_prev(&vmi); - /* Shrink the vma to just the new range */ - return vma_shrink(&vmi, vma, new_start, new_end, vma->vm_pgoff); -} - /* * Finalizes the stack vm_area_struct. The flags and permissions are updated, * the stack is optionally relocated, and some extra space is added. @@ -877,7 +803,12 @@ int setup_arg_pages(struct linux_binprm *bprm, /* Move stack pages down in memory. */ if (stack_shift) { - ret = shift_arg_pages(vma, stack_shift); + /* + * During bprm_mm_init(), we create a temporary stack at STACK_TOP_MAX. Once + * the binfmt code determines where the new stack should reside, we shift it to + * its final location. + */ + ret = relocate_vma_down(vma, stack_shift); if (ret) goto out_unlock; } diff --git a/include/linux/mm.h b/include/linux/mm.h index 2d519975e9b6..86c9d53657f1 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1005,12 +1005,6 @@ static inline struct vm_area_struct *vma_prev(struct vma_iterator *vmi) return mas_prev(&vmi->mas, 0); } -static inline -struct vm_area_struct *vma_iter_prev_range(struct vma_iterator *vmi) -{ - return mas_prev_range(&vmi->mas, 0); -} - static inline unsigned long vma_iter_addr(struct vma_iterator *vmi) { return vmi->mas.index; @@ -2530,11 +2524,6 @@ int set_page_dirty_lock(struct page *page); int get_cmdline(struct task_struct *task, char *buffer, int buflen); -extern unsigned long move_page_tables(struct vm_area_struct *vma, - unsigned long old_addr, struct vm_area_struct *new_vma, - unsigned long new_addr, unsigned long len, - bool need_rmap_locks, bool for_stack); - /* * Flags used by change_protection(). For now we make it a bitmap so * that we can pass in multiple flags just like parameters. However @@ -3266,11 +3255,6 @@ void anon_vma_interval_tree_verify(struct anon_vma_chain *node); /* mmap.c */ extern int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin); -extern int vma_expand(struct vma_iterator *vmi, struct vm_area_struct *vma, - unsigned long start, unsigned long end, pgoff_t pgoff, - struct vm_area_struct *next); -extern int vma_shrink(struct vma_iterator *vmi, struct vm_area_struct *vma, - unsigned long start, unsigned long end, pgoff_t pgoff); extern struct anon_vma *find_mergeable_anon_vma(struct vm_area_struct *); extern int insert_vm_struct(struct mm_struct *, struct vm_area_struct *); extern void unlink_file_vma(struct vm_area_struct *); @@ -3278,6 +3262,7 @@ extern struct vm_area_struct *copy_vma(struct vm_area_struct **, unsigned long addr, unsigned long len, pgoff_t pgoff, bool *need_rmap_locks); extern void exit_mmap(struct mm_struct *); +int relocate_vma_down(struct vm_area_struct *vma, unsigned long shift); static inline int check_data_rlimit(unsigned long rlim, unsigned long new, diff --git a/mm/internal.h b/mm/internal.h index 81564ce0f9e2..a4d0e98ccb97 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1305,6 +1305,12 @@ static inline struct vm_area_struct vma_policy(vma), new_ctx, anon_vma_name(vma)); } +int vma_expand(struct vma_iterator *vmi, struct vm_area_struct *vma, + unsigned long start, unsigned long end, pgoff_t pgoff, + struct vm_area_struct *next); +int vma_shrink(struct vma_iterator *vmi, struct vm_area_struct *vma, + unsigned long start, unsigned long end, pgoff_t pgoff); + enum { /* mark page accessed */ FOLL_TOUCH = 1 << 16, @@ -1528,6 +1534,12 @@ static inline int vma_iter_store_gfp(struct vma_iterator *vmi, return 0; } +static inline +struct vm_area_struct *vma_iter_prev_range(struct vma_iterator *vmi) +{ + return mas_prev_range(&vmi->mas, 0); +} + /* * VMA lock generalization */ @@ -1639,4 +1651,10 @@ void unlink_file_vma_batch_init(struct unlink_vma_file_batch *); void unlink_file_vma_batch_add(struct unlink_vma_file_batch *, struct vm_area_struct *); void unlink_file_vma_batch_final(struct unlink_vma_file_batch *); +/* mremap.c */ +unsigned long move_page_tables(struct vm_area_struct *vma, + unsigned long old_addr, struct vm_area_struct *new_vma, + unsigned long new_addr, unsigned long len, + bool need_rmap_locks, bool for_stack); + #endif /* __MM_INTERNAL_H */ diff --git a/mm/mmap.c b/mm/mmap.c index d0dfc85b209b..211148ba2831 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -4088,3 +4088,84 @@ static int __meminit init_reserve_notifier(void) return 0; } subsys_initcall(init_reserve_notifier); + +/* + * Relocate a VMA downwards by shift bytes. There cannot be any VMAs between + * this VMA and its relocated range, which will now reside at [vma->vm_start - + * shift, vma->vm_end - shift). + * + * This function is almost certainly NOT what you want for anything other than + * early executable temporary stack relocation. + */ +int relocate_vma_down(struct vm_area_struct *vma, unsigned long shift) +{ + /* + * The process proceeds as follows: + * + * 1) Use shift to calculate the new vma endpoints. + * 2) Extend vma to cover both the old and new ranges. This ensures the + * arguments passed to subsequent functions are consistent. + * 3) Move vma's page tables to the new range. + * 4) Free up any cleared pgd range. + * 5) Shrink the vma to cover only the new range. + */ + + struct mm_struct *mm = vma->vm_mm; + unsigned long old_start = vma->vm_start; + unsigned long old_end = vma->vm_end; + unsigned long length = old_end - old_start; + unsigned long new_start = old_start - shift; + unsigned long new_end = old_end - shift; + VMA_ITERATOR(vmi, mm, new_start); + struct vm_area_struct *next; + struct mmu_gather tlb; + + BUG_ON(new_start > new_end); + + /* + * ensure there are no vmas between where we want to go + * and where we are + */ + if (vma != vma_next(&vmi)) + return -EFAULT; + + vma_iter_prev_range(&vmi); + /* + * cover the whole range: [new_start, old_end) + */ + if (vma_expand(&vmi, vma, new_start, old_end, vma->vm_pgoff, NULL)) + return -ENOMEM; + + /* + * move the page tables downwards, on failure we rely on + * process cleanup to remove whatever mess we made. + */ + if (length != move_page_tables(vma, old_start, + vma, new_start, length, false, true)) + return -ENOMEM; + + lru_add_drain(); + tlb_gather_mmu(&tlb, mm); + next = vma_next(&vmi); + if (new_end > old_start) { + /* + * when the old and new regions overlap clear from new_end. + */ + free_pgd_range(&tlb, new_end, old_end, new_end, + next ? next->vm_start : USER_PGTABLES_CEILING); + } else { + /* + * otherwise, clean from old_start; this is done to not touch + * the address space in [new_end, old_start) some architectures + * have constraints on va-space that make this illegal (IA64) - + * for the others its just a little faster. + */ + free_pgd_range(&tlb, old_start, old_end, new_end, + next ? next->vm_start : USER_PGTABLES_CEILING); + } + tlb_finish_mmu(&tlb); + + vma_prev(&vmi); + /* Shrink the vma to just the new range */ + return vma_shrink(&vmi, vma, new_start, new_end, vma->vm_pgoff); +} From patchwork Mon Jul 29 11:50:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lorenzo Stoakes X-Patchwork-Id: 13744781 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F3140147C98; Mon, 29 Jul 2024 11:51:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=205.220.177.32 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722253907; cv=fail; b=Ekq8Xj6C3WWYAq918Fh5Rg3DLaTAfFQvDcAh3qPMphIbSb9l4Ju2lkEHPC8a9OXWuy9n0DuvMjJi7RvRrvo2TuUiqLF9QaD2JgUYJjodPZVRyRsAvqlPKW+ysYphikMaNc335xSnTutnKjIP3EfJEAN5xUUY5dxiaOKHS/mPECU= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722253907; c=relaxed/simple; bh=HtO/3FgrP4pbf71i1N3GKLyD5ClfDUSO6QKxIyupO44=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=CM0w4W8tDJOETmElz3uRobMRxseT+91N7bIFG29uIG8s42wE8a9DqFIF5AD5MShRuYWWqi5qsTGTRJ8KDycvzmRANzQ6hAvqCcerGSPLpwP/Ur+P7w4JAPnWCew5iQhvzXqf9WUKijJ4gCd5G+OKrwmjaKUP6z3MgTMHHoJlbHQ= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=Z0Tv+/wI; dkim=pass (1024-bit key) header.d=oracle.onmicrosoft.com header.i=@oracle.onmicrosoft.com header.b=gjpowV/Y; arc=fail smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="Z0Tv+/wI"; dkim=pass (1024-bit key) header.d=oracle.onmicrosoft.com header.i=@oracle.onmicrosoft.com header.b="gjpowV/Y" Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 46T8MZdw031592; Mon, 29 Jul 2024 11:51:14 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :content-transfer-encoding:content-type:mime-version; s= corp-2023-11-20; bh=DdBGLP+Bp9mgHVbJhymIUdU/suUld7GpIl2S07mFdMQ=; b= Z0Tv+/wIslzMetm2onO2Vf6OOuc0RFLUbbYQ+ilHE+Qyiudy/L9igYQpLBaKe5gg ISYVeji7D0i98AZiuaMT32zgWhQg7p/5H1zlyIcYyVEv6zxQiDErQIEbL9pONFQ7 RDtGSa/PLSI3NUE4/YJrneYehHMoPJ3PdMTPPBHslBRTWVOGJDJordvbAhRd0uYE piAazbcVjD/MKf3QGtT6uqP5foLTpgqUYRhnsDZmBqXRgQzWbwfdTyKaWYEz38Oh SJQfnf/PB8uyQmVjKtyUqO4gzFnMOLjwmyw3UFORjUdn/+QK2+H+KsDQgMvuCsYT zngH+VBF1gD3gEz8XgvjeQ== Received: from iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta01.appoci.oracle.com [130.35.100.223]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 40mqacjcc5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 29 Jul 2024 11:51:13 +0000 (GMT) Received: from pps.filterd (iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 46TB4eCn001248; Mon, 29 Jul 2024 11:51:12 GMT Received: from nam04-bn8-obe.outbound.protection.outlook.com (mail-bn8nam04lp2046.outbound.protection.outlook.com [104.47.74.46]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 40nkh53n57-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 29 Jul 2024 11:51:12 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=fOS8jKnXnNa64kt5DPVR4fnDNcuyDM6w3R8pPhvSY/NYSJqQkjIK4j30Wye9Yk6N+hj99jjBHd9zkGWyfR1IGB+7Y5Eo1GNoxYcuwTvqnvasTPl81Uw0zAm2RYYdXIKiUzrm6BiCUb550RAxuGsPO2zpRj/6fD9pwBTAzXXgIBJY4tyuxOG3vgpaJbfKeu+p/2a2+RFFuMlDxWLsCEyCX2gSEwTFZubWwc5uLzN/FKeKNXCMLwzOUrmGEE5nZEWcs2KYsvYPddTbhaU966x1VFZyKxuwZgnD60HqU8AlvrZPoHsj3ZUeZ3afNen6/5lMmF6nM5Yn/luOhAiJZW+4Eg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=DdBGLP+Bp9mgHVbJhymIUdU/suUld7GpIl2S07mFdMQ=; b=kaVVlBPMxy69u2JG+vEusYDFnnt0bC0jkQPnijuorP5q/Rp1xQ/i0pxzFyGRusLq/bDFgr5ZPU1Sp+d/Ohse1fE7igibtaSc98cGM+pLwhDzYWng7QAaV3BxkBl//mSVuzNR2Zqx4/WyTM3gM4BKtRs9IBbzvUHUveT/SDJPeXOgiQreXcwxr7ljWafIfLuuhdlSFTx7S+1lJzgUhWkm8WplvMFywimjk3G8Ap+hLhmkkG1UdBI/M30mmWgIzcKJdHIt8GwFSuHUW7RD5ItzIR5GznxquUUJ+H1QjAnHcYSM4S+HzQDfj+07P7PgHCEgTPVXxV2z6kXjAAOisANu1A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=DdBGLP+Bp9mgHVbJhymIUdU/suUld7GpIl2S07mFdMQ=; b=gjpowV/YL6aRD/TPMppM9cTEFyJQ0SfHZ1qCO41XuA2lmZkDwbTR1d6KObumLU5nvpJARLkB7HAI+LuTJZUct3cvqoaA3jWHxzb344RNzcq8iNWpOeq/ixcKmWaPFlip/a+KZzYE8G9SrKfXDFaDbRp2cI+Q+2q5J2cIEGxMod8= Received: from SJ0PR10MB5613.namprd10.prod.outlook.com (2603:10b6:a03:3d0::5) by SJ0PR10MB4543.namprd10.prod.outlook.com (2603:10b6:a03:2d9::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7807.28; Mon, 29 Jul 2024 11:51:08 +0000 Received: from SJ0PR10MB5613.namprd10.prod.outlook.com ([fe80::4239:cf6f:9caa:940e]) by SJ0PR10MB5613.namprd10.prod.outlook.com ([fe80::4239:cf6f:9caa:940e%6]) with mapi id 15.20.7807.026; Mon, 29 Jul 2024 11:51:08 +0000 From: Lorenzo Stoakes To: Andrew Morton Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, "Liam R . Howlett" , Vlastimil Babka , Matthew Wilcox , Alexander Viro , Christian Brauner , Jan Kara , Eric Biederman , Kees Cook , Suren Baghdasaryan , SeongJae Park , Shuah Khan , Brendan Higgins , David Gow , Rae Moar Subject: [PATCH v4 4/7] mm: move internal core VMA manipulation functions to own file Date: Mon, 29 Jul 2024 12:50:38 +0100 Message-ID: X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: LO2P265CA0323.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a4::23) To SJ0PR10MB5613.namprd10.prod.outlook.com (2603:10b6:a03:3d0::5) Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ0PR10MB5613:EE_|SJ0PR10MB4543:EE_ X-MS-Office365-Filtering-Correlation-Id: 41672375-41b2-452f-9259-08dcafc4bbc6 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|1800799024|7416014|376014; X-Microsoft-Antispam-Message-Info: e88DW3xW2bjY4jY3ukiwoaf/p5+FF1wYrnnvFxXjAtW6VOoc3oMfA3Yd67aY95cl6x6312ouuH1iOPztx0WWUgVoiTeT4ufonyiMlK1vQzVU3vyCNQE8BwbkM7LQs1DTFKxY5NekmKK5OaX7o4J3xj+Cp+WcJbRbIvJTf9ePye7szDaeRjjy4siOhqFQEEEaVSaIEc1DLicCDzDppAvqJb+nohI6dweKfWxaXdZ8lOppADUFq4tBMrZCrdpWcNxxTFDDNzuyvOqnnLw82wVUZYGDo4JxXXnw8RW1w485eEZ1GmLp2U1fvw72g8L3acC6WfBLupmealXQ1u+UJDdPpJJHcGFEwgYyJcWUTGT1OlVDZ3S9W+7lqB3Li6H5Iotpfm1Mf13KaswbV0TCyfKGC7eE8jR7AJeNsNGIpMZkAJbFeEdCBF+ppeQnWpsm8Z1J+IcHFil8jjdU1STNvNKbQtZOna6G4koXBuZJ3AdUEMiKLMfpFtgdeB4CW1ZOUFwVOqOL26WHfcjqq0z5hjlGJcWVU/2VBVO10D81YGduU7qI1+0NLCLquSNIpnDXbSL6/SFBeE+bLHLI/lCPK5mtVKhx81ymtiVGGFin2Zw8UCGhqe9dC48g3pwmeWhcK2EtpHwwOvOuCe8zzf8jimyqzlXXDoKFIyg+3mdXc+gW+yVAl+LJS1oK/OmV6BPjDVS4bYA9AQoIzHA2TYQ6nlpUd0Zcdf1nL1MLmSP41EYNXq4qBPFM76oZTVonOitp5PTPLntglim6+C3e5f30ygI06MpBXSbqjup2OtPWa8YDmHDK+M9GrDkYXopa4NIWkeeMrk0S1f4LXPU4kHaaaPMJvt1q/pJ1d8RzvMEms55AjMCsH0i78gRIgStaTsIm6qLZw6hkgXPf9dax6lFDY5Nqlrtbu7EcYeU8WeBdnHUQr1H+xws5gOtqmFtxMFMSrv82QC3Y+yE+fQpgfzK2jq16pjqLsdKr2G1OrJDdF0vKRxr/QPPvQHxAq+wDwHwhnuESEfNmNdqWafbiMdrvSFGI2b0zTnT3Bru38gE8l7+fMD2CEVh1WzG4lPEzwYxqMSrZKkgIgHXYj/7jLIweTaquNqrThYEvkOsjdbcQsHFBNOX7XmSynfCBBYd4OvFS5wrJQUcjA2Qru+jcHtS7LneY9IgyzjF8/R5xhXOcrIl4tPfe/r0/vUu5mhkzArmzEfEwebGKVMvX6IdMK/KmwcCv1C8/lROJUBvYvFMAtUdhCRQXnO1hHvtDfOVjwBuHhRN2IqAyDemxdNxbLR9Ol/TrrnoMRZH8Vm4WC2JbmFD4zB9Ya0ecTwxsPWveIUb7OYn5A+CkIY9uqn2jhri2fNBKYA== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ0PR10MB5613.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(1800799024)(7416014)(376014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: uwwonxYRRrWYGAd017zUyi9D9IUyOilpmJ9V8LYnl3h61TQXmtCe1qt6uCsHneRtSuGjhYJsc23QEEG0t7TpgofiY/4/Ua5K7UL5ZyBzRyBBfZlA0J1GUjjJWpldP9NWzmKP54VP9rjb6eKopzEClIDPmIeRlK4FYY+8Tht8D3kb76ht6AW29rdk3GVZlK/aLFUAolqWKRhwfFDzIDUwJf7PJrv77RvN4tyyrQ4BfB+euRuHcYQLWJBYar05hifMNgKxAHcOwZraIGhP4xTef4aY9u8rkp+8aAha73hGeNp8m5DidVgJGuhkQni4zoOZF/rb3Q06ivo3kXuxR60B4PjQ7a/WphZtO7ri5Xey1gsK+sh8zMASr5Kq1Mxv9AkuAtQszavS0zxc6RsBYIV7/rq0jVGED+ehriacNPo/sEtn+hm5X8l1bgq0AnmKAqdp82V9R2Zw/VfpoNnzEB5iV2LBP+exF7gm1EnDfO19f6BDS56evKcVURUqMfp3aAz4Gn1vtCOi2PcZ/0fGKTjBV3zCxOMU1KANrD4xEd8IRPWpuA/7Ta+me/p3AG3orCKVmx9eE6vmMnpCvpfGCgRf28GHhMR3NyhrdCcT0lSq00S3u039mZLQCdfiPUxC6YET68QVo2dAs6J3xXQjtSPCg3H2azoF1RfHqxA/SkJBzMPKc4Olz6ueFjo/1xCpDd4lb8DUIPHO9IM6synfphzjdo9ppzsgVyr6tCMW6RC+VM0wKRd4tD4ekzicjPobz7bFCsaaT0KXdescqz5MImgg4s/LWU8dvxbXNK2x14KKk817kWj7RSKLPoTDP2+s8F6QgxoZo5ZVNfuKUuBK8p1764PxUevfuIjr0QotKoR4MAnTWAEiknm4BDO3BUgWL6H+huli+cO9l3wFiNl64aybKY4cn1w9PP6IYlnjbTyUCZ6d8qDygHq1ubtRsGSIGKzYJxLzMeyIal1sUCXHzOJhFRST37Lw0JIHLbzTVl6ISmae/rdWLBtI9P2Bv6YXzA+VPjltNIXmZDwMWtNkGq/X7tVyggV4qEJ9U4o5FN08Uz1e22Pou4B0BVrPoxYyaRCiQeVwJnyZRy67BfjyLfI2IVRqm56bWj2Hh8VWbshj9GYQbKsp6uF1Dp6esk03cX0ExvFK27A8z/SofodV6oLoht2FFW+2Teg2TWtzWE8JcDoHPy2AEpfgnRUKAaswD5IIXyUlB1uolltKO70l5ocBfMQv6abWItt5bb5yDBCAUH8zShfBiJs14eov0to0x2AjcJyh1SXabMzXd7g/SYoemgtUDS2b1D866aKd/5zIln2bDR7D7IN4OEBFi/qHrro0ohadYwf+UARor8UOWbfnKgdmqLsnAabTJiej1aAKhPvHENka4rVavgB9HFx3ZsikFD9XM/dVNrHS92/1M5iAsknlkLoggVi9t0JNfybuPxPuXMDU50yIDczr9bRVQLag7Q5j/PFql382R56GyKOBLyZYEPi1SkvnOTZydPCD+nevsnetCR5JXGuggGHscscgKjU6hx0riGciIjAMUcO2qoVOdlyFBtnvSu76MdXw4sBl9IQUm+sgE68I1xw12U76kaeYSjES426yjfYB1KkxsQ== X-MS-Exchange-AntiSpam-ExternalHop-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-ExternalHop-MessageData-0: PSIJjPBPfFn6ZdznllFRXYyfKP+iIqx4AWIFpD8WD6psPbODqyMSppGBFzsrPUIOTHO853TxCRnOBJqw1JxSq7nb8nVFmgJsNElS4ivX7p0hAgDg9EB7i5+Wqp1TXI/kdidp0mbpdhtQVzec10+p4BVzU36AjB2h7rXY1KZ/MJR3OsF30lFRHJsYtogVizs34ckyFUwlv/yD54eIFJdVE3oHBMfpuuWu14ciBdyXUuWQ7q0vcvINmQwCp7vdUsIOGXPIL34QF7iZOiH06vMCrqK0uzqsSqVu1UNW/johq3tQ29pq8j800pAMl+4wQhLux5Iz6wJpScqvY2vF9aPvC606Cw0/8gV2vVKokPZw7l2ynMlQRNRCHVvRh/uK1RBltA44xlDd45sqApuxnmOuJ1qA8gkmR1FN844/3pxiUKxZTjHwpJmLaufbtsLwoDHNuY3KRhmPyhu+09WZ17+xYMcOz+2eDT/PtuY5f+w4+Jh2cwot2SvK1BIsyQGCIfTTK+YEtaYBNBXpqQl8d/A562u/xK2N22YYo2TVwPH9jrTZcvoMiCSiTIBE1uv4X0K6j4Zt6PfKwbtz1BdmBuajNgQ0Zjw/I3G4w1G0qbQ3Juw= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 41672375-41b2-452f-9259-08dcafc4bbc6 X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB5613.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Jul 2024 11:51:08.4049 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: SBczWZHud/Tii3RIx5itnlxRCtJNmBLWD5W/UiBCeLcm0NvIxoJAQWMNDzo81DLVRWZwq3bUOqd/3vXXO1YcrXNXiMmwo2uzjBGYywVpBnQ= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR10MB4543 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-07-29_10,2024-07-26_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 malwarescore=0 suspectscore=0 mlxscore=0 spamscore=0 adultscore=0 phishscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2407110000 definitions=main-2407290080 X-Proofpoint-ORIG-GUID: LRwKlN8GewM5M4ienFFc9rgJZ3FsBJdR X-Proofpoint-GUID: LRwKlN8GewM5M4ienFFc9rgJZ3FsBJdR This patch introduces vma.c and moves internal core VMA manipulation functions to this file from mmap.c. This allows us to isolate VMA functionality in a single place such that we can create userspace testing code that invokes this functionality in an environment where we can implement simple unit tests of core functionality. This patch ensures that core VMA functionality is explicitly marked as such by its presence in mm/vma.h. It also places the header includes required by vma.c in vma_internal.h, which is simply imported by vma.c. This makes the VMA functionality testable, as userland testing code can simply stub out functionality as required. Reviewed-by: Vlastimil Babka Reviewed-by: Liam R. Howlett Signed-off-by: Lorenzo Stoakes --- include/linux/mm.h | 35 - mm/Makefile | 2 +- mm/internal.h | 236 +----- mm/mmap.c | 1980 +++----------------------------------------- mm/mmu_notifier.c | 2 + mm/vma.c | 1766 +++++++++++++++++++++++++++++++++++++++ mm/vma.h | 364 ++++++++ mm/vma_internal.h | 50 ++ 8 files changed, 2292 insertions(+), 2143 deletions(-) create mode 100644 mm/vma.c create mode 100644 mm/vma.h create mode 100644 mm/vma_internal.h diff --git a/include/linux/mm.h b/include/linux/mm.h index 86c9d53657f1..4e6701f48b0c 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1005,21 +1005,6 @@ static inline struct vm_area_struct *vma_prev(struct vma_iterator *vmi) return mas_prev(&vmi->mas, 0); } -static inline unsigned long vma_iter_addr(struct vma_iterator *vmi) -{ - return vmi->mas.index; -} - -static inline unsigned long vma_iter_end(struct vma_iterator *vmi) -{ - return vmi->mas.last + 1; -} -static inline int vma_iter_bulk_alloc(struct vma_iterator *vmi, - unsigned long count) -{ - return mas_expected_entries(&vmi->mas, count); -} - static inline int vma_iter_clear_gfp(struct vma_iterator *vmi, unsigned long start, unsigned long end, gfp_t gfp) { @@ -2544,21 +2529,6 @@ int get_cmdline(struct task_struct *task, char *buffer, int buflen); #define MM_CP_UFFD_WP_ALL (MM_CP_UFFD_WP | \ MM_CP_UFFD_WP_RESOLVE) -bool vma_needs_dirty_tracking(struct vm_area_struct *vma); -bool vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot); -static inline bool vma_wants_manual_pte_write_upgrade(struct vm_area_struct *vma) -{ - /* - * We want to check manually if we can change individual PTEs writable - * if we can't do that automatically for all PTEs in a mapping. For - * private mappings, that's always the case when we have write - * permissions as we properly have to handle COW. - */ - if (vma->vm_flags & VM_SHARED) - return vma_wants_writenotify(vma, vma->vm_page_prot); - return !!(vma->vm_flags & VM_WRITE); - -} bool can_change_pte_writable(struct vm_area_struct *vma, unsigned long addr, pte_t pte); extern long change_protection(struct mmu_gather *tlb, @@ -3255,12 +3225,7 @@ void anon_vma_interval_tree_verify(struct anon_vma_chain *node); /* mmap.c */ extern int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin); -extern struct anon_vma *find_mergeable_anon_vma(struct vm_area_struct *); extern int insert_vm_struct(struct mm_struct *, struct vm_area_struct *); -extern void unlink_file_vma(struct vm_area_struct *); -extern struct vm_area_struct *copy_vma(struct vm_area_struct **, - unsigned long addr, unsigned long len, pgoff_t pgoff, - bool *need_rmap_locks); extern void exit_mmap(struct mm_struct *); int relocate_vma_down(struct vm_area_struct *vma, unsigned long shift); diff --git a/mm/Makefile b/mm/Makefile index d2915f8c9dc0..140a22654dde 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -37,7 +37,7 @@ mmu-y := nommu.o mmu-$(CONFIG_MMU) := highmem.o memory.o mincore.o \ mlock.o mmap.o mmu_gather.o mprotect.o mremap.o \ msync.o page_vma_mapped.o pagewalk.o \ - pgtable-generic.o rmap.o vmalloc.o + pgtable-generic.o rmap.o vmalloc.o vma.o ifdef CONFIG_CROSS_MEMORY_ATTACH diff --git a/mm/internal.h b/mm/internal.h index a4d0e98ccb97..1159b04e76a3 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -8,13 +8,18 @@ #define __MM_INTERNAL_H #include +#include #include +#include #include #include #include #include #include +/* Internal core VMA manipulation functions. */ +#include "vma.h" + struct folio_batch; /* @@ -778,37 +783,6 @@ static inline bool free_area_empty(struct free_area *area, int migratetype) return list_empty(&area->free_list[migratetype]); } -/* - * These three helpers classifies VMAs for virtual memory accounting. - */ - -/* - * Executable code area - executable, not writable, not stack - */ -static inline bool is_exec_mapping(vm_flags_t flags) -{ - return (flags & (VM_EXEC | VM_WRITE | VM_STACK)) == VM_EXEC; -} - -/* - * Stack area (including shadow stacks) - * - * VM_GROWSUP / VM_GROWSDOWN VMAs are always private anonymous: - * do_mmap() forbids all other combinations. - */ -static inline bool is_stack_mapping(vm_flags_t flags) -{ - return ((flags & VM_STACK) == VM_STACK) || (flags & VM_SHADOW_STACK); -} - -/* - * Data area - private, writable, not stack - */ -static inline bool is_data_mapping(vm_flags_t flags) -{ - return (flags & (VM_WRITE | VM_SHARED | VM_STACK)) == VM_WRITE; -} - /* mm/util.c */ struct anon_vma *folio_anon_vma(struct folio *folio); @@ -1237,80 +1211,6 @@ void touch_pud(struct vm_area_struct *vma, unsigned long addr, void touch_pmd(struct vm_area_struct *vma, unsigned long addr, pmd_t *pmd, bool write); -/* - * mm/mmap.c - */ -struct vm_area_struct *vma_merge_extend(struct vma_iterator *vmi, - struct vm_area_struct *vma, - unsigned long delta); - -struct vm_area_struct *vma_modify(struct vma_iterator *vmi, - struct vm_area_struct *prev, - struct vm_area_struct *vma, - unsigned long start, unsigned long end, - unsigned long vm_flags, - struct mempolicy *policy, - struct vm_userfaultfd_ctx uffd_ctx, - struct anon_vma_name *anon_name); - -/* We are about to modify the VMA's flags. */ -static inline struct vm_area_struct -*vma_modify_flags(struct vma_iterator *vmi, - struct vm_area_struct *prev, - struct vm_area_struct *vma, - unsigned long start, unsigned long end, - unsigned long new_flags) -{ - return vma_modify(vmi, prev, vma, start, end, new_flags, - vma_policy(vma), vma->vm_userfaultfd_ctx, - anon_vma_name(vma)); -} - -/* We are about to modify the VMA's flags and/or anon_name. */ -static inline struct vm_area_struct -*vma_modify_flags_name(struct vma_iterator *vmi, - struct vm_area_struct *prev, - struct vm_area_struct *vma, - unsigned long start, - unsigned long end, - unsigned long new_flags, - struct anon_vma_name *new_name) -{ - return vma_modify(vmi, prev, vma, start, end, new_flags, - vma_policy(vma), vma->vm_userfaultfd_ctx, new_name); -} - -/* We are about to modify the VMA's memory policy. */ -static inline struct vm_area_struct -*vma_modify_policy(struct vma_iterator *vmi, - struct vm_area_struct *prev, - struct vm_area_struct *vma, - unsigned long start, unsigned long end, - struct mempolicy *new_pol) -{ - return vma_modify(vmi, prev, vma, start, end, vma->vm_flags, - new_pol, vma->vm_userfaultfd_ctx, anon_vma_name(vma)); -} - -/* We are about to modify the VMA's flags and/or uffd context. */ -static inline struct vm_area_struct -*vma_modify_flags_uffd(struct vma_iterator *vmi, - struct vm_area_struct *prev, - struct vm_area_struct *vma, - unsigned long start, unsigned long end, - unsigned long new_flags, - struct vm_userfaultfd_ctx new_ctx) -{ - return vma_modify(vmi, prev, vma, start, end, new_flags, - vma_policy(vma), new_ctx, anon_vma_name(vma)); -} - -int vma_expand(struct vma_iterator *vmi, struct vm_area_struct *vma, - unsigned long start, unsigned long end, pgoff_t pgoff, - struct vm_area_struct *next); -int vma_shrink(struct vma_iterator *vmi, struct vm_area_struct *vma, - unsigned long start, unsigned long end, pgoff_t pgoff); - enum { /* mark page accessed */ FOLL_TOUCH = 1 << 16, @@ -1437,123 +1337,6 @@ static inline bool pte_needs_soft_dirty_wp(struct vm_area_struct *vma, pte_t pte return vma_soft_dirty_enabled(vma) && !pte_soft_dirty(pte); } -static inline void vma_iter_config(struct vma_iterator *vmi, - unsigned long index, unsigned long last) -{ - __mas_set_range(&vmi->mas, index, last - 1); -} - -static inline void vma_iter_reset(struct vma_iterator *vmi) -{ - mas_reset(&vmi->mas); -} - -static inline -struct vm_area_struct *vma_iter_prev_range_limit(struct vma_iterator *vmi, unsigned long min) -{ - return mas_prev_range(&vmi->mas, min); -} - -static inline -struct vm_area_struct *vma_iter_next_range_limit(struct vma_iterator *vmi, unsigned long max) -{ - return mas_next_range(&vmi->mas, max); -} - -static inline int vma_iter_area_lowest(struct vma_iterator *vmi, unsigned long min, - unsigned long max, unsigned long size) -{ - return mas_empty_area(&vmi->mas, min, max - 1, size); -} - -static inline int vma_iter_area_highest(struct vma_iterator *vmi, unsigned long min, - unsigned long max, unsigned long size) -{ - return mas_empty_area_rev(&vmi->mas, min, max - 1, size); -} - -/* - * VMA Iterator functions shared between nommu and mmap - */ -static inline int vma_iter_prealloc(struct vma_iterator *vmi, - struct vm_area_struct *vma) -{ - return mas_preallocate(&vmi->mas, vma, GFP_KERNEL); -} - -static inline void vma_iter_clear(struct vma_iterator *vmi) -{ - mas_store_prealloc(&vmi->mas, NULL); -} - -static inline struct vm_area_struct *vma_iter_load(struct vma_iterator *vmi) -{ - return mas_walk(&vmi->mas); -} - -/* Store a VMA with preallocated memory */ -static inline void vma_iter_store(struct vma_iterator *vmi, - struct vm_area_struct *vma) -{ - -#if defined(CONFIG_DEBUG_VM_MAPLE_TREE) - if (MAS_WARN_ON(&vmi->mas, vmi->mas.status != ma_start && - vmi->mas.index > vma->vm_start)) { - pr_warn("%lx > %lx\n store vma %lx-%lx\n into slot %lx-%lx\n", - vmi->mas.index, vma->vm_start, vma->vm_start, - vma->vm_end, vmi->mas.index, vmi->mas.last); - } - if (MAS_WARN_ON(&vmi->mas, vmi->mas.status != ma_start && - vmi->mas.last < vma->vm_start)) { - pr_warn("%lx < %lx\nstore vma %lx-%lx\ninto slot %lx-%lx\n", - vmi->mas.last, vma->vm_start, vma->vm_start, vma->vm_end, - vmi->mas.index, vmi->mas.last); - } -#endif - - if (vmi->mas.status != ma_start && - ((vmi->mas.index > vma->vm_start) || (vmi->mas.last < vma->vm_start))) - vma_iter_invalidate(vmi); - - __mas_set_range(&vmi->mas, vma->vm_start, vma->vm_end - 1); - mas_store_prealloc(&vmi->mas, vma); -} - -static inline int vma_iter_store_gfp(struct vma_iterator *vmi, - struct vm_area_struct *vma, gfp_t gfp) -{ - if (vmi->mas.status != ma_start && - ((vmi->mas.index > vma->vm_start) || (vmi->mas.last < vma->vm_start))) - vma_iter_invalidate(vmi); - - __mas_set_range(&vmi->mas, vma->vm_start, vma->vm_end - 1); - mas_store_gfp(&vmi->mas, vma, gfp); - if (unlikely(mas_is_err(&vmi->mas))) - return -ENOMEM; - - return 0; -} - -static inline -struct vm_area_struct *vma_iter_prev_range(struct vma_iterator *vmi) -{ - return mas_prev_range(&vmi->mas, 0); -} - -/* - * VMA lock generalization - */ -struct vma_prepare { - struct vm_area_struct *vma; - struct vm_area_struct *adj_next; - struct file *file; - struct address_space *mapping; - struct anon_vma *anon_vma; - struct vm_area_struct *insert; - struct vm_area_struct *remove; - struct vm_area_struct *remove2; -}; - void __meminit __init_single_page(struct page *page, unsigned long pfn, unsigned long zone, int nid); @@ -1642,15 +1425,6 @@ static inline void shrinker_debugfs_remove(struct dentry *debugfs_entry, void workingset_update_node(struct xa_node *node); extern struct list_lru shadow_nodes; -struct unlink_vma_file_batch { - int count; - struct vm_area_struct *vmas[8]; -}; - -void unlink_file_vma_batch_init(struct unlink_vma_file_batch *); -void unlink_file_vma_batch_add(struct unlink_vma_file_batch *, struct vm_area_struct *); -void unlink_file_vma_batch_final(struct unlink_vma_file_batch *); - /* mremap.c */ unsigned long move_page_tables(struct vm_area_struct *vma, unsigned long old_addr, struct vm_area_struct *new_vma, diff --git a/mm/mmap.c b/mm/mmap.c index 211148ba2831..4a9c2329b09a 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -76,16 +76,6 @@ int mmap_rnd_compat_bits __read_mostly = CONFIG_ARCH_MMAP_RND_COMPAT_BITS; static bool ignore_rlimit_data; core_param(ignore_rlimit_data, ignore_rlimit_data, bool, 0644); -static void unmap_region(struct mm_struct *mm, struct ma_state *mas, - struct vm_area_struct *vma, struct vm_area_struct *prev, - struct vm_area_struct *next, unsigned long start, - unsigned long end, unsigned long tree_end, bool mm_wr_locked); - -static pgprot_t vm_pgprot_modify(pgprot_t oldprot, unsigned long vm_flags) -{ - return pgprot_modify(oldprot, vm_get_page_prot(vm_flags)); -} - /* Update vma->vm_page_prot to reflect vma->vm_flags. */ void vma_set_page_prot(struct vm_area_struct *vma) { @@ -101,100 +91,6 @@ void vma_set_page_prot(struct vm_area_struct *vma) WRITE_ONCE(vma->vm_page_prot, vm_page_prot); } -/* - * Requires inode->i_mapping->i_mmap_rwsem - */ -static void __remove_shared_vm_struct(struct vm_area_struct *vma, - struct address_space *mapping) -{ - if (vma_is_shared_maywrite(vma)) - mapping_unmap_writable(mapping); - - flush_dcache_mmap_lock(mapping); - vma_interval_tree_remove(vma, &mapping->i_mmap); - flush_dcache_mmap_unlock(mapping); -} - -/* - * Unlink a file-based vm structure from its interval tree, to hide - * vma from rmap and vmtruncate before freeing its page tables. - */ -void unlink_file_vma(struct vm_area_struct *vma) -{ - struct file *file = vma->vm_file; - - if (file) { - struct address_space *mapping = file->f_mapping; - i_mmap_lock_write(mapping); - __remove_shared_vm_struct(vma, mapping); - i_mmap_unlock_write(mapping); - } -} - -void unlink_file_vma_batch_init(struct unlink_vma_file_batch *vb) -{ - vb->count = 0; -} - -static void unlink_file_vma_batch_process(struct unlink_vma_file_batch *vb) -{ - struct address_space *mapping; - int i; - - mapping = vb->vmas[0]->vm_file->f_mapping; - i_mmap_lock_write(mapping); - for (i = 0; i < vb->count; i++) { - VM_WARN_ON_ONCE(vb->vmas[i]->vm_file->f_mapping != mapping); - __remove_shared_vm_struct(vb->vmas[i], mapping); - } - i_mmap_unlock_write(mapping); - - unlink_file_vma_batch_init(vb); -} - -void unlink_file_vma_batch_add(struct unlink_vma_file_batch *vb, - struct vm_area_struct *vma) -{ - if (vma->vm_file == NULL) - return; - - if ((vb->count > 0 && vb->vmas[0]->vm_file != vma->vm_file) || - vb->count == ARRAY_SIZE(vb->vmas)) - unlink_file_vma_batch_process(vb); - - vb->vmas[vb->count] = vma; - vb->count++; -} - -void unlink_file_vma_batch_final(struct unlink_vma_file_batch *vb) -{ - if (vb->count > 0) - unlink_file_vma_batch_process(vb); -} - -/* - * Close a vm structure and free it. - */ -static void remove_vma(struct vm_area_struct *vma, bool unreachable) -{ - might_sleep(); - if (vma->vm_ops && vma->vm_ops->close) - vma->vm_ops->close(vma); - if (vma->vm_file) - fput(vma->vm_file); - mpol_put(vma_policy(vma)); - if (unreachable) - __vm_area_free(vma); - else - vm_area_free(vma); -} - -static inline struct vm_area_struct *vma_prev_limit(struct vma_iterator *vmi, - unsigned long min) -{ - return mas_prev(&vmi->mas, min); -} - /* * check_brk_limits() - Use platform specific check of range & verify mlock * limits. @@ -300,891 +196,22 @@ SYSCALL_DEFINE1(brk, unsigned long, brk) if (do_brk_flags(&vmi, brkvma, oldbrk, newbrk - oldbrk, 0) < 0) goto out; - mm->brk = brk; - if (mm->def_flags & VM_LOCKED) - populate = true; - -success: - mmap_write_unlock(mm); -success_unlocked: - userfaultfd_unmap_complete(mm, &uf); - if (populate) - mm_populate(oldbrk, newbrk - oldbrk); - return brk; - -out: - mm->brk = origbrk; - mmap_write_unlock(mm); - return origbrk; -} - -#if defined(CONFIG_DEBUG_VM_MAPLE_TREE) -static void validate_mm(struct mm_struct *mm) -{ - int bug = 0; - int i = 0; - struct vm_area_struct *vma; - VMA_ITERATOR(vmi, mm, 0); - - mt_validate(&mm->mm_mt); - for_each_vma(vmi, vma) { -#ifdef CONFIG_DEBUG_VM_RB - struct anon_vma *anon_vma = vma->anon_vma; - struct anon_vma_chain *avc; -#endif - unsigned long vmi_start, vmi_end; - bool warn = 0; - - vmi_start = vma_iter_addr(&vmi); - vmi_end = vma_iter_end(&vmi); - if (VM_WARN_ON_ONCE_MM(vma->vm_end != vmi_end, mm)) - warn = 1; - - if (VM_WARN_ON_ONCE_MM(vma->vm_start != vmi_start, mm)) - warn = 1; - - if (warn) { - pr_emerg("issue in %s\n", current->comm); - dump_stack(); - dump_vma(vma); - pr_emerg("tree range: %px start %lx end %lx\n", vma, - vmi_start, vmi_end - 1); - vma_iter_dump_tree(&vmi); - } - -#ifdef CONFIG_DEBUG_VM_RB - if (anon_vma) { - anon_vma_lock_read(anon_vma); - list_for_each_entry(avc, &vma->anon_vma_chain, same_vma) - anon_vma_interval_tree_verify(avc); - anon_vma_unlock_read(anon_vma); - } -#endif - i++; - } - if (i != mm->map_count) { - pr_emerg("map_count %d vma iterator %d\n", mm->map_count, i); - bug = 1; - } - VM_BUG_ON_MM(bug, mm); -} - -#else /* !CONFIG_DEBUG_VM_MAPLE_TREE */ -#define validate_mm(mm) do { } while (0) -#endif /* CONFIG_DEBUG_VM_MAPLE_TREE */ - -/* - * vma has some anon_vma assigned, and is already inserted on that - * anon_vma's interval trees. - * - * Before updating the vma's vm_start / vm_end / vm_pgoff fields, the - * vma must be removed from the anon_vma's interval trees using - * anon_vma_interval_tree_pre_update_vma(). - * - * After the update, the vma will be reinserted using - * anon_vma_interval_tree_post_update_vma(). - * - * The entire update must be protected by exclusive mmap_lock and by - * the root anon_vma's mutex. - */ -static inline void -anon_vma_interval_tree_pre_update_vma(struct vm_area_struct *vma) -{ - struct anon_vma_chain *avc; - - list_for_each_entry(avc, &vma->anon_vma_chain, same_vma) - anon_vma_interval_tree_remove(avc, &avc->anon_vma->rb_root); -} - -static inline void -anon_vma_interval_tree_post_update_vma(struct vm_area_struct *vma) -{ - struct anon_vma_chain *avc; - - list_for_each_entry(avc, &vma->anon_vma_chain, same_vma) - anon_vma_interval_tree_insert(avc, &avc->anon_vma->rb_root); -} - -static unsigned long count_vma_pages_range(struct mm_struct *mm, - unsigned long addr, unsigned long end) -{ - VMA_ITERATOR(vmi, mm, addr); - struct vm_area_struct *vma; - unsigned long nr_pages = 0; - - for_each_vma_range(vmi, vma, end) { - unsigned long vm_start = max(addr, vma->vm_start); - unsigned long vm_end = min(end, vma->vm_end); - - nr_pages += PHYS_PFN(vm_end - vm_start); - } - - return nr_pages; -} - -static void __vma_link_file(struct vm_area_struct *vma, - struct address_space *mapping) -{ - if (vma_is_shared_maywrite(vma)) - mapping_allow_writable(mapping); - - flush_dcache_mmap_lock(mapping); - vma_interval_tree_insert(vma, &mapping->i_mmap); - flush_dcache_mmap_unlock(mapping); -} - -static void vma_link_file(struct vm_area_struct *vma) -{ - struct file *file = vma->vm_file; - struct address_space *mapping; - - if (file) { - mapping = file->f_mapping; - i_mmap_lock_write(mapping); - __vma_link_file(vma, mapping); - i_mmap_unlock_write(mapping); - } -} - -static int vma_link(struct mm_struct *mm, struct vm_area_struct *vma) -{ - VMA_ITERATOR(vmi, mm, 0); - - vma_iter_config(&vmi, vma->vm_start, vma->vm_end); - if (vma_iter_prealloc(&vmi, vma)) - return -ENOMEM; - - vma_start_write(vma); - vma_iter_store(&vmi, vma); - vma_link_file(vma); - mm->map_count++; - validate_mm(mm); - return 0; -} - -/* - * init_multi_vma_prep() - Initializer for struct vma_prepare - * @vp: The vma_prepare struct - * @vma: The vma that will be altered once locked - * @next: The next vma if it is to be adjusted - * @remove: The first vma to be removed - * @remove2: The second vma to be removed - */ -static inline void init_multi_vma_prep(struct vma_prepare *vp, - struct vm_area_struct *vma, struct vm_area_struct *next, - struct vm_area_struct *remove, struct vm_area_struct *remove2) -{ - memset(vp, 0, sizeof(struct vma_prepare)); - vp->vma = vma; - vp->anon_vma = vma->anon_vma; - vp->remove = remove; - vp->remove2 = remove2; - vp->adj_next = next; - if (!vp->anon_vma && next) - vp->anon_vma = next->anon_vma; - - vp->file = vma->vm_file; - if (vp->file) - vp->mapping = vma->vm_file->f_mapping; - -} - -/* - * init_vma_prep() - Initializer wrapper for vma_prepare struct - * @vp: The vma_prepare struct - * @vma: The vma that will be altered once locked - */ -static inline void init_vma_prep(struct vma_prepare *vp, - struct vm_area_struct *vma) -{ - init_multi_vma_prep(vp, vma, NULL, NULL, NULL); -} - - -/* - * vma_prepare() - Helper function for handling locking VMAs prior to altering - * @vp: The initialized vma_prepare struct - */ -static inline void vma_prepare(struct vma_prepare *vp) -{ - if (vp->file) { - uprobe_munmap(vp->vma, vp->vma->vm_start, vp->vma->vm_end); - - if (vp->adj_next) - uprobe_munmap(vp->adj_next, vp->adj_next->vm_start, - vp->adj_next->vm_end); - - i_mmap_lock_write(vp->mapping); - if (vp->insert && vp->insert->vm_file) { - /* - * Put into interval tree now, so instantiated pages - * are visible to arm/parisc __flush_dcache_page - * throughout; but we cannot insert into address - * space until vma start or end is updated. - */ - __vma_link_file(vp->insert, - vp->insert->vm_file->f_mapping); - } - } - - if (vp->anon_vma) { - anon_vma_lock_write(vp->anon_vma); - anon_vma_interval_tree_pre_update_vma(vp->vma); - if (vp->adj_next) - anon_vma_interval_tree_pre_update_vma(vp->adj_next); - } - - if (vp->file) { - flush_dcache_mmap_lock(vp->mapping); - vma_interval_tree_remove(vp->vma, &vp->mapping->i_mmap); - if (vp->adj_next) - vma_interval_tree_remove(vp->adj_next, - &vp->mapping->i_mmap); - } - -} - -/* - * vma_complete- Helper function for handling the unlocking after altering VMAs, - * or for inserting a VMA. - * - * @vp: The vma_prepare struct - * @vmi: The vma iterator - * @mm: The mm_struct - */ -static inline void vma_complete(struct vma_prepare *vp, - struct vma_iterator *vmi, struct mm_struct *mm) -{ - if (vp->file) { - if (vp->adj_next) - vma_interval_tree_insert(vp->adj_next, - &vp->mapping->i_mmap); - vma_interval_tree_insert(vp->vma, &vp->mapping->i_mmap); - flush_dcache_mmap_unlock(vp->mapping); - } - - if (vp->remove && vp->file) { - __remove_shared_vm_struct(vp->remove, vp->mapping); - if (vp->remove2) - __remove_shared_vm_struct(vp->remove2, vp->mapping); - } else if (vp->insert) { - /* - * split_vma has split insert from vma, and needs - * us to insert it before dropping the locks - * (it may either follow vma or precede it). - */ - vma_iter_store(vmi, vp->insert); - mm->map_count++; - } - - if (vp->anon_vma) { - anon_vma_interval_tree_post_update_vma(vp->vma); - if (vp->adj_next) - anon_vma_interval_tree_post_update_vma(vp->adj_next); - anon_vma_unlock_write(vp->anon_vma); - } - - if (vp->file) { - i_mmap_unlock_write(vp->mapping); - uprobe_mmap(vp->vma); - - if (vp->adj_next) - uprobe_mmap(vp->adj_next); - } - - if (vp->remove) { -again: - vma_mark_detached(vp->remove, true); - if (vp->file) { - uprobe_munmap(vp->remove, vp->remove->vm_start, - vp->remove->vm_end); - fput(vp->file); - } - if (vp->remove->anon_vma) - anon_vma_merge(vp->vma, vp->remove); - mm->map_count--; - mpol_put(vma_policy(vp->remove)); - if (!vp->remove2) - WARN_ON_ONCE(vp->vma->vm_end < vp->remove->vm_end); - vm_area_free(vp->remove); - - /* - * In mprotect's case 6 (see comments on vma_merge), - * we are removing both mid and next vmas - */ - if (vp->remove2) { - vp->remove = vp->remove2; - vp->remove2 = NULL; - goto again; - } - } - if (vp->insert && vp->file) - uprobe_mmap(vp->insert); - validate_mm(mm); -} - -/* - * dup_anon_vma() - Helper function to duplicate anon_vma - * @dst: The destination VMA - * @src: The source VMA - * @dup: Pointer to the destination VMA when successful. - * - * Returns: 0 on success. - */ -static inline int dup_anon_vma(struct vm_area_struct *dst, - struct vm_area_struct *src, struct vm_area_struct **dup) -{ - /* - * Easily overlooked: when mprotect shifts the boundary, make sure the - * expanding vma has anon_vma set if the shrinking vma had, to cover any - * anon pages imported. - */ - if (src->anon_vma && !dst->anon_vma) { - int ret; - - vma_assert_write_locked(dst); - dst->anon_vma = src->anon_vma; - ret = anon_vma_clone(dst, src); - if (ret) - return ret; - - *dup = dst; - } - - return 0; -} - -/* - * vma_expand - Expand an existing VMA - * - * @vmi: The vma iterator - * @vma: The vma to expand - * @start: The start of the vma - * @end: The exclusive end of the vma - * @pgoff: The page offset of vma - * @next: The current of next vma. - * - * Expand @vma to @start and @end. Can expand off the start and end. Will - * expand over @next if it's different from @vma and @end == @next->vm_end. - * Checking if the @vma can expand and merge with @next needs to be handled by - * the caller. - * - * Returns: 0 on success - */ -int vma_expand(struct vma_iterator *vmi, struct vm_area_struct *vma, - unsigned long start, unsigned long end, pgoff_t pgoff, - struct vm_area_struct *next) -{ - struct vm_area_struct *anon_dup = NULL; - bool remove_next = false; - struct vma_prepare vp; - - vma_start_write(vma); - if (next && (vma != next) && (end == next->vm_end)) { - int ret; - - remove_next = true; - vma_start_write(next); - ret = dup_anon_vma(vma, next, &anon_dup); - if (ret) - return ret; - } - - init_multi_vma_prep(&vp, vma, NULL, remove_next ? next : NULL, NULL); - /* Not merging but overwriting any part of next is not handled. */ - VM_WARN_ON(next && !vp.remove && - next != vma && end > next->vm_start); - /* Only handles expanding */ - VM_WARN_ON(vma->vm_start < start || vma->vm_end > end); - - /* Note: vma iterator must be pointing to 'start' */ - vma_iter_config(vmi, start, end); - if (vma_iter_prealloc(vmi, vma)) - goto nomem; - - vma_prepare(&vp); - vma_adjust_trans_huge(vma, start, end, 0); - vma_set_range(vma, start, end, pgoff); - vma_iter_store(vmi, vma); - - vma_complete(&vp, vmi, vma->vm_mm); - return 0; - -nomem: - if (anon_dup) - unlink_anon_vmas(anon_dup); - return -ENOMEM; -} - -/* - * vma_shrink() - Reduce an existing VMAs memory area - * @vmi: The vma iterator - * @vma: The VMA to modify - * @start: The new start - * @end: The new end - * - * Returns: 0 on success, -ENOMEM otherwise - */ -int vma_shrink(struct vma_iterator *vmi, struct vm_area_struct *vma, - unsigned long start, unsigned long end, pgoff_t pgoff) -{ - struct vma_prepare vp; - - WARN_ON((vma->vm_start != start) && (vma->vm_end != end)); - - if (vma->vm_start < start) - vma_iter_config(vmi, vma->vm_start, start); - else - vma_iter_config(vmi, end, vma->vm_end); - - if (vma_iter_prealloc(vmi, NULL)) - return -ENOMEM; - - vma_start_write(vma); - - init_vma_prep(&vp, vma); - vma_prepare(&vp); - vma_adjust_trans_huge(vma, start, end, 0); - - vma_iter_clear(vmi); - vma_set_range(vma, start, end, pgoff); - vma_complete(&vp, vmi, vma->vm_mm); - return 0; -} - -/* - * If the vma has a ->close operation then the driver probably needs to release - * per-vma resources, so we don't attempt to merge those if the caller indicates - * the current vma may be removed as part of the merge. - */ -static inline bool is_mergeable_vma(struct vm_area_struct *vma, - struct file *file, unsigned long vm_flags, - struct vm_userfaultfd_ctx vm_userfaultfd_ctx, - struct anon_vma_name *anon_name, bool may_remove_vma) -{ - /* - * VM_SOFTDIRTY should not prevent from VMA merging, if we - * match the flags but dirty bit -- the caller should mark - * merged VMA as dirty. If dirty bit won't be excluded from - * comparison, we increase pressure on the memory system forcing - * the kernel to generate new VMAs when old one could be - * extended instead. - */ - if ((vma->vm_flags ^ vm_flags) & ~VM_SOFTDIRTY) - return false; - if (vma->vm_file != file) - return false; - if (may_remove_vma && vma->vm_ops && vma->vm_ops->close) - return false; - if (!is_mergeable_vm_userfaultfd_ctx(vma, vm_userfaultfd_ctx)) - return false; - if (!anon_vma_name_eq(anon_vma_name(vma), anon_name)) - return false; - return true; -} - -static inline bool is_mergeable_anon_vma(struct anon_vma *anon_vma1, - struct anon_vma *anon_vma2, struct vm_area_struct *vma) -{ - /* - * The list_is_singular() test is to avoid merging VMA cloned from - * parents. This can improve scalability caused by anon_vma lock. - */ - if ((!anon_vma1 || !anon_vma2) && (!vma || - list_is_singular(&vma->anon_vma_chain))) - return true; - return anon_vma1 == anon_vma2; -} - -/* - * Return true if we can merge this (vm_flags,anon_vma,file,vm_pgoff) - * in front of (at a lower virtual address and file offset than) the vma. - * - * We cannot merge two vmas if they have differently assigned (non-NULL) - * anon_vmas, nor if same anon_vma is assigned but offsets incompatible. - * - * We don't check here for the merged mmap wrapping around the end of pagecache - * indices (16TB on ia32) because do_mmap() does not permit mmap's which - * wrap, nor mmaps which cover the final page at index -1UL. - * - * We assume the vma may be removed as part of the merge. - */ -static bool -can_vma_merge_before(struct vm_area_struct *vma, unsigned long vm_flags, - struct anon_vma *anon_vma, struct file *file, - pgoff_t vm_pgoff, struct vm_userfaultfd_ctx vm_userfaultfd_ctx, - struct anon_vma_name *anon_name) -{ - if (is_mergeable_vma(vma, file, vm_flags, vm_userfaultfd_ctx, anon_name, true) && - is_mergeable_anon_vma(anon_vma, vma->anon_vma, vma)) { - if (vma->vm_pgoff == vm_pgoff) - return true; - } - return false; -} - -/* - * Return true if we can merge this (vm_flags,anon_vma,file,vm_pgoff) - * beyond (at a higher virtual address and file offset than) the vma. - * - * We cannot merge two vmas if they have differently assigned (non-NULL) - * anon_vmas, nor if same anon_vma is assigned but offsets incompatible. - * - * We assume that vma is not removed as part of the merge. - */ -static bool -can_vma_merge_after(struct vm_area_struct *vma, unsigned long vm_flags, - struct anon_vma *anon_vma, struct file *file, - pgoff_t vm_pgoff, struct vm_userfaultfd_ctx vm_userfaultfd_ctx, - struct anon_vma_name *anon_name) -{ - if (is_mergeable_vma(vma, file, vm_flags, vm_userfaultfd_ctx, anon_name, false) && - is_mergeable_anon_vma(anon_vma, vma->anon_vma, vma)) { - pgoff_t vm_pglen; - vm_pglen = vma_pages(vma); - if (vma->vm_pgoff + vm_pglen == vm_pgoff) - return true; - } - return false; -} - -/* - * Given a mapping request (addr,end,vm_flags,file,pgoff,anon_name), - * figure out whether that can be merged with its predecessor or its - * successor. Or both (it neatly fills a hole). - * - * In most cases - when called for mmap, brk or mremap - [addr,end) is - * certain not to be mapped by the time vma_merge is called; but when - * called for mprotect, it is certain to be already mapped (either at - * an offset within prev, or at the start of next), and the flags of - * this area are about to be changed to vm_flags - and the no-change - * case has already been eliminated. - * - * The following mprotect cases have to be considered, where **** is - * the area passed down from mprotect_fixup, never extending beyond one - * vma, PPPP is the previous vma, CCCC is a concurrent vma that starts - * at the same address as **** and is of the same or larger span, and - * NNNN the next vma after ****: - * - * **** **** **** - * PPPPPPNNNNNN PPPPPPNNNNNN PPPPPPCCCCCC - * cannot merge might become might become - * PPNNNNNNNNNN PPPPPPPPPPCC - * mmap, brk or case 4 below case 5 below - * mremap move: - * **** **** - * PPPP NNNN PPPPCCCCNNNN - * might become might become - * PPPPPPPPPPPP 1 or PPPPPPPPPPPP 6 or - * PPPPPPPPNNNN 2 or PPPPPPPPNNNN 7 or - * PPPPNNNNNNNN 3 PPPPNNNNNNNN 8 - * - * It is important for case 8 that the vma CCCC overlapping the - * region **** is never going to extended over NNNN. Instead NNNN must - * be extended in region **** and CCCC must be removed. This way in - * all cases where vma_merge succeeds, the moment vma_merge drops the - * rmap_locks, the properties of the merged vma will be already - * correct for the whole merged range. Some of those properties like - * vm_page_prot/vm_flags may be accessed by rmap_walks and they must - * be correct for the whole merged range immediately after the - * rmap_locks are released. Otherwise if NNNN would be removed and - * CCCC would be extended over the NNNN range, remove_migration_ptes - * or other rmap walkers (if working on addresses beyond the "end" - * parameter) may establish ptes with the wrong permissions of CCCC - * instead of the right permissions of NNNN. - * - * In the code below: - * PPPP is represented by *prev - * CCCC is represented by *curr or not represented at all (NULL) - * NNNN is represented by *next or not represented at all (NULL) - * **** is not represented - it will be merged and the vma containing the - * area is returned, or the function will return NULL - */ -static struct vm_area_struct -*vma_merge(struct vma_iterator *vmi, struct vm_area_struct *prev, - struct vm_area_struct *src, unsigned long addr, unsigned long end, - unsigned long vm_flags, pgoff_t pgoff, struct mempolicy *policy, - struct vm_userfaultfd_ctx vm_userfaultfd_ctx, - struct anon_vma_name *anon_name) -{ - struct mm_struct *mm = src->vm_mm; - struct anon_vma *anon_vma = src->anon_vma; - struct file *file = src->vm_file; - struct vm_area_struct *curr, *next, *res; - struct vm_area_struct *vma, *adjust, *remove, *remove2; - struct vm_area_struct *anon_dup = NULL; - struct vma_prepare vp; - pgoff_t vma_pgoff; - int err = 0; - bool merge_prev = false; - bool merge_next = false; - bool vma_expanded = false; - unsigned long vma_start = addr; - unsigned long vma_end = end; - pgoff_t pglen = (end - addr) >> PAGE_SHIFT; - long adj_start = 0; - - /* - * We later require that vma->vm_flags == vm_flags, - * so this tests vma->vm_flags & VM_SPECIAL, too. - */ - if (vm_flags & VM_SPECIAL) - return NULL; - - /* Does the input range span an existing VMA? (cases 5 - 8) */ - curr = find_vma_intersection(mm, prev ? prev->vm_end : 0, end); - - if (!curr || /* cases 1 - 4 */ - end == curr->vm_end) /* cases 6 - 8, adjacent VMA */ - next = vma_lookup(mm, end); - else - next = NULL; /* case 5 */ - - if (prev) { - vma_start = prev->vm_start; - vma_pgoff = prev->vm_pgoff; - - /* Can we merge the predecessor? */ - if (addr == prev->vm_end && mpol_equal(vma_policy(prev), policy) - && can_vma_merge_after(prev, vm_flags, anon_vma, file, - pgoff, vm_userfaultfd_ctx, anon_name)) { - merge_prev = true; - vma_prev(vmi); - } - } - - /* Can we merge the successor? */ - if (next && mpol_equal(policy, vma_policy(next)) && - can_vma_merge_before(next, vm_flags, anon_vma, file, pgoff+pglen, - vm_userfaultfd_ctx, anon_name)) { - merge_next = true; - } - - /* Verify some invariant that must be enforced by the caller. */ - VM_WARN_ON(prev && addr <= prev->vm_start); - VM_WARN_ON(curr && (addr != curr->vm_start || end > curr->vm_end)); - VM_WARN_ON(addr >= end); - - if (!merge_prev && !merge_next) - return NULL; /* Not mergeable. */ - - if (merge_prev) - vma_start_write(prev); - - res = vma = prev; - remove = remove2 = adjust = NULL; - - /* Can we merge both the predecessor and the successor? */ - if (merge_prev && merge_next && - is_mergeable_anon_vma(prev->anon_vma, next->anon_vma, NULL)) { - vma_start_write(next); - remove = next; /* case 1 */ - vma_end = next->vm_end; - err = dup_anon_vma(prev, next, &anon_dup); - if (curr) { /* case 6 */ - vma_start_write(curr); - remove = curr; - remove2 = next; - /* - * Note that the dup_anon_vma below cannot overwrite err - * since the first caller would do nothing unless next - * has an anon_vma. - */ - if (!next->anon_vma) - err = dup_anon_vma(prev, curr, &anon_dup); - } - } else if (merge_prev) { /* case 2 */ - if (curr) { - vma_start_write(curr); - if (end == curr->vm_end) { /* case 7 */ - /* - * can_vma_merge_after() assumed we would not be - * removing prev vma, so it skipped the check - * for vm_ops->close, but we are removing curr - */ - if (curr->vm_ops && curr->vm_ops->close) - err = -EINVAL; - remove = curr; - } else { /* case 5 */ - adjust = curr; - adj_start = (end - curr->vm_start); - } - if (!err) - err = dup_anon_vma(prev, curr, &anon_dup); - } - } else { /* merge_next */ - vma_start_write(next); - res = next; - if (prev && addr < prev->vm_end) { /* case 4 */ - vma_start_write(prev); - vma_end = addr; - adjust = next; - adj_start = -(prev->vm_end - addr); - err = dup_anon_vma(next, prev, &anon_dup); - } else { - /* - * Note that cases 3 and 8 are the ONLY ones where prev - * is permitted to be (but is not necessarily) NULL. - */ - vma = next; /* case 3 */ - vma_start = addr; - vma_end = next->vm_end; - vma_pgoff = next->vm_pgoff - pglen; - if (curr) { /* case 8 */ - vma_pgoff = curr->vm_pgoff; - vma_start_write(curr); - remove = curr; - err = dup_anon_vma(next, curr, &anon_dup); - } - } - } - - /* Error in anon_vma clone. */ - if (err) - goto anon_vma_fail; - - if (vma_start < vma->vm_start || vma_end > vma->vm_end) - vma_expanded = true; - - if (vma_expanded) { - vma_iter_config(vmi, vma_start, vma_end); - } else { - vma_iter_config(vmi, adjust->vm_start + adj_start, - adjust->vm_end); - } - - if (vma_iter_prealloc(vmi, vma)) - goto prealloc_fail; - - init_multi_vma_prep(&vp, vma, adjust, remove, remove2); - VM_WARN_ON(vp.anon_vma && adjust && adjust->anon_vma && - vp.anon_vma != adjust->anon_vma); - - vma_prepare(&vp); - vma_adjust_trans_huge(vma, vma_start, vma_end, adj_start); - vma_set_range(vma, vma_start, vma_end, vma_pgoff); - - if (vma_expanded) - vma_iter_store(vmi, vma); - - if (adj_start) { - adjust->vm_start += adj_start; - adjust->vm_pgoff += adj_start >> PAGE_SHIFT; - if (adj_start < 0) { - WARN_ON(vma_expanded); - vma_iter_store(vmi, next); - } - } - - vma_complete(&vp, vmi, mm); - khugepaged_enter_vma(res, vm_flags); - return res; - -prealloc_fail: - if (anon_dup) - unlink_anon_vmas(anon_dup); - -anon_vma_fail: - vma_iter_set(vmi, addr); - vma_iter_load(vmi); - return NULL; -} - -/* - * Rough compatibility check to quickly see if it's even worth looking - * at sharing an anon_vma. - * - * They need to have the same vm_file, and the flags can only differ - * in things that mprotect may change. - * - * NOTE! The fact that we share an anon_vma doesn't _have_ to mean that - * we can merge the two vma's. For example, we refuse to merge a vma if - * there is a vm_ops->close() function, because that indicates that the - * driver is doing some kind of reference counting. But that doesn't - * really matter for the anon_vma sharing case. - */ -static int anon_vma_compatible(struct vm_area_struct *a, struct vm_area_struct *b) -{ - return a->vm_end == b->vm_start && - mpol_equal(vma_policy(a), vma_policy(b)) && - a->vm_file == b->vm_file && - !((a->vm_flags ^ b->vm_flags) & ~(VM_ACCESS_FLAGS | VM_SOFTDIRTY)) && - b->vm_pgoff == a->vm_pgoff + ((b->vm_start - a->vm_start) >> PAGE_SHIFT); -} - -/* - * Do some basic sanity checking to see if we can re-use the anon_vma - * from 'old'. The 'a'/'b' vma's are in VM order - one of them will be - * the same as 'old', the other will be the new one that is trying - * to share the anon_vma. - * - * NOTE! This runs with mmap_lock held for reading, so it is possible that - * the anon_vma of 'old' is concurrently in the process of being set up - * by another page fault trying to merge _that_. But that's ok: if it - * is being set up, that automatically means that it will be a singleton - * acceptable for merging, so we can do all of this optimistically. But - * we do that READ_ONCE() to make sure that we never re-load the pointer. - * - * IOW: that the "list_is_singular()" test on the anon_vma_chain only - * matters for the 'stable anon_vma' case (ie the thing we want to avoid - * is to return an anon_vma that is "complex" due to having gone through - * a fork). - * - * We also make sure that the two vma's are compatible (adjacent, - * and with the same memory policies). That's all stable, even with just - * a read lock on the mmap_lock. - */ -static struct anon_vma *reusable_anon_vma(struct vm_area_struct *old, struct vm_area_struct *a, struct vm_area_struct *b) -{ - if (anon_vma_compatible(a, b)) { - struct anon_vma *anon_vma = READ_ONCE(old->anon_vma); - - if (anon_vma && list_is_singular(&old->anon_vma_chain)) - return anon_vma; - } - return NULL; -} - -/* - * find_mergeable_anon_vma is used by anon_vma_prepare, to check - * neighbouring vmas for a suitable anon_vma, before it goes off - * to allocate a new anon_vma. It checks because a repetitive - * sequence of mprotects and faults may otherwise lead to distinct - * anon_vmas being allocated, preventing vma merge in subsequent - * mprotect. - */ -struct anon_vma *find_mergeable_anon_vma(struct vm_area_struct *vma) -{ - struct anon_vma *anon_vma = NULL; - struct vm_area_struct *prev, *next; - VMA_ITERATOR(vmi, vma->vm_mm, vma->vm_end); - - /* Try next first. */ - next = vma_iter_load(&vmi); - if (next) { - anon_vma = reusable_anon_vma(next, vma, next); - if (anon_vma) - return anon_vma; - } + mm->brk = brk; + if (mm->def_flags & VM_LOCKED) + populate = true; - prev = vma_prev(&vmi); - VM_BUG_ON_VMA(prev != vma, vma); - prev = vma_prev(&vmi); - /* Try prev next. */ - if (prev) - anon_vma = reusable_anon_vma(prev, prev, vma); +success: + mmap_write_unlock(mm); +success_unlocked: + userfaultfd_unmap_complete(mm, &uf); + if (populate) + mm_populate(oldbrk, newbrk - oldbrk); + return brk; - /* - * We might reach here with anon_vma == NULL if we can't find - * any reusable anon_vma. - * There's no absolute need to look only at touching neighbours: - * we could search further afield for "compatible" anon_vmas. - * But it would probably just be a waste of time searching, - * or lead to too many vmas hanging off the same anon_vma. - * We're trying to allow mprotect remerging later on, - * not trying to minimize memory used for anon_vmas. - */ - return anon_vma; +out: + mm->brk = origbrk; + mmap_write_unlock(mm); + return origbrk; } /* @@ -1549,85 +576,6 @@ SYSCALL_DEFINE1(old_mmap, struct mmap_arg_struct __user *, arg) } #endif /* __ARCH_WANT_SYS_OLD_MMAP */ -static bool vm_ops_needs_writenotify(const struct vm_operations_struct *vm_ops) -{ - return vm_ops && (vm_ops->page_mkwrite || vm_ops->pfn_mkwrite); -} - -static bool vma_is_shared_writable(struct vm_area_struct *vma) -{ - return (vma->vm_flags & (VM_WRITE | VM_SHARED)) == - (VM_WRITE | VM_SHARED); -} - -static bool vma_fs_can_writeback(struct vm_area_struct *vma) -{ - /* No managed pages to writeback. */ - if (vma->vm_flags & VM_PFNMAP) - return false; - - return vma->vm_file && vma->vm_file->f_mapping && - mapping_can_writeback(vma->vm_file->f_mapping); -} - -/* - * Does this VMA require the underlying folios to have their dirty state - * tracked? - */ -bool vma_needs_dirty_tracking(struct vm_area_struct *vma) -{ - /* Only shared, writable VMAs require dirty tracking. */ - if (!vma_is_shared_writable(vma)) - return false; - - /* Does the filesystem need to be notified? */ - if (vm_ops_needs_writenotify(vma->vm_ops)) - return true; - - /* - * Even if the filesystem doesn't indicate a need for writenotify, if it - * can writeback, dirty tracking is still required. - */ - return vma_fs_can_writeback(vma); -} - -/* - * Some shared mappings will want the pages marked read-only - * to track write events. If so, we'll downgrade vm_page_prot - * to the private version (using protection_map[] without the - * VM_SHARED bit). - */ -bool vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot) -{ - /* If it was private or non-writable, the write bit is already clear */ - if (!vma_is_shared_writable(vma)) - return false; - - /* The backer wishes to know when pages are first written to? */ - if (vm_ops_needs_writenotify(vma->vm_ops)) - return true; - - /* The open routine did something to the protections that pgprot_modify - * won't preserve? */ - if (pgprot_val(vm_page_prot) != - pgprot_val(vm_pgprot_modify(vm_page_prot, vma->vm_flags))) - return false; - - /* - * Do we need to track softdirty? hugetlb does not support softdirty - * tracking yet. - */ - if (vma_soft_dirty_enabled(vma) && !is_vm_hugetlb_page(vma)) - return true; - - /* Do we need write faults for uffd-wp tracking? */ - if (userfaultfd_wp(vma)) - return true; - - /* Can the mapping track the dirty pages? */ - return vma_fs_can_writeback(vma); -} - /* * We account for memory if it's a private writeable mapping, * not hugepages and VM_NORESERVE wasn't set. @@ -2268,566 +1216,129 @@ int expand_downwards(struct vm_area_struct *vma, unsigned long address) anon_vma_interval_tree_post_update_vma(vma); spin_unlock(&mm->page_table_lock); - perf_event_mmap(vma); - } - } - } - anon_vma_unlock_write(vma->anon_vma); - vma_iter_free(&vmi); - validate_mm(mm); - return error; -} - -/* enforced gap between the expanding stack and other mappings. */ -unsigned long stack_guard_gap = 256UL<close operation then the driver probably needs to release + * per-vma resources, so we don't attempt to merge those if the caller indicates + * the current vma may be removed as part of the merge. + */ +static inline bool is_mergeable_vma(struct vm_area_struct *vma, + struct file *file, unsigned long vm_flags, + struct vm_userfaultfd_ctx vm_userfaultfd_ctx, + struct anon_vma_name *anon_name, bool may_remove_vma) +{ + /* + * VM_SOFTDIRTY should not prevent from VMA merging, if we + * match the flags but dirty bit -- the caller should mark + * merged VMA as dirty. If dirty bit won't be excluded from + * comparison, we increase pressure on the memory system forcing + * the kernel to generate new VMAs when old one could be + * extended instead. + */ + if ((vma->vm_flags ^ vm_flags) & ~VM_SOFTDIRTY) + return false; + if (vma->vm_file != file) + return false; + if (may_remove_vma && vma->vm_ops && vma->vm_ops->close) + return false; + if (!is_mergeable_vm_userfaultfd_ctx(vma, vm_userfaultfd_ctx)) + return false; + if (!anon_vma_name_eq(anon_vma_name(vma), anon_name)) + return false; + return true; +} + +static inline bool is_mergeable_anon_vma(struct anon_vma *anon_vma1, + struct anon_vma *anon_vma2, struct vm_area_struct *vma) +{ + /* + * The list_is_singular() test is to avoid merging VMA cloned from + * parents. This can improve scalability caused by anon_vma lock. + */ + if ((!anon_vma1 || !anon_vma2) && (!vma || + list_is_singular(&vma->anon_vma_chain))) + return true; + return anon_vma1 == anon_vma2; +} + +/* + * init_multi_vma_prep() - Initializer for struct vma_prepare + * @vp: The vma_prepare struct + * @vma: The vma that will be altered once locked + * @next: The next vma if it is to be adjusted + * @remove: The first vma to be removed + * @remove2: The second vma to be removed + */ +static void init_multi_vma_prep(struct vma_prepare *vp, + struct vm_area_struct *vma, + struct vm_area_struct *next, + struct vm_area_struct *remove, + struct vm_area_struct *remove2) +{ + memset(vp, 0, sizeof(struct vma_prepare)); + vp->vma = vma; + vp->anon_vma = vma->anon_vma; + vp->remove = remove; + vp->remove2 = remove2; + vp->adj_next = next; + if (!vp->anon_vma && next) + vp->anon_vma = next->anon_vma; + + vp->file = vma->vm_file; + if (vp->file) + vp->mapping = vma->vm_file->f_mapping; + +} + +/* + * Return true if we can merge this (vm_flags,anon_vma,file,vm_pgoff) + * in front of (at a lower virtual address and file offset than) the vma. + * + * We cannot merge two vmas if they have differently assigned (non-NULL) + * anon_vmas, nor if same anon_vma is assigned but offsets incompatible. + * + * We don't check here for the merged mmap wrapping around the end of pagecache + * indices (16TB on ia32) because do_mmap() does not permit mmap's which + * wrap, nor mmaps which cover the final page at index -1UL. + * + * We assume the vma may be removed as part of the merge. + */ +bool +can_vma_merge_before(struct vm_area_struct *vma, unsigned long vm_flags, + struct anon_vma *anon_vma, struct file *file, + pgoff_t vm_pgoff, struct vm_userfaultfd_ctx vm_userfaultfd_ctx, + struct anon_vma_name *anon_name) +{ + if (is_mergeable_vma(vma, file, vm_flags, vm_userfaultfd_ctx, anon_name, true) && + is_mergeable_anon_vma(anon_vma, vma->anon_vma, vma)) { + if (vma->vm_pgoff == vm_pgoff) + return true; + } + return false; +} + +/* + * Return true if we can merge this (vm_flags,anon_vma,file,vm_pgoff) + * beyond (at a higher virtual address and file offset than) the vma. + * + * We cannot merge two vmas if they have differently assigned (non-NULL) + * anon_vmas, nor if same anon_vma is assigned but offsets incompatible. + * + * We assume that vma is not removed as part of the merge. + */ +bool +can_vma_merge_after(struct vm_area_struct *vma, unsigned long vm_flags, + struct anon_vma *anon_vma, struct file *file, + pgoff_t vm_pgoff, struct vm_userfaultfd_ctx vm_userfaultfd_ctx, + struct anon_vma_name *anon_name) +{ + if (is_mergeable_vma(vma, file, vm_flags, vm_userfaultfd_ctx, anon_name, false) && + is_mergeable_anon_vma(anon_vma, vma->anon_vma, vma)) { + pgoff_t vm_pglen; + + vm_pglen = vma_pages(vma); + if (vma->vm_pgoff + vm_pglen == vm_pgoff) + return true; + } + return false; +} + +/* + * Close a vm structure and free it. + */ +void remove_vma(struct vm_area_struct *vma, bool unreachable) +{ + might_sleep(); + if (vma->vm_ops && vma->vm_ops->close) + vma->vm_ops->close(vma); + if (vma->vm_file) + fput(vma->vm_file); + mpol_put(vma_policy(vma)); + if (unreachable) + __vm_area_free(vma); + else + vm_area_free(vma); +} + +/* + * Get rid of page table information in the indicated region. + * + * Called with the mm semaphore held. + */ +void unmap_region(struct mm_struct *mm, struct ma_state *mas, + struct vm_area_struct *vma, struct vm_area_struct *prev, + struct vm_area_struct *next, unsigned long start, + unsigned long end, unsigned long tree_end, bool mm_wr_locked) +{ + struct mmu_gather tlb; + unsigned long mt_start = mas->index; + + lru_add_drain(); + tlb_gather_mmu(&tlb, mm); + update_hiwater_rss(mm); + unmap_vmas(&tlb, mas, vma, start, end, tree_end, mm_wr_locked); + mas_set(mas, mt_start); + free_pgtables(&tlb, mas, vma, prev ? prev->vm_end : FIRST_USER_ADDRESS, + next ? next->vm_start : USER_PGTABLES_CEILING, + mm_wr_locked); + tlb_finish_mmu(&tlb); +} + +/* + * __split_vma() bypasses sysctl_max_map_count checking. We use this where it + * has already been checked or doesn't make sense to fail. + * VMA Iterator will point to the end VMA. + */ +static int __split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma, + unsigned long addr, int new_below) +{ + struct vma_prepare vp; + struct vm_area_struct *new; + int err; + + WARN_ON(vma->vm_start >= addr); + WARN_ON(vma->vm_end <= addr); + + if (vma->vm_ops && vma->vm_ops->may_split) { + err = vma->vm_ops->may_split(vma, addr); + if (err) + return err; + } + + new = vm_area_dup(vma); + if (!new) + return -ENOMEM; + + if (new_below) { + new->vm_end = addr; + } else { + new->vm_start = addr; + new->vm_pgoff += ((addr - vma->vm_start) >> PAGE_SHIFT); + } + + err = -ENOMEM; + vma_iter_config(vmi, new->vm_start, new->vm_end); + if (vma_iter_prealloc(vmi, new)) + goto out_free_vma; + + err = vma_dup_policy(vma, new); + if (err) + goto out_free_vmi; + + err = anon_vma_clone(new, vma); + if (err) + goto out_free_mpol; + + if (new->vm_file) + get_file(new->vm_file); + + if (new->vm_ops && new->vm_ops->open) + new->vm_ops->open(new); + + vma_start_write(vma); + vma_start_write(new); + + init_vma_prep(&vp, vma); + vp.insert = new; + vma_prepare(&vp); + vma_adjust_trans_huge(vma, vma->vm_start, addr, 0); + + if (new_below) { + vma->vm_start = addr; + vma->vm_pgoff += (addr - new->vm_start) >> PAGE_SHIFT; + } else { + vma->vm_end = addr; + } + + /* vma_complete stores the new vma */ + vma_complete(&vp, vmi, vma->vm_mm); + + /* Success. */ + if (new_below) + vma_next(vmi); + return 0; + +out_free_mpol: + mpol_put(vma_policy(new)); +out_free_vmi: + vma_iter_free(vmi); +out_free_vma: + vm_area_free(new); + return err; +} + +/* + * Split a vma into two pieces at address 'addr', a new vma is allocated + * either for the first part or the tail. + */ +static int split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma, + unsigned long addr, int new_below) +{ + if (vma->vm_mm->map_count >= sysctl_max_map_count) + return -ENOMEM; + + return __split_vma(vmi, vma, addr, new_below); +} + +/* + * Ok - we have the memory areas we should free on a maple tree so release them, + * and do the vma updates. + * + * Called with the mm semaphore held. + */ +static inline void remove_mt(struct mm_struct *mm, struct ma_state *mas) +{ + unsigned long nr_accounted = 0; + struct vm_area_struct *vma; + + /* Update high watermark before we lower total_vm */ + update_hiwater_vm(mm); + mas_for_each(mas, vma, ULONG_MAX) { + long nrpages = vma_pages(vma); + + if (vma->vm_flags & VM_ACCOUNT) + nr_accounted += nrpages; + vm_stat_account(mm, vma->vm_flags, -nrpages); + remove_vma(vma, false); + } + vm_unacct_memory(nr_accounted); +} + +/* + * init_vma_prep() - Initializer wrapper for vma_prepare struct + * @vp: The vma_prepare struct + * @vma: The vma that will be altered once locked + */ +void init_vma_prep(struct vma_prepare *vp, + struct vm_area_struct *vma) +{ + init_multi_vma_prep(vp, vma, NULL, NULL, NULL); +} + +/* + * Requires inode->i_mapping->i_mmap_rwsem + */ +static void __remove_shared_vm_struct(struct vm_area_struct *vma, + struct address_space *mapping) +{ + if (vma_is_shared_maywrite(vma)) + mapping_unmap_writable(mapping); + + flush_dcache_mmap_lock(mapping); + vma_interval_tree_remove(vma, &mapping->i_mmap); + flush_dcache_mmap_unlock(mapping); +} + +/* + * vma has some anon_vma assigned, and is already inserted on that + * anon_vma's interval trees. + * + * Before updating the vma's vm_start / vm_end / vm_pgoff fields, the + * vma must be removed from the anon_vma's interval trees using + * anon_vma_interval_tree_pre_update_vma(). + * + * After the update, the vma will be reinserted using + * anon_vma_interval_tree_post_update_vma(). + * + * The entire update must be protected by exclusive mmap_lock and by + * the root anon_vma's mutex. + */ +void +anon_vma_interval_tree_pre_update_vma(struct vm_area_struct *vma) +{ + struct anon_vma_chain *avc; + + list_for_each_entry(avc, &vma->anon_vma_chain, same_vma) + anon_vma_interval_tree_remove(avc, &avc->anon_vma->rb_root); +} + +void +anon_vma_interval_tree_post_update_vma(struct vm_area_struct *vma) +{ + struct anon_vma_chain *avc; + + list_for_each_entry(avc, &vma->anon_vma_chain, same_vma) + anon_vma_interval_tree_insert(avc, &avc->anon_vma->rb_root); +} + +static void __vma_link_file(struct vm_area_struct *vma, + struct address_space *mapping) +{ + if (vma_is_shared_maywrite(vma)) + mapping_allow_writable(mapping); + + flush_dcache_mmap_lock(mapping); + vma_interval_tree_insert(vma, &mapping->i_mmap); + flush_dcache_mmap_unlock(mapping); +} + +/* + * vma_prepare() - Helper function for handling locking VMAs prior to altering + * @vp: The initialized vma_prepare struct + */ +void vma_prepare(struct vma_prepare *vp) +{ + if (vp->file) { + uprobe_munmap(vp->vma, vp->vma->vm_start, vp->vma->vm_end); + + if (vp->adj_next) + uprobe_munmap(vp->adj_next, vp->adj_next->vm_start, + vp->adj_next->vm_end); + + i_mmap_lock_write(vp->mapping); + if (vp->insert && vp->insert->vm_file) { + /* + * Put into interval tree now, so instantiated pages + * are visible to arm/parisc __flush_dcache_page + * throughout; but we cannot insert into address + * space until vma start or end is updated. + */ + __vma_link_file(vp->insert, + vp->insert->vm_file->f_mapping); + } + } + + if (vp->anon_vma) { + anon_vma_lock_write(vp->anon_vma); + anon_vma_interval_tree_pre_update_vma(vp->vma); + if (vp->adj_next) + anon_vma_interval_tree_pre_update_vma(vp->adj_next); + } + + if (vp->file) { + flush_dcache_mmap_lock(vp->mapping); + vma_interval_tree_remove(vp->vma, &vp->mapping->i_mmap); + if (vp->adj_next) + vma_interval_tree_remove(vp->adj_next, + &vp->mapping->i_mmap); + } + +} + +/* + * dup_anon_vma() - Helper function to duplicate anon_vma + * @dst: The destination VMA + * @src: The source VMA + * @dup: Pointer to the destination VMA when successful. + * + * Returns: 0 on success. + */ +static int dup_anon_vma(struct vm_area_struct *dst, + struct vm_area_struct *src, struct vm_area_struct **dup) +{ + /* + * Easily overlooked: when mprotect shifts the boundary, make sure the + * expanding vma has anon_vma set if the shrinking vma had, to cover any + * anon pages imported. + */ + if (src->anon_vma && !dst->anon_vma) { + int ret; + + vma_assert_write_locked(dst); + dst->anon_vma = src->anon_vma; + ret = anon_vma_clone(dst, src); + if (ret) + return ret; + + *dup = dst; + } + + return 0; +} + +#ifdef CONFIG_DEBUG_VM_MAPLE_TREE +void validate_mm(struct mm_struct *mm) +{ + int bug = 0; + int i = 0; + struct vm_area_struct *vma; + VMA_ITERATOR(vmi, mm, 0); + + mt_validate(&mm->mm_mt); + for_each_vma(vmi, vma) { +#ifdef CONFIG_DEBUG_VM_RB + struct anon_vma *anon_vma = vma->anon_vma; + struct anon_vma_chain *avc; +#endif + unsigned long vmi_start, vmi_end; + bool warn = 0; + + vmi_start = vma_iter_addr(&vmi); + vmi_end = vma_iter_end(&vmi); + if (VM_WARN_ON_ONCE_MM(vma->vm_end != vmi_end, mm)) + warn = 1; + + if (VM_WARN_ON_ONCE_MM(vma->vm_start != vmi_start, mm)) + warn = 1; + + if (warn) { + pr_emerg("issue in %s\n", current->comm); + dump_stack(); + dump_vma(vma); + pr_emerg("tree range: %px start %lx end %lx\n", vma, + vmi_start, vmi_end - 1); + vma_iter_dump_tree(&vmi); + } + +#ifdef CONFIG_DEBUG_VM_RB + if (anon_vma) { + anon_vma_lock_read(anon_vma); + list_for_each_entry(avc, &vma->anon_vma_chain, same_vma) + anon_vma_interval_tree_verify(avc); + anon_vma_unlock_read(anon_vma); + } +#endif + i++; + } + if (i != mm->map_count) { + pr_emerg("map_count %d vma iterator %d\n", mm->map_count, i); + bug = 1; + } + VM_BUG_ON_MM(bug, mm); +} +#endif /* CONFIG_DEBUG_VM_MAPLE_TREE */ + +/* + * vma_expand - Expand an existing VMA + * + * @vmi: The vma iterator + * @vma: The vma to expand + * @start: The start of the vma + * @end: The exclusive end of the vma + * @pgoff: The page offset of vma + * @next: The current of next vma. + * + * Expand @vma to @start and @end. Can expand off the start and end. Will + * expand over @next if it's different from @vma and @end == @next->vm_end. + * Checking if the @vma can expand and merge with @next needs to be handled by + * the caller. + * + * Returns: 0 on success + */ +int vma_expand(struct vma_iterator *vmi, struct vm_area_struct *vma, + unsigned long start, unsigned long end, pgoff_t pgoff, + struct vm_area_struct *next) +{ + struct vm_area_struct *anon_dup = NULL; + bool remove_next = false; + struct vma_prepare vp; + + vma_start_write(vma); + if (next && (vma != next) && (end == next->vm_end)) { + int ret; + + remove_next = true; + vma_start_write(next); + ret = dup_anon_vma(vma, next, &anon_dup); + if (ret) + return ret; + } + + init_multi_vma_prep(&vp, vma, NULL, remove_next ? next : NULL, NULL); + /* Not merging but overwriting any part of next is not handled. */ + VM_WARN_ON(next && !vp.remove && + next != vma && end > next->vm_start); + /* Only handles expanding */ + VM_WARN_ON(vma->vm_start < start || vma->vm_end > end); + + /* Note: vma iterator must be pointing to 'start' */ + vma_iter_config(vmi, start, end); + if (vma_iter_prealloc(vmi, vma)) + goto nomem; + + vma_prepare(&vp); + vma_adjust_trans_huge(vma, start, end, 0); + vma_set_range(vma, start, end, pgoff); + vma_iter_store(vmi, vma); + + vma_complete(&vp, vmi, vma->vm_mm); + return 0; + +nomem: + if (anon_dup) + unlink_anon_vmas(anon_dup); + return -ENOMEM; +} + +/* + * vma_shrink() - Reduce an existing VMAs memory area + * @vmi: The vma iterator + * @vma: The VMA to modify + * @start: The new start + * @end: The new end + * + * Returns: 0 on success, -ENOMEM otherwise + */ +int vma_shrink(struct vma_iterator *vmi, struct vm_area_struct *vma, + unsigned long start, unsigned long end, pgoff_t pgoff) +{ + struct vma_prepare vp; + + WARN_ON((vma->vm_start != start) && (vma->vm_end != end)); + + if (vma->vm_start < start) + vma_iter_config(vmi, vma->vm_start, start); + else + vma_iter_config(vmi, end, vma->vm_end); + + if (vma_iter_prealloc(vmi, NULL)) + return -ENOMEM; + + vma_start_write(vma); + + init_vma_prep(&vp, vma); + vma_prepare(&vp); + vma_adjust_trans_huge(vma, start, end, 0); + + vma_iter_clear(vmi); + vma_set_range(vma, start, end, pgoff); + vma_complete(&vp, vmi, vma->vm_mm); + return 0; +} + +/* + * vma_complete- Helper function for handling the unlocking after altering VMAs, + * or for inserting a VMA. + * + * @vp: The vma_prepare struct + * @vmi: The vma iterator + * @mm: The mm_struct + */ +void vma_complete(struct vma_prepare *vp, + struct vma_iterator *vmi, struct mm_struct *mm) +{ + if (vp->file) { + if (vp->adj_next) + vma_interval_tree_insert(vp->adj_next, + &vp->mapping->i_mmap); + vma_interval_tree_insert(vp->vma, &vp->mapping->i_mmap); + flush_dcache_mmap_unlock(vp->mapping); + } + + if (vp->remove && vp->file) { + __remove_shared_vm_struct(vp->remove, vp->mapping); + if (vp->remove2) + __remove_shared_vm_struct(vp->remove2, vp->mapping); + } else if (vp->insert) { + /* + * split_vma has split insert from vma, and needs + * us to insert it before dropping the locks + * (it may either follow vma or precede it). + */ + vma_iter_store(vmi, vp->insert); + mm->map_count++; + } + + if (vp->anon_vma) { + anon_vma_interval_tree_post_update_vma(vp->vma); + if (vp->adj_next) + anon_vma_interval_tree_post_update_vma(vp->adj_next); + anon_vma_unlock_write(vp->anon_vma); + } + + if (vp->file) { + i_mmap_unlock_write(vp->mapping); + uprobe_mmap(vp->vma); + + if (vp->adj_next) + uprobe_mmap(vp->adj_next); + } + + if (vp->remove) { +again: + vma_mark_detached(vp->remove, true); + if (vp->file) { + uprobe_munmap(vp->remove, vp->remove->vm_start, + vp->remove->vm_end); + fput(vp->file); + } + if (vp->remove->anon_vma) + anon_vma_merge(vp->vma, vp->remove); + mm->map_count--; + mpol_put(vma_policy(vp->remove)); + if (!vp->remove2) + WARN_ON_ONCE(vp->vma->vm_end < vp->remove->vm_end); + vm_area_free(vp->remove); + + /* + * In mprotect's case 6 (see comments on vma_merge), + * we are removing both mid and next vmas + */ + if (vp->remove2) { + vp->remove = vp->remove2; + vp->remove2 = NULL; + goto again; + } + } + if (vp->insert && vp->file) + uprobe_mmap(vp->insert); + validate_mm(mm); +} + +/* + * do_vmi_align_munmap() - munmap the aligned region from @start to @end. + * @vmi: The vma iterator + * @vma: The starting vm_area_struct + * @mm: The mm_struct + * @start: The aligned start address to munmap. + * @end: The aligned end address to munmap. + * @uf: The userfaultfd list_head + * @unlock: Set to true to drop the mmap_lock. unlocking only happens on + * success. + * + * Return: 0 on success and drops the lock if so directed, error and leaves the + * lock held otherwise. + */ +int +do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma, + struct mm_struct *mm, unsigned long start, + unsigned long end, struct list_head *uf, bool unlock) +{ + struct vm_area_struct *prev, *next = NULL; + struct maple_tree mt_detach; + int count = 0; + int error = -ENOMEM; + unsigned long locked_vm = 0; + MA_STATE(mas_detach, &mt_detach, 0, 0); + mt_init_flags(&mt_detach, vmi->mas.tree->ma_flags & MT_FLAGS_LOCK_MASK); + mt_on_stack(mt_detach); + + /* + * If we need to split any vma, do it now to save pain later. + * + * Note: mremap's move_vma VM_ACCOUNT handling assumes a partially + * unmapped vm_area_struct will remain in use: so lower split_vma + * places tmp vma above, and higher split_vma places tmp vma below. + */ + + /* Does it split the first one? */ + if (start > vma->vm_start) { + + /* + * Make sure that map_count on return from munmap() will + * not exceed its limit; but let map_count go just above + * its limit temporarily, to help free resources as expected. + */ + if (end < vma->vm_end && mm->map_count >= sysctl_max_map_count) + goto map_count_exceeded; + + error = __split_vma(vmi, vma, start, 1); + if (error) + goto start_split_failed; + } + + /* + * Detach a range of VMAs from the mm. Using next as a temp variable as + * it is always overwritten. + */ + next = vma; + do { + /* Does it split the end? */ + if (next->vm_end > end) { + error = __split_vma(vmi, next, end, 0); + if (error) + goto end_split_failed; + } + vma_start_write(next); + mas_set(&mas_detach, count); + error = mas_store_gfp(&mas_detach, next, GFP_KERNEL); + if (error) + goto munmap_gather_failed; + vma_mark_detached(next, true); + if (next->vm_flags & VM_LOCKED) + locked_vm += vma_pages(next); + + count++; + if (unlikely(uf)) { + /* + * If userfaultfd_unmap_prep returns an error the vmas + * will remain split, but userland will get a + * highly unexpected error anyway. This is no + * different than the case where the first of the two + * __split_vma fails, but we don't undo the first + * split, despite we could. This is unlikely enough + * failure that it's not worth optimizing it for. + */ + error = userfaultfd_unmap_prep(next, start, end, uf); + + if (error) + goto userfaultfd_error; + } +#ifdef CONFIG_DEBUG_VM_MAPLE_TREE + BUG_ON(next->vm_start < start); + BUG_ON(next->vm_start > end); +#endif + } for_each_vma_range(*vmi, next, end); + +#if defined(CONFIG_DEBUG_VM_MAPLE_TREE) + /* Make sure no VMAs are about to be lost. */ + { + MA_STATE(test, &mt_detach, 0, 0); + struct vm_area_struct *vma_mas, *vma_test; + int test_count = 0; + + vma_iter_set(vmi, start); + rcu_read_lock(); + vma_test = mas_find(&test, count - 1); + for_each_vma_range(*vmi, vma_mas, end) { + BUG_ON(vma_mas != vma_test); + test_count++; + vma_test = mas_next(&test, count - 1); + } + rcu_read_unlock(); + BUG_ON(count != test_count); + } +#endif + + while (vma_iter_addr(vmi) > start) + vma_iter_prev_range(vmi); + + error = vma_iter_clear_gfp(vmi, start, end, GFP_KERNEL); + if (error) + goto clear_tree_failed; + + /* Point of no return */ + mm->locked_vm -= locked_vm; + mm->map_count -= count; + if (unlock) + mmap_write_downgrade(mm); + + prev = vma_iter_prev_range(vmi); + next = vma_next(vmi); + if (next) + vma_iter_prev_range(vmi); + + /* + * We can free page tables without write-locking mmap_lock because VMAs + * were isolated before we downgraded mmap_lock. + */ + mas_set(&mas_detach, 1); + unmap_region(mm, &mas_detach, vma, prev, next, start, end, count, + !unlock); + /* Statistics and freeing VMAs */ + mas_set(&mas_detach, 0); + remove_mt(mm, &mas_detach); + validate_mm(mm); + if (unlock) + mmap_read_unlock(mm); + + __mt_destroy(&mt_detach); + return 0; + +clear_tree_failed: +userfaultfd_error: +munmap_gather_failed: +end_split_failed: + mas_set(&mas_detach, 0); + mas_for_each(&mas_detach, next, end) + vma_mark_detached(next, false); + + __mt_destroy(&mt_detach); +start_split_failed: +map_count_exceeded: + validate_mm(mm); + return error; +} + +/* + * do_vmi_munmap() - munmap a given range. + * @vmi: The vma iterator + * @mm: The mm_struct + * @start: The start address to munmap + * @len: The length of the range to munmap + * @uf: The userfaultfd list_head + * @unlock: set to true if the user wants to drop the mmap_lock on success + * + * This function takes a @mas that is either pointing to the previous VMA or set + * to MA_START and sets it up to remove the mapping(s). The @len will be + * aligned and any arch_unmap work will be preformed. + * + * Return: 0 on success and drops the lock if so directed, error and leaves the + * lock held otherwise. + */ +int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm, + unsigned long start, size_t len, struct list_head *uf, + bool unlock) +{ + unsigned long end; + struct vm_area_struct *vma; + + if ((offset_in_page(start)) || start > TASK_SIZE || len > TASK_SIZE-start) + return -EINVAL; + + end = start + PAGE_ALIGN(len); + if (end == start) + return -EINVAL; + + /* + * Check if memory is sealed before arch_unmap. + * Prevent unmapping a sealed VMA. + * can_modify_mm assumes we have acquired the lock on MM. + */ + if (unlikely(!can_modify_mm(mm, start, end))) + return -EPERM; + + /* arch_unmap() might do unmaps itself. */ + arch_unmap(mm, start, end); + + /* Find the first overlapping VMA */ + vma = vma_find(vmi, end); + if (!vma) { + if (unlock) + mmap_write_unlock(mm); + return 0; + } + + return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock); +} + +/* + * Given a mapping request (addr,end,vm_flags,file,pgoff,anon_name), + * figure out whether that can be merged with its predecessor or its + * successor. Or both (it neatly fills a hole). + * + * In most cases - when called for mmap, brk or mremap - [addr,end) is + * certain not to be mapped by the time vma_merge is called; but when + * called for mprotect, it is certain to be already mapped (either at + * an offset within prev, or at the start of next), and the flags of + * this area are about to be changed to vm_flags - and the no-change + * case has already been eliminated. + * + * The following mprotect cases have to be considered, where **** is + * the area passed down from mprotect_fixup, never extending beyond one + * vma, PPPP is the previous vma, CCCC is a concurrent vma that starts + * at the same address as **** and is of the same or larger span, and + * NNNN the next vma after ****: + * + * **** **** **** + * PPPPPPNNNNNN PPPPPPNNNNNN PPPPPPCCCCCC + * cannot merge might become might become + * PPNNNNNNNNNN PPPPPPPPPPCC + * mmap, brk or case 4 below case 5 below + * mremap move: + * **** **** + * PPPP NNNN PPPPCCCCNNNN + * might become might become + * PPPPPPPPPPPP 1 or PPPPPPPPPPPP 6 or + * PPPPPPPPNNNN 2 or PPPPPPPPNNNN 7 or + * PPPPNNNNNNNN 3 PPPPNNNNNNNN 8 + * + * It is important for case 8 that the vma CCCC overlapping the + * region **** is never going to extended over NNNN. Instead NNNN must + * be extended in region **** and CCCC must be removed. This way in + * all cases where vma_merge succeeds, the moment vma_merge drops the + * rmap_locks, the properties of the merged vma will be already + * correct for the whole merged range. Some of those properties like + * vm_page_prot/vm_flags may be accessed by rmap_walks and they must + * be correct for the whole merged range immediately after the + * rmap_locks are released. Otherwise if NNNN would be removed and + * CCCC would be extended over the NNNN range, remove_migration_ptes + * or other rmap walkers (if working on addresses beyond the "end" + * parameter) may establish ptes with the wrong permissions of CCCC + * instead of the right permissions of NNNN. + * + * In the code below: + * PPPP is represented by *prev + * CCCC is represented by *curr or not represented at all (NULL) + * NNNN is represented by *next or not represented at all (NULL) + * **** is not represented - it will be merged and the vma containing the + * area is returned, or the function will return NULL + */ +static struct vm_area_struct +*vma_merge(struct vma_iterator *vmi, struct vm_area_struct *prev, + struct vm_area_struct *src, unsigned long addr, unsigned long end, + unsigned long vm_flags, pgoff_t pgoff, struct mempolicy *policy, + struct vm_userfaultfd_ctx vm_userfaultfd_ctx, + struct anon_vma_name *anon_name) +{ + struct mm_struct *mm = src->vm_mm; + struct anon_vma *anon_vma = src->anon_vma; + struct file *file = src->vm_file; + struct vm_area_struct *curr, *next, *res; + struct vm_area_struct *vma, *adjust, *remove, *remove2; + struct vm_area_struct *anon_dup = NULL; + struct vma_prepare vp; + pgoff_t vma_pgoff; + int err = 0; + bool merge_prev = false; + bool merge_next = false; + bool vma_expanded = false; + unsigned long vma_start = addr; + unsigned long vma_end = end; + pgoff_t pglen = (end - addr) >> PAGE_SHIFT; + long adj_start = 0; + + /* + * We later require that vma->vm_flags == vm_flags, + * so this tests vma->vm_flags & VM_SPECIAL, too. + */ + if (vm_flags & VM_SPECIAL) + return NULL; + + /* Does the input range span an existing VMA? (cases 5 - 8) */ + curr = find_vma_intersection(mm, prev ? prev->vm_end : 0, end); + + if (!curr || /* cases 1 - 4 */ + end == curr->vm_end) /* cases 6 - 8, adjacent VMA */ + next = vma_lookup(mm, end); + else + next = NULL; /* case 5 */ + + if (prev) { + vma_start = prev->vm_start; + vma_pgoff = prev->vm_pgoff; + + /* Can we merge the predecessor? */ + if (addr == prev->vm_end && mpol_equal(vma_policy(prev), policy) + && can_vma_merge_after(prev, vm_flags, anon_vma, file, + pgoff, vm_userfaultfd_ctx, anon_name)) { + merge_prev = true; + vma_prev(vmi); + } + } + + /* Can we merge the successor? */ + if (next && mpol_equal(policy, vma_policy(next)) && + can_vma_merge_before(next, vm_flags, anon_vma, file, pgoff+pglen, + vm_userfaultfd_ctx, anon_name)) { + merge_next = true; + } + + /* Verify some invariant that must be enforced by the caller. */ + VM_WARN_ON(prev && addr <= prev->vm_start); + VM_WARN_ON(curr && (addr != curr->vm_start || end > curr->vm_end)); + VM_WARN_ON(addr >= end); + + if (!merge_prev && !merge_next) + return NULL; /* Not mergeable. */ + + if (merge_prev) + vma_start_write(prev); + + res = vma = prev; + remove = remove2 = adjust = NULL; + + /* Can we merge both the predecessor and the successor? */ + if (merge_prev && merge_next && + is_mergeable_anon_vma(prev->anon_vma, next->anon_vma, NULL)) { + vma_start_write(next); + remove = next; /* case 1 */ + vma_end = next->vm_end; + err = dup_anon_vma(prev, next, &anon_dup); + if (curr) { /* case 6 */ + vma_start_write(curr); + remove = curr; + remove2 = next; + /* + * Note that the dup_anon_vma below cannot overwrite err + * since the first caller would do nothing unless next + * has an anon_vma. + */ + if (!next->anon_vma) + err = dup_anon_vma(prev, curr, &anon_dup); + } + } else if (merge_prev) { /* case 2 */ + if (curr) { + vma_start_write(curr); + if (end == curr->vm_end) { /* case 7 */ + /* + * can_vma_merge_after() assumed we would not be + * removing prev vma, so it skipped the check + * for vm_ops->close, but we are removing curr + */ + if (curr->vm_ops && curr->vm_ops->close) + err = -EINVAL; + remove = curr; + } else { /* case 5 */ + adjust = curr; + adj_start = (end - curr->vm_start); + } + if (!err) + err = dup_anon_vma(prev, curr, &anon_dup); + } + } else { /* merge_next */ + vma_start_write(next); + res = next; + if (prev && addr < prev->vm_end) { /* case 4 */ + vma_start_write(prev); + vma_end = addr; + adjust = next; + adj_start = -(prev->vm_end - addr); + err = dup_anon_vma(next, prev, &anon_dup); + } else { + /* + * Note that cases 3 and 8 are the ONLY ones where prev + * is permitted to be (but is not necessarily) NULL. + */ + vma = next; /* case 3 */ + vma_start = addr; + vma_end = next->vm_end; + vma_pgoff = next->vm_pgoff - pglen; + if (curr) { /* case 8 */ + vma_pgoff = curr->vm_pgoff; + vma_start_write(curr); + remove = curr; + err = dup_anon_vma(next, curr, &anon_dup); + } + } + } + + /* Error in anon_vma clone. */ + if (err) + goto anon_vma_fail; + + if (vma_start < vma->vm_start || vma_end > vma->vm_end) + vma_expanded = true; + + if (vma_expanded) { + vma_iter_config(vmi, vma_start, vma_end); + } else { + vma_iter_config(vmi, adjust->vm_start + adj_start, + adjust->vm_end); + } + + if (vma_iter_prealloc(vmi, vma)) + goto prealloc_fail; + + init_multi_vma_prep(&vp, vma, adjust, remove, remove2); + VM_WARN_ON(vp.anon_vma && adjust && adjust->anon_vma && + vp.anon_vma != adjust->anon_vma); + + vma_prepare(&vp); + vma_adjust_trans_huge(vma, vma_start, vma_end, adj_start); + vma_set_range(vma, vma_start, vma_end, vma_pgoff); + + if (vma_expanded) + vma_iter_store(vmi, vma); + + if (adj_start) { + adjust->vm_start += adj_start; + adjust->vm_pgoff += adj_start >> PAGE_SHIFT; + if (adj_start < 0) { + WARN_ON(vma_expanded); + vma_iter_store(vmi, next); + } + } + + vma_complete(&vp, vmi, mm); + khugepaged_enter_vma(res, vm_flags); + return res; + +prealloc_fail: + if (anon_dup) + unlink_anon_vmas(anon_dup); + +anon_vma_fail: + vma_iter_set(vmi, addr); + vma_iter_load(vmi); + return NULL; +} + +/* + * We are about to modify one or multiple of a VMA's flags, policy, userfaultfd + * context and anonymous VMA name within the range [start, end). + * + * As a result, we might be able to merge the newly modified VMA range with an + * adjacent VMA with identical properties. + * + * If no merge is possible and the range does not span the entirety of the VMA, + * we then need to split the VMA to accommodate the change. + * + * The function returns either the merged VMA, the original VMA if a split was + * required instead, or an error if the split failed. + */ +struct vm_area_struct *vma_modify(struct vma_iterator *vmi, + struct vm_area_struct *prev, + struct vm_area_struct *vma, + unsigned long start, unsigned long end, + unsigned long vm_flags, + struct mempolicy *policy, + struct vm_userfaultfd_ctx uffd_ctx, + struct anon_vma_name *anon_name) +{ + pgoff_t pgoff = vma->vm_pgoff + ((start - vma->vm_start) >> PAGE_SHIFT); + struct vm_area_struct *merged; + + merged = vma_merge(vmi, prev, vma, start, end, vm_flags, + pgoff, policy, uffd_ctx, anon_name); + if (merged) + return merged; + + if (vma->vm_start < start) { + int err = split_vma(vmi, vma, start, 1); + + if (err) + return ERR_PTR(err); + } + + if (vma->vm_end > end) { + int err = split_vma(vmi, vma, end, 0); + + if (err) + return ERR_PTR(err); + } + + return vma; +} + +/* + * Attempt to merge a newly mapped VMA with those adjacent to it. The caller + * must ensure that [start, end) does not overlap any existing VMA. + */ +struct vm_area_struct +*vma_merge_new_vma(struct vma_iterator *vmi, struct vm_area_struct *prev, + struct vm_area_struct *vma, unsigned long start, + unsigned long end, pgoff_t pgoff) +{ + return vma_merge(vmi, prev, vma, start, end, vma->vm_flags, pgoff, + vma_policy(vma), vma->vm_userfaultfd_ctx, anon_vma_name(vma)); +} + +/* + * Expand vma by delta bytes, potentially merging with an immediately adjacent + * VMA with identical properties. + */ +struct vm_area_struct *vma_merge_extend(struct vma_iterator *vmi, + struct vm_area_struct *vma, + unsigned long delta) +{ + pgoff_t pgoff = vma->vm_pgoff + vma_pages(vma); + + /* vma is specified as prev, so case 1 or 2 will apply. */ + return vma_merge(vmi, vma, vma, vma->vm_end, vma->vm_end + delta, + vma->vm_flags, pgoff, vma_policy(vma), + vma->vm_userfaultfd_ctx, anon_vma_name(vma)); +} + +void unlink_file_vma_batch_init(struct unlink_vma_file_batch *vb) +{ + vb->count = 0; +} + +static void unlink_file_vma_batch_process(struct unlink_vma_file_batch *vb) +{ + struct address_space *mapping; + int i; + + mapping = vb->vmas[0]->vm_file->f_mapping; + i_mmap_lock_write(mapping); + for (i = 0; i < vb->count; i++) { + VM_WARN_ON_ONCE(vb->vmas[i]->vm_file->f_mapping != mapping); + __remove_shared_vm_struct(vb->vmas[i], mapping); + } + i_mmap_unlock_write(mapping); + + unlink_file_vma_batch_init(vb); +} + +void unlink_file_vma_batch_add(struct unlink_vma_file_batch *vb, + struct vm_area_struct *vma) +{ + if (vma->vm_file == NULL) + return; + + if ((vb->count > 0 && vb->vmas[0]->vm_file != vma->vm_file) || + vb->count == ARRAY_SIZE(vb->vmas)) + unlink_file_vma_batch_process(vb); + + vb->vmas[vb->count] = vma; + vb->count++; +} + +void unlink_file_vma_batch_final(struct unlink_vma_file_batch *vb) +{ + if (vb->count > 0) + unlink_file_vma_batch_process(vb); +} + +/* + * Unlink a file-based vm structure from its interval tree, to hide + * vma from rmap and vmtruncate before freeing its page tables. + */ +void unlink_file_vma(struct vm_area_struct *vma) +{ + struct file *file = vma->vm_file; + + if (file) { + struct address_space *mapping = file->f_mapping; + + i_mmap_lock_write(mapping); + __remove_shared_vm_struct(vma, mapping); + i_mmap_unlock_write(mapping); + } +} + +void vma_link_file(struct vm_area_struct *vma) +{ + struct file *file = vma->vm_file; + struct address_space *mapping; + + if (file) { + mapping = file->f_mapping; + i_mmap_lock_write(mapping); + __vma_link_file(vma, mapping); + i_mmap_unlock_write(mapping); + } +} + +int vma_link(struct mm_struct *mm, struct vm_area_struct *vma) +{ + VMA_ITERATOR(vmi, mm, 0); + + vma_iter_config(&vmi, vma->vm_start, vma->vm_end); + if (vma_iter_prealloc(&vmi, vma)) + return -ENOMEM; + + vma_start_write(vma); + vma_iter_store(&vmi, vma); + vma_link_file(vma); + mm->map_count++; + validate_mm(mm); + return 0; +} + +/* + * Copy the vma structure to a new location in the same mm, + * prior to moving page table entries, to effect an mremap move. + */ +struct vm_area_struct *copy_vma(struct vm_area_struct **vmap, + unsigned long addr, unsigned long len, pgoff_t pgoff, + bool *need_rmap_locks) +{ + struct vm_area_struct *vma = *vmap; + unsigned long vma_start = vma->vm_start; + struct mm_struct *mm = vma->vm_mm; + struct vm_area_struct *new_vma, *prev; + bool faulted_in_anon_vma = true; + VMA_ITERATOR(vmi, mm, addr); + + /* + * If anonymous vma has not yet been faulted, update new pgoff + * to match new location, to increase its chance of merging. + */ + if (unlikely(vma_is_anonymous(vma) && !vma->anon_vma)) { + pgoff = addr >> PAGE_SHIFT; + faulted_in_anon_vma = false; + } + + new_vma = find_vma_prev(mm, addr, &prev); + if (new_vma && new_vma->vm_start < addr + len) + return NULL; /* should never get here */ + + new_vma = vma_merge_new_vma(&vmi, prev, vma, addr, addr + len, pgoff); + if (new_vma) { + /* + * Source vma may have been merged into new_vma + */ + if (unlikely(vma_start >= new_vma->vm_start && + vma_start < new_vma->vm_end)) { + /* + * The only way we can get a vma_merge with + * self during an mremap is if the vma hasn't + * been faulted in yet and we were allowed to + * reset the dst vma->vm_pgoff to the + * destination address of the mremap to allow + * the merge to happen. mremap must change the + * vm_pgoff linearity between src and dst vmas + * (in turn preventing a vma_merge) to be + * safe. It is only safe to keep the vm_pgoff + * linear if there are no pages mapped yet. + */ + VM_BUG_ON_VMA(faulted_in_anon_vma, new_vma); + *vmap = vma = new_vma; + } + *need_rmap_locks = (new_vma->vm_pgoff <= vma->vm_pgoff); + } else { + new_vma = vm_area_dup(vma); + if (!new_vma) + goto out; + vma_set_range(new_vma, addr, addr + len, pgoff); + if (vma_dup_policy(vma, new_vma)) + goto out_free_vma; + if (anon_vma_clone(new_vma, vma)) + goto out_free_mempol; + if (new_vma->vm_file) + get_file(new_vma->vm_file); + if (new_vma->vm_ops && new_vma->vm_ops->open) + new_vma->vm_ops->open(new_vma); + if (vma_link(mm, new_vma)) + goto out_vma_link; + *need_rmap_locks = false; + } + return new_vma; + +out_vma_link: + if (new_vma->vm_ops && new_vma->vm_ops->close) + new_vma->vm_ops->close(new_vma); + + if (new_vma->vm_file) + fput(new_vma->vm_file); + + unlink_anon_vmas(new_vma); +out_free_mempol: + mpol_put(vma_policy(new_vma)); +out_free_vma: + vm_area_free(new_vma); +out: + return NULL; +} + +/* + * Rough compatibility check to quickly see if it's even worth looking + * at sharing an anon_vma. + * + * They need to have the same vm_file, and the flags can only differ + * in things that mprotect may change. + * + * NOTE! The fact that we share an anon_vma doesn't _have_ to mean that + * we can merge the two vma's. For example, we refuse to merge a vma if + * there is a vm_ops->close() function, because that indicates that the + * driver is doing some kind of reference counting. But that doesn't + * really matter for the anon_vma sharing case. + */ +static int anon_vma_compatible(struct vm_area_struct *a, struct vm_area_struct *b) +{ + return a->vm_end == b->vm_start && + mpol_equal(vma_policy(a), vma_policy(b)) && + a->vm_file == b->vm_file && + !((a->vm_flags ^ b->vm_flags) & ~(VM_ACCESS_FLAGS | VM_SOFTDIRTY)) && + b->vm_pgoff == a->vm_pgoff + ((b->vm_start - a->vm_start) >> PAGE_SHIFT); +} + +/* + * Do some basic sanity checking to see if we can re-use the anon_vma + * from 'old'. The 'a'/'b' vma's are in VM order - one of them will be + * the same as 'old', the other will be the new one that is trying + * to share the anon_vma. + * + * NOTE! This runs with mmap_lock held for reading, so it is possible that + * the anon_vma of 'old' is concurrently in the process of being set up + * by another page fault trying to merge _that_. But that's ok: if it + * is being set up, that automatically means that it will be a singleton + * acceptable for merging, so we can do all of this optimistically. But + * we do that READ_ONCE() to make sure that we never re-load the pointer. + * + * IOW: that the "list_is_singular()" test on the anon_vma_chain only + * matters for the 'stable anon_vma' case (ie the thing we want to avoid + * is to return an anon_vma that is "complex" due to having gone through + * a fork). + * + * We also make sure that the two vma's are compatible (adjacent, + * and with the same memory policies). That's all stable, even with just + * a read lock on the mmap_lock. + */ +static struct anon_vma *reusable_anon_vma(struct vm_area_struct *old, + struct vm_area_struct *a, + struct vm_area_struct *b) +{ + if (anon_vma_compatible(a, b)) { + struct anon_vma *anon_vma = READ_ONCE(old->anon_vma); + + if (anon_vma && list_is_singular(&old->anon_vma_chain)) + return anon_vma; + } + return NULL; +} + +/* + * find_mergeable_anon_vma is used by anon_vma_prepare, to check + * neighbouring vmas for a suitable anon_vma, before it goes off + * to allocate a new anon_vma. It checks because a repetitive + * sequence of mprotects and faults may otherwise lead to distinct + * anon_vmas being allocated, preventing vma merge in subsequent + * mprotect. + */ +struct anon_vma *find_mergeable_anon_vma(struct vm_area_struct *vma) +{ + struct anon_vma *anon_vma = NULL; + struct vm_area_struct *prev, *next; + VMA_ITERATOR(vmi, vma->vm_mm, vma->vm_end); + + /* Try next first. */ + next = vma_iter_load(&vmi); + if (next) { + anon_vma = reusable_anon_vma(next, vma, next); + if (anon_vma) + return anon_vma; + } + + prev = vma_prev(&vmi); + VM_BUG_ON_VMA(prev != vma, vma); + prev = vma_prev(&vmi); + /* Try prev next. */ + if (prev) + anon_vma = reusable_anon_vma(prev, prev, vma); + + /* + * We might reach here with anon_vma == NULL if we can't find + * any reusable anon_vma. + * There's no absolute need to look only at touching neighbours: + * we could search further afield for "compatible" anon_vmas. + * But it would probably just be a waste of time searching, + * or lead to too many vmas hanging off the same anon_vma. + * We're trying to allow mprotect remerging later on, + * not trying to minimize memory used for anon_vmas. + */ + return anon_vma; +} + +static bool vm_ops_needs_writenotify(const struct vm_operations_struct *vm_ops) +{ + return vm_ops && (vm_ops->page_mkwrite || vm_ops->pfn_mkwrite); +} + +static bool vma_is_shared_writable(struct vm_area_struct *vma) +{ + return (vma->vm_flags & (VM_WRITE | VM_SHARED)) == + (VM_WRITE | VM_SHARED); +} + +static bool vma_fs_can_writeback(struct vm_area_struct *vma) +{ + /* No managed pages to writeback. */ + if (vma->vm_flags & VM_PFNMAP) + return false; + + return vma->vm_file && vma->vm_file->f_mapping && + mapping_can_writeback(vma->vm_file->f_mapping); +} + +/* + * Does this VMA require the underlying folios to have their dirty state + * tracked? + */ +bool vma_needs_dirty_tracking(struct vm_area_struct *vma) +{ + /* Only shared, writable VMAs require dirty tracking. */ + if (!vma_is_shared_writable(vma)) + return false; + + /* Does the filesystem need to be notified? */ + if (vm_ops_needs_writenotify(vma->vm_ops)) + return true; + + /* + * Even if the filesystem doesn't indicate a need for writenotify, if it + * can writeback, dirty tracking is still required. + */ + return vma_fs_can_writeback(vma); +} + +/* + * Some shared mappings will want the pages marked read-only + * to track write events. If so, we'll downgrade vm_page_prot + * to the private version (using protection_map[] without the + * VM_SHARED bit). + */ +bool vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot) +{ + /* If it was private or non-writable, the write bit is already clear */ + if (!vma_is_shared_writable(vma)) + return false; + + /* The backer wishes to know when pages are first written to? */ + if (vm_ops_needs_writenotify(vma->vm_ops)) + return true; + + /* The open routine did something to the protections that pgprot_modify + * won't preserve? */ + if (pgprot_val(vm_page_prot) != + pgprot_val(vm_pgprot_modify(vm_page_prot, vma->vm_flags))) + return false; + + /* + * Do we need to track softdirty? hugetlb does not support softdirty + * tracking yet. + */ + if (vma_soft_dirty_enabled(vma) && !is_vm_hugetlb_page(vma)) + return true; + + /* Do we need write faults for uffd-wp tracking? */ + if (userfaultfd_wp(vma)) + return true; + + /* Can the mapping track the dirty pages? */ + return vma_fs_can_writeback(vma); +} + +unsigned long count_vma_pages_range(struct mm_struct *mm, + unsigned long addr, unsigned long end) +{ + VMA_ITERATOR(vmi, mm, addr); + struct vm_area_struct *vma; + unsigned long nr_pages = 0; + + for_each_vma_range(vmi, vma, end) { + unsigned long vm_start = max(addr, vma->vm_start); + unsigned long vm_end = min(end, vma->vm_end); + + nr_pages += PHYS_PFN(vm_end - vm_start); + } + + return nr_pages; +} + +static DEFINE_MUTEX(mm_all_locks_mutex); + +static void vm_lock_anon_vma(struct mm_struct *mm, struct anon_vma *anon_vma) +{ + if (!test_bit(0, (unsigned long *) &anon_vma->root->rb_root.rb_root.rb_node)) { + /* + * The LSB of head.next can't change from under us + * because we hold the mm_all_locks_mutex. + */ + down_write_nest_lock(&anon_vma->root->rwsem, &mm->mmap_lock); + /* + * We can safely modify head.next after taking the + * anon_vma->root->rwsem. If some other vma in this mm shares + * the same anon_vma we won't take it again. + * + * No need of atomic instructions here, head.next + * can't change from under us thanks to the + * anon_vma->root->rwsem. + */ + if (__test_and_set_bit(0, (unsigned long *) + &anon_vma->root->rb_root.rb_root.rb_node)) + BUG(); + } +} + +static void vm_lock_mapping(struct mm_struct *mm, struct address_space *mapping) +{ + if (!test_bit(AS_MM_ALL_LOCKS, &mapping->flags)) { + /* + * AS_MM_ALL_LOCKS can't change from under us because + * we hold the mm_all_locks_mutex. + * + * Operations on ->flags have to be atomic because + * even if AS_MM_ALL_LOCKS is stable thanks to the + * mm_all_locks_mutex, there may be other cpus + * changing other bitflags in parallel to us. + */ + if (test_and_set_bit(AS_MM_ALL_LOCKS, &mapping->flags)) + BUG(); + down_write_nest_lock(&mapping->i_mmap_rwsem, &mm->mmap_lock); + } +} + +/* + * This operation locks against the VM for all pte/vma/mm related + * operations that could ever happen on a certain mm. This includes + * vmtruncate, try_to_unmap, and all page faults. + * + * The caller must take the mmap_lock in write mode before calling + * mm_take_all_locks(). The caller isn't allowed to release the + * mmap_lock until mm_drop_all_locks() returns. + * + * mmap_lock in write mode is required in order to block all operations + * that could modify pagetables and free pages without need of + * altering the vma layout. It's also needed in write mode to avoid new + * anon_vmas to be associated with existing vmas. + * + * A single task can't take more than one mm_take_all_locks() in a row + * or it would deadlock. + * + * The LSB in anon_vma->rb_root.rb_node and the AS_MM_ALL_LOCKS bitflag in + * mapping->flags avoid to take the same lock twice, if more than one + * vma in this mm is backed by the same anon_vma or address_space. + * + * We take locks in following order, accordingly to comment at beginning + * of mm/rmap.c: + * - all hugetlbfs_i_mmap_rwsem_key locks (aka mapping->i_mmap_rwsem for + * hugetlb mapping); + * - all vmas marked locked + * - all i_mmap_rwsem locks; + * - all anon_vma->rwseml + * + * We can take all locks within these types randomly because the VM code + * doesn't nest them and we protected from parallel mm_take_all_locks() by + * mm_all_locks_mutex. + * + * mm_take_all_locks() and mm_drop_all_locks are expensive operations + * that may have to take thousand of locks. + * + * mm_take_all_locks() can fail if it's interrupted by signals. + */ +int mm_take_all_locks(struct mm_struct *mm) +{ + struct vm_area_struct *vma; + struct anon_vma_chain *avc; + VMA_ITERATOR(vmi, mm, 0); + + mmap_assert_write_locked(mm); + + mutex_lock(&mm_all_locks_mutex); + + /* + * vma_start_write() does not have a complement in mm_drop_all_locks() + * because vma_start_write() is always asymmetrical; it marks a VMA as + * being written to until mmap_write_unlock() or mmap_write_downgrade() + * is reached. + */ + for_each_vma(vmi, vma) { + if (signal_pending(current)) + goto out_unlock; + vma_start_write(vma); + } + + vma_iter_init(&vmi, mm, 0); + for_each_vma(vmi, vma) { + if (signal_pending(current)) + goto out_unlock; + if (vma->vm_file && vma->vm_file->f_mapping && + is_vm_hugetlb_page(vma)) + vm_lock_mapping(mm, vma->vm_file->f_mapping); + } + + vma_iter_init(&vmi, mm, 0); + for_each_vma(vmi, vma) { + if (signal_pending(current)) + goto out_unlock; + if (vma->vm_file && vma->vm_file->f_mapping && + !is_vm_hugetlb_page(vma)) + vm_lock_mapping(mm, vma->vm_file->f_mapping); + } + + vma_iter_init(&vmi, mm, 0); + for_each_vma(vmi, vma) { + if (signal_pending(current)) + goto out_unlock; + if (vma->anon_vma) + list_for_each_entry(avc, &vma->anon_vma_chain, same_vma) + vm_lock_anon_vma(mm, avc->anon_vma); + } + + return 0; + +out_unlock: + mm_drop_all_locks(mm); + return -EINTR; +} + +static void vm_unlock_anon_vma(struct anon_vma *anon_vma) +{ + if (test_bit(0, (unsigned long *) &anon_vma->root->rb_root.rb_root.rb_node)) { + /* + * The LSB of head.next can't change to 0 from under + * us because we hold the mm_all_locks_mutex. + * + * We must however clear the bitflag before unlocking + * the vma so the users using the anon_vma->rb_root will + * never see our bitflag. + * + * No need of atomic instructions here, head.next + * can't change from under us until we release the + * anon_vma->root->rwsem. + */ + if (!__test_and_clear_bit(0, (unsigned long *) + &anon_vma->root->rb_root.rb_root.rb_node)) + BUG(); + anon_vma_unlock_write(anon_vma); + } +} + +static void vm_unlock_mapping(struct address_space *mapping) +{ + if (test_bit(AS_MM_ALL_LOCKS, &mapping->flags)) { + /* + * AS_MM_ALL_LOCKS can't change to 0 from under us + * because we hold the mm_all_locks_mutex. + */ + i_mmap_unlock_write(mapping); + if (!test_and_clear_bit(AS_MM_ALL_LOCKS, + &mapping->flags)) + BUG(); + } +} + +/* + * The mmap_lock cannot be released by the caller until + * mm_drop_all_locks() returns. + */ +void mm_drop_all_locks(struct mm_struct *mm) +{ + struct vm_area_struct *vma; + struct anon_vma_chain *avc; + VMA_ITERATOR(vmi, mm, 0); + + mmap_assert_write_locked(mm); + BUG_ON(!mutex_is_locked(&mm_all_locks_mutex)); + + for_each_vma(vmi, vma) { + if (vma->anon_vma) + list_for_each_entry(avc, &vma->anon_vma_chain, same_vma) + vm_unlock_anon_vma(avc->anon_vma); + if (vma->vm_file && vma->vm_file->f_mapping) + vm_unlock_mapping(vma->vm_file->f_mapping); + } + + mutex_unlock(&mm_all_locks_mutex); +} diff --git a/mm/vma.h b/mm/vma.h new file mode 100644 index 000000000000..6efdf1768a0a --- /dev/null +++ b/mm/vma.h @@ -0,0 +1,364 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * vma.h + * + * Core VMA manipulation API implemented in vma.c. + */ +#ifndef __MM_VMA_H +#define __MM_VMA_H + +/* + * VMA lock generalization + */ +struct vma_prepare { + struct vm_area_struct *vma; + struct vm_area_struct *adj_next; + struct file *file; + struct address_space *mapping; + struct anon_vma *anon_vma; + struct vm_area_struct *insert; + struct vm_area_struct *remove; + struct vm_area_struct *remove2; +}; + +struct unlink_vma_file_batch { + int count; + struct vm_area_struct *vmas[8]; +}; + +#ifdef CONFIG_DEBUG_VM_MAPLE_TREE +void validate_mm(struct mm_struct *mm); +#else +#define validate_mm(mm) do { } while (0) +#endif + +/* Required for expand_downwards(). */ +void anon_vma_interval_tree_pre_update_vma(struct vm_area_struct *vma); + +/* Required for expand_downwards(). */ +void anon_vma_interval_tree_post_update_vma(struct vm_area_struct *vma); + +/* Required for do_brk_flags(). */ +void vma_prepare(struct vma_prepare *vp); + +/* Required for do_brk_flags(). */ +void init_vma_prep(struct vma_prepare *vp, + struct vm_area_struct *vma); + +/* Required for do_brk_flags(). */ +void vma_complete(struct vma_prepare *vp, + struct vma_iterator *vmi, struct mm_struct *mm); + +int vma_expand(struct vma_iterator *vmi, struct vm_area_struct *vma, + unsigned long start, unsigned long end, pgoff_t pgoff, + struct vm_area_struct *next); + +int vma_shrink(struct vma_iterator *vmi, struct vm_area_struct *vma, + unsigned long start, unsigned long end, pgoff_t pgoff); + +int +do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma, + struct mm_struct *mm, unsigned long start, + unsigned long end, struct list_head *uf, bool unlock); + +int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm, + unsigned long start, size_t len, struct list_head *uf, + bool unlock); + +void remove_vma(struct vm_area_struct *vma, bool unreachable); + +void unmap_region(struct mm_struct *mm, struct ma_state *mas, + struct vm_area_struct *vma, struct vm_area_struct *prev, + struct vm_area_struct *next, unsigned long start, + unsigned long end, unsigned long tree_end, bool mm_wr_locked); + +/* Required by mmap_region(). */ +bool +can_vma_merge_before(struct vm_area_struct *vma, unsigned long vm_flags, + struct anon_vma *anon_vma, struct file *file, + pgoff_t vm_pgoff, struct vm_userfaultfd_ctx vm_userfaultfd_ctx, + struct anon_vma_name *anon_name); + +/* Required by mmap_region() and do_brk_flags(). */ +bool +can_vma_merge_after(struct vm_area_struct *vma, unsigned long vm_flags, + struct anon_vma *anon_vma, struct file *file, + pgoff_t vm_pgoff, struct vm_userfaultfd_ctx vm_userfaultfd_ctx, + struct anon_vma_name *anon_name); + +struct vm_area_struct *vma_modify(struct vma_iterator *vmi, + struct vm_area_struct *prev, + struct vm_area_struct *vma, + unsigned long start, unsigned long end, + unsigned long vm_flags, + struct mempolicy *policy, + struct vm_userfaultfd_ctx uffd_ctx, + struct anon_vma_name *anon_name); + +/* We are about to modify the VMA's flags. */ +static inline struct vm_area_struct +*vma_modify_flags(struct vma_iterator *vmi, + struct vm_area_struct *prev, + struct vm_area_struct *vma, + unsigned long start, unsigned long end, + unsigned long new_flags) +{ + return vma_modify(vmi, prev, vma, start, end, new_flags, + vma_policy(vma), vma->vm_userfaultfd_ctx, + anon_vma_name(vma)); +} + +/* We are about to modify the VMA's flags and/or anon_name. */ +static inline struct vm_area_struct +*vma_modify_flags_name(struct vma_iterator *vmi, + struct vm_area_struct *prev, + struct vm_area_struct *vma, + unsigned long start, + unsigned long end, + unsigned long new_flags, + struct anon_vma_name *new_name) +{ + return vma_modify(vmi, prev, vma, start, end, new_flags, + vma_policy(vma), vma->vm_userfaultfd_ctx, new_name); +} + +/* We are about to modify the VMA's memory policy. */ +static inline struct vm_area_struct +*vma_modify_policy(struct vma_iterator *vmi, + struct vm_area_struct *prev, + struct vm_area_struct *vma, + unsigned long start, unsigned long end, + struct mempolicy *new_pol) +{ + return vma_modify(vmi, prev, vma, start, end, vma->vm_flags, + new_pol, vma->vm_userfaultfd_ctx, anon_vma_name(vma)); +} + +/* We are about to modify the VMA's flags and/or uffd context. */ +static inline struct vm_area_struct +*vma_modify_flags_uffd(struct vma_iterator *vmi, + struct vm_area_struct *prev, + struct vm_area_struct *vma, + unsigned long start, unsigned long end, + unsigned long new_flags, + struct vm_userfaultfd_ctx new_ctx) +{ + return vma_modify(vmi, prev, vma, start, end, new_flags, + vma_policy(vma), new_ctx, anon_vma_name(vma)); +} + +struct vm_area_struct +*vma_merge_new_vma(struct vma_iterator *vmi, struct vm_area_struct *prev, + struct vm_area_struct *vma, unsigned long start, + unsigned long end, pgoff_t pgoff); + +struct vm_area_struct *vma_merge_extend(struct vma_iterator *vmi, + struct vm_area_struct *vma, + unsigned long delta); + +void unlink_file_vma_batch_init(struct unlink_vma_file_batch *vb); + +void unlink_file_vma_batch_final(struct unlink_vma_file_batch *vb); + +void unlink_file_vma_batch_add(struct unlink_vma_file_batch *vb, + struct vm_area_struct *vma); + +void unlink_file_vma(struct vm_area_struct *vma); + +void vma_link_file(struct vm_area_struct *vma); + +int vma_link(struct mm_struct *mm, struct vm_area_struct *vma); + +struct vm_area_struct *copy_vma(struct vm_area_struct **vmap, + unsigned long addr, unsigned long len, pgoff_t pgoff, + bool *need_rmap_locks); + +struct anon_vma *find_mergeable_anon_vma(struct vm_area_struct *vma); + +bool vma_needs_dirty_tracking(struct vm_area_struct *vma); +bool vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot); + +int mm_take_all_locks(struct mm_struct *mm); +void mm_drop_all_locks(struct mm_struct *mm); +unsigned long count_vma_pages_range(struct mm_struct *mm, + unsigned long addr, unsigned long end); + +static inline bool vma_wants_manual_pte_write_upgrade(struct vm_area_struct *vma) +{ + /* + * We want to check manually if we can change individual PTEs writable + * if we can't do that automatically for all PTEs in a mapping. For + * private mappings, that's always the case when we have write + * permissions as we properly have to handle COW. + */ + if (vma->vm_flags & VM_SHARED) + return vma_wants_writenotify(vma, vma->vm_page_prot); + return !!(vma->vm_flags & VM_WRITE); +} + +#ifdef CONFIG_MMU +static inline pgprot_t vm_pgprot_modify(pgprot_t oldprot, unsigned long vm_flags) +{ + return pgprot_modify(oldprot, vm_get_page_prot(vm_flags)); +} +#endif + +static inline struct vm_area_struct *vma_prev_limit(struct vma_iterator *vmi, + unsigned long min) +{ + return mas_prev(&vmi->mas, min); +} + +static inline int vma_iter_store_gfp(struct vma_iterator *vmi, + struct vm_area_struct *vma, gfp_t gfp) +{ + if (vmi->mas.status != ma_start && + ((vmi->mas.index > vma->vm_start) || (vmi->mas.last < vma->vm_start))) + vma_iter_invalidate(vmi); + + __mas_set_range(&vmi->mas, vma->vm_start, vma->vm_end - 1); + mas_store_gfp(&vmi->mas, vma, gfp); + if (unlikely(mas_is_err(&vmi->mas))) + return -ENOMEM; + + return 0; +} + + +/* + * These three helpers classifies VMAs for virtual memory accounting. + */ + +/* + * Executable code area - executable, not writable, not stack + */ +static inline bool is_exec_mapping(vm_flags_t flags) +{ + return (flags & (VM_EXEC | VM_WRITE | VM_STACK)) == VM_EXEC; +} + +/* + * Stack area (including shadow stacks) + * + * VM_GROWSUP / VM_GROWSDOWN VMAs are always private anonymous: + * do_mmap() forbids all other combinations. + */ +static inline bool is_stack_mapping(vm_flags_t flags) +{ + return ((flags & VM_STACK) == VM_STACK) || (flags & VM_SHADOW_STACK); +} + +/* + * Data area - private, writable, not stack + */ +static inline bool is_data_mapping(vm_flags_t flags) +{ + return (flags & (VM_WRITE | VM_SHARED | VM_STACK)) == VM_WRITE; +} + + +static inline void vma_iter_config(struct vma_iterator *vmi, + unsigned long index, unsigned long last) +{ + __mas_set_range(&vmi->mas, index, last - 1); +} + +static inline void vma_iter_reset(struct vma_iterator *vmi) +{ + mas_reset(&vmi->mas); +} + +static inline +struct vm_area_struct *vma_iter_prev_range_limit(struct vma_iterator *vmi, unsigned long min) +{ + return mas_prev_range(&vmi->mas, min); +} + +static inline +struct vm_area_struct *vma_iter_next_range_limit(struct vma_iterator *vmi, unsigned long max) +{ + return mas_next_range(&vmi->mas, max); +} + +static inline int vma_iter_area_lowest(struct vma_iterator *vmi, unsigned long min, + unsigned long max, unsigned long size) +{ + return mas_empty_area(&vmi->mas, min, max - 1, size); +} + +static inline int vma_iter_area_highest(struct vma_iterator *vmi, unsigned long min, + unsigned long max, unsigned long size) +{ + return mas_empty_area_rev(&vmi->mas, min, max - 1, size); +} + +/* + * VMA Iterator functions shared between nommu and mmap + */ +static inline int vma_iter_prealloc(struct vma_iterator *vmi, + struct vm_area_struct *vma) +{ + return mas_preallocate(&vmi->mas, vma, GFP_KERNEL); +} + +static inline void vma_iter_clear(struct vma_iterator *vmi) +{ + mas_store_prealloc(&vmi->mas, NULL); +} + +static inline struct vm_area_struct *vma_iter_load(struct vma_iterator *vmi) +{ + return mas_walk(&vmi->mas); +} + +/* Store a VMA with preallocated memory */ +static inline void vma_iter_store(struct vma_iterator *vmi, + struct vm_area_struct *vma) +{ + +#if defined(CONFIG_DEBUG_VM_MAPLE_TREE) + if (MAS_WARN_ON(&vmi->mas, vmi->mas.status != ma_start && + vmi->mas.index > vma->vm_start)) { + pr_warn("%lx > %lx\n store vma %lx-%lx\n into slot %lx-%lx\n", + vmi->mas.index, vma->vm_start, vma->vm_start, + vma->vm_end, vmi->mas.index, vmi->mas.last); + } + if (MAS_WARN_ON(&vmi->mas, vmi->mas.status != ma_start && + vmi->mas.last < vma->vm_start)) { + pr_warn("%lx < %lx\nstore vma %lx-%lx\ninto slot %lx-%lx\n", + vmi->mas.last, vma->vm_start, vma->vm_start, vma->vm_end, + vmi->mas.index, vmi->mas.last); + } +#endif + + if (vmi->mas.status != ma_start && + ((vmi->mas.index > vma->vm_start) || (vmi->mas.last < vma->vm_start))) + vma_iter_invalidate(vmi); + + __mas_set_range(&vmi->mas, vma->vm_start, vma->vm_end - 1); + mas_store_prealloc(&vmi->mas, vma); +} + +static inline unsigned long vma_iter_addr(struct vma_iterator *vmi) +{ + return vmi->mas.index; +} + +static inline unsigned long vma_iter_end(struct vma_iterator *vmi) +{ + return vmi->mas.last + 1; +} + +static inline int vma_iter_bulk_alloc(struct vma_iterator *vmi, + unsigned long count) +{ + return mas_expected_entries(&vmi->mas, count); +} + +static inline +struct vm_area_struct *vma_iter_prev_range(struct vma_iterator *vmi) +{ + return mas_prev_range(&vmi->mas, 0); +} + +#endif /* __MM_VMA_H */ diff --git a/mm/vma_internal.h b/mm/vma_internal.h new file mode 100644 index 000000000000..14c24d5cb582 --- /dev/null +++ b/mm/vma_internal.h @@ -0,0 +1,50 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * vma_internal.h + * + * Headers required by vma.c, which can be substituted accordingly when testing + * VMA functionality. + */ + +#ifndef __MM_VMA_INTERNAL_H +#define __MM_VMA_INTERNAL_H + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#include "internal.h" + +#endif /* __MM_VMA_INTERNAL_H */ From patchwork Mon Jul 29 11:50:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lorenzo Stoakes X-Patchwork-Id: 13744780 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A7356147C60; Mon, 29 Jul 2024 11:51:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=205.220.177.32 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722253898; cv=fail; b=feeuXxUmGukEY/QUsxmgPrjC+Demm8BWi6fob7zLN0+mbAh99bYlrOeybKHbm6vdlRulgTmJkaDnbrCmjnPoxmN8/G+Db1zbVa6482iPMnBuNveH9je562a79RLXTc2eCpyiVYnAjhttzWCreXj2JT0DoiMG/Lg5VQwK2MjSQhg= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722253898; c=relaxed/simple; bh=SQwgd9K3Hgg1OKv8/IXrezGsWaKUwYTjImKl1oIXXd0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=WspQoUr36AwREJFd4J2LP+4B68Zq95QdIOsq+mYb8qH5g7iMpWAXXSStyITVFMJBAfMWVVwABy71NkfRKiKEgGXAS7gV6BhBdlwalkhue+JKDD1Ug3ec/J6o5oytC/QEeWDRzrnWEFW5pSjFqzwLb6fVwgXCCPNku3JALpJ33r4= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=Fmh1MQWV; dkim=pass (1024-bit key) header.d=oracle.onmicrosoft.com header.i=@oracle.onmicrosoft.com header.b=F0sovX24; arc=fail smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="Fmh1MQWV"; dkim=pass (1024-bit key) header.d=oracle.onmicrosoft.com header.i=@oracle.onmicrosoft.com header.b="F0sovX24" Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 46T8MXQj018370; Mon, 29 Jul 2024 11:51:18 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :content-transfer-encoding:content-type:mime-version; s= corp-2023-11-20; bh=vtagKAB+jqdkvRh3KbQ+evz67pMUQYm9LbDMk4fOdm0=; b= Fmh1MQWVu90l9dEArjzc+FKU7v0wRZC19cGZwBESiRD/KgQV2owzB077FhLhiUXE X2eXfsVKbPd5uWfUks4sC0QLW7i9XsrvBnMIGoHlm8id7JqtIGycdq6OyDkoRAe2 hKlok+AUfr4cZHfjFJd2kfomueKb5PihJfpFtIk0SRPXE0X3J6x1FZyY/GComBDx JHSo619z62GZ4vFno7i5pGQDzIZyBfNtyK2IXEmzafwJlJsu3wCD6HPCutn9FUB/ jHb+vgK7y4zL4IL3Wvb0nfUD7N3Mh852WKsfYcwmHdRvf2f3ildf5ZL/WSj64MFe v6RuwRNgNVexemGvvmhS/Q== Received: from iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta03.appoci.oracle.com [130.35.103.27]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 40mrgs2be1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 29 Jul 2024 11:51:17 +0000 (GMT) Received: from pps.filterd (iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 46TAT2fV009336; Mon, 29 Jul 2024 11:51:17 GMT Received: from nam04-bn8-obe.outbound.protection.outlook.com (mail-bn8nam04lp2042.outbound.protection.outlook.com [104.47.74.42]) by iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 40nrn5q3x5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 29 Jul 2024 11:51:17 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=eQhSzIvYEXNw+xiYlBOrBtn2FTGEfsby7wlFyMSsSABF9Z26rnWAXsvL1+pNnmx8sBrpvVTMB0aKbILmfSff84XFtCXRlZ0/PG9BR7xa6ZSxbWvvC58zIldOQxlITYC27mC618lUtUNNJhskXPa6INsuXLrEbnUpbv+073Rvsw5BIslvupK8hEHyLfxrW7bLXlM2OHgiz4v/EjGTTqtePWlFzE3oe/krHJtKQp7j+jlluREtzr4EknW4NDstWH2MUhMA1CGQQ5b8pwNS2YKEl2S1Kg9fun1c7PXgEmpdAYgWbhJ2j3HMISBsf9yBXGweST8LlK+48EJyvIH8/HgHzQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=vtagKAB+jqdkvRh3KbQ+evz67pMUQYm9LbDMk4fOdm0=; b=VIYHnM/LHczIS0afGh1oq2ZKZ1k9ouCHvLytLO+W1mfCIZFgl1Tg1T00gN8p2W+s38Ea2meEVhEq+oJU5wFeZ3Q4KXNTe9lzuqIm/no3gbEmJvZrI9CFxBRs4MgqtnqF4L0m2/7nCRU9QP2L+Okpab6ybLN7Sa0YK0xOkmQMBtuxpMFUUGvYgtgLaUteEbiGl7vOeDCz+jH9q+wFXK5EPFJLJH5u0BSacF2g/cCPFU98J5ymM/mNNxJHLQdrQPLbusDOTZXf4aDy1wRMsfyYk3hzmwHOcMj1w+3dpiqU7DjJHM6/+5hzCqzj0jQm+zU+CxLpn76YkrMYlCwh9lhsrA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=vtagKAB+jqdkvRh3KbQ+evz67pMUQYm9LbDMk4fOdm0=; b=F0sovX24lbpPZMxJu86BazMhDJPSXHGoJTpEvAFIns137nMXiMVDh0IcsFEdha2bg9g929XWd9lbDsn5YU5odtFzruocGqe5Bt6DiHmHcAMll73RtCH+B9lHmfkfuIAocWU+eoQIALEkkrEJLp1/8/fzrCsIh3Q5Gk6Uy7p8QDY= Received: from SJ0PR10MB5613.namprd10.prod.outlook.com (2603:10b6:a03:3d0::5) by SJ0PR10MB4543.namprd10.prod.outlook.com (2603:10b6:a03:2d9::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7807.28; Mon, 29 Jul 2024 11:51:15 +0000 Received: from SJ0PR10MB5613.namprd10.prod.outlook.com ([fe80::4239:cf6f:9caa:940e]) by SJ0PR10MB5613.namprd10.prod.outlook.com ([fe80::4239:cf6f:9caa:940e%6]) with mapi id 15.20.7807.026; Mon, 29 Jul 2024 11:51:15 +0000 From: Lorenzo Stoakes To: Andrew Morton Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, "Liam R . Howlett" , Vlastimil Babka , Matthew Wilcox , Alexander Viro , Christian Brauner , Jan Kara , Eric Biederman , Kees Cook , Suren Baghdasaryan , SeongJae Park , Shuah Khan , Brendan Higgins , David Gow , Rae Moar Subject: [PATCH v4 5/7] MAINTAINERS: Add entry for new VMA files Date: Mon, 29 Jul 2024 12:50:39 +0100 Message-ID: X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: LO4P123CA0020.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:151::7) To SJ0PR10MB5613.namprd10.prod.outlook.com (2603:10b6:a03:3d0::5) Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ0PR10MB5613:EE_|SJ0PR10MB4543:EE_ X-MS-Office365-Filtering-Correlation-Id: e737227f-01cd-41d9-8f9a-08dcafc4c02a X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|1800799024|7416014|376014; X-Microsoft-Antispam-Message-Info: ckmdw+ta03VNovktXHsGEpjEaWwJNSFzBJclcENL+HZjm4VMWeohJhzSCJpXp/w0su5FdYKuLIZXBVxUzQpEmn90dE/SdqTs8AVk+aUw1pjIirq5KjBYxqO+bBXnZj3rYdmZHWDXi4CJy6Bu4PgQLcYJ+9JBL2675rGuYMOEepXcMMnW/b4yXANr9CnM5Nmj+euLXMq+EiIvilmAuxyEVjAJRoJ646EnpikS3avhi7iCrGBZOCTiQ+VKJcnaTdkHzjbXOBNVySHIWRlMwzdRxUn7/aZQAF/JBt5IEvVF/mAi5Ykd+SVLL+JbzSc9NZI6df/Y8/hB6t4P5krBKZaaMsCGh+QAqDcJqH6F5bR0nbh7vu/cnYDzgBcGw2fE6egg3R0n4puaL9k7/DtcTM5x0O5NQmYXpLf0G2DMNv0txFejX9l7Gf9ENAZysghViiuc7O1vPrD79sd6/o7ye0IaToxXCE8u6OMRR/23N+Lu5UYjCJVdJOxNdbJwGudtlYb04z55DQKkHgHYOIz09O0N+TeynjG+45xv7ERSe80nNMLlepoFBL9GE9PbQJTXBEmM4G1v8sw55Ufye5edVPp1NZZoelJEvC/wP9EySScjz3sggQopa7fHCRYUbwmjg4ykFr+RSCbV4k8h61Uf06hU9JlcGkmsga7iu/dA8RtDOkmHjY+UcwT8srA/8MBQbXk0GO4wVaZ0fcLt4wTPKlfrU+S8DULKDFfV1LRiljnBBwSqlvAjAlxquZD2XzRb/+tg822KhF1/V5lxQXYe1aqR69soB6UpAAKkHtI6bX4pDd6CexkHzFb33QxcULziRgNIb1y7Yg/NgfdIFKyjwQKReVW7kNVAI1t63mNEJIaY2FWxb+fNHT8nAQFJf3dkYHBBU0uNqotDsm3DgCw6AbCW83SVZD8J4z2BMejoT9N4d4Y4CHGWfrCnp8iacMk/1W13chbmDiiXcuDA1KNmc3dB0drsEMfZmakI+eMUJ5FCIhvnhwmPX/1pOmOtmZchCtReiM/HTLSyOylFAuS8T+gGu6T1aXxSg4XgmqT2nqoYhdV6iobHzGYR0QNc6npyQd/YxM9/zv4Rb3FF/idJqIx9RhyVAd3m0aHBt2xLw0deO+fn1560Dul5t4oarg7/uqfpzX0HowlzZE4PKbry7yQ2S+CpLBkrcc1Ju6QbWpHTAhmHKMdI2ARCX+j8oE7zHnhuLOfM9yThm/2XLUs5G/DJeyhPpWfeHQMhMJSI6okJ3/ptzLmC1pa5Ee5a+fgqQDg/qNNA8Jwf5TvVc44YUAlv0w8ti7lOjnv4EF7MfROPz5Y= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ0PR10MB5613.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(1800799024)(7416014)(376014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: tiA8oxlGf68JMn/cT4ef8+kUx8No+Nv7GoJah+LBq6r1Cn8gHi0onZA9Wz6OhO25XDawHytJ4C58vxFbsQpOwoq5HJutQniGoBCZcxR7IwREVpEW5/JzbpAxWjXQ2qEOSqQ7i59pXeHCvDoNMRvSBa3PyvDlJXLothlTMDbWJsKaRSWLSf1GgmfuzXnN87HBRGIFvTQyKoNsErSfdVPJA9WYIHvbJDxihvZ/xlqdM9KNKsrO10tv7JV3HZyY3Ix6aB/Lbqfl/OqoMNNDYVin3MgFkrVjRXYlkGovicmGQcFrNPxwEmjgbwJTVPwkiAt97V8ijfFjYeSoXZ6oKzOvW40hQBOpNFvBujvXXQn3SZf5dgrvfMrJHMwmDeY1QbRXf7AZ92TnNXqNvRvXZFu28+JiCud/0MhcYpniViCRHedM3uObb63/09K7xMqz5nh9gesqilTnDS94Zu0DoJWWEAe0h191rCJA7SyAA8sxATJhfrCOgFR/LumU3y6FMgZpToc3tjm3dxcO8RK26arEOU48AmCPXTuT1f70JBscGN5f1qzeeeXrwwK7oi7hJqt5Ovk6p1K4R03sOU6iJCAr0uTLieUu8jTPZxshyiOpHnn07UZ7Xut7g9jaMlIAfP7db5bMEnsUzT1mLQ11SsZyUYnl4A8s5jq9xbe86UVSUE0D4+Om+8nBOZ0Ly7A+D5VepzQMkjDQsitd4S432IEuh5zFOlhT52SVnosWmwwpajlTxpOHR2Se2Hj/hM4EByyopl4OvhhR1vfreEGZeBvGj0hJVGeD+yHmnqWnV8Eys9736N9ojaqqBFJ9yb6IW/WMER0Gb/YNB5O3X4OqEFNIgC+ZZrYvivdOwkfm5ZVvQq9uPGYEshp83KXPP7UP1FvziheiPQu7LXhY00DwBAKmhoq39g53dNepb550HQOW+qyfq+Yz3GlX6a9ReM7S9/GE0rEsNXurn0Bm8g6So4kQMzIiw775ULvlWZwPyBS5Oc/nZdn8wGTtsCmslgpPr3rf4KF9L4cJgPBEFG6oL7hedHGW570wkBU6jaUYdJTKIdpvLwGXet+UatlftwXOXL18RU6Oz9YVxfiwXxh01ayVi+I0w43/aE2nCorcyA1Aj/2b2SOIZ/45bY5fwLwWLsNTDfIi14DxkCOD8yJkMQJIaK6NPpSX1OU/kmR7fVA6b2T1PvgArz4ON4LRcVeMLdgH55yK1NIbKpv0jgU5eqc0oKwkhRnNKBhPYusvD+TN0ZFJor62Rf/lEubMTHMGmBN6dg7QYfIpWH2O1jPQNAicqPPbQnnMrxpalYAc33iA1yYU8BSE53OwORR6rMKxJycRjGsIoegA7iI+I0zoXcm6YXBUGl5lpcGuRUisuW+CfkPuw5j49WtoG7QbbkZzoaCDognRUv+OoWmo9sJ6mUg+1awMFQBMht2J/17vkxyrOmRuPO5bHoeQicV8xO1yPMSElA9fn2zUYJ7LxXshZakogCZX4nUak7Sqkzx8egl8cGKC9y7SMO0GEEIBfc+2C475JF/CVSJixETaHq51vcIvAwIOFKaue7zrsPpxjvesw4gUt5lcoKiWammRvbGQCMnkFfqqvVS1wPYWK7H3q45slg== X-MS-Exchange-AntiSpam-ExternalHop-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-ExternalHop-MessageData-0: E1pkXLPKfO4QPnloqh6sdCl51Fs1BmJazEo42rLt9j6MFwHnXJmJW7mbE50v4gjiPmCmWVze4zS/2ipkQHzVNVJzwA5sjkNWkvmobbngDclznk4qQvoFtFTNtI4As79A0h6E5VgUrcnOMB93mSXf1hds4K4v7s0LIpW5BlouUnH4eLHDjK3QTLkonSg4gAbXwXaCwQdyPqyVP6BzpDVsv5hQUNYQLlUScdPA1E1lR/mk005pzz8PIQScOR7vmI0OZSAtvYh+BrIi+qKB1iDPYTJH0VEZlDXuDdWxaKfQObdGSJy6UPtHaEvQUK4Jc78Ppg0zmKG8K9G8sdaLXaBjuNgGVy+TGf7AntADp6TWzGNThzl0gEva12yphlZsesf+IUL960Esr6pZ0ZvWFj7x/NaB0vxNduFim1Tc8TbU1VD4vg8pTxqz7bIpTUce0UMeJgDQxPN3VhhvhP+g7+4cmCNh7NQ2SEdaFj3ICFmkcdGuAdIb2amNU2ob0glGGn8Z6uBiK8jpoF9vWb+ipfJc24bApJ5STLiTmfKOYXLi5w4Q+vDR5tg01niuZ48XYe+JGnrOXnGZuf5kgpyLksWWIptFllH0PITom1UqnMJuriE= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: e737227f-01cd-41d9-8f9a-08dcafc4c02a X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB5613.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Jul 2024 11:51:15.3160 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: TA/V+jWv9rtnN95pYeSWGeXYIs00fQbyYBwg4ek0dBCVeHJQZnfGd/LKj7KFx4N+AVyEDCA1eN7xqKOSqQJeuqqY75z4svGhr0B5UFdPs1s= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR10MB4543 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-07-29_10,2024-07-26_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 mlxlogscore=999 suspectscore=0 bulkscore=0 phishscore=0 adultscore=0 malwarescore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2407110000 definitions=main-2407290080 X-Proofpoint-GUID: 9dpWkUMMmvC4vRGBV6y7baDGWsQaK3m3 X-Proofpoint-ORIG-GUID: 9dpWkUMMmvC4vRGBV6y7baDGWsQaK3m3 The vma files contain logic split from mmap.c for the most part and are all relevant to VMA logic, so maintain the same reviewers for both. Acked-by: Vlastimil Babka Acked-by: Liam R. Howlett Signed-off-by: Lorenzo Stoakes --- MAINTAINERS | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 42decde38320..d4cc9f832d49 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -24383,6 +24383,19 @@ F: include/uapi/linux/vsockmon.h F: net/vmw_vsock/ F: tools/testing/vsock/ +VMA +M: Andrew Morton +R: Liam R. Howlett +R: Vlastimil Babka +R: Lorenzo Stoakes +L: linux-mm@kvack.org +S: Maintained +W: https://www.linux-mm.org +T: git git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm +F: mm/vma.c +F: mm/vma.h +F: mm/vma_internal.h + VMALLOC M: Andrew Morton R: Uladzislau Rezki From patchwork Mon Jul 29 11:50:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lorenzo Stoakes X-Patchwork-Id: 13744782 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 81A111487FE; Mon, 29 Jul 2024 11:51:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=205.220.165.32 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722253909; cv=fail; b=RbbQK4mk2Bk8tBYvunAgCOAUvIZ0cSdNJWca0UofJpNUbF81J8TRtd2+PBeUjllQEkj6sOqx5/pqw39QisQoE3m2O1bHWow2DwFLO2k9xx8lExucZhukiwaiFZOC64NPEADMGvdgtp3x4fzK8MNzn4TgAORCdndzuKlJqX+ub9U= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722253909; c=relaxed/simple; bh=WF5MaBf0OuKqyo4tlhdPrMGZMjR9ABPXDjTHLH//Dac=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=cKSJjOurIPmZSdeKQ2Ah5p/xZOHGKvP4r4w0SVlJEsubCQNbGjIrz3ma/9DkNBu0wA4VkZxhSYgA04r+m20v3WF6q+afJqEgotjjfvmixNjsq/AknnCa7vg4QKI5qiXvhmOVrnIPGJbR+fovjotWRetKgEH25aPPwIzH5+hQal8= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=JYqZUL7f; dkim=pass (1024-bit key) header.d=oracle.onmicrosoft.com header.i=@oracle.onmicrosoft.com header.b=o3sbG6nI; arc=fail smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="JYqZUL7f"; dkim=pass (1024-bit key) header.d=oracle.onmicrosoft.com header.i=@oracle.onmicrosoft.com header.b="o3sbG6nI" Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 46T8MYjJ006241; Mon, 29 Jul 2024 11:51:25 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :content-transfer-encoding:content-type:mime-version; s= corp-2023-11-20; bh=AtZXBCncvd+GENkOrV4WmcTQW0fb2FkdkewaAOgm5ys=; b= JYqZUL7fZMckf0dS76Rd9fcQvuQQYvasZUzi9douS/rCQtszHVjEl6fSwE4Gd8ar 5rbZbMh4tgn9vguat5tdzTTU/0RUqVUmqIYJOYnWmaFKolkeaOKAQjxTH2nOQLTN 02diHuzoWgL4eGBxj6qNx+d/tBgT+lcTiIB+nM3ZN2VO+kjxKwbNiS+JToHF7nKL AOz6RdvtDuvRDi55i9ZQrxtCDtb6zq1EIKhtGi3MKcrHk6sANAiq6FxQsUNuj4SN EHT4f4PCxJnHhSKrSX3mrHvd7E6GEhymn7DajoLvfiTGSVhkuvZeirgIrPhNduvn LefhP8CYMPLzvLc78jYvAw== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 40msesj93r-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 29 Jul 2024 11:51:25 +0000 (GMT) Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 46TBaggO035584; Mon, 29 Jul 2024 11:51:24 GMT Received: from nam04-bn8-obe.outbound.protection.outlook.com (mail-bn8nam04lp2041.outbound.protection.outlook.com [104.47.74.41]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 40nvnun13g-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 29 Jul 2024 11:51:24 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Qb4ammERSbl6/YrBXmwQ8+9qwISQYn2xS6SmL+xUfxzj6SP0bdkQRzQH5Ib4zZM7gKaqtTRzsqyCZEQKji4dm9Bri/KUWK3gMmQjj/CR3a7Eeb3DV9WmnVNDJybu4+zAeFIuE0BXeJNgYZ8bVXKTK/eF/SekxyAfdQg3frMQWJqfn3m/DzicDSKmBoD+X0+svVbsAk1OHaHJ+Ieb0JW1lM82o2AXvQblll/gyxHEC6tJTlk85d8Avr573E+jmo7z2VGW95vrZofM1Y3vEuFlYuJV14lflLdQmvnb3LSVNg0QpTmF8N4yqE8/1njNVYZj55ovzQ2Dh1tksV3z83J+6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=AtZXBCncvd+GENkOrV4WmcTQW0fb2FkdkewaAOgm5ys=; b=hd+PVPK+z/fiCmXSht9s3fsBTzhtcaDHpjlntiKn9NZb8cOh/jlcbv3Mrl6rTS/Tkd5bnm0rgtsi+BiLDxxSCZDyGbxxBE7md2tPdzQj9sOcxTbED61lwPZV73904a951smXZvE/mJhIyyppE1odGZyucbD/oiYy5kBX4+sj5fa/MeEicYCAXlZkMQawfwApCTlNEnQ/eh2T+b23sZAup+sni6wGCmyW8R0OBphDaDJ128YDtKM9zL9x6VzoJyoiH8DNBgkbmEb44Y9Mcy0g0WLXqH/GN6iaiyFij4zXvWOCW+PED7tSGpMpOKM5xWrJvzlE87t3XHaV4kDibZJAgg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=AtZXBCncvd+GENkOrV4WmcTQW0fb2FkdkewaAOgm5ys=; b=o3sbG6nIZGesLDEEPaEMMo0oALvjOCgTTFth2NCMHe7ndMB7SLScqmElGY5gQbofOa+nVk1iOVk/NQsAD3tpKxRhPVcNwSDvfacN7RvzQk9zUEab3aUG4Zv9+FjXLckq315gzoWKJ8yjdpqZMO1OG5OBZdSwODAIW3tHGi7jEm0= Received: from SJ0PR10MB5613.namprd10.prod.outlook.com (2603:10b6:a03:3d0::5) by SJ0PR10MB4543.namprd10.prod.outlook.com (2603:10b6:a03:2d9::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7807.28; Mon, 29 Jul 2024 11:51:20 +0000 Received: from SJ0PR10MB5613.namprd10.prod.outlook.com ([fe80::4239:cf6f:9caa:940e]) by SJ0PR10MB5613.namprd10.prod.outlook.com ([fe80::4239:cf6f:9caa:940e%6]) with mapi id 15.20.7807.026; Mon, 29 Jul 2024 11:51:20 +0000 From: Lorenzo Stoakes To: Andrew Morton Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, "Liam R . Howlett" , Vlastimil Babka , Matthew Wilcox , Alexander Viro , Christian Brauner , Jan Kara , Eric Biederman , Kees Cook , Suren Baghdasaryan , SeongJae Park , Shuah Khan , Brendan Higgins , David Gow , Rae Moar Subject: [PATCH v4 6/7] tools: separate out shared radix-tree components Date: Mon, 29 Jul 2024 12:50:40 +0100 Message-ID: <1ee720c265808168e0d75608e687607d77c36719.1722251717.git.lorenzo.stoakes@oracle.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: LNXP265CA0063.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:5d::27) To SJ0PR10MB5613.namprd10.prod.outlook.com (2603:10b6:a03:3d0::5) Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ0PR10MB5613:EE_|SJ0PR10MB4543:EE_ X-MS-Office365-Filtering-Correlation-Id: 172a0041-aaf2-49ee-412b-08dcafc4c33a X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|1800799024|7416014|376014; X-Microsoft-Antispam-Message-Info: uxv/I43mpsQ4Vi086oCPatly+pnsFBpWWlIw3q8J8XeobgUvu8JrhZ3Qztfti+B5J0VGXhPoyzMotUgdd6wKOxsfIHuEMZJQP5kDLESOagCYJmgQpdq2D32Q5/+MKgJjgnrAEpMmdsIHkjnPFfvZiykN+B4JFoQ082RgPtzuT/AnoUXAKeJQsOr3gASlqBPTHb7vqZ6Yve83bIzDbT4+xSkitd3nsNmS/kFnG1Ns6lPLwRSeWdn0GrUuNcoVV/aHPw+9r/A/1b6t7KbPnj8gKBTWI4eg4gEvs8uInAerfMFIFXhZcwEY3pZdzhSdT7Vn1lr2ePDVMAwZK1a6IfcHgrmmv5XE12NJB2F+CLt6FvygI1m+5ZsGsqfZ/GB3wK9J0QJKKXIDSzWLiOzSveS//Ks9BVJLplvJyqRTCYnkJLkDGfKyMHubBKka2cjWYganwrG/OABfhDQLy8FCLCPjG7jrmdqwX8t4PrkNt336TvWng70P2HGnWd/SVlFmijBnfvQZ8PAynVIHP3NBeBF3H43g0qaCzC4unmkIRaxUBc9ivnPloEKzbRJn34kSDtNryd8lwPCqXx1+hQw+JNHLGk5Sk6CwuQ5YqJ3jSKkRkA2fU2fcjEf0Qixx8YubFdD5WHx7wN0SptsgMCpYcf1Ka47upA3545y4DcoHjHGvNpElaAYhV9o8BTmWzqc2h6Y+FcZ9gXy4ibyyEg8GEzvheV4oNacXucijMZlQO5Sw8Jgn1i72Wx+rNP+w9053kKh39Zw+T0EovZqNM9ulNex7+4R7fLKPDOuiGkNm7X2hOohycpN0xWn5OBD9NlFF2GutnHwdDXkfIOkPF0na9XNTkSEx0XvxzTpzDwM+viM+8RIkRAXkfSh4N53B+aiyoQCcwafx0xss5GKVFYulUpMLJzEiiVor6yPhwPgB2d7qYrHILOz0ZH14MrmyynwyVti4V6AUlGw2+YooOdR+rJctF+eOSX5jXooYI34O6njlnxYRI3+oItyDitYsfU5FCXfQoAeVdjnjtiE+Etbqrz6v1dG9jIcAaOu5W9SUtP7NYKYjGr9Zdfq6uyVbAqVNtfr+qTIwT4yvSlEkD27EkZPTiQ3TLVixux2SuCaUgj6me2GuM1jbycGvmMrXqAOyGjlauh/D9CfF8tWSLyePEiLBO7wjjPy3cHmzcOMVMc2sf/XmJv/kwBN7E330nQGdbZbUmekqVxzIRr5wA+/ZudKBoSsTVlY+W5eV9gglgs11p0AX3SbdhPmWZlQSCl/nrE8NzvulCx+DtpH2hfmHLHnVflg7rreUNDlXYWRB3vmcTwQSUOPGf+kajpKEg4/rwIcZNLLhdvHxaY+shKHSyqsPwg== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ0PR10MB5613.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(1800799024)(7416014)(376014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 1W+8zeNJCRrR7+G72zR5y1zIFcrgS862YMidzh45zvCFEoiSchuejklNm4LizuQtKKLOM2pQTkrOLKIPc5JR+Pv6/QZjz/Et8Nlhk2zN3V57KHMSFuBjxq79of9TRrL7xlz/msMibK6JmCJWMGuPsYKJ9dI8bQtA3yRI4/rTiUJR6fjC39aY4wyOHdYtLKlSVzvmcL4GejzohfKUjvYCFYnydifWAM/zug5GyjntORC+T7w6U3KDweUgbZOJJth/WQIqMU0Nkpsx8Tn28LgICxUMW0CrBf7lbY3dFMeWjBgVekVEshMeDJlFacCZ/acMEg33XRe27tLGEppip7OStgCoV4u07ytVwIBKIXnn1mENayRP2om1czitHp1+dr3oMIf9agOpMG8d5mX1EDiSct/5yYtF+EWKd2KA5s/GR5Qb34Amtcdc2dCZpX9pdnqy+6N/QmizjB0y6qb4VFmKzx56tA3Hjzrr+m62mVu22XDa5dP9bMfk2pyWM5h28kdqSeK1gPRqNzkOvSfvP2wW1l9jfq1u/FYdx5KvRf04Jox9yo1GDNBuYfoJOJxHhfzTHMdwYvYAWGrw46waG1ZiXxaqQZp6rqAbkwQCXqD/s64CzTo/+EhfjYzvwD2V2gNWLtfHlhJ+KGWudfzdhg3IGbr3AXaTnIU/GwYhNtVaKyJp3/Mhm/2Q38k7YosWmSzQEJeiml4rtMjMknUYdpHQDJA4P+dJJKypZRAu3z7RJBUDLNlgthYmEUZGeIpco91rqhg8iWKFGuW9mrwr2gn1r9dN2/mCC/hWLlbG7ktIU2wUJr5vsvF8Fh85v7mx8GPczEn4yakWlT1mAdN2lpSmeNsioHUa44n742o6sFYkhuaFFIK/pDR5zAn/3exvk6xzB3pAlfvTSpjCjCrwVk6oHge5MNzq9LZ0+VO0dFRaf0SB6BSz55mu8On+WaE89e8I22sAEx5JkM3gPB6cH0K/ppMb88C2hx62mO2JfjsW5kb6rZ/TkxARRvN9rxUfjSGY+PKsjzu5uRR1kssdLLg95+iACDe650czo9n8ASe9elNcfKpCuI5vUyWqsz5gnrcmraNHhkAahjVXvl94aIK0uHoBl0BSyUE1Gf8xMxLNrMOmHh0HLVJC2Hfx/KYlb6zECHGYU9slwBtFBwYTQFWok1vnWPk5lgoMuWa3VXiFMIgZMWXZ0HQE3VbFC1fsPUJgmiz9ThX4uK+h1PcTy/kEFTsCpJhUiS/BfdIt56G2eguy0j+Nh+i4NLJ4twlQwDQV0zkA26Ogrf145uwAx13Uaui2nKzEgKNlhQAHXXbCDPtQGAWuFfoHdRJ0VX39HuW/Gc1H3HIDySSIWVpHR4TP+2h0K/rpU5e5aqGWKh2WkoX7Z/GE7CBznWKRI4mZyuNadUpQhlV1WoQ04c8c/891uJtfqYgogRHDeuvkA3KVY5uoj6IfcaLbg+mwaJz8UR/KgoE5HQCkCmx12gcKhIPcNVbO+XurSDgz2uO1/BCiq5C0t8C70vouw07UcbcPAY23ZoUH5TAuCRBsslq1m9vasWXLfBun6Ys5h7hLqZDnnH7rYVb67iUdE4y1oHaYqDJlRfH/CcW9uuYwkCAabJiZuw== X-MS-Exchange-AntiSpam-ExternalHop-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-ExternalHop-MessageData-0: s/9r44hE/7GcLXHCxqrrTv5jtbgpoXEUHdqHYkDj0GFRdtg/ikyidyQ05M1yJBPMOU+X4qmVoXBcGf6msL8+StKlQ50jv889kfokX92Q33lVpFUFotDM0Thev50K2McIFrwquJCpjICvqWBmM9vEeQCRunDfLQmwjUWMnxVgS3fWmDeld6bHDFtqZMZqZDbmmnoR3yJcSVHxmAAwjtg2BkKUnWYLEEiCpYwcBT4r9pBTz4gFpxS/ocrMfZscAw6iusHBXu5YA/YlZ+YKoXQ84m8BpOAQ9/iJNtenFAxZiaJFK+/iYUNZLgO++UrCqXRLBOrnegk3NInK2kLvUxF0Nku/+dLYdgkPtTtGz1YkIhWiMAIfFreyyUVFrjxSEM1nlrxsbOHgp0fymiJHuv1jOdBA/eQRlpCRq53eQKdR1ZFbKBfkElLZwxnVyhhwRWurM4EtPuEkEpZBZ+X3lNpjpPPYKiFu642yHacWRoUtK+C/zIJdxI+tDs4++Axg86uKwbwNm6Ut5ZxEpw/cLydyotUYvAa8bdkjANsmsAaOV9rwYKVgDtosfYAP8QceRacf/a/xgjstC7ebWGlrq+famW6eG1+Gzz/GRMv8Tr621I4= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 172a0041-aaf2-49ee-412b-08dcafc4c33a X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB5613.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Jul 2024 11:51:20.4466 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: KWxfqEYmhTmhXFGKDjCG9Vs0HYJHdOcv7rNlsrc84g6IpD4mqkBkjlTZWhbNW3J8cNJnbz2GYKUEs0i9y+wfoowewt//1ra2uGrstKEkiII= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR10MB4543 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-07-29_10,2024-07-26_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 adultscore=0 mlxlogscore=999 mlxscore=0 suspectscore=0 malwarescore=0 bulkscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2407110000 definitions=main-2407290080 X-Proofpoint-ORIG-GUID: EPHT3_G4G4fr5d51coDx5FxOl2uygRL0 X-Proofpoint-GUID: EPHT3_G4G4fr5d51coDx5FxOl2uygRL0 The core components contained within the radix-tree tests which provide shims for kernel headers and access to the maple tree are useful for testing other things, so separate them out and make the radix tree tests dependent on the shared components. This lays the groundwork for us to add VMA tests of the newly introduced vma.c file. Acked-by: Vlastimil Babka Reviewed-by: Liam R. Howlett Signed-off-by: Lorenzo Stoakes --- tools/testing/radix-tree/.gitignore | 1 + tools/testing/radix-tree/Makefile | 72 ++++--------------- tools/testing/radix-tree/xarray.c | 10 +-- .../generated => shared}/autoconf.h | 0 tools/testing/{radix-tree => shared}/linux.c | 0 .../{radix-tree => shared}/linux/bug.h | 0 .../{radix-tree => shared}/linux/cpu.h | 0 .../{radix-tree => shared}/linux/idr.h | 0 .../{radix-tree => shared}/linux/init.h | 0 .../{radix-tree => shared}/linux/kconfig.h | 0 .../{radix-tree => shared}/linux/kernel.h | 0 .../{radix-tree => shared}/linux/kmemleak.h | 0 .../{radix-tree => shared}/linux/local_lock.h | 0 .../{radix-tree => shared}/linux/lockdep.h | 0 .../{radix-tree => shared}/linux/maple_tree.h | 0 .../{radix-tree => shared}/linux/percpu.h | 0 .../{radix-tree => shared}/linux/preempt.h | 0 .../{radix-tree => shared}/linux/radix-tree.h | 0 .../{radix-tree => shared}/linux/rcupdate.h | 0 .../{radix-tree => shared}/linux/xarray.h | 0 tools/testing/shared/maple-shared.h | 9 +++ tools/testing/shared/maple-shim.c | 7 ++ tools/testing/shared/shared.h | 33 +++++++++ tools/testing/shared/shared.mk | 72 +++++++++++++++++++ .../trace/events/maple_tree.h | 0 tools/testing/shared/xarray-shared.c | 5 ++ tools/testing/shared/xarray-shared.h | 4 ++ 27 files changed, 144 insertions(+), 69 deletions(-) rename tools/testing/{radix-tree/generated => shared}/autoconf.h (100%) rename tools/testing/{radix-tree => shared}/linux.c (100%) rename tools/testing/{radix-tree => shared}/linux/bug.h (100%) rename tools/testing/{radix-tree => shared}/linux/cpu.h (100%) rename tools/testing/{radix-tree => shared}/linux/idr.h (100%) rename tools/testing/{radix-tree => shared}/linux/init.h (100%) rename tools/testing/{radix-tree => shared}/linux/kconfig.h (100%) rename tools/testing/{radix-tree => shared}/linux/kernel.h (100%) rename tools/testing/{radix-tree => shared}/linux/kmemleak.h (100%) rename tools/testing/{radix-tree => shared}/linux/local_lock.h (100%) rename tools/testing/{radix-tree => shared}/linux/lockdep.h (100%) rename tools/testing/{radix-tree => shared}/linux/maple_tree.h (100%) rename tools/testing/{radix-tree => shared}/linux/percpu.h (100%) rename tools/testing/{radix-tree => shared}/linux/preempt.h (100%) rename tools/testing/{radix-tree => shared}/linux/radix-tree.h (100%) rename tools/testing/{radix-tree => shared}/linux/rcupdate.h (100%) rename tools/testing/{radix-tree => shared}/linux/xarray.h (100%) create mode 100644 tools/testing/shared/maple-shared.h create mode 100644 tools/testing/shared/maple-shim.c create mode 100644 tools/testing/shared/shared.h create mode 100644 tools/testing/shared/shared.mk rename tools/testing/{radix-tree => shared}/trace/events/maple_tree.h (100%) create mode 100644 tools/testing/shared/xarray-shared.c create mode 100644 tools/testing/shared/xarray-shared.h diff --git a/tools/testing/radix-tree/.gitignore b/tools/testing/radix-tree/.gitignore index 49bccb90c35b..ce167a761981 100644 --- a/tools/testing/radix-tree/.gitignore +++ b/tools/testing/radix-tree/.gitignore @@ -1,4 +1,5 @@ # SPDX-License-Identifier: GPL-2.0-only +generated/autoconf.h generated/bit-length.h generated/map-shift.h idr.c diff --git a/tools/testing/radix-tree/Makefile b/tools/testing/radix-tree/Makefile index d1acd7d58850..8b3591a51e1f 100644 --- a/tools/testing/radix-tree/Makefile +++ b/tools/testing/radix-tree/Makefile @@ -1,77 +1,29 @@ # SPDX-License-Identifier: GPL-2.0 -CFLAGS += -I. -I../../include -I../../../lib -g -Og -Wall \ - -D_LGPL_SOURCE -fsanitize=address -fsanitize=undefined -LDFLAGS += -fsanitize=address -fsanitize=undefined -LDLIBS+= -lpthread -lurcu -TARGETS = main idr-test multiorder xarray maple -LIBS := slab.o find_bit.o bitmap.o hweight.o vsprintf.o -CORE_OFILES := xarray.o radix-tree.o idr.o linux.o test.o maple.o $(LIBS) -OFILES = main.o $(CORE_OFILES) regression1.o regression2.o regression3.o \ - regression4.o tag_check.o multiorder.o idr-test.o iteration_check.o \ - iteration_check_2.o benchmark.o - -ifndef SHIFT - SHIFT=3 -endif +.PHONY: clean -ifeq ($(BUILD), 32) - CFLAGS += -m32 - LDFLAGS += -m32 -LONG_BIT := 32 -endif - -ifndef LONG_BIT -LONG_BIT := $(shell getconf LONG_BIT) -endif +TARGETS = main idr-test multiorder xarray maple +CORE_OFILES = $(SHARED_OFILES) xarray.o maple.o test.o +OFILES = main.o $(CORE_OFILES) regression1.o regression2.o \ + regression3.o regression4.o tag_check.o multiorder.o idr-test.o \ + iteration_check.o iteration_check_2.o benchmark.o targets: generated/map-shift.h generated/bit-length.h $(TARGETS) +include ../shared/shared.mk + main: $(OFILES) idr-test.o: ../../../lib/test_ida.c idr-test: idr-test.o $(CORE_OFILES) -xarray: $(CORE_OFILES) +xarray: $(CORE_OFILES) xarray.o -maple: $(CORE_OFILES) +maple: $(CORE_OFILES) maple.o multiorder: multiorder.o $(CORE_OFILES) clean: - $(RM) $(TARGETS) *.o radix-tree.c idr.c generated/map-shift.h generated/bit-length.h - -vpath %.c ../../lib - -$(OFILES): Makefile *.h */*.h generated/map-shift.h generated/bit-length.h \ - ../../include/linux/*.h \ - ../../include/asm/*.h \ - ../../../include/linux/xarray.h \ - ../../../include/linux/maple_tree.h \ - ../../../include/linux/radix-tree.h \ - ../../../lib/radix-tree.h \ - ../../../include/linux/idr.h - -radix-tree.c: ../../../lib/radix-tree.c - sed -e 's/^static //' -e 's/__always_inline //' -e 's/inline //' < $< > $@ - -idr.c: ../../../lib/idr.c - sed -e 's/^static //' -e 's/__always_inline //' -e 's/inline //' < $< > $@ - -xarray.o: ../../../lib/xarray.c ../../../lib/test_xarray.c - -maple.o: ../../../lib/maple_tree.c ../../../lib/test_maple_tree.c - -generated/map-shift.h: - @if ! grep -qws $(SHIFT) generated/map-shift.h; then \ - echo "#define XA_CHUNK_SHIFT $(SHIFT)" > \ - generated/map-shift.h; \ - fi - -generated/bit-length.h: FORCE - @if ! grep -qws CONFIG_$(LONG_BIT)BIT generated/bit-length.h; then \ - echo "Generating $@"; \ - echo "#define CONFIG_$(LONG_BIT)BIT 1" > $@; \ - fi + $(RM) $(TARGETS) *.o radix-tree.c idr.c generated/* -FORCE: ; +$(OFILES): $(SHARED_DEPS) *.h diff --git a/tools/testing/radix-tree/xarray.c b/tools/testing/radix-tree/xarray.c index d0e53bff1eb6..253208a8541b 100644 --- a/tools/testing/radix-tree/xarray.c +++ b/tools/testing/radix-tree/xarray.c @@ -4,17 +4,9 @@ * Copyright (c) 2018 Matthew Wilcox */ -#define XA_DEBUG +#include "xarray-shared.h" #include "test.h" -#define module_init(x) -#define module_exit(x) -#define MODULE_AUTHOR(x) -#define MODULE_DESCRIPTION(X) -#define MODULE_LICENSE(x) -#define dump_stack() assert(0) - -#include "../../../lib/xarray.c" #undef XA_DEBUG #include "../../../lib/test_xarray.c" diff --git a/tools/testing/radix-tree/generated/autoconf.h b/tools/testing/shared/autoconf.h similarity index 100% rename from tools/testing/radix-tree/generated/autoconf.h rename to tools/testing/shared/autoconf.h diff --git a/tools/testing/radix-tree/linux.c b/tools/testing/shared/linux.c similarity index 100% rename from tools/testing/radix-tree/linux.c rename to tools/testing/shared/linux.c diff --git a/tools/testing/radix-tree/linux/bug.h b/tools/testing/shared/linux/bug.h similarity index 100% rename from tools/testing/radix-tree/linux/bug.h rename to tools/testing/shared/linux/bug.h diff --git a/tools/testing/radix-tree/linux/cpu.h b/tools/testing/shared/linux/cpu.h similarity index 100% rename from tools/testing/radix-tree/linux/cpu.h rename to tools/testing/shared/linux/cpu.h diff --git a/tools/testing/radix-tree/linux/idr.h b/tools/testing/shared/linux/idr.h similarity index 100% rename from tools/testing/radix-tree/linux/idr.h rename to tools/testing/shared/linux/idr.h diff --git a/tools/testing/radix-tree/linux/init.h b/tools/testing/shared/linux/init.h similarity index 100% rename from tools/testing/radix-tree/linux/init.h rename to tools/testing/shared/linux/init.h diff --git a/tools/testing/radix-tree/linux/kconfig.h b/tools/testing/shared/linux/kconfig.h similarity index 100% rename from tools/testing/radix-tree/linux/kconfig.h rename to tools/testing/shared/linux/kconfig.h diff --git a/tools/testing/radix-tree/linux/kernel.h b/tools/testing/shared/linux/kernel.h similarity index 100% rename from tools/testing/radix-tree/linux/kernel.h rename to tools/testing/shared/linux/kernel.h diff --git a/tools/testing/radix-tree/linux/kmemleak.h b/tools/testing/shared/linux/kmemleak.h similarity index 100% rename from tools/testing/radix-tree/linux/kmemleak.h rename to tools/testing/shared/linux/kmemleak.h diff --git a/tools/testing/radix-tree/linux/local_lock.h b/tools/testing/shared/linux/local_lock.h similarity index 100% rename from tools/testing/radix-tree/linux/local_lock.h rename to tools/testing/shared/linux/local_lock.h diff --git a/tools/testing/radix-tree/linux/lockdep.h b/tools/testing/shared/linux/lockdep.h similarity index 100% rename from tools/testing/radix-tree/linux/lockdep.h rename to tools/testing/shared/linux/lockdep.h diff --git a/tools/testing/radix-tree/linux/maple_tree.h b/tools/testing/shared/linux/maple_tree.h similarity index 100% rename from tools/testing/radix-tree/linux/maple_tree.h rename to tools/testing/shared/linux/maple_tree.h diff --git a/tools/testing/radix-tree/linux/percpu.h b/tools/testing/shared/linux/percpu.h similarity index 100% rename from tools/testing/radix-tree/linux/percpu.h rename to tools/testing/shared/linux/percpu.h diff --git a/tools/testing/radix-tree/linux/preempt.h b/tools/testing/shared/linux/preempt.h similarity index 100% rename from tools/testing/radix-tree/linux/preempt.h rename to tools/testing/shared/linux/preempt.h diff --git a/tools/testing/radix-tree/linux/radix-tree.h b/tools/testing/shared/linux/radix-tree.h similarity index 100% rename from tools/testing/radix-tree/linux/radix-tree.h rename to tools/testing/shared/linux/radix-tree.h diff --git a/tools/testing/radix-tree/linux/rcupdate.h b/tools/testing/shared/linux/rcupdate.h similarity index 100% rename from tools/testing/radix-tree/linux/rcupdate.h rename to tools/testing/shared/linux/rcupdate.h diff --git a/tools/testing/radix-tree/linux/xarray.h b/tools/testing/shared/linux/xarray.h similarity index 100% rename from tools/testing/radix-tree/linux/xarray.h rename to tools/testing/shared/linux/xarray.h diff --git a/tools/testing/shared/maple-shared.h b/tools/testing/shared/maple-shared.h new file mode 100644 index 000000000000..3d847edd149d --- /dev/null +++ b/tools/testing/shared/maple-shared.h @@ -0,0 +1,9 @@ +/* SPDX-License-Identifier: GPL-2.0+ */ + +#define CONFIG_DEBUG_MAPLE_TREE +#define CONFIG_MAPLE_SEARCH +#define MAPLE_32BIT (MAPLE_NODE_SLOTS > 31) +#include "shared.h" +#include +#include +#include "linux/init.h" diff --git a/tools/testing/shared/maple-shim.c b/tools/testing/shared/maple-shim.c new file mode 100644 index 000000000000..640df76f483e --- /dev/null +++ b/tools/testing/shared/maple-shim.c @@ -0,0 +1,7 @@ +// SPDX-License-Identifier: GPL-2.0-or-later + +/* Very simple shim around the maple tree. */ + +#include "maple-shared.h" + +#include "../../../lib/maple_tree.c" diff --git a/tools/testing/shared/shared.h b/tools/testing/shared/shared.h new file mode 100644 index 000000000000..f08f683812ad --- /dev/null +++ b/tools/testing/shared/shared.h @@ -0,0 +1,33 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#include +#include +#include +#include + +#include +#include + +#ifndef module_init +#define module_init(x) +#endif + +#ifndef module_exit +#define module_exit(x) +#endif + +#ifndef MODULE_AUTHOR +#define MODULE_AUTHOR(x) +#endif + +#ifndef MODULE_LICENSE +#define MODULE_LICENSE(x) +#endif + +#ifndef MODULE_DESCRIPTION +#define MODULE_DESCRIPTION(x) +#endif + +#ifndef dump_stack +#define dump_stack() assert(0) +#endif diff --git a/tools/testing/shared/shared.mk b/tools/testing/shared/shared.mk new file mode 100644 index 000000000000..a05f0588513a --- /dev/null +++ b/tools/testing/shared/shared.mk @@ -0,0 +1,72 @@ +# SPDX-License-Identifier: GPL-2.0 + +CFLAGS += -I../shared -I. -I../../include -I../../../lib -g -Og -Wall \ + -D_LGPL_SOURCE -fsanitize=address -fsanitize=undefined +LDFLAGS += -fsanitize=address -fsanitize=undefined +LDLIBS += -lpthread -lurcu +LIBS := slab.o find_bit.o bitmap.o hweight.o vsprintf.o +SHARED_OFILES = xarray-shared.o radix-tree.o idr.o linux.o $(LIBS) + +SHARED_DEPS = Makefile ../shared/shared.mk ../shared/*.h generated/map-shift.h \ + generated/bit-length.h generated/autoconf.h \ + ../../include/linux/*.h \ + ../../include/asm/*.h \ + ../../../include/linux/xarray.h \ + ../../../include/linux/maple_tree.h \ + ../../../include/linux/radix-tree.h \ + ../../../lib/radix-tree.h \ + ../../../include/linux/idr.h + +ifndef SHIFT + SHIFT=3 +endif + +ifeq ($(BUILD), 32) + CFLAGS += -m32 + LDFLAGS += -m32 +LONG_BIT := 32 +endif + +ifndef LONG_BIT +LONG_BIT := $(shell getconf LONG_BIT) +endif + +%.o: ../shared/%.c + $(CC) -c $(CFLAGS) $< -o $@ + +vpath %.c ../../lib + +$(SHARED_OFILES): $(SHARED_DEPS) + +radix-tree.c: ../../../lib/radix-tree.c + sed -e 's/^static //' -e 's/__always_inline //' -e 's/inline //' < $< > $@ + +idr.c: ../../../lib/idr.c + sed -e 's/^static //' -e 's/__always_inline //' -e 's/inline //' < $< > $@ + +xarray-shared.o: ../shared/xarray-shared.c ../../../lib/xarray.c \ + ../../../lib/test_xarray.c + +maple-shared.o: ../shared/maple-shared.c ../../../lib/maple_tree.c \ + ../../../lib/test_maple_tree.c + +generated/autoconf.h: + @mkdir -p generated + cp ../shared/autoconf.h generated/autoconf.h + +generated/map-shift.h: + @mkdir -p generated + @if ! grep -qws $(SHIFT) generated/map-shift.h; then \ + echo "Generating $@"; \ + echo "#define XA_CHUNK_SHIFT $(SHIFT)" > \ + generated/map-shift.h; \ + fi + +generated/bit-length.h: FORCE + @mkdir -p generated + @if ! grep -qws CONFIG_$(LONG_BIT)BIT generated/bit-length.h; then \ + echo "Generating $@"; \ + echo "#define CONFIG_$(LONG_BIT)BIT 1" > $@; \ + fi + +FORCE: ; diff --git a/tools/testing/radix-tree/trace/events/maple_tree.h b/tools/testing/shared/trace/events/maple_tree.h similarity index 100% rename from tools/testing/radix-tree/trace/events/maple_tree.h rename to tools/testing/shared/trace/events/maple_tree.h diff --git a/tools/testing/shared/xarray-shared.c b/tools/testing/shared/xarray-shared.c new file mode 100644 index 000000000000..e90901958dcd --- /dev/null +++ b/tools/testing/shared/xarray-shared.c @@ -0,0 +1,5 @@ +// SPDX-License-Identifier: GPL-2.0-or-later + +#include "xarray-shared.h" + +#include "../../../lib/xarray.c" diff --git a/tools/testing/shared/xarray-shared.h b/tools/testing/shared/xarray-shared.h new file mode 100644 index 000000000000..ac2d16ff53ae --- /dev/null +++ b/tools/testing/shared/xarray-shared.h @@ -0,0 +1,4 @@ +/* SPDX-License-Identifier: GPL-2.0+ */ + +#define XA_DEBUG +#include "shared.h" From patchwork Mon Jul 29 11:50:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lorenzo Stoakes X-Patchwork-Id: 13744783 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 762D51487E3; Mon, 29 Jul 2024 11:51:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=205.220.165.32 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722253912; cv=fail; b=ulGGnuiAh5MQYNJEYtTn9XkC1EoYpAjP3EwoZAQyT4fE8LeinWlOa6AH9m8SviZi3Nf+Q9E0idClvjtuYK0iqJapdFVlN/n26VxXI2hcGwZv4LXjDXw3/pcENNSf2QatmldJhpjjJMppYmB+dBvlkj0wf6iCxKWbdDuhZuzT2Go= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722253912; c=relaxed/simple; bh=4VvQ48r6G+JT3DAVqxIIl53qUjA98vqVeTmeh/oEdos=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=ZCnnWPCVS20/QZEpEBvy1KTqCcPH8n6TsUmzD++aYD4vFRKukBXW5ELAridK0jW4NnPYu4vADbkkJ4G6efzzugIR3T7rdVZBLtPh2gEJwCNCdhI7X69eFxrGPf8puL5UG1yD9ine2BhsHNNoFJYqfUqIT44tZam3A672loKaMeI= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=bK8Q7QIa; dkim=pass (1024-bit key) header.d=oracle.onmicrosoft.com header.i=@oracle.onmicrosoft.com header.b=n9dKn5P1; arc=fail smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="bK8Q7QIa"; dkim=pass (1024-bit key) header.d=oracle.onmicrosoft.com header.i=@oracle.onmicrosoft.com header.b="n9dKn5P1" Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 46T8MYK3016344; Mon, 29 Jul 2024 11:51:30 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :content-transfer-encoding:content-type:mime-version; s= corp-2023-11-20; bh=659cUnBNKjbOyYlATFDrAGrEO9G6ZrUJS9LxlcUebA8=; b= bK8Q7QIa5YCHEQUHTfVTqhDGbqqwMZ+zCcHfdWFbzjdIxEEJ9GxzHGZmrpCfkHAu NcP0uj2+HvYTQb/5+LDX5OwCeU4I1VDiLRWHYl9d5XogE1kud4nUXbcIrM0y01w+ aOEImvBIhTfkwOPUk73q8EFm2TtRL+9+xYpLGqTrdotPnIibyfufuqLSAG4OWiKz j5R9+kOQ/AIJAQMg5aU3dzWBkJHYVWH/TwmhO7PasjNO49NeRpEsvclwak1YMY4e bhp/gMbr+W8GAoBGNr6MmqjysNXDPK/32flQh9UtFn8HnKr5qczVE/4x9whMpTQh kwGZuk4BuEv/t9WMZ9hn1g== Received: from phxpaimrmta02.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta02.appoci.oracle.com [147.154.114.232]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 40mqfyab3s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 29 Jul 2024 11:51:30 +0000 (GMT) Received: from pps.filterd (phxpaimrmta02.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta02.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 46TBjMM1025704; Mon, 29 Jul 2024 11:51:29 GMT Received: from nam04-bn8-obe.outbound.protection.outlook.com (mail-bn8nam04lp2044.outbound.protection.outlook.com [104.47.74.44]) by phxpaimrmta02.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 40npcegnr0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 29 Jul 2024 11:51:29 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=DeSPtd4OkwCLE5de8KkEmJOFY1Rxg8RE1YJuxP0JtU/2tpVQ8Ggy2uAWUwR2wP0SyfxF9WnkV4bJW1ERyBc5+7nB4VSDqI030A7Jsiu7X5SxWgUBcF8EskyZ+8FHMzYph3t65odUQwJiZW4OAZnEP8aQC10RONQjDWwRRXNIhL6wofRkX5i4n0lNpNKb3zK/weqlgImPmfwTKvAmMZ3VWnz1ULUMSO7EwxTdj2b23VDqVf6U9yy/mbF+sKGpkKz/ZtwmKTVHjocKky7VDLKElzJzCkNE1sOi9+8wlG6Pvxwbad02EORF7h/4dBqmErouzFoYd4uexRr3j7rRlVJeMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=659cUnBNKjbOyYlATFDrAGrEO9G6ZrUJS9LxlcUebA8=; b=sqeqhgJcbRmqJohllYDVgw6C0x0qEFTa4e0dlGyfQ5kudB4UNNPuZcJk5QVhhp8zOz+cc7QzJ9gTx2IxWSG7siFscJVyuCCYHIHYfxC8D76g/nidP8Df3qpmMCiHD30MkmoSrGK6TaPGzcqO8qnAwYKGZcoSynNaestErPiriq5ceKpx0ORcLgk2I3wgZ/Pu8AvqHaJ9nHZX2SIpFKj9GxDWpTEB3F+a8SArW1rPr5gY5ok6BIfUwbN3Srlk62ys+jH+7OljKvOcMhr0l6VlQZuFmtjdtdjeyVp1GRgl4AGMG4HRiUssKCSsq7hND/+6qaOHDV0YG9aE2jhVrrWAUg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=659cUnBNKjbOyYlATFDrAGrEO9G6ZrUJS9LxlcUebA8=; b=n9dKn5P1B3LJD2yeTFASqn7gQo5Vxz0wlXvQx6uc3uW4Kc/8D9x6Cx7sS7EEknSv3QQh3WnqLoCZ/fljVVbXz+X6/Dlj8hSXsvZbYi0nm3UZjy7oQ4yWfhr3vFdW88ULVneeHP7lsqERnQBYDwjKj5mMgk81QCiUW9tJOLlpG6U= Received: from SJ0PR10MB5613.namprd10.prod.outlook.com (2603:10b6:a03:3d0::5) by SJ0PR10MB4543.namprd10.prod.outlook.com (2603:10b6:a03:2d9::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7807.28; Mon, 29 Jul 2024 11:51:26 +0000 Received: from SJ0PR10MB5613.namprd10.prod.outlook.com ([fe80::4239:cf6f:9caa:940e]) by SJ0PR10MB5613.namprd10.prod.outlook.com ([fe80::4239:cf6f:9caa:940e%6]) with mapi id 15.20.7807.026; Mon, 29 Jul 2024 11:51:26 +0000 From: Lorenzo Stoakes To: Andrew Morton Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, "Liam R . Howlett" , Vlastimil Babka , Matthew Wilcox , Alexander Viro , Christian Brauner , Jan Kara , Eric Biederman , Kees Cook , Suren Baghdasaryan , SeongJae Park , Shuah Khan , Brendan Higgins , David Gow , Rae Moar Subject: [PATCH v4 7/7] tools: add skeleton code for userland testing of VMA logic Date: Mon, 29 Jul 2024 12:50:41 +0100 Message-ID: <533ffa2eec771cbe6b387dd049a7f128a53eb616.1722251717.git.lorenzo.stoakes@oracle.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: X-ClientProxiedBy: LO2P265CA0507.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:13b::14) To SJ0PR10MB5613.namprd10.prod.outlook.com (2603:10b6:a03:3d0::5) Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ0PR10MB5613:EE_|SJ0PR10MB4543:EE_ X-MS-Office365-Filtering-Correlation-Id: fe70a73d-5ae0-4524-d0b6-08dcafc4c687 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|1800799024|7416014|376014; X-Microsoft-Antispam-Message-Info: KFW8TijrRlOKo5iKXdKWSdPXuSaG68VW1+7+1xQmhhEfFVtm0fJeDo+YN+hBoQB4j6CoYrTnWQGSgmIjOJhWGjyl1GEykF0v6UYlh4rv4u5FPDggFuAsZB5vZcu501V/p32wQ7AmygScZuomvybzRCWJF2HbkRoZ7QB8X4Ht2WHplTooqUvUO+awjygPXBt/OZLwKKXops9omoQ2CxAeBdqYm8VGNUCcN9kqwwQdfasaHCd+MDYWN3XHyw5Qjk2QIMbI7oiRBFEUQf+3pPyPs4MQGoL2QgtVRrcDEP/lPE4lqZ5xqvosUWMuq+yHGOuSAfBCuUEDFUFj8tVE/xwlc+FJdwBFyDp2IcZ/NkqP5QgWvT6zApR24lXtBAX7mJQQIhrwLo0wqF4uJKLxCFGQMzh+qBTIaiNZg32NJ0b49MqnlR1Ps477GzZ+YM1AGZ3VhViVZOWaULcKq+2RbpX2E22sLugohV2MiNNa8mbt6eZ7fOJN8OFYKKpV10YGCf40SZcPTF3JVhLkDq+ropvdPtTgHCYm3D97YoRTzSmSv1stPAPgdInQNc0w4b2PhbdJw6dvhy+rMCyLn7FywXIBQkg8VShtoNxRACBvDFx02lPovmBgxsM2EzazEkLjTPUS+AFi2xlLQL+58NAuNrA5cs+3qZtxIcNkR7Uh/AYQ62DKkxmDkAMKkxLurHfELrNow7iimpFQ0UKCUjT3x2d7jISko2405wznPuRM5/xZhDikPcrRppjdKD4jJ9M2Gw5NlNRs/lH5crPfmegYP3QThyBDA0/dhwZBypYUKvcFC+s7ke5fL7EdYcro3NbSCeji2MKOhOCdDAVyQz7vVDZy2GR4xJtih8dHpG3sWl+KBHB1tP3t5Bq4Rtgi7hUlnuDGJg6ma7GTAYkXvpZInE32iH0hqITwIaZif0shl5/0wk/BvLvaT51PSr+rTI2zAkAvsgkXyuVvwaHDSTZsqb/p6MNjoa2+oN6s3s6Qqe85GHln6P6yKBsvYEa35sOL2f6EGieqWZY/v9fBEH7o9B//TpAIb5ZUV5LATP6B+zR6iupYLlNdVo+Xjgnrrlda1lMcJwiDtZMP8LVmkf7UxwAM3aI9V8sxmpneYMV48zks6CctOR56j6FEUx7EXB6qhoUgGpfvZJhf7HEZpCYmFegOlO9qFyBi1z3rdQ34l/o4xsfbvJGmP7HE5CGAgCRKYc+GEFtrbDfw7CUMDSMP+rqg8Xl0uiGIBGEsdy09skJDym4vKZdWz+ORD8rwTaq0ZlzzLNp9+F3yWC8nakX+Mrbml4qxbVa8zOJMnbq1+bUwFMYsP5ahQSeK+J8YyJ29zlLyi/uFXs+m35dOUACEZEcNCg== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ0PR10MB5613.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(1800799024)(7416014)(376014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: jHV7KGKUkKfidZ0yfB249QOeadb09vrib7KtkiE2m6zNpETtLy617/imiwTsgueqxL/n6FP3trmy6Ml5FMnwnpQDU6LUKCfWqjrsn3HfOcFqErd67UYK6tUfZgRjtChWk1v4wmTUjBSgjm2bpTbkSlwPY78JrlX92ofSIROweP4O5o6EMNEvbg7cGiYI+LoTLOnqtalWuZfxKTueUME48oajJn/JthEzrB96kxKgRrD1xC/EbcASPdz5G2Jc6xK/qN0X/LgjlNCOPNPhrsmH0gcfhrlekf8vjCyn826BgU5WECEhpCkNLQdcdYf1zhEHBRhBURIRzGmpJiy+w2EldsjNR66VKizsEaN93OtYJf0bqUCXqf3pAHbvpC8Qd4W7Fg0I3YgFHPyzp7KcCxFdm+m9Uxtdv9o9b+uecJmDOTeAX48zQ6iNRWcJIVoqlB05Xi5SyC9CDWWgkaUhJVM1jF8kxnOYP/oi75VUoe7v1KozZam4XMMc+OTtTZdHY8I6bEbFhygWUQ/7BZJDLjZ34oE5MZw2zQfKTd1027a2FQRsvEncLc1j0VnbxvbtFZoqafPpWataAGblQDrfDqPP0sRRKshMvxSsz00ORGSAAhIX5VYwhLqd8INljqkhl8jRzDfPHcHnxTfB8szxWt8clag/H/EwDjRV+jU3VjaLNaUzNV9SHWFmMQJG3hna/rkB9Q7cvd3348wsF9HOPSpkGwnWjmVZCW0Wu7sfUQuYuLzb6y/sR74QXVvBo/Q3LReOJcvjeVPOR/cSHA6u9ntXQCjc58pdDQ/03Xp582VAwpbLEx2TKzfNtj6D9xiiRMTe+1VmFbD7E+Ra+PoYf8wnOLl+fOqYwgbxHOwqOcgJ7WPrnemtWyIj6g1kYO5KrkxxNcUNt7Y7ald49IBQ1c18upACfm6pR/an7BvBvqdxC7fNoLmFnpw1cDqXqjM0qL7itSvgvNluf4RjMXPRE6ZHNDZrhVCpPmgh8+vWfgoDvtnqU5qqzMMnhDmzIMLwsLJ9plIBwTsKovDUlSvjPEbH5UTmOnoa+fcPSrwKAC8kUuUqSHzu6Mlaye9IYMiIS3C5vzsqvpymaO9ZpvMlNId17l6pOIK6PAdBhhY+iImC44OqPqFtdCWg2u10Q5LXGhQd/7XVB9E4ZDA5AAScJyqu6tPooA0b8l22EU073yacEzZHNT/tTe2TNZNqD3LgnNgbTyZNK/0/gnMmlta+KE+ifNurpVL3DIiAtE9iGNSSCvb6v7rVTikZDlGyaRr5WSiSCPI46V/DjPESf9fQwnP/IAe3a0j6mYyckLR7sbvWVpg6CM/jDn6OU4dnCTiM/A9cnkfQ3JrQlD5xfvFvC0HCvrqIOoHzvNzpw/+U+f7sbcPWRE9oA9JNYpfGqgZmuSCG7mFaQBruc2iCYMG2tg8YlDR19GQLhKCnDd68F5pYhEJ45ePvd3F/VLTIlgCsUDvWBCCl3LuAfbaewU8uYTK6SFEgDOCfu0wTeb7jm2/3JlNrWYpm4ZBquRnmvAA15bnolQJmE9mb9NGkAD4moZdZbv3qGnG+TflMm++PIzH/XFClE4zWki97oOu8yHzcKQQMVL8at2vw56cpFfFeQg7JNw== X-MS-Exchange-AntiSpam-ExternalHop-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-ExternalHop-MessageData-0: QRcis9xl5rBqX6wshL0EwJNgqtbHed1CPIcoVw8FEZ4eUp+d+J91STPYMeQbEas2IiXPOitOzdFDoFDKdK5wLWMt9zqyDTSu2qpMYsIkE7/oTI2UGFuDPw+rc7tuB6G7HESIrestZaWL5FEixWEOAYkhDjx9a8xPXz8q4gpi5hXYxv+J2zO9XqQk7uFmWJNCfVxN/bxO7b3mpLZZQef27FHBIJDTr6dMobpNI+IQ765iGcDsOwxio87j2QAipnRi00PxavrAgLn8rHTIOQlFSqpGlB7MFg59FXzmEhtHIe1LVdd8UyOyLX/y5C0ct0Les+DdayvajD+7SOkUP87pkVHt1rbCuKJF2bjpEg0/t8lhFQbeWIy96fUwX+b0nV8rZDfcNNU9hjg4cDIi3NpWSYYjki+FoxKNVvuTMTEDQckUBmOpuPOBF4M8w2WC/boADkFvFHmwT6SZnT4mYuDzi5IguJgqEDIORUBGnh8bUdF7xJ/2mFyNot+svT/4MzBYqDh6KqDpB9tws5xQkwXJmC8EIK4ebdMGemJ1D1LFILNcit9O131eHd11XR1v5q7V13BTEHcUcNATDc0Uzsb6Fn8jHSLBJ0NprAs/yny8Mc0= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: fe70a73d-5ae0-4524-d0b6-08dcafc4c687 X-MS-Exchange-CrossTenant-AuthSource: SJ0PR10MB5613.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Jul 2024 11:51:26.1585 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 0NbY9kDB2nRs4WdhaOV+XI5wgmarPIEtqINsSyJVKNpk6JbLzZJ6qF2rmJ03p/tulW9TmT+6vZY6jfjUwS8r4cQaOFYu+mQdIh+36rpornM= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR10MB4543 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-07-29_10,2024-07-26_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 mlxlogscore=999 bulkscore=0 suspectscore=0 spamscore=0 malwarescore=0 adultscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2407110000 definitions=main-2407290080 X-Proofpoint-GUID: 71JI87S15r7gsF_mJvCU5IusftPwd7S_ X-Proofpoint-ORIG-GUID: 71JI87S15r7gsF_mJvCU5IusftPwd7S_ Establish a new userland VMA unit testing implementation under tools/testing which utilises existing logic providing maple tree support in userland utilising the now-shared code previously exclusive to radix tree testing. This provides fundamental VMA operations whose API is defined in mm/vma.h, while stubbing out superfluous functionality. This exists as a proof-of-concept, with the test implementation functional and sufficient to allow userland compilation of vma.c, but containing only cursory tests to demonstrate basic functionality. Tested-by: SeongJae Park Acked-by: Vlastimil Babka Reviewed-by: Liam R. Howlett Signed-off-by: Lorenzo Stoakes --- MAINTAINERS | 1 + tools/testing/vma/.gitignore | 7 + tools/testing/vma/Makefile | 16 + tools/testing/vma/linux/atomic.h | 12 + tools/testing/vma/linux/mmzone.h | 38 ++ tools/testing/vma/vma.c | 207 ++++++++ tools/testing/vma/vma_internal.h | 882 +++++++++++++++++++++++++++++++ 7 files changed, 1163 insertions(+) create mode 100644 tools/testing/vma/.gitignore create mode 100644 tools/testing/vma/Makefile create mode 100644 tools/testing/vma/linux/atomic.h create mode 100644 tools/testing/vma/linux/mmzone.h create mode 100644 tools/testing/vma/vma.c create mode 100644 tools/testing/vma/vma_internal.h diff --git a/MAINTAINERS b/MAINTAINERS index d4cc9f832d49..c0baa29a8323 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -24395,6 +24395,7 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm F: mm/vma.c F: mm/vma.h F: mm/vma_internal.h +F: tools/testing/vma/ VMALLOC M: Andrew Morton diff --git a/tools/testing/vma/.gitignore b/tools/testing/vma/.gitignore new file mode 100644 index 000000000000..b003258eba79 --- /dev/null +++ b/tools/testing/vma/.gitignore @@ -0,0 +1,7 @@ +# SPDX-License-Identifier: GPL-2.0-only +generated/bit-length.h +generated/map-shift.h +generated/autoconf.h +idr.c +radix-tree.c +vma diff --git a/tools/testing/vma/Makefile b/tools/testing/vma/Makefile new file mode 100644 index 000000000000..bfc905d222cf --- /dev/null +++ b/tools/testing/vma/Makefile @@ -0,0 +1,16 @@ +# SPDX-License-Identifier: GPL-2.0-or-later + +.PHONY: default + +default: vma + +include ../shared/shared.mk + +OFILES = $(SHARED_OFILES) vma.o maple-shim.o +TARGETS = vma + +vma: $(OFILES) vma_internal.h ../../../mm/vma.c ../../../mm/vma.h + $(CC) $(CFLAGS) -o $@ $(OFILES) $(LDLIBS) + +clean: + $(RM) $(TARGETS) *.o radix-tree.c idr.c generated/map-shift.h generated/bit-length.h generated/autoconf.h diff --git a/tools/testing/vma/linux/atomic.h b/tools/testing/vma/linux/atomic.h new file mode 100644 index 000000000000..e01f66f98982 --- /dev/null +++ b/tools/testing/vma/linux/atomic.h @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ + +#ifndef _LINUX_ATOMIC_H +#define _LINUX_ATOMIC_H + +#define atomic_t int32_t +#define atomic_inc(x) uatomic_inc(x) +#define atomic_read(x) uatomic_read(x) +#define atomic_set(x, y) do {} while (0) +#define U8_MAX UCHAR_MAX + +#endif /* _LINUX_ATOMIC_H */ diff --git a/tools/testing/vma/linux/mmzone.h b/tools/testing/vma/linux/mmzone.h new file mode 100644 index 000000000000..33cd1517f7a3 --- /dev/null +++ b/tools/testing/vma/linux/mmzone.h @@ -0,0 +1,38 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ + +#ifndef _LINUX_MMZONE_H +#define _LINUX_MMZONE_H + +#include + +struct pglist_data *first_online_pgdat(void); +struct pglist_data *next_online_pgdat(struct pglist_data *pgdat); + +#define for_each_online_pgdat(pgdat) \ + for (pgdat = first_online_pgdat(); \ + pgdat; \ + pgdat = next_online_pgdat(pgdat)) + +enum zone_type { + __MAX_NR_ZONES +}; + +#define MAX_NR_ZONES __MAX_NR_ZONES +#define MAX_PAGE_ORDER 10 +#define MAX_ORDER_NR_PAGES (1 << MAX_PAGE_ORDER) + +#define pageblock_order MAX_PAGE_ORDER +#define pageblock_nr_pages BIT(pageblock_order) +#define pageblock_align(pfn) ALIGN((pfn), pageblock_nr_pages) +#define pageblock_start_pfn(pfn) ALIGN_DOWN((pfn), pageblock_nr_pages) + +struct zone { + atomic_long_t managed_pages; +}; + +typedef struct pglist_data { + struct zone node_zones[MAX_NR_ZONES]; + +} pg_data_t; + +#endif /* _LINUX_MMZONE_H */ diff --git a/tools/testing/vma/vma.c b/tools/testing/vma/vma.c new file mode 100644 index 000000000000..48e033c60d87 --- /dev/null +++ b/tools/testing/vma/vma.c @@ -0,0 +1,207 @@ +// SPDX-License-Identifier: GPL-2.0-or-later + +#include +#include +#include + +#include "maple-shared.h" +#include "vma_internal.h" + +/* + * Directly import the VMA implementation here. Our vma_internal.h wrapper + * provides userland-equivalent functionality for everything vma.c uses. + */ +#include "../../../mm/vma.c" + +const struct vm_operations_struct vma_dummy_vm_ops; + +#define ASSERT_TRUE(_expr) \ + do { \ + if (!(_expr)) { \ + fprintf(stderr, \ + "Assert FAILED at %s:%d:%s(): %s is FALSE.\n", \ + __FILE__, __LINE__, __FUNCTION__, #_expr); \ + return false; \ + } \ + } while (0) +#define ASSERT_FALSE(_expr) ASSERT_TRUE(!(_expr)) +#define ASSERT_EQ(_val1, _val2) ASSERT_TRUE((_val1) == (_val2)) +#define ASSERT_NE(_val1, _val2) ASSERT_TRUE((_val1) != (_val2)) + +static struct vm_area_struct *alloc_vma(struct mm_struct *mm, + unsigned long start, + unsigned long end, + pgoff_t pgoff, + vm_flags_t flags) +{ + struct vm_area_struct *ret = vm_area_alloc(mm); + + if (ret == NULL) + return NULL; + + ret->vm_start = start; + ret->vm_end = end; + ret->vm_pgoff = pgoff; + ret->__vm_flags = flags; + + return ret; +} + +static bool test_simple_merge(void) +{ + struct vm_area_struct *vma; + unsigned long flags = VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE; + struct mm_struct mm = {}; + struct vm_area_struct *vma_left = alloc_vma(&mm, 0, 0x1000, 0, flags); + struct vm_area_struct *vma_middle = alloc_vma(&mm, 0x1000, 0x2000, 1, flags); + struct vm_area_struct *vma_right = alloc_vma(&mm, 0x2000, 0x3000, 2, flags); + VMA_ITERATOR(vmi, &mm, 0x1000); + + ASSERT_FALSE(vma_link(&mm, vma_left)); + ASSERT_FALSE(vma_link(&mm, vma_middle)); + ASSERT_FALSE(vma_link(&mm, vma_right)); + + vma = vma_merge_new_vma(&vmi, vma_left, vma_middle, 0x1000, + 0x2000, 1); + ASSERT_NE(vma, NULL); + + ASSERT_EQ(vma->vm_start, 0); + ASSERT_EQ(vma->vm_end, 0x3000); + ASSERT_EQ(vma->vm_pgoff, 0); + ASSERT_EQ(vma->vm_flags, flags); + + vm_area_free(vma); + mtree_destroy(&mm.mm_mt); + + return true; +} + +static bool test_simple_modify(void) +{ + struct vm_area_struct *vma; + unsigned long flags = VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE; + struct mm_struct mm = {}; + struct vm_area_struct *init_vma = alloc_vma(&mm, 0, 0x3000, 0, flags); + VMA_ITERATOR(vmi, &mm, 0x1000); + + ASSERT_FALSE(vma_link(&mm, init_vma)); + + /* + * The flags will not be changed, the vma_modify_flags() function + * performs the merge/split only. + */ + vma = vma_modify_flags(&vmi, init_vma, init_vma, + 0x1000, 0x2000, VM_READ | VM_MAYREAD); + ASSERT_NE(vma, NULL); + /* We modify the provided VMA, and on split allocate new VMAs. */ + ASSERT_EQ(vma, init_vma); + + ASSERT_EQ(vma->vm_start, 0x1000); + ASSERT_EQ(vma->vm_end, 0x2000); + ASSERT_EQ(vma->vm_pgoff, 1); + + /* + * Now walk through the three split VMAs and make sure they are as + * expected. + */ + + vma_iter_set(&vmi, 0); + vma = vma_iter_load(&vmi); + + ASSERT_EQ(vma->vm_start, 0); + ASSERT_EQ(vma->vm_end, 0x1000); + ASSERT_EQ(vma->vm_pgoff, 0); + + vm_area_free(vma); + vma_iter_clear(&vmi); + + vma = vma_next(&vmi); + + ASSERT_EQ(vma->vm_start, 0x1000); + ASSERT_EQ(vma->vm_end, 0x2000); + ASSERT_EQ(vma->vm_pgoff, 1); + + vm_area_free(vma); + vma_iter_clear(&vmi); + + vma = vma_next(&vmi); + + ASSERT_EQ(vma->vm_start, 0x2000); + ASSERT_EQ(vma->vm_end, 0x3000); + ASSERT_EQ(vma->vm_pgoff, 2); + + vm_area_free(vma); + mtree_destroy(&mm.mm_mt); + + return true; +} + +static bool test_simple_expand(void) +{ + unsigned long flags = VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE; + struct mm_struct mm = {}; + struct vm_area_struct *vma = alloc_vma(&mm, 0, 0x1000, 0, flags); + VMA_ITERATOR(vmi, &mm, 0); + + ASSERT_FALSE(vma_link(&mm, vma)); + + ASSERT_FALSE(vma_expand(&vmi, vma, 0, 0x3000, 0, NULL)); + + ASSERT_EQ(vma->vm_start, 0); + ASSERT_EQ(vma->vm_end, 0x3000); + ASSERT_EQ(vma->vm_pgoff, 0); + + vm_area_free(vma); + mtree_destroy(&mm.mm_mt); + + return true; +} + +static bool test_simple_shrink(void) +{ + unsigned long flags = VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE; + struct mm_struct mm = {}; + struct vm_area_struct *vma = alloc_vma(&mm, 0, 0x3000, 0, flags); + VMA_ITERATOR(vmi, &mm, 0); + + ASSERT_FALSE(vma_link(&mm, vma)); + + ASSERT_FALSE(vma_shrink(&vmi, vma, 0, 0x1000, 0)); + + ASSERT_EQ(vma->vm_start, 0); + ASSERT_EQ(vma->vm_end, 0x1000); + ASSERT_EQ(vma->vm_pgoff, 0); + + vm_area_free(vma); + mtree_destroy(&mm.mm_mt); + + return true; +} + +int main(void) +{ + int num_tests = 0, num_fail = 0; + + maple_tree_init(); + +#define TEST(name) \ + do { \ + num_tests++; \ + if (!test_##name()) { \ + num_fail++; \ + fprintf(stderr, "Test " #name " FAILED\n"); \ + } \ + } while (0) + + TEST(simple_merge); + TEST(simple_modify); + TEST(simple_expand); + TEST(simple_shrink); + +#undef TEST + + printf("%d tests run, %d passed, %d failed.\n", + num_tests, num_tests - num_fail, num_fail); + + return num_fail == 0 ? EXIT_SUCCESS : EXIT_FAILURE; +} diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_internal.h new file mode 100644 index 000000000000..093560e5b2ac --- /dev/null +++ b/tools/testing/vma/vma_internal.h @@ -0,0 +1,882 @@ +/* SPDX-License-Identifier: GPL-2.0+ */ +/* + * vma_internal.h + * + * Header providing userland wrappers and shims for the functionality provided + * by mm/vma_internal.h. + * + * We make the header guard the same as mm/vma_internal.h, so if this shim + * header is included, it precludes the inclusion of the kernel one. + */ + +#ifndef __MM_VMA_INTERNAL_H +#define __MM_VMA_INTERNAL_H + +#define __private +#define __bitwise +#define __randomize_layout + +#define CONFIG_MMU +#define CONFIG_PER_VMA_LOCK + +#include + +#include +#include +#include +#include +#include + +#define VM_WARN_ON(_expr) (WARN_ON(_expr)) +#define VM_WARN_ON_ONCE(_expr) (WARN_ON_ONCE(_expr)) +#define VM_BUG_ON(_expr) (BUG_ON(_expr)) +#define VM_BUG_ON_VMA(_expr, _vma) (BUG_ON(_expr)) + +#define VM_NONE 0x00000000 +#define VM_READ 0x00000001 +#define VM_WRITE 0x00000002 +#define VM_EXEC 0x00000004 +#define VM_SHARED 0x00000008 +#define VM_MAYREAD 0x00000010 +#define VM_MAYWRITE 0x00000020 +#define VM_GROWSDOWN 0x00000100 +#define VM_PFNMAP 0x00000400 +#define VM_LOCKED 0x00002000 +#define VM_IO 0x00004000 +#define VM_DONTEXPAND 0x00040000 +#define VM_ACCOUNT 0x00100000 +#define VM_MIXEDMAP 0x10000000 +#define VM_STACK VM_GROWSDOWN +#define VM_SHADOW_STACK VM_NONE +#define VM_SOFTDIRTY 0 + +#define VM_ACCESS_FLAGS (VM_READ | VM_WRITE | VM_EXEC) +#define VM_SPECIAL (VM_IO | VM_DONTEXPAND | VM_PFNMAP | VM_MIXEDMAP) + +#define FIRST_USER_ADDRESS 0UL +#define USER_PGTABLES_CEILING 0UL + +#define vma_policy(vma) NULL + +#define down_write_nest_lock(sem, nest_lock) + +#define pgprot_val(x) ((x).pgprot) +#define __pgprot(x) ((pgprot_t) { (x) } ) + +#define for_each_vma(__vmi, __vma) \ + while (((__vma) = vma_next(&(__vmi))) != NULL) + +/* The MM code likes to work with exclusive end addresses */ +#define for_each_vma_range(__vmi, __vma, __end) \ + while (((__vma) = vma_find(&(__vmi), (__end))) != NULL) + +#define offset_in_page(p) ((unsigned long)(p) & ~PAGE_MASK) + +#define PHYS_PFN(x) ((unsigned long)((x) >> PAGE_SHIFT)) + +#define test_and_set_bit(nr, addr) __test_and_set_bit(nr, addr) +#define test_and_clear_bit(nr, addr) __test_and_clear_bit(nr, addr) + +#define TASK_SIZE ((1ul << 47)-PAGE_SIZE) + +#define AS_MM_ALL_LOCKS 2 + +#define current NULL + +/* We hardcode this for now. */ +#define sysctl_max_map_count 0x1000000UL + +#define pgoff_t unsigned long +typedef unsigned long pgprotval_t; +typedef struct pgprot { pgprotval_t pgprot; } pgprot_t; +typedef unsigned long vm_flags_t; +typedef __bitwise unsigned int vm_fault_t; + +typedef struct refcount_struct { + atomic_t refs; +} refcount_t; + +struct kref { + refcount_t refcount; +}; + +struct anon_vma { + struct anon_vma *root; + struct rb_root_cached rb_root; +}; + +struct anon_vma_chain { + struct anon_vma *anon_vma; + struct list_head same_vma; +}; + +struct anon_vma_name { + struct kref kref; + /* The name needs to be at the end because it is dynamically sized. */ + char name[]; +}; + +struct vma_iterator { + struct ma_state mas; +}; + +#define VMA_ITERATOR(name, __mm, __addr) \ + struct vma_iterator name = { \ + .mas = { \ + .tree = &(__mm)->mm_mt, \ + .index = __addr, \ + .node = NULL, \ + .status = ma_start, \ + }, \ + } + +struct address_space { + struct rb_root_cached i_mmap; + unsigned long flags; + atomic_t i_mmap_writable; +}; + +struct vm_userfaultfd_ctx {}; +struct mempolicy {}; +struct mmu_gather {}; +struct mutex {}; +#define DEFINE_MUTEX(mutexname) \ + struct mutex mutexname = {} + +struct mm_struct { + struct maple_tree mm_mt; + int map_count; /* number of VMAs */ + unsigned long total_vm; /* Total pages mapped */ + unsigned long locked_vm; /* Pages that have PG_mlocked set */ + unsigned long data_vm; /* VM_WRITE & ~VM_SHARED & ~VM_STACK */ + unsigned long exec_vm; /* VM_EXEC & ~VM_WRITE & ~VM_STACK */ + unsigned long stack_vm; /* VM_STACK */ +}; + +struct vma_lock { + struct rw_semaphore lock; +}; + + +struct file { + struct address_space *f_mapping; +}; + +struct vm_area_struct { + /* The first cache line has the info for VMA tree walking. */ + + union { + struct { + /* VMA covers [vm_start; vm_end) addresses within mm */ + unsigned long vm_start; + unsigned long vm_end; + }; +#ifdef CONFIG_PER_VMA_LOCK + struct rcu_head vm_rcu; /* Used for deferred freeing. */ +#endif + }; + + struct mm_struct *vm_mm; /* The address space we belong to. */ + pgprot_t vm_page_prot; /* Access permissions of this VMA. */ + + /* + * Flags, see mm.h. + * To modify use vm_flags_{init|reset|set|clear|mod} functions. + */ + union { + const vm_flags_t vm_flags; + vm_flags_t __private __vm_flags; + }; + +#ifdef CONFIG_PER_VMA_LOCK + /* Flag to indicate areas detached from the mm->mm_mt tree */ + bool detached; + + /* + * Can only be written (using WRITE_ONCE()) while holding both: + * - mmap_lock (in write mode) + * - vm_lock->lock (in write mode) + * Can be read reliably while holding one of: + * - mmap_lock (in read or write mode) + * - vm_lock->lock (in read or write mode) + * Can be read unreliably (using READ_ONCE()) for pessimistic bailout + * while holding nothing (except RCU to keep the VMA struct allocated). + * + * This sequence counter is explicitly allowed to overflow; sequence + * counter reuse can only lead to occasional unnecessary use of the + * slowpath. + */ + int vm_lock_seq; + struct vma_lock *vm_lock; +#endif + + /* + * For areas with an address space and backing store, + * linkage into the address_space->i_mmap interval tree. + * + */ + struct { + struct rb_node rb; + unsigned long rb_subtree_last; + } shared; + + /* + * A file's MAP_PRIVATE vma can be in both i_mmap tree and anon_vma + * list, after a COW of one of the file pages. A MAP_SHARED vma + * can only be in the i_mmap tree. An anonymous MAP_PRIVATE, stack + * or brk vma (with NULL file) can only be in an anon_vma list. + */ + struct list_head anon_vma_chain; /* Serialized by mmap_lock & + * page_table_lock */ + struct anon_vma *anon_vma; /* Serialized by page_table_lock */ + + /* Function pointers to deal with this struct. */ + const struct vm_operations_struct *vm_ops; + + /* Information about our backing store: */ + unsigned long vm_pgoff; /* Offset (within vm_file) in PAGE_SIZE + units */ + struct file * vm_file; /* File we map to (can be NULL). */ + void * vm_private_data; /* was vm_pte (shared mem) */ + +#ifdef CONFIG_ANON_VMA_NAME + /* + * For private and shared anonymous mappings, a pointer to a null + * terminated string containing the name given to the vma, or NULL if + * unnamed. Serialized by mmap_lock. Use anon_vma_name to access. + */ + struct anon_vma_name *anon_name; +#endif +#ifdef CONFIG_SWAP + atomic_long_t swap_readahead_info; +#endif +#ifndef CONFIG_MMU + struct vm_region *vm_region; /* NOMMU mapping region */ +#endif +#ifdef CONFIG_NUMA + struct mempolicy *vm_policy; /* NUMA policy for the VMA */ +#endif +#ifdef CONFIG_NUMA_BALANCING + struct vma_numab_state *numab_state; /* NUMA Balancing state */ +#endif + struct vm_userfaultfd_ctx vm_userfaultfd_ctx; +} __randomize_layout; + +struct vm_fault {}; + +struct vm_operations_struct { + void (*open)(struct vm_area_struct * area); + /** + * @close: Called when the VMA is being removed from the MM. + * Context: User context. May sleep. Caller holds mmap_lock. + */ + void (*close)(struct vm_area_struct * area); + /* Called any time before splitting to check if it's allowed */ + int (*may_split)(struct vm_area_struct *area, unsigned long addr); + int (*mremap)(struct vm_area_struct *area); + /* + * Called by mprotect() to make driver-specific permission + * checks before mprotect() is finalised. The VMA must not + * be modified. Returns 0 if mprotect() can proceed. + */ + int (*mprotect)(struct vm_area_struct *vma, unsigned long start, + unsigned long end, unsigned long newflags); + vm_fault_t (*fault)(struct vm_fault *vmf); + vm_fault_t (*huge_fault)(struct vm_fault *vmf, unsigned int order); + vm_fault_t (*map_pages)(struct vm_fault *vmf, + pgoff_t start_pgoff, pgoff_t end_pgoff); + unsigned long (*pagesize)(struct vm_area_struct * area); + + /* notification that a previously read-only page is about to become + * writable, if an error is returned it will cause a SIGBUS */ + vm_fault_t (*page_mkwrite)(struct vm_fault *vmf); + + /* same as page_mkwrite when using VM_PFNMAP|VM_MIXEDMAP */ + vm_fault_t (*pfn_mkwrite)(struct vm_fault *vmf); + + /* called by access_process_vm when get_user_pages() fails, typically + * for use by special VMAs. See also generic_access_phys() for a generic + * implementation useful for any iomem mapping. + */ + int (*access)(struct vm_area_struct *vma, unsigned long addr, + void *buf, int len, int write); + + /* Called by the /proc/PID/maps code to ask the vma whether it + * has a special name. Returning non-NULL will also cause this + * vma to be dumped unconditionally. */ + const char *(*name)(struct vm_area_struct *vma); + +#ifdef CONFIG_NUMA + /* + * set_policy() op must add a reference to any non-NULL @new mempolicy + * to hold the policy upon return. Caller should pass NULL @new to + * remove a policy and fall back to surrounding context--i.e. do not + * install a MPOL_DEFAULT policy, nor the task or system default + * mempolicy. + */ + int (*set_policy)(struct vm_area_struct *vma, struct mempolicy *new); + + /* + * get_policy() op must add reference [mpol_get()] to any policy at + * (vma,addr) marked as MPOL_SHARED. The shared policy infrastructure + * in mm/mempolicy.c will do this automatically. + * get_policy() must NOT add a ref if the policy at (vma,addr) is not + * marked as MPOL_SHARED. vma policies are protected by the mmap_lock. + * If no [shared/vma] mempolicy exists at the addr, get_policy() op + * must return NULL--i.e., do not "fallback" to task or system default + * policy. + */ + struct mempolicy *(*get_policy)(struct vm_area_struct *vma, + unsigned long addr, pgoff_t *ilx); +#endif + /* + * Called by vm_normal_page() for special PTEs to find the + * page for @addr. This is useful if the default behavior + * (using pte_page()) would not find the correct page. + */ + struct page *(*find_special_page)(struct vm_area_struct *vma, + unsigned long addr); +}; + +static inline void vma_iter_invalidate(struct vma_iterator *vmi) +{ + mas_pause(&vmi->mas); +} + +static inline pgprot_t pgprot_modify(pgprot_t oldprot, pgprot_t newprot) +{ + return __pgprot(pgprot_val(oldprot) | pgprot_val(newprot)); +} + +static inline pgprot_t vm_get_page_prot(unsigned long vm_flags) +{ + return __pgprot(vm_flags); +} + +static inline bool is_shared_maywrite(vm_flags_t vm_flags) +{ + return (vm_flags & (VM_SHARED | VM_MAYWRITE)) == + (VM_SHARED | VM_MAYWRITE); +} + +static inline bool vma_is_shared_maywrite(struct vm_area_struct *vma) +{ + return is_shared_maywrite(vma->vm_flags); +} + +static inline struct vm_area_struct *vma_next(struct vma_iterator *vmi) +{ + /* + * Uses mas_find() to get the first VMA when the iterator starts. + * Calling mas_next() could skip the first entry. + */ + return mas_find(&vmi->mas, ULONG_MAX); +} + +static inline bool vma_lock_alloc(struct vm_area_struct *vma) +{ + vma->vm_lock = calloc(1, sizeof(struct vma_lock)); + + if (!vma->vm_lock) + return false; + + init_rwsem(&vma->vm_lock->lock); + vma->vm_lock_seq = -1; + + return true; +} + +static inline void vma_assert_write_locked(struct vm_area_struct *); +static inline void vma_mark_detached(struct vm_area_struct *vma, bool detached) +{ + /* When detaching vma should be write-locked */ + if (detached) + vma_assert_write_locked(vma); + vma->detached = detached; +} + +extern const struct vm_operations_struct vma_dummy_vm_ops; + +static inline void vma_init(struct vm_area_struct *vma, struct mm_struct *mm) +{ + memset(vma, 0, sizeof(*vma)); + vma->vm_mm = mm; + vma->vm_ops = &vma_dummy_vm_ops; + INIT_LIST_HEAD(&vma->anon_vma_chain); + vma_mark_detached(vma, false); +} + +static inline struct vm_area_struct *vm_area_alloc(struct mm_struct *mm) +{ + struct vm_area_struct *vma = calloc(1, sizeof(struct vm_area_struct)); + + if (!vma) + return NULL; + + vma_init(vma, mm); + if (!vma_lock_alloc(vma)) { + free(vma); + return NULL; + } + + return vma; +} + +static inline struct vm_area_struct *vm_area_dup(struct vm_area_struct *orig) +{ + struct vm_area_struct *new = calloc(1, sizeof(struct vm_area_struct)); + + if (!new) + return NULL; + + memcpy(new, orig, sizeof(*new)); + if (!vma_lock_alloc(new)) { + free(new); + return NULL; + } + INIT_LIST_HEAD(&new->anon_vma_chain); + + return new; +} + +/* + * These are defined in vma.h, but sadly vm_stat_account() is referenced by + * kernel/fork.c, so we have to these broadly available there, and temporarily + * define them here to resolve the dependency cycle. + */ + +#define is_exec_mapping(flags) \ + ((flags & (VM_EXEC | VM_WRITE | VM_STACK)) == VM_EXEC) + +#define is_stack_mapping(flags) \ + (((flags & VM_STACK) == VM_STACK) || (flags & VM_SHADOW_STACK)) + +#define is_data_mapping(flags) \ + ((flags & (VM_WRITE | VM_SHARED | VM_STACK)) == VM_WRITE) + +static inline void vm_stat_account(struct mm_struct *mm, vm_flags_t flags, + long npages) +{ + WRITE_ONCE(mm->total_vm, READ_ONCE(mm->total_vm)+npages); + + if (is_exec_mapping(flags)) + mm->exec_vm += npages; + else if (is_stack_mapping(flags)) + mm->stack_vm += npages; + else if (is_data_mapping(flags)) + mm->data_vm += npages; +} + +#undef is_exec_mapping +#undef is_stack_mapping +#undef is_data_mapping + +/* Currently stubbed but we may later wish to un-stub. */ +static inline void vm_acct_memory(long pages); +static inline void vm_unacct_memory(long pages) +{ + vm_acct_memory(-pages); +} + +static inline void mapping_allow_writable(struct address_space *mapping) +{ + atomic_inc(&mapping->i_mmap_writable); +} + +static inline void vma_set_range(struct vm_area_struct *vma, + unsigned long start, unsigned long end, + pgoff_t pgoff) +{ + vma->vm_start = start; + vma->vm_end = end; + vma->vm_pgoff = pgoff; +} + +static inline +struct vm_area_struct *vma_find(struct vma_iterator *vmi, unsigned long max) +{ + return mas_find(&vmi->mas, max - 1); +} + +static inline int vma_iter_clear_gfp(struct vma_iterator *vmi, + unsigned long start, unsigned long end, gfp_t gfp) +{ + __mas_set_range(&vmi->mas, start, end - 1); + mas_store_gfp(&vmi->mas, NULL, gfp); + if (unlikely(mas_is_err(&vmi->mas))) + return -ENOMEM; + + return 0; +} + +static inline void mmap_assert_locked(struct mm_struct *); +static inline struct vm_area_struct *find_vma_intersection(struct mm_struct *mm, + unsigned long start_addr, + unsigned long end_addr) +{ + unsigned long index = start_addr; + + mmap_assert_locked(mm); + return mt_find(&mm->mm_mt, &index, end_addr - 1); +} + +static inline +struct vm_area_struct *vma_lookup(struct mm_struct *mm, unsigned long addr) +{ + return mtree_load(&mm->mm_mt, addr); +} + +static inline struct vm_area_struct *vma_prev(struct vma_iterator *vmi) +{ + return mas_prev(&vmi->mas, 0); +} + +static inline void vma_iter_set(struct vma_iterator *vmi, unsigned long addr) +{ + mas_set(&vmi->mas, addr); +} + +static inline bool vma_is_anonymous(struct vm_area_struct *vma) +{ + return !vma->vm_ops; +} + +/* Defined in vma.h, so temporarily define here to avoid circular dependency. */ +#define vma_iter_load(vmi) \ + mas_walk(&(vmi)->mas) + +static inline struct vm_area_struct * +find_vma_prev(struct mm_struct *mm, unsigned long addr, + struct vm_area_struct **pprev) +{ + struct vm_area_struct *vma; + VMA_ITERATOR(vmi, mm, addr); + + vma = vma_iter_load(&vmi); + *pprev = vma_prev(&vmi); + if (!vma) + vma = vma_next(&vmi); + return vma; +} + +#undef vma_iter_load + +static inline void vma_iter_init(struct vma_iterator *vmi, + struct mm_struct *mm, unsigned long addr) +{ + mas_init(&vmi->mas, &mm->mm_mt, addr); +} + +/* Stubbed functions. */ + +static inline struct anon_vma_name *anon_vma_name(struct vm_area_struct *vma) +{ + return NULL; +} + +static inline bool is_mergeable_vm_userfaultfd_ctx(struct vm_area_struct *vma, + struct vm_userfaultfd_ctx vm_ctx) +{ + return true; +} + +static inline bool anon_vma_name_eq(struct anon_vma_name *anon_name1, + struct anon_vma_name *anon_name2) +{ + return true; +} + +static inline void might_sleep(void) +{ +} + +static inline unsigned long vma_pages(struct vm_area_struct *vma) +{ + return (vma->vm_end - vma->vm_start) >> PAGE_SHIFT; +} + +static inline void fput(struct file *) +{ +} + +static inline void mpol_put(struct mempolicy *) +{ +} + +static inline void vma_lock_free(struct vm_area_struct *vma) +{ + free(vma->vm_lock); +} + +static inline void __vm_area_free(struct vm_area_struct *vma) +{ + vma_lock_free(vma); + free(vma); +} + +static inline void vm_area_free(struct vm_area_struct *vma) +{ + __vm_area_free(vma); +} + +static inline void lru_add_drain(void) +{ +} + +static inline void tlb_gather_mmu(struct mmu_gather *, struct mm_struct *) +{ +} + +static inline void update_hiwater_rss(struct mm_struct *) +{ +} + +static inline void update_hiwater_vm(struct mm_struct *) +{ +} + +static inline void unmap_vmas(struct mmu_gather *tlb, struct ma_state *mas, + struct vm_area_struct *vma, unsigned long start_addr, + unsigned long end_addr, unsigned long tree_end, + bool mm_wr_locked) +{ + (void)tlb; + (void)mas; + (void)vma; + (void)start_addr; + (void)end_addr; + (void)tree_end; + (void)mm_wr_locked; +} + +static inline void free_pgtables(struct mmu_gather *tlb, struct ma_state *mas, + struct vm_area_struct *vma, unsigned long floor, + unsigned long ceiling, bool mm_wr_locked) +{ + (void)tlb; + (void)mas; + (void)vma; + (void)floor; + (void)ceiling; + (void)mm_wr_locked; +} + +static inline void mapping_unmap_writable(struct address_space *) +{ +} + +static inline void flush_dcache_mmap_lock(struct address_space *) +{ +} + +static inline void tlb_finish_mmu(struct mmu_gather *) +{ +} + +static inline void get_file(struct file *) +{ +} + +static inline int vma_dup_policy(struct vm_area_struct *, struct vm_area_struct *) +{ + return 0; +} + +static inline int anon_vma_clone(struct vm_area_struct *, struct vm_area_struct *) +{ + return 0; +} + +static inline void vma_start_write(struct vm_area_struct *) +{ +} + +static inline void vma_adjust_trans_huge(struct vm_area_struct *vma, + unsigned long start, + unsigned long end, + long adjust_next) +{ + (void)vma; + (void)start; + (void)end; + (void)adjust_next; +} + +static inline void vma_iter_free(struct vma_iterator *vmi) +{ + mas_destroy(&vmi->mas); +} + +static inline void vm_acct_memory(long pages) +{ +} + +static inline void vma_interval_tree_insert(struct vm_area_struct *, + struct rb_root_cached *) +{ +} + +static inline void vma_interval_tree_remove(struct vm_area_struct *, + struct rb_root_cached *) +{ +} + +static inline void flush_dcache_mmap_unlock(struct address_space *) +{ +} + +static inline void anon_vma_interval_tree_insert(struct anon_vma_chain*, + struct rb_root_cached *) +{ +} + +static inline void anon_vma_interval_tree_remove(struct anon_vma_chain*, + struct rb_root_cached *) +{ +} + +static inline void uprobe_mmap(struct vm_area_struct *) +{ +} + +static inline void uprobe_munmap(struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + (void)vma; + (void)start; + (void)end; +} + +static inline void i_mmap_lock_write(struct address_space *) +{ +} + +static inline void anon_vma_lock_write(struct anon_vma *) +{ +} + +static inline void vma_assert_write_locked(struct vm_area_struct *) +{ +} + +static inline void unlink_anon_vmas(struct vm_area_struct *) +{ +} + +static inline void anon_vma_unlock_write(struct anon_vma *) +{ +} + +static inline void i_mmap_unlock_write(struct address_space *) +{ +} + +static inline void anon_vma_merge(struct vm_area_struct *, + struct vm_area_struct *) +{ +} + +static inline int userfaultfd_unmap_prep(struct vm_area_struct *vma, + unsigned long start, + unsigned long end, + struct list_head *unmaps) +{ + (void)vma; + (void)start; + (void)end; + (void)unmaps; + + return 0; +} + +static inline void mmap_write_downgrade(struct mm_struct *) +{ +} + +static inline void mmap_read_unlock(struct mm_struct *) +{ +} + +static inline void mmap_write_unlock(struct mm_struct *) +{ +} + +static inline bool can_modify_mm(struct mm_struct *mm, + unsigned long start, + unsigned long end) +{ + (void)mm; + (void)start; + (void)end; + + return true; +} + +static inline void arch_unmap(struct mm_struct *mm, + unsigned long start, + unsigned long end) +{ + (void)mm; + (void)start; + (void)end; +} + +static inline void mmap_assert_locked(struct mm_struct *) +{ +} + +static inline bool mpol_equal(struct mempolicy *, struct mempolicy *) +{ + return true; +} + +static inline void khugepaged_enter_vma(struct vm_area_struct *vma, + unsigned long vm_flags) +{ + (void)vma; + (void)vm_flags; +} + +static inline bool mapping_can_writeback(struct address_space *) +{ + return true; +} + +static inline bool is_vm_hugetlb_page(struct vm_area_struct *) +{ + return false; +} + +static inline bool vma_soft_dirty_enabled(struct vm_area_struct *) +{ + return false; +} + +static inline bool userfaultfd_wp(struct vm_area_struct *) +{ + return false; +} + +static inline void mmap_assert_write_locked(struct mm_struct *) +{ +} + +static inline void mutex_lock(struct mutex *) +{ +} + +static inline void mutex_unlock(struct mutex *) +{ +} + +static inline bool mutex_is_locked(struct mutex *) +{ + return true; +} + +static inline bool signal_pending(void *) +{ + return false; +} + +#endif /* __MM_VMA_INTERNAL_H */