From patchwork Wed Feb 5 05:48:07 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Stephen Eta Zhou X-Patchwork-Id: 13960578 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F8E7C02192 for ; Wed, 5 Feb 2025 05:48:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 64064280005; Wed, 5 Feb 2025 00:48:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5EFFE280004; Wed, 5 Feb 2025 00:48:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 380C0280005; Wed, 5 Feb 2025 00:48:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id F1003280004 for ; Wed, 5 Feb 2025 00:48:13 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 77E60C06FA for ; Wed, 5 Feb 2025 05:48:13 +0000 (UTC) X-FDA: 83084810466.04.979547E Received: from NAM10-BN7-obe.outbound.protection.outlook.com (mail-bn7nam10olkn2108.outbound.protection.outlook.com [40.92.40.108]) by imf21.hostedemail.com (Postfix) with ESMTP id 89DE81C000C for ; Wed, 5 Feb 2025 05:48:10 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=outlook.com header.s=selector1 header.b=amZOb+J0; arc=pass ("microsoft.com:s=arcselector10001:i=1"); spf=pass (imf21.hostedemail.com: domain of stephen.eta.zhou@outlook.com designates 40.92.40.108 as permitted sender) smtp.mailfrom=stephen.eta.zhou@outlook.com; dmarc=pass (policy=none) header.from=outlook.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738734490; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=HkUS3g9TMijGcLAzY5hmcR7CsqJ7J0KD8wRwzSElIto=; b=Z0J0c1JzfbM3Zg1AZixYd9CFeBDj7Wkzf+qQsDfr9ADn3tlOPLQQ0ThkQOJ0Wl781bZP1j dtLX5I6T2mmLVoEGrrLHNIKy6EVyhGPS8puLveNnxvsUcPXC1yV5iucwhbM/6yUIY65ljv 14TrZo/LouvN1LJb1Bn2nfEUpaALXBc= ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1738734490; a=rsa-sha256; cv=pass; b=Lp0pnTAIwH46LBu+qb6dXnqCyo6zBUBiwfq32XSiuFYEMjPQpJn1JphcwVC/VTYfSaR624 I7bsbpGybMcfTdJA+gNdzQzhLCPm5erJOuz2dTbgPYl4nvh8UY/ps07VrPMRfzJlbQbg+q EwGlguqOr1C3hEMsfoBGYmbW6z1S5wA= ARC-Authentication-Results: i=2; imf21.hostedemail.com; dkim=pass header.d=outlook.com header.s=selector1 header.b=amZOb+J0; arc=pass ("microsoft.com:s=arcselector10001:i=1"); spf=pass (imf21.hostedemail.com: domain of stephen.eta.zhou@outlook.com designates 40.92.40.108 as permitted sender) smtp.mailfrom=stephen.eta.zhou@outlook.com; dmarc=pass (policy=none) header.from=outlook.com ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=eEurlXTFIqv0TT8FUWaIRz90Hi5DTY8klmtNiTJki5/6le9cpGD4vtam5U6f9cexlM99Y0+oomy5rU2X+7MH5ysD2D+HniT5xgPrWc75l+BLvdqOPhoH7xH56KloPDtoIB82AD2DEZ4nMQw+su/JMw/+mJ+ajkoIT4nEWPtXOK/WOEZ+fr+QpXOJxd4lvotwKTDlkXWzR5dNUvVk9e3OilH0hWcQL4woTK6UL/+Y5Gk9yK6mNvoCVvfQS4BLYW3qyWVtiNcHQy1hMbWbZxCPqJ1plkOzv8Tx65Z5kyVKMt0cC56LHbJUKYi3oFRgROqu6we92uckitu4tkZcTUaD8g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=HkUS3g9TMijGcLAzY5hmcR7CsqJ7J0KD8wRwzSElIto=; b=VdO+NGRBFqpDhhp65DtfjiQ3PsXQuxOOPih/O5BGed4UfrBqPyTnqW91ls60YnhAtH5IcGaThZL+Bd4KYKL+4NsprkVJMdDS/E+u5dAtga76GNr4qJ1Qam9SYB2t4RH0R0D5Q6U10Wm01iBH+U+KyhBc4gMrpgKIhflJlaiddsbJBubt4O/drqozKY4qt4/G5bgQq+omJwKDj3mENAiuvnhNsHh8Lln2kedmH7sTURDSJEvDsuHvVoheuXObqPPTpCP9Ndzv7ncTU/BH3k8IlBUCft0UeGVt8l2EBHf8X12xRW7+xWCVtOuusvb4PjUrS1cd7RBaXH+cjHHQyKqFMg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outlook.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=HkUS3g9TMijGcLAzY5hmcR7CsqJ7J0KD8wRwzSElIto=; b=amZOb+J0BpBHenng9fRLTp9YWS1JmjzfaDUNEwz5b66aFOA6ya/60JEY2Nib2+lg0NCYhhSeZSwDp2PBG8yUTATPn99YB7Oh/Fer5F8cbh+lsopm8x1Yr9oTjvHwBF0ARQFAUXU9KHbk3872KucDsy0U7vVWFz0VmP3nslgtZMl4JpuaSGDd21SaIQ7jY6tUms02MSYYny0+5fV8UMXf91/Xe9UeS1aQ4n/ICBJGiAh7St6SE5XKHy5f2pYspNG7fg0PzdpWPrM7InKrNac2hW8vXFUbRkdBX8qRaipcEBahFEG+mPgtquCBo++DufPpdE2qMt6ghsqbzW9NejzVGA== Received: from BYAPR12MB3205.namprd12.prod.outlook.com (2603:10b6:a03:134::32) by DS0PR12MB7703.namprd12.prod.outlook.com (2603:10b6:8:130::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8398.26; Wed, 5 Feb 2025 05:48:08 +0000 Received: from BYAPR12MB3205.namprd12.prod.outlook.com ([fe80::489:5515:fcf2:b991]) by BYAPR12MB3205.namprd12.prod.outlook.com ([fe80::489:5515:fcf2:b991%5]) with mapi id 15.20.8398.025; Wed, 5 Feb 2025 05:48:08 +0000 From: Stephen Eta Zhou To: "rppt@kernel.org" , "akpm@linux-foundation.org" CC: "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" Subject: [PATCH] mm: optimize memblock_add_range() for improved performance Thread-Topic: [PATCH] mm: optimize memblock_add_range() for improved performance Thread-Index: AQHbd4Arig3jAnQGoEGcpI4Kdsm+gw== Date: Wed, 5 Feb 2025 05:48:07 +0000 Message-ID: Accept-Language: zh-CN, en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: x-ms-publictraffictype: Email x-ms-traffictypediagnostic: BYAPR12MB3205:EE_|DS0PR12MB7703:EE_ x-ms-office365-filtering-correlation-id: 70a2173a-1ddb-4059-7ad2-08dd45a8aad4 x-microsoft-antispam: BCL:0;ARA:14566002|8062599003|15030799003|19110799003|8060799006|5062599005|12121999004|15080799006|461199028|440099028|3412199025|41001999003|3430499032|102099032|56899033; x-microsoft-antispam-message-info: =?utf-8?q?lXQW557tV7FrQnAjgSvDeTCdbGpcOY8?= =?utf-8?q?x1GIrmK79jSrNWyHu3qd3PU9KpDvAsLf3Q+q2KoBU22VOEvo7BzV6OmiZIkzfRcT+?= =?utf-8?q?qnpB0bK+UlU0Sz9HPYjJL1XvhAgpG0Ahwr071STGAZSfWMDIvusR3ZW4/AlOPYIRb?= =?utf-8?q?A5fF3HDnhytHwnCmiTZKNOO/YUunk6iIjxc0vd3JqShGXIU6MngtG6IB5MdCU3vVK?= =?utf-8?q?7Es8ar+r6uTlxhy8mAnTR09YflSpyyUEWhlnBaFUn/AdRE31scAlPDmqogLjnOlAK?= =?utf-8?q?jMJKFQojEUeCNLLOxQ5a4dqivPrdKyz+d4OE93CsRsE97H1N0wE719tH/wd48p3+U?= =?utf-8?q?vj3kingsc1rQ8EvZw47pBifsHhyPpo6zPHmWcU6w8eBDeqtXXzzv6Xa+etK3QrgQp?= =?utf-8?q?m28RkjudtVQQOlTi3NWyftUOSqOBmMkbHeaKkpvvKMjrDuaipoPaoEOLe4loej4io?= =?utf-8?q?PZ6Qkkl+e8loJW2ggd6J9kcVfJmD3rz4caZPb3WQFPkL7T2mKw6lOdpafbSTKRqtf?= =?utf-8?q?5oRF3VjayZSBzBQy2V0b543GEe7OapAHcTfTy5RSxn6NdgZmSGN/edolOX9v9Jmrm?= =?utf-8?q?38VWtLGO25RCGIkyVSGssNeb1gWVqKrlgFuDD3l8O1xcYXTqA4cIIHixCHPt0WSh+?= =?utf-8?q?TV3gtlOuIhygcJcmw+zcO+bC0ZQqIlG3ZKOmuDb2ysmDGAJpNTVCD0e3oJCNK50Vv?= =?utf-8?q?yt2DLkuKYgYDiLOtQF8m1oEZVNkx6gJ+61CpBAB/EG9Gv1pvhLpojViqnj9mFn0Kz?= =?utf-8?q?IkEubz+Z08C+JVR5Z9qh5yKe58GuMl5InyHvstWxoAi60Wl14ojwyhGa0YSyeVYoK?= =?utf-8?q?pzx7fiVKNPObWoXdTf2sD9snoP1HJdSsVPMe1pRn0aBxZ+QV4hOdBSr3oY3/PidAg?= =?utf-8?q?aVxp40/YdZ7c+G+MhRIdREqSUEwDPuYWLUOzjIhieRvd7caj5FdbguZDT+53FX3MO?= =?utf-8?q?aQhE52iJrlWzGBd9s2KjFDYauTefkLCWl7tbXJvRHB4kzuZRIzNiX+Ws5k/ZsKfg1?= =?utf-8?q?dzWdvu7qTObozc/CHppToHGeR/2GGNhcfGrHfkTYoLHMhpX2XW1slZ42tppbjDQHQ?= =?utf-8?q?pvEzAZD8fDCz/F/tcWTzNvPUc3G3Ed8bbayHhbWvy7lEONiSk+aWnP+GKsA=3D?= x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?utf-8?q?50Xhfxk4B4LqBx8dMEZycr+OBfID?= =?utf-8?q?nE2RsB0Sr55+WTPtbEcMM1Q8tx7M9FqZxtrxbXDFbg/mVoui0TT7nGvF3G1FB8fA1?= =?utf-8?q?cWNOoJEWnKYn5kl8K8FlDvdpnFLSvpcN3yO5ggxjF8t0MrFL5fsmnl4QIUKi7WHuh?= =?utf-8?q?xxmqOF5MZMWUYrhCeaacSf9uQmxbCD53ssNuHwL6XJ63N/bRwMlZrANKoP1G7YSVM?= =?utf-8?q?Vk7cJO6RyX0otI1IBdP8He8TdDNoEbET2wZdbnUZ2im4RyEvuirtcgvLl01TwnSBi?= =?utf-8?q?XJX7eEyYpvbislHQxwz0nrB0Z44f6byD9K/VuMxJhrDD3lnahVVfcKEsbyEhILGEk?= =?utf-8?q?59p/dFN81eQp8OW59vZynI6fE0xiRl5tnDJPRMLFeJih0kuYoAN1aAeFDoeVYCi5Z?= =?utf-8?q?9dhKzTTd5pFgYeoHwkA7+PISGOBYUtCtk5DzPNTr/rJoQPSugJHrIOmYAcfkJxvmj?= =?utf-8?q?ni8XStF0kVRbIwHYgY74nYPohWeTMHZiZw73zzekOahC/sQksaSj034Cw8dSSxChV?= =?utf-8?q?NUjignLjOMyFWg2hwoWFAfb8uAiUfCHNsza02HzaRZ7LzoqmZGXTniOfmDiHADP6O?= =?utf-8?q?gWc0NBQX8pnsD3dzXzWmjf7R+xQ458GetXQpMCzZ5tsAfjK5ATXmQz9p3OtPG0U/+?= =?utf-8?q?f9nV2M+6QG47xFaBmj2T4Va9SbnvQ+OdaqV/YCKrzm1p7mzeYNtqPgjM2b6UinJ3O?= =?utf-8?q?hl8ngohT5rNxG6HGi8KPkrC2eiPlSbMsOaDfn57pgwZgFNy/CDviaVmt3CfROMvYc?= =?utf-8?q?l6j0qOBzvRL5fpYW53xVb2SRY+Y+AqgjaTxDSjFkxbzIVy9U4mcJL/BP9uAAzW4mi?= =?utf-8?q?DbUYp/rach8C8oUHtz1cACwCQIsd3VGNd02Vfhdh+oD3A19Xgxi/4qiPso7nNa/bm?= =?utf-8?q?q+b6YadESQhbnqpkgdBsA/vtn3k9uwpmu/CAv4anmCd76GEFZbYbdCCR3ZkqKmxPJ?= =?utf-8?q?YlaAWjGCAW0zHm2Jz4wkf00+k1n7hdywn9Doxx+p0gvomW4rNWdb9N8dOKJfiYGUn?= =?utf-8?q?aRuJ6baZWf2UQO9tSROU0ftHBgiEK+c8RtXqcz6B3bfk5xBIoSpznXJF3YYf97u1E?= =?utf-8?q?uD7FeQXR+WP8Q9Hc+47Hc9oRcMv66tMHpE3mWe/XuCBZBdh0NytL3O0Hv5x14Xx2c?= =?utf-8?q?IeLgiKIzLG59OmMZu7TcGkA05GqcyjCebKdDvFUDqaZdyEZelo7UIGL3PHKmk=3D?= MIME-Version: 1.0 X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: BYAPR12MB3205.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-Network-Message-Id: 70a2173a-1ddb-4059-7ad2-08dd45a8aad4 X-MS-Exchange-CrossTenant-originalarrivaltime: 05 Feb 2025 05:48:07.8914 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-rms-persistedconsumerorg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS0PR12MB7703 X-Stat-Signature: rwz4scqe4m81k8bgcghey9yiydknz4i4 X-Rspamd-Queue-Id: 89DE81C000C X-Rspam-User: X-Rspamd-Server: rspam06 X-HE-Tag: 1738734490-614858 X-HE-Meta: U2FsdGVkX1+ILVXKzn7vcNbN+CGaJCGeKuWWxdgUXW70fYGUQv60ex/gkA4+nb0AtnY5Gw1J2Vhvvmp+Dc2zmNK3dAWKM0dQvyDz/gYGguwWKvKAULlN/wfnhDOmZGhyDw0lCMULg4+FBLGUOnp4rH6n8wTmbatdv+wWuj8c4FI+yuqXfywxfFkKBvejym13pcJ2HGPo6KRrJ8ZYHDlI36r8tIdIAe2T7C9AnIHzAKlPYX3NrPyHQt1LO1FAQDGzVAcj1p1eVBX7udgqstwiyeuq6QJokxpYr1AHlzQHUTAWAOWhvIgQcGq0UytBUJMq3OFsqQSTotBB21RYsnc9Boy+jh4eXY6ZHgHyJU2cwCzaONGT1JRwfBEzSUssAfb1rNJGMAXXYm34lubqW9v5Tvtp7p/jaDVKFlh49IGABPkSkepjIZyhIXrO/7CfVPPf78ZSalHdVpBW4M4yXSISo023lowZWMlWXG0uB+lfBoaHjYMt1ThsfC0CYDqtkjiEpPAK8mo8WdpaQScqaIvUk/J8JHvkZOVTP3wgi3ZqyMMQbRZevjqiEWbBMaEJic5qflVhdc/X/YSxfVG9gd8gw4dJPVNwCN3WNrfdjDAhct4BoML5sQ10tdDvqVAjhhfqyuL0OgsaXsEHADQY+Nbq266TJwVxA/lhTRyKjzD2uHpwFWNKVBdBJPstpo6LkoCM5qskhxLO+pb5U0kCUTra8acCVhKKwwpU5tya+ZjudsUh+FtKwQMn5eIGnCIXc08f2A4aYS0HUj9Ey+S2kE0wls1HNuXDlBVA04fK6XcCQv2XeoFBYCNPgf42dfZ/q4qid+cWgnP4oDP5KLk9dp4Pr7u3mAtbInpYcmzEW9J/Dy6qWDLfooEqSb5EV/1DOfSlNFH7kD6nPtyduo1BuRp2PYN+ZAihL81lztDk8KuZ8qc6lE4zbeJdfzC4jnauEGCvnPuGr3d4azqtS1prmj7 7kj/25Ay MwfE1VqlWcuAXm++eRM9MAesNDsuqgvQPcAfwwpFtWYOU/NhVdr5VGyZ4BLUL+oxvRnz01iUxChCqKKJ0IbzmbHJAvgkzPbNytKEu+n/PC0xeB+/PNDczVLNFOkTGzLKFIOu6Br/jSxhAcd0BRx+9bY7nWflPlmreGAoEqkTtSmiHSFfykr3s8m+gGVKSKhe13eZ32xZtQGf1rixrjo5+OfohDNOLKcs9tHkV5f3VKw75Z4padTcBq8uhBvG1Y31PX5RLpObKNO3Oz3UkQAdTwP/ibPcfNrIZrDr0f0OM9YTfk5dzRe68knKc+pbi9oDgP0UFSA85rxqmRi3pm3BWfSJduFOmwrSoWaP95FJ2Wk8gE1YwcGSFdueyQScSNV3w0zoRyPgEMxwUP5fTDwkVn7ntOYLTEo/pFoIBWyLfmuwQhroUBTufDtKwRdScVIX/+AheGMIYdqDsAI+T6exU0RMNPbfltj4fSvktK/lRYB80WH5kVBPP2iWuFjyDydmLMtSd4zwKAvsqGvtgJrBrxT3mkmf3KLeqzjPBDM0wDEaKXd8sQYh2YZDoz32Vi892b9W9o4EIfFFbE4YU8+RsDsLmfgTrbndwU0GqrviOefEzLD7ISQxvKE5sVNE9zr84ROXjNtyhjisA5poTRbfMtc/4tBCr6+sREV/TFnpO/2ggEjg7ekfQyj/Q9hNnqr+s7R5o X-Bogosity: Ham, tests=bogofilter, spamicity=0.000126, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Mike Rapoport、Andrew Morton I have recently been researching the mm subsystem of the Linux kernel, and I came across the memblock_add_range function, which piqued my interest. I found the implementation approach quite interesting, so I analyzed it and identified some areas for optimization. Starting with this part of the code: if (type->cnt * 2 + 1 <= type->max)       insert = true; The idea here is good, but it has a certain flaw. The condition is rather restrictive, and it cannot be executed initially. Moreover, it is only valid when the remaining space is (2/1) + 1. If there is enough memory, but it does not satisfy (2/1) + 1, the insertion operation still needs to be performed twice. So, I came up with a solution: delayed allocation. Since memory expansion is exponential, it means that performing around four expansions should be sufficient to handle the memory operations needed early in the kernel, before the buddy system takes over. Therefore, assuming that memory is adequate at the beginning, insertion can happen at any time. If memory is insufficient during the insertion, we record the operation (changing the insertion into a record operation). This involves logging the starting address of the remaining unused space and the number of insertions needed (this usually happens when resolving overlaps). After logging, memory allocation is performed, and then the insertion is attempted again. The benefit of this approach is that it significantly reduces the time cost and also records the starting address of the remaining unused space. This way, the next time the operation begins, it doesn’t have to start from scratch, somewhat like a checkpointed transmission. I optimized memblock_add_range according to my approach. Afterward, I tested it in the qemu-arm environment, and it worked properly. Additionally, I tested it in the linux/tools/testing/memblock directory, and it successfully passed all the test cases. I used perf for performance profiling, and here are my diagnostic records (by the way, my CPU is a 13th Gen Intel(R) Core(TM) i7-13700): The performance of memblock_add_range before the modification is as follows: Samples: 3K of event 'cycles', Event count (approx.): 3853609007 Children Self Comm Shar Symbol 1.32% 1.32% main main [.] memblock_add_range.isra.0 After the modification: Samples: 3K of event 'cycles', Event count (approx.): 3839056584 Children Self Comm Shar Symbol 0.67% 0.67% main main [.] memblock_add_range.isra.0 The optimal performance can reach 0.38%. Samples: 3K of event 'cycles', Event count (approx.): 3839056584 Children Self Comm Shar Symbol 0.38% 0.38% main main [.] memblock_add_range.isra.0 To test the optimal and average utilization rates, I wrote a shell script to execute two versions of the code (one before modification and one after modification). It runs each version 100 times and analyzes the results of perf to calculate the average, minimum, and maximum utilization rates for memblock_add_range. Below are the results from the script: After applying the patch, I measured the performance improvements using `perf` on my Intel i7-13700. The results show a significant reduction in the time spent in `memblock_add_range`: - Before the patch: - Average: 1.22% - Max: 1.63%, Min: 0.93% - After the patch: - Average: 0.69% - Max: 0.94%, Min: 0.50% The optimization provides a 53% reduction in the average CPU time spent in this function, with the worst-case performance now close to the best-case performance before the optimization. Here is my test script (it should be run only in the linux/tools/testing/memblock directory): #!/bin/bash PERF_DATA="perf.data" TOTAL_RUNS=100 CHILDREN_PERCENTAGE=0 SELF_PERCENTAGE=0 CHILDREN_AVERAGE=0 SELF_AVERAGE=0 CHILDRENS=() SELFS=() MIN_CHILDREN=0 MAX_CHILDREN=0 MIN_SELF=0 MAX_SELF=0 function log() { echo -e $* echo -e $* >perf_test.log } if [ -f "./perf_test.log" ]; then rm "./perf_test.log" fi touch perf_test.log for i in $(seq 1 $TOTAL_RUNS) do sudo perf record -e cycles -g ./main > /dev/null 2>&1 read CHILDREN SELF <<< $(sudo perf report | grep "memblock_add_range.isra.0" | awk 'NR==2 {print $1, $2}' | sed 's/%//g') if [ -z $CHILDREN ]; then read CHILDREN SELF <<< $(sudo perf report | grep "memblock_add_range.isra.0" | awk 'NR==1 {print $1, $2}' | sed 's/%//g') fi if [ $MIN_CHILDREN == 0 ]; then MIN_CHILDREN=$CHILDREN fi if [ $MIN_SELF == 0 ]; then MIN_SELF=$SELF fi if (( $(echo "$CHILDREN > $MAX_CHILDREN" | bc -l) )); then MAX_CHILDREN=$CHILDREN elif (( $(echo "$CHILDREN < $MIN_CHILDREN" | bc -l) )); then MIN_CHILDREN=$CHILDREN fi if (( $(echo "$SELF > $MAX_SELF" | bc -l) )); then MAX_SELF=$SELF elif (( $(echo "$SELF < $MIN_SELF" | bc -l) )); then MIN_SELF=$SELF fi log "($i) memblock_add_range.isra.0: Children <$CHILDREN>, Self <$SELF>" CHILDRENS+=($CHILDREN) SELFS+=($SELF) sudo rm -f $PERF_DATA done for PERCENT in "${CHILDRENS[@]}" do CHILDREN_PERCENTAGE=$(echo "$CHILDREN_PERCENTAGE + $PERCENT" | bc) done for PERCENT in "${SELFS[@]}" do SELF_PERCENTAGE=$(echo "$SELF_PERCENTAGE + $PERCENT" | bc) done CHILDREN_AVERAGE=$(echo "scale=2; $CHILDREN_PERCENTAGE / $TOTAL_RUNS" | bc) CHILDREN_AVERAGE=$(printf "%.2f" $CHILDREN_AVERAGE) SELF_AVERAGE=$(echo "scale=2; $SELF_PERCENTAGE / $TOTAL_RUNS" | bc) SELF_AVERAGE=$(printf "%.2f" $SELF_AVERAGE) log "" log "Result report:" log "memblock_add_range.isra.0 Average: Children (Ave<$CHILDREN_AVERAGE%>, Min<$MIN_CHILDREN>, Max<$MAX_CHILDREN)> Self (Ave<$SELF_AVERAGE%>, Min<$MIN_SELF>, Max<$MAX_SELF>)" Here is my patch: From 1d4da808b1c6ef2a8706782c3fe724c62169f311 Mon Sep 17 00:00:00 2001 From: "stephen.eta.zhou" Date: Wed, 5 Feb 2025 12:04:40 +0800 Subject: [PATCH] mm: optimize memblock_add_range() for improved performance - Streamlined memory insertion logic to minimize redundant iterations. - Improved handling of memory insufficiency to avoid excessive reallocations. Signed-off-by: stephen.eta.zhou --- mm/memblock.c | 106 +++++++++++++++++++++++++++++++++----------------- 1 file changed, 70 insertions(+), 36 deletions(-) -- 2.25.1 diff --git a/mm/memblock.c b/mm/memblock.c index 95af35fd1389..75c76b39a364 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -585,16 +585,16 @@ static int __init_memblock memblock_add_range(struct memblock_type *type,                        phys_addr_t base, phys_addr_t size,                        int nid, enum memblock_flags flags) { -     bool insert = false;      phys_addr_t obase = base;      phys_addr_t end = base + memblock_cap_size(base, &size); -     int idx, nr_new, start_rgn = -1, end_rgn; +     phys_addr_t rbase, rend; +     int idx, nr_new, start_rgn, end_rgn;      struct memblock_region *rgn;      if (!size)            return 0; -     /* special case for empty array */ +     /* Special case for empty array */      if (type->regions[0].size == 0) {            WARN_ON(type->cnt != 0 || type->total_size);            type->regions[0].base = base; @@ -606,80 +606,114 @@ static int __init_memblock memblock_add_range(struct memblock_type *type,            return 0;      } + /* Delayed assignment, which is not necessary when the array is empty. */ +     start_rgn = -1;      /* -      * The worst case is when new range overlaps all existing regions, -      * then we'll need type->cnt + 1 empty regions in @type. So if -      * type->cnt * 2 + 1 is less than or equal to type->max, we know -      * that there is enough empty regions in @type, and we can insert -      * regions directly. +      * Originally, `end_rgn` didn't need to be assigned a value, +      * but due to the use of nested conditional expressions, +      * the compiler reports a warning that `end_rgn` is uninitialized. +      * Therefore, it has been given an initial value here +      * to eliminate the warning.       */ -     if (type->cnt * 2 + 1 <= type->max) -           insert = true; +     end_rgn = -1; repeat:      /* -      * The following is executed twice. Once with %false @insert and -      * then with %true. The first counts the number of regions needed -      * to accommodate the new area. The second actually inserts them. +      * It is assumed that insertion is always possible under normal circumstances. +      * If memory is insufficient during insertion, the operation will record the need, +      * allocate memory, and then re-execute the insertion for the remaining portion.       */      base = obase;      nr_new = 0;      for_each_memblock_type(idx, type, rgn) { -           phys_addr_t rbase = rgn->base; -           phys_addr_t rend = rbase + rgn->size; +           rbase = rgn->base; +           rend = rbase + rgn->size;            if (rbase >= end)                  break;            if (rend <= base)                  continue; +            /* -            * @rgn overlaps. If it separates the lower part of new -            * area, insert that portion. +            * @rgn overlaps. If it separates the lower part of new area, insert that portion.             */            if (rbase > base) { #ifdef CONFIG_NUMA                  WARN_ON(nid != memblock_get_region_node(rgn)); #endif                  WARN_ON(flags != rgn->flags); -                 nr_new++; -                 if (insert) { +                 /* +                  * If memory is insufficient, the space required will be recorded. +                  * If memory is sufficient, the insertion will proceed. +                  */ +                 if (type->cnt >= type->max) { +                       /* +                        * Record obase as the address where the +                        * overlapping part has not been resolved, +                        * so that when repeat restarts, +                        * redundant operations of resolving the +                        * overlapping addresses are avoided. +                        */ +                       if (nr_new == 0) +                             obase = base; +                       nr_new++; +                 } else {                        if (start_rgn == -1)                              start_rgn = idx;                        end_rgn = idx + 1; -                       memblock_insert_region(type, idx++, base, -                                    rbase - base, nid, -                                    flags); +                       memblock_insert_region(type, idx++, base, rbase - base, nid, flags);                  }            } -           /* area below @rend is dealt with, forget about it */ +           /* Area below @rend is dealt with, forget about it */            base = min(rend, end);      } -     /* insert the remaining portion */ +     /* Insert the remaining portion */      if (base < end) { -           nr_new++; -           if (insert) { +           /* +            * Similarly, after handling the overlapping part, +            * it is still possible that memory is +            * insufficient. In that case, the space will be recorded once again. +            */ +           if (type->cnt >= type->max) { +                 /* +                  * The address of obase needs to be recorded here as well. The purpose is to +                  * handle the situation where, +                  * after resolving the overlap, there is still a remaining space to +                  * insert but memory is insufficient (i.e., +                  * no memory shortage occurred while resolving the overlap). +                  * This means that space for +                  * N (overlapping parts) + 1 (non-overlapping part) is required. +                  * If obase is not recorded, after memory expansion, +                  * base might revert to the original address to be +                  * inserted (which could be overlapping). +                  * This could lead to for_each_memblock_type attempting +                  * to resolve the overlap again, causing multiple unnecessary iterations, +                  * even if it's just a simple check. +                  */ +                 if (nr_new == 0) +                       obase = base; +                 nr_new++; +           } else {                  if (start_rgn == -1)                        start_rgn = idx;                  end_rgn = idx + 1; -                 memblock_insert_region(type, idx, base, end - base, -                              nid, flags); +                 memblock_insert_region(type, idx, base, end - base, nid, flags);            }      } -     if (!nr_new) -           return 0; -      /* -      * If this was the first round, resize array and repeat for actual -      * insertions; otherwise, merge and return. +      * Finally, check if memory insufficiency occurred during insertion. +      * If so, the memory will be expanded to an appropriate size, +      * and the remaining portion will be inserted again. +      * If not, it means memory is sufficient, and the regions will be merged directly.       */ -     if (!insert) { -           while (type->cnt + nr_new > type->max) +     if (nr_new > 0) { +           while (type->cnt + nr_new > type->max) {                  if (memblock_double_array(type, obase, size) < 0)                        return -ENOMEM; -           insert = true; +           }            goto repeat;      } else {            memblock_merge_regions(type, start_rgn, end_rgn);