From patchwork Thu Mar 6 05:45:29 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bharata B Rao X-Patchwork-Id: 14003845 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A67A5C282D1 for ; Thu, 6 Mar 2025 05:47:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1D76F280002; Thu, 6 Mar 2025 00:47:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 138F4280001; Thu, 6 Mar 2025 00:47:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ECCD5280002; Thu, 6 Mar 2025 00:47:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id CB032280001 for ; Thu, 6 Mar 2025 00:47:27 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id A12E81210E5 for ; Thu, 6 Mar 2025 05:47:28 +0000 (UTC) X-FDA: 83190043776.22.FD17FA4 Received: from NAM02-SN1-obe.outbound.protection.outlook.com (mail-sn1nam02on2082.outbound.protection.outlook.com [40.107.96.82]) by imf20.hostedemail.com (Postfix) with ESMTP id ABE4E1C0004 for ; Thu, 6 Mar 2025 05:47:25 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=amd.com header.s=selector1 header.b=ZwwKJzpG; dmarc=pass (policy=quarantine) header.from=amd.com; arc=pass ("microsoft.com:s=arcselector10001:i=1"); spf=pass (imf20.hostedemail.com: domain of bharata@amd.com designates 40.107.96.82 as permitted sender) smtp.mailfrom=bharata@amd.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1741240045; a=rsa-sha256; cv=pass; b=qr7p5qgwz0DiRdJtk2klNXMTyyu2hO1TfDUhZ87tmxY2boM84p0mssLsrqOygRcgSxebvF CmJLfus8oghhBkVqOQGN1geYcQmJj+eTkbTh9g+UlfRoJVSJN3Ehx59ZwT2gykyjNKnFDd C+FMQGxn02hb4fh1Jhl2EVf74lX8SJM= ARC-Authentication-Results: i=2; imf20.hostedemail.com; dkim=pass header.d=amd.com header.s=selector1 header.b=ZwwKJzpG; dmarc=pass (policy=quarantine) header.from=amd.com; arc=pass ("microsoft.com:s=arcselector10001:i=1"); spf=pass (imf20.hostedemail.com: domain of bharata@amd.com designates 40.107.96.82 as permitted sender) smtp.mailfrom=bharata@amd.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741240045; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=PGR0a7QoIDLQXftmz/h92JF2OeoRfFiIzzu7pSWhTW4=; b=rsWQA76VmkpXlfabkkQIN4f/ebY0npD8kbnW2DAOm97N91u6e/kZAM1rgG7Ke2aDXaX0Rq rpYmIVWqmV6ibgJ6lTjcWWyQfkMjLisoQzuUccxyki8291VQVbr2dTXm16aYDFQ3CsHmmA 8JTg0DD+V8UclJZ3AEtTAjFwBRs4A/k= ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=JnpFZ7e/C3qcdHveiEDzlOP0HHV4ZVMs3dBQG3cqLNa7K6GVaWiWKw2ejKbKZkr9nX13K6jxpvxP/s4QsIEEWHkhTAunShlsgQ/m31rrQdf1DmmQT2xF+Mbx5qiLKt7cvJY0v4GtZPYnO7Jv0VfPwWT6kMu1nw3Rn0/8+VeXPGEPNErD3U7Bgk6iO/d/T19VTIs3um1angHzUF0/pNPcJCi9JlCnxIFgu1r4F+ODm7uRNlq6GMcF6bBTNy0+61bEtqDdk3jYXXM75tXqWYaKayRxmAWO6T8tyqekknCEXSVdUuGprPo6234KM4X1MqHSxwEX+ff9ctRzx/9P8ZJncQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=PGR0a7QoIDLQXftmz/h92JF2OeoRfFiIzzu7pSWhTW4=; b=ghUupwSGBGNlIIrv9PTNCTPvDPm6XGCd5bm2jJPWC4eTyVxzI828lTXhYYkh/Yct2R3frNCbgRz/A8KzqdHQ+RlDvC5LjKpPV3yRyBFz0f9XRb56X8yBUBMnyPJpjM1r8fqgtgqLGGBxtf/8DWVi6TJmNjRUzyrGr8oP4a4VK39dlB4R/KhQDNsfK4J/J5T1Iw+E1joTtru3v7QKZp94GWYpumFe20FJD0NTNgWH15EY7wlf6TG6+0+VK77RHZJaQFN1e2lXrwoiTEV73C99u2QB7fIyoXxQMT/9REDn9jkkxJTBnMM2uoeAIKMoBBTi868EDCDtfi46X+DoKgIIUQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=PGR0a7QoIDLQXftmz/h92JF2OeoRfFiIzzu7pSWhTW4=; b=ZwwKJzpGFEPSaeR3/pwxZqwilhvHtU1eZbqdmvOo4zKj6N/F3AsbyHZXtN1CZTqB4lrkA7aFfwv74EpWsO1a2yPKzecpR1F6NS8NK/EsudZhzyTcSvfbs9ASx5h8UHBSySo3F9Atd78TXxgbChjpvoB+YwJTiRXHGEkUzpDDma8= Received: from BN9PR03CA0200.namprd03.prod.outlook.com (2603:10b6:408:f9::25) by LV8PR12MB9154.namprd12.prod.outlook.com (2603:10b6:408:190::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8511.19; Thu, 6 Mar 2025 05:47:22 +0000 Received: from BL02EPF0001A0FE.namprd03.prod.outlook.com (2603:10b6:408:f9:cafe::2e) by BN9PR03CA0200.outlook.office365.com (2603:10b6:408:f9::25) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8511.19 via Frontend Transport; Thu, 6 Mar 2025 05:47:22 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by BL02EPF0001A0FE.mail.protection.outlook.com (10.167.242.105) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8511.15 via Frontend Transport; Thu, 6 Mar 2025 05:47:21 +0000 Received: from BLR-L-BHARARAO.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Wed, 5 Mar 2025 23:47:11 -0600 From: Bharata B Rao To: , CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Bharata B Rao Subject: [RFC PATCH 1/4] mm: migrate: Allow misplaced migration without VMA too Date: Thu, 6 Mar 2025 11:15:29 +0530 Message-ID: <20250306054532.221138-2-bharata@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250306054532.221138-1-bharata@amd.com> References: <20250306054532.221138-1-bharata@amd.com> MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BL02EPF0001A0FE:EE_|LV8PR12MB9154:EE_ X-MS-Office365-Filtering-Correlation-Id: e2328a66-23f5-4a61-eae7-08dd5c725d69 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|7416014|376014|36860700013|1800799024; X-Microsoft-Antispam-Message-Info: 8YuyvYw0Z3TdW2Ft/s9MUEh/B4+SngIkYRVmsvwsbwzhBmZNfJo18ydm5coidD62KuDyTG4kDLDrxsg8w/n5D2FDAkKM6oyPe+jjvM1EzU8vhFAVf7pu5WY/KHx6HlpQvdT68xkkm4wCLPYgQVuHDbJoArTQkgGGbjw+4YndBm6mkz7EKWESPPJ6WCGlugbiiSwthgQLMyEokIddhTcQoz+V2niv4pASEjq8arv3C5CvAOg3ljewwQRgT7028OLHvLLgl4QOA6N0dnpfPtdWKSgnkpyVVjUZO7wtjxu570pO4pfhUHd7CpvPi9E7GyuAmfJENGlefpAZ+j4Ze2B13QD4wrS/Zhuugp1JjsGBubLbobsVSjf+rUZ59t/WeewxK/PZk3xLwBxjX+Q5+CXHBeLbMcYBs+zDwnMzMMkhyNNkPF61cdKXPE00aTHJ0A6L+0hfu/LMuzQgWWrJfmDyMa5P2Ni9GZUwiXU5z2NRFxqZXv/4XE9omEA76aDdAysuiahWDMQKmc7b7xVn1sO/3AbDpnUiILBLPLpLLU1AIhuo0fsaEl9TKrsSBytJ79yBDKPTh/h2a5e0mkICNTFKZ/+YARhisFUKBawDT7yRVRpCYmD3UCiY07LeDK6xl9k2dUnrHfmm083UMpYTe47kzHS6K9+0rFecYOAnjnAikhpkCtM/2PXSdd3GDUFjXgDaELnTG3RrAQSQEs0hlSWcIvuLRCwiIwczD5ANkZ9C3M2FkapttymFP6l2Gr/UfLsW4Od+3pZ1r+yGksleeD4Mbu9PIuanaeeAvOeyNeMzgXBmC7viYv0yTVMaic1OwDfCLAlwZ3TMc90YQQP4NzgW2XHgme7kykIwgerH7TsTsGjHxCtkJSGp/+CiYu8ztRAY94qkiS6vvzjrFshAAgGJtsy843FZZFdIHfUDS93hwfoItrghm0PL1o7cyQHsdlcaZtwbZcBRhg9rN6X6/xAjZSrJwMv9r4sHYP9YWGkBYY8/UzueeYNwcurtd1n+muySkSjC8/QeYbsxa1CWuzV3vteycuTO7QrY/GULRqpYFOlvtOmGtT8iOzVDxm02MKaGT+tNcYzYSK2wNKm8neeevxtYC/0/IhOei6kn6CvfgDjJE5v48Z2gi3Uqv+RbSWxTY8VSgFRe14iT9TazT0zo4UO7x+j22IM89ThzXk6pRkO2CXipJa70RINgUejiv3+DwtKN6TuSuQkbHcCBXiN6OZWhkMgsxXktwoVoEVS6lwI/mqNVQA4u0mRryY9+WnAb68fZGsLCynZdPPN12PSCEvizRfk0EjAgIR9Jbuh4Yggr7o7nXH1Ab+U/1Vpbfxomf9O1dzL/xVfL0zOK7VYKHQH9Au1/fNqSxbUDVE+ExY3wXegRN2QYXYuj08X0uJeB7o2hc84hMY0IeB2isV8hrRjg8sMAQD2pKcX4z+fwQoNq0jp2c/CWTKxUkKHjL/bkFbJK7F094WPjtHWQBWm4qsXlDCxLXWaqzce1sNdWcqU= X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(82310400026)(7416014)(376014)(36860700013)(1800799024);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Mar 2025 05:47:21.9259 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: e2328a66-23f5-4a61-eae7-08dd5c725d69 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: BL02EPF0001A0FE.namprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: LV8PR12MB9154 X-Rspamd-Queue-Id: ABE4E1C0004 X-Stat-Signature: txk3q1kpotgk34ex7emuh1iwxwzjg16d X-Rspam-User: X-Rspamd-Server: rspam05 X-HE-Tag: 1741240045-177747 X-HE-Meta: U2FsdGVkX19J4jYW7ZlRg0WQyalHiWVrtMzS2Ds2YoVqHKkvQZtPEQvUrpukXFPSs88eog3WdhLLdcrfttwaNtyM4HLMcD3VSKhKIKEGOAW6rBFprlybpjJNI3hnTTVGwy7xpHfDuL16/30d6dBhdQ74O3tnpZ0S3V6lhf6HdYoM5pAwo1w4PV+DL2mryx+SCpzSsNv1Z1uWerh//i9CpREGmhoj6MrE9PW9FqwyTCwV+G1ZRUl6xwN5YPGtLf0vT9SOT0FX+Uoy18JdzP/EYrcBg2yLFrErcV+mroEhzjy0KpfXYaJGPjFpM7qDLAQHQXJ/x/VZ54CoEyr8UNwrHdteOQw+zMrO4c1z1TcUEjqpzN97D3GtEokNw1VtDemglrmaVfMTeX2hkwP9J958kRwq7KEK6XM/R9KijqfrDtSK+LNN9ruE3Bo7+OmQ7/k+QDh+Jmx4MWrdfJMXsdkkxJKf2jVGhip05Oq9lr8b1QWcL+gdlNuIDR2lkF8Joi6c6fSca9EKjesxluI0YFr60M5x/I/pl35Ukmts8HI7rVSFdplVtu6gETXK4pthoiq1OpBypXUN6O1UkvFlgdmbJzyxnUg4yyYvkRB9SvEW4dR6K6jqIMQrzMG2+YOrzR1jno4uUFIoqvb7QeWqSBYH4u5+LE3K16s0Yh0HbpyTqUe3pFvgeBWVphOwj2x841aefzoco7Cs0/+zNzD1T6zEgL0jd0qRkC618WFvMMmg9lVr3nRvDAiM346U8hg+TmuMVW8YD3dKskpxJqGl7zKAPFZe3zTSyM89d1QlMLNB0v0/jCpoQiS0xpH6hSj7mCCG5/41OnFxQe7+garURONpnkL79RRyImrt/rBc2Vvj8uSvXdH/cl01k7ynoUKQUc77kN+jX+GC8FJZDXpPKAtPi89mOGQqKpUwtH+t2QCxHInv5UXq+J0zhVGglprIXZ+KV7hbhgbXLpE0smMwZ5G 7wewcNg7 2Tgyty5eV0FyDl0o3KmfL9qOORgsi6tomqt04N9aJPTgaZvHnb4pZcV/NAulVArGk0eMB5NJa6oJF6GuCsO9SltUCkzzVzxxOcOk6kt/LDH1VX4u8RavadXJw/mJURvhlQ9Xqd0akT+tegBgQIlyF0K0lnEozTrz6WzkJKTj38g6ohiA9R8NxTu/nxeOgXdNK4T3b2KqXX21lrNo6SlY/BF+5L0mFNGuhM50QvhBGoQJ5w32s8NiFIs8zyn2IxKxQtdxsleHUIwQI2pCq7WiO1OE3ybR7R3uypyPviE+BBGMlYhfOHrDT+Pghvowh310sY5R0ZhHCevx55ySpAtT5et789fjuTdzA5Kw3TQ9EqKHSirhrWL48cgx40uiKdvYRSJy6606zf8e4k0Y24LvYz22ZoMZNsxEgD+MT6Mfpk0LuidYj+pn4+npKFNRknbv53WBgb9Wnh7Zsu8CKKs3PvxC9FAvrXoUHIUOboZQrWUhjwWqC4atn2gOqX/zDciAnGTR4M2uJDgj/MVYxYI3vi/Ja1whHrCUHDWL05ksKk+rZU4TErG9fcaPlRvYhHKlvOVFP X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: migrate_misplaced_folio_prepare() can be called from a context where VMA isn't available. Allow the migration to work from such contexts too. Signed-off-by: Bharata B Rao --- mm/migrate.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index fb19a18892c8..5b21856a0dd0 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2639,7 +2639,8 @@ static struct folio *alloc_misplaced_dst_folio(struct folio *src, /* * Prepare for calling migrate_misplaced_folio() by isolating the folio if - * permitted. Must be called with the PTL still held. + * permitted. Must be called with the PTL still held if called with a non-NULL + * vma. */ int migrate_misplaced_folio_prepare(struct folio *folio, struct vm_area_struct *vma, int node) @@ -2656,7 +2657,7 @@ int migrate_misplaced_folio_prepare(struct folio *folio, * See folio_likely_mapped_shared() on possible imprecision * when we cannot easily detect if a folio is shared. */ - if ((vma->vm_flags & VM_EXEC) && + if (vma && (vma->vm_flags & VM_EXEC) && folio_likely_mapped_shared(folio)) return -EACCES; From patchwork Thu Mar 6 05:45:30 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bharata B Rao X-Patchwork-Id: 14003846 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89C2FC282D1 for ; Thu, 6 Mar 2025 05:48:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DC709280003; Thu, 6 Mar 2025 00:47:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D50E7280001; Thu, 6 Mar 2025 00:47:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B7EB1280003; Thu, 6 Mar 2025 00:47:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 8F9E5280001 for ; Thu, 6 Mar 2025 00:47:58 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 4B396C1002 for ; Thu, 6 Mar 2025 05:47:59 +0000 (UTC) X-FDA: 83190045078.03.90DABE4 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2044.outbound.protection.outlook.com [40.107.243.44]) by imf01.hostedemail.com (Postfix) with ESMTP id 4689E40004 for ; Thu, 6 Mar 2025 05:47:56 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=amd.com header.s=selector1 header.b=dOabVrly; spf=pass (imf01.hostedemail.com: domain of bharata@amd.com designates 40.107.243.44 as permitted sender) smtp.mailfrom=bharata@amd.com; arc=pass ("microsoft.com:s=arcselector10001:i=1"); dmarc=pass (policy=quarantine) header.from=amd.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741240076; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=K7TvrwplZzQShquHVeI1Ot4r7G3WbzWt6WnN16CBkVw=; b=Cl3/aa/3Q9EgE6aHeLfoudixDpYlNCUv3TvyX5BGI2Ewu2v41tM3SbUeN73XEEC+74Ex2p ODD2iB9PYGlTdE1vHJXYoYLLab02+gbdMSVtqxuDMJ5yI/7a00vGzp2piKrD85hH9vrhKb UJHiU5Rm+5TvyljW/uBZ81b3TxsH99E= ARC-Authentication-Results: i=2; imf01.hostedemail.com; dkim=pass header.d=amd.com header.s=selector1 header.b=dOabVrly; spf=pass (imf01.hostedemail.com: domain of bharata@amd.com designates 40.107.243.44 as permitted sender) smtp.mailfrom=bharata@amd.com; arc=pass ("microsoft.com:s=arcselector10001:i=1"); dmarc=pass (policy=quarantine) header.from=amd.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1741240076; a=rsa-sha256; cv=pass; b=w2tKDkgqO5s8WPWLlho5aGADm9zeRUAKiWl+aDyp2PY6N+bmdi+8c6EI6CiL7Ii1qsDxeK 2IAu/mvP1YngBzsQ2q1wh/jsVewbzvQqdJVsCBMO/nv8of9vp9/S3rKNbl5FYkDOFiWlQ4 gzN9J30sc0VAbELambDOXFaRYVznqYw= ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=X66Ih8nAJHTbQiaZ8BS+CSERp5KEgTdbw0j3w+6asauWc/uu/q2kirkr7caTDpUMHPi6sBsKZqk8dOUU6NwPICE0nJgoQalFhEQ2I0GVXWtWHqSUMlZ1tbv5ehoR5AuqSLv5+GGOJm9OMgBXVkXF5ZAzkm+iMzNYnl7n/H5pOkDbJkhSag6Of/ERJSm9523vympQdjrLpEVSd1bOFLz3dmi4gDlpuTXhldJvL6udKjEfsKmt/LVWHBOraN0meelyER+6elay92eIqemesFRRkL47cSY8tjyLiq7gDLdoVgm8DrBDYzlnheuvxq5aMeuFC9Xdp/J7zgCCu3u5nacsJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=K7TvrwplZzQShquHVeI1Ot4r7G3WbzWt6WnN16CBkVw=; b=mjJwzpza0LL7IJmP0iKnjKZaTWcLYxNnAFYutH9tA+xPptjLx3D95BeMmb3XuSc2eYbv7QCAM1yce8bwzoC2ulxTyxTqCRhT53bkBp0KPPE1dLoCr5O9AqW4ak0cnNWLgMlD+1jgM9FkMvK+Mk8lqU80E3iBYo4oSgu4upaItgdTjR8P5aKcAaR2ZMk/TzMg0iT/iVNCt8qVORKhgHWuFacQ5bEADg5p0MU35LXwTbpt/cf9Thn78P3zmzQtYVcmZKWbDpZsrU1ngfMxevAlUwtmkWFTMTFgsvIbV1FYf2jJx+MESpzfYBKLPVnvOL3k8RYoSEr+VS+lVJiKo8wElQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=K7TvrwplZzQShquHVeI1Ot4r7G3WbzWt6WnN16CBkVw=; b=dOabVrly3Ag5+998i+HaBbiOaWfi+gYdr2z932Kdf65NkMjp+mMKU/CDeb6utc1rlbgSm1qeySeqdDI7/54mc2z2/frPd6l0tPk3pM2gV+o5nE5Tgtw/wscImaaArpKb6US/6A/5QKCqcVHW1cA0k0w61rfzjSsuCNXorc0mArY= Received: from BN0PR04CA0148.namprd04.prod.outlook.com (2603:10b6:408:ed::33) by SJ2PR12MB9086.namprd12.prod.outlook.com (2603:10b6:a03:55f::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8511.19; Thu, 6 Mar 2025 05:47:51 +0000 Received: from BL02EPF0001A0FB.namprd03.prod.outlook.com (2603:10b6:408:ed:cafe::82) by BN0PR04CA0148.outlook.office365.com (2603:10b6:408:ed::33) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8511.19 via Frontend Transport; Thu, 6 Mar 2025 05:47:50 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by BL02EPF0001A0FB.mail.protection.outlook.com (10.167.242.102) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8511.15 via Frontend Transport; Thu, 6 Mar 2025 05:47:50 +0000 Received: from BLR-L-BHARARAO.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Wed, 5 Mar 2025 23:47:39 -0600 From: Bharata B Rao To: , CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Bharata B Rao Subject: [RFC PATCH 2/4] mm: kpromoted: Hot page info collection and promotion daemon Date: Thu, 6 Mar 2025 11:15:30 +0530 Message-ID: <20250306054532.221138-3-bharata@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250306054532.221138-1-bharata@amd.com> References: <20250306054532.221138-1-bharata@amd.com> MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BL02EPF0001A0FB:EE_|SJ2PR12MB9086:EE_ X-MS-Office365-Filtering-Correlation-Id: 3f4d5aa1-d7c6-47df-6753-08dd5c726ea2 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|7416014|376014|36860700013|1800799024; X-Microsoft-Antispam-Message-Info: Gw28esoDq6GcKpFuFQUQr/AHda2qIh2uVIf+PaK9Etud3PbieELC/9QY2+78hR+LoIKDIEUcPZ5h6uTYLv6KNkn+EyTXj9EzBEOmJvk29t5rl4b0DkSBEPfGh4yK+9hdjoeWmW2eGf96eIgN8wJ614VPDWVthf67nvqqcFphHHJQhLxDSG9DUt0Igon+0qkA7X12FLn6sO79iycHcNeWfjja4a3J+OZdZwH+3+mImOy5O7K42zaeedK6rjGMRRFtfsCYjawBR/ZydvA2HNtRV/l5QNfFrt5V5/aX1v5lNFyJE7GNk0Djfepd7cN11MO5dhKe2Dhj6Rq8q8qmJeNivb8TjnxfSexYJ3bqjjrAo0IjtTdIFfwVjir4osWgBW6YTBO422k1isRlBrKbG0KiVaMt5OBk6KSTLw63a2q1N2c2iksHcM1MKFFx0w+eVncqLQWUUQsKidN/DM7kPg20FU6r+TkI/geDadrOw/njK4XPUh2+h/IEAi7WcAVlgeINwqw7EkV4lVHpZotklGd+GHYQNjzJPLj7Tu+Cikyeok/zqJpSYd9Y6K3hKQKrQOnOsDdoY7hXc38T6qWCUtAILWIG0Tu1zg1PqyhT/OvfV4aIMspu9jx+RXM1AfwI/REBKauHKBfWcMkUwuGRt0HB8k7a/tb19pCF3NAc7JaQI+xxPC5ndofhqeJKXd5A1LHj9s5htNYVlYtc277LgosUN8VvqxJEkxiTlSB7SSCWD2SHPWNjSQbShLexREfWXY1plsbPyODjQEHMz7Hi10+uKtwxhU+C4ZDyxPu7o3qglTbD4uJoJM3IbFCnFei2lsrdssMwEPcoTUvCSDhxxyd83cCMnnV4U2oQJjEquYBRacFw+NfhL3R8DQstdE/dd3OxsleaFYDJx66dBdKLcut5eONChTA0T3fziAxRushPvQuMtbBUkqucnvaH50cuuhqojMA6ACpVT9FbGRqJoW0KMs2CVAc4Z5BqKwLy2E9zyOCw+n1Y5xsBsCsxqkg8Sb9tI4xLyFV3CAf5ei5Xl+FcVkS7tV8cLMZOws9ZCQxxOQEHavhVKNjI22gS9KQ8nBnE67OaWvmZha0Gw4ZbzMDZt1rRrR28oxK6GvHN+tia2is2vhoAzjQgOMj0WOF3L0AkcPjGXpdkApYzBxPI9I2pt/JET0VIkN4AB95ruOFNJtqJZ9xFn/U6tCK2ftlnd0GDMc2uYmdFqT6QKwyVair2OEWMyEil82KUY38PKa3b0Jif+chfFbv4HrXPhledpHxIz11Ov9WnL92poYyDFgDAjsLFf6R7lK9UUuB1vUq+JM5bPAFLyI5dZYxAenTW5NzrScre0qmv2rF6HQ9FqZE3wVNYhextpbvf76sbTcmxpMRGLvzVoI/lBdFNbSkk8AlLDB9vP1/PghyfHZlrV13O4TkOdpXmKHWLW9rXfX95F6uuvnt5o85opKjuhGmazIaZhGGEE5ynSH0uKw1RDez++A== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(82310400026)(7416014)(376014)(36860700013)(1800799024);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Mar 2025 05:47:50.8173 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 3f4d5aa1-d7c6-47df-6753-08dd5c726ea2 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: BL02EPF0001A0FB.namprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ2PR12MB9086 X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 4689E40004 X-Stat-Signature: 68tey6xdkomtrfcrze8q7dymrmz76cpf X-HE-Tag: 1741240076-819976 X-HE-Meta: U2FsdGVkX19DDLEFjDuwSHAo9jNqdK3HAni/DYpKk80wyNZJ+zSBIBVVssthkX+JIyh4tR+05NO3hhRCNPauK2WWfHZX4fY84s4MF364+rcHYv6ZfjKYHyZbqMW+XndsFAhgp+2FuE9fxX9we8cHlehJUxurXJlpdmIGPey/YVbt0UMwc4CwZg/QVg1rgF+NX5QH358iOT8YbmO2V4EkkjmCPRgtUTEua5BcuHaTe0b8hLeKfhGH7mJEgFWYkt57BxsbTEpDkAXmpYC+mOgTnUHbVVPHARcdzcrqHLV7PyKn6eNE+Q+1JkKbe+/xNR87eJ5LJtNL/MAD7VYxgrW5iEaBiCvHmp2ZYxDlfk2ygdBzL8nTX/aovQBL1Nb65XBTL682sAzXfwtA6vZ1o1Jx9pgrQ9YSHLpYyqCaH/oYcxoD1J2U8yhNokJ9QPtt9KAahr+agUb0NbEH6FS9G+hCo1U2cqNhO1OVfAGhxjkO2P+rUSzd7lgJMMq4YDb+VQ9G/kaJK5+bqUKnCIwGQY9WxGQiL5Y0U2fnAMtaISk5he2KhzTgXX6bDRakVv3LGapDY9KfdsxfRPvg9IR45fHRhVZOZD/FFhxUzqXYkAWcXbPfco32c3EJYErQjmGmR6S1DaEFS9/c+LUH46xa6lS5pDEEwYNTAqTAwfCQbaIeyF0/7K4+7TXuEkh7Cr5SLNFBIis2hSFAQGnTk1gMG2+VuwpXehiBWYWUsEyFVoUBB5n+KaM3wHJcSAX0wh72+/qhTRLrE+dwDoRzmK3m/O5jaRw3pJ+dGzyTPwWEhJm5GSwF5YTcGqU9n5Xl7FeE/HYCROiGFjWS2W3xHsAQrMwCw4yrWPaIoG3KzSGQyNdSpsbELGDNjoA1kCiEhQkO+HBCz1PWGKI+4fIP57ffZ/li+jOv3rZWt9zp+drhnnrGaj/HGk0Xi1DSxPI879l8Y/CbczQGq28x4azh3Y0Ze97 O4IwXs22 am/S3JuZbWWh2hbU1jU1GohatHeHeglNv4pwm/gq2zcy86pNBHMd/1yru8OUzMsQSyn3ldBrQR9S201qGtN0Qt8dNndl9rYaC6twS7hl0lFnmPCRlglfgJcQP3cRsyPN3BHZOySMMgnoKntFlUBNb02CHxXhPpadPOc64EwUK/+r+6U+WBTU3F4y4Ov9Q5gebYmNKB4eCcaqUCOXnPNYkYskc4c58P0y5H0pqU86vjWsemLMXfbZdOyhflPluIF5m1YO2JJy83T3AvrM20tRwzmDwKuuXU30OoNyCJvRnZh5WNyoE53k5f3mf4mEy/aSteqD01HZ/QWcIsOL3ee6W+r0lkiixbCh+iSOspTjUNMQ14GDHouOK9rSzEJNh7hKIcIX3s3JOhhMHj/GlEASM2HyQkI3BdEp/wy9ojwBfxfPBm32bSNQAFosI/reZY2qYPEpKcI6ftpLZXoDKpdulrk76U/kWgKgtjR9mN7fdXhojlU9M1iABZhdDJQ1tErwfYOZV8oVSKgxef9gDPvi9cXX4tbK60857EVtkakKPR9Nm2vLJbJxqfJEeCuks1heHUW9D X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: kpromoted is a kernel daemon that accumulates hot page info from different sources and tries to promote pages from slow tiers to top tiers. One instance of this thread runs on each node that has CPUs. Subsystems that generate hot page access info can report that to kpromoted via this API: int kpromoted_record_access(u64 pfn, int nid, int src, unsigned long time) @pfn: The PFN of the memory accessed @nid: The accessing NUMA node ID @src: The temperature source (subsystem) that generated the access info @time: The access time in jiffies Some temperature sources may not provide the nid from which the page was accessed. This is true for sources that use page table scanning for PTE Accessed bit. Currently the toptier node to which such pages should be promoted to is hard coded. Also, the access time provided some sources may at best be considered approximate. This is especially true for hot pages detected by PTE A bit scanning. kpromoted currently maintains the hot PFN records in hash lists hashed by PFN value. Each record stores the following info: struct page_hotness_info { unsigned long pfn; /* Time when this record was updated last */ unsigned long last_update; /* * Number of times this page was accessed in the * current window */ int frequency; /* Most recent access time */ unsigned long recency; /* Most recent access from this node */ int hot_node; struct hlist_node hnode; }; The way in which a page is categorized as hot enough to be promoted is pretty primitive now. Signed-off-by: Bharata B Rao --- include/linux/kpromoted.h | 54 ++++++ include/linux/mmzone.h | 4 + include/linux/vm_event_item.h | 13 ++ mm/Kconfig | 7 + mm/Makefile | 1 + mm/kpromoted.c | 305 ++++++++++++++++++++++++++++++++++ mm/mm_init.c | 10 ++ mm/vmstat.c | 13 ++ 8 files changed, 407 insertions(+) create mode 100644 include/linux/kpromoted.h create mode 100644 mm/kpromoted.c diff --git a/include/linux/kpromoted.h b/include/linux/kpromoted.h new file mode 100644 index 000000000000..2bef3d74f03a --- /dev/null +++ b/include/linux/kpromoted.h @@ -0,0 +1,54 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_KPROMOTED_H +#define _LINUX_KPROMOTED_H + +#include +#include +#include + +/* Page hotness temperature sources */ +enum kpromoted_src { + KPROMOTED_HW_HINTS, + KPROMOTED_PGTABLE_SCAN, +}; + +#ifdef CONFIG_KPROMOTED + +#define KPROMOTED_FREQ_WINDOW (5 * MSEC_PER_SEC) + +/* 2 accesses within a window will make the page a promotion candidate */ +#define KPRMOTED_FREQ_THRESHOLD 2 + +#define KPROMOTED_HASH_ORDER 16 + +struct page_hotness_info { + unsigned long pfn; + + /* Time when this record was updated last */ + unsigned long last_update; + + /* + * Number of times this page was accessed in the + * current window + */ + int frequency; + + /* Most recent access time */ + unsigned long recency; + + /* Most recent access from this node */ + int hot_node; + struct hlist_node hnode; +}; + +#define KPROMOTE_DELAY MSEC_PER_SEC + +int kpromoted_record_access(u64 pfn, int nid, int src, unsigned long now); +#else +static inline int kpromoted_record_access(u64 pfn, int nid, int src, + unsigned long now) +{ + return 0; +} +#endif /* CONFIG_KPROMOTED */ +#endif /* _LINUX_KPROMOTED_H */ diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 9540b41894da..a5c4e789aa55 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1459,6 +1459,10 @@ typedef struct pglist_data { #ifdef CONFIG_MEMORY_FAILURE struct memory_failure_stats mf_stats; #endif +#ifdef CONFIG_KPROMOTED + struct task_struct *kpromoted; + wait_queue_head_t kpromoted_wait; +#endif } pg_data_t; #define node_present_pages(nid) (NODE_DATA(nid)->node_present_pages) diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index f70d0958095c..b5823b037883 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -182,6 +182,19 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, KSTACK_REST, #endif #endif /* CONFIG_DEBUG_STACK_USAGE */ + KPROMOTED_RECORDED_ACCESSES, + KPROMOTED_RECORD_HWHINTS, + KPROMOTED_RECORD_PGTSCANS, + KPROMOTED_RECORD_TOPTIER, + KPROMOTED_RECORD_ADDED, + KPROMOTED_RECORD_EXISTS, + KPROMOTED_MIG_RIGHT_NODE, + KPROMOTED_MIG_NON_LRU, + KPROMOTED_MIG_COLD_OLD, + KPROMOTED_MIG_COLD_NOT_ACCESSED, + KPROMOTED_MIG_CANDIDATE, + KPROMOTED_MIG_PROMOTED, + KPROMOTED_MIG_DROPPED, NR_VM_EVENT_ITEMS }; diff --git a/mm/Kconfig b/mm/Kconfig index 1b501db06417..ceaa462a0ce6 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1358,6 +1358,13 @@ config PT_RECLAIM Note: now only empty user PTE page table pages will be reclaimed. +config KPROMOTED + bool "Kernel hot page promotion daemon" + def_bool y + depends on NUMA && MIGRATION && MMU + help + Promote hot pages from lower tier to top tier by using the + memory access information provided by various sources. source "mm/damon/Kconfig" diff --git a/mm/Makefile b/mm/Makefile index 850386a67b3e..bf4f5f18f1f9 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -147,3 +147,4 @@ obj-$(CONFIG_SHRINKER_DEBUG) += shrinker_debug.o obj-$(CONFIG_EXECMEM) += execmem.o obj-$(CONFIG_TMPFS_QUOTA) += shmem_quota.o obj-$(CONFIG_PT_RECLAIM) += pt_reclaim.o +obj-$(CONFIG_KPROMOTED) += kpromoted.o diff --git a/mm/kpromoted.c b/mm/kpromoted.c new file mode 100644 index 000000000000..2a8b8495b6b3 --- /dev/null +++ b/mm/kpromoted.c @@ -0,0 +1,305 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * kpromoted is a kernel thread that runs on each node that has CPU i,e., + * on regular nodes. + * + * Maintains list of hot pages from lower tiers and promotes them. + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +static DEFINE_HASHTABLE(page_hotness_hash, KPROMOTED_HASH_ORDER); +static struct mutex page_hotness_lock[1UL << KPROMOTED_HASH_ORDER]; + +static int kpromote_page(struct page_hotness_info *phi) +{ + struct page *page = pfn_to_page(phi->pfn); + struct folio *folio; + int ret; + + if (!page) + return 1; + + folio = page_folio(page); + ret = migrate_misplaced_folio_prepare(folio, NULL, phi->hot_node); + if (ret) + return 1; + + return migrate_misplaced_folio(folio, phi->hot_node); +} + +static int page_should_be_promoted(struct page_hotness_info *phi) +{ + struct page *page = pfn_to_online_page(phi->pfn); + unsigned long now = jiffies; + struct folio *folio; + + if (!page || is_zone_device_page(page)) + return false; + + folio = page_folio(page); + if (!folio_test_lru(folio)) { + count_vm_event(KPROMOTED_MIG_NON_LRU); + return false; + } + if (folio_nid(folio) == phi->hot_node) { + count_vm_event(KPROMOTED_MIG_RIGHT_NODE); + return false; + } + + /* If the page was hot a while ago, don't promote */ + if ((now - phi->last_update) > 2 * msecs_to_jiffies(KPROMOTED_FREQ_WINDOW)) { + count_vm_event(KPROMOTED_MIG_COLD_OLD); + return false; + } + + /* If the page hasn't been accessed enough number of times, don't promote */ + if (phi->frequency < KPRMOTED_FREQ_THRESHOLD) { + count_vm_event(KPROMOTED_MIG_COLD_NOT_ACCESSED); + return false; + } + return true; +} + +/* + * Go thro' page hotness information and migrate pages if required. + * + * Promoted pages are not longer tracked in the hot list. + * Cold pages are pruned from the list as well. + * + * TODO: Batching could be done + */ +static void kpromoted_migrate(pg_data_t *pgdat) +{ + int nid = pgdat->node_id; + struct page_hotness_info *phi; + struct hlist_node *tmp; + int nr_bkts = HASH_SIZE(page_hotness_hash); + int bkt; + + for (bkt = 0; bkt < nr_bkts; bkt++) { + mutex_lock(&page_hotness_lock[bkt]); + hlist_for_each_entry_safe(phi, tmp, &page_hotness_hash[bkt], hnode) { + if (phi->hot_node != nid) + continue; + + if (page_should_be_promoted(phi)) { + count_vm_event(KPROMOTED_MIG_CANDIDATE); + if (!kpromote_page(phi)) { + count_vm_event(KPROMOTED_MIG_PROMOTED); + hlist_del_init(&phi->hnode); + kfree(phi); + } + } else { + /* + * Not a suitable page or cold page, stop tracking it. + * TODO: Identify cold pages and drive demotion? + */ + count_vm_event(KPROMOTED_MIG_DROPPED); + hlist_del_init(&phi->hnode); + kfree(phi); + } + } + mutex_unlock(&page_hotness_lock[bkt]); + } +} + +static struct page_hotness_info *__kpromoted_lookup(unsigned long pfn, int bkt) +{ + struct page_hotness_info *phi; + + hlist_for_each_entry(phi, &page_hotness_hash[bkt], hnode) { + if (phi->pfn == pfn) + return phi; + } + return NULL; +} + +static struct page_hotness_info *kpromoted_lookup(unsigned long pfn, int bkt, unsigned long now) +{ + struct page_hotness_info *phi; + + phi = __kpromoted_lookup(pfn, bkt); + if (!phi) { + phi = kzalloc(sizeof(struct page_hotness_info), GFP_KERNEL); + if (!phi) + return ERR_PTR(-ENOMEM); + + phi->pfn = pfn; + phi->frequency = 1; + phi->last_update = now; + phi->recency = now; + hlist_add_head(&phi->hnode, &page_hotness_hash[bkt]); + count_vm_event(KPROMOTED_RECORD_ADDED); + } else { + count_vm_event(KPROMOTED_RECORD_EXISTS); + } + return phi; +} + +/* + * Called by subsystems that generate page hotness/access information. + * + * Records the memory access info for futher action by kpromoted. + */ +int kpromoted_record_access(u64 pfn, int nid, int src, unsigned long now) +{ + struct page_hotness_info *phi; + struct page *page; + struct folio *folio; + int ret, bkt; + + count_vm_event(KPROMOTED_RECORDED_ACCESSES); + + switch (src) { + case KPROMOTED_HW_HINTS: + count_vm_event(KPROMOTED_RECORD_HWHINTS); + break; + case KPROMOTED_PGTABLE_SCAN: + count_vm_event(KPROMOTED_RECORD_PGTSCANS); + break; + default: + break; + } + + /* + * Record only accesses from lower tiers. + * Assuming node having CPUs as toptier for now. + */ + if (node_is_toptier(pfn_to_nid(pfn))) { + count_vm_event(KPROMOTED_RECORD_TOPTIER); + return 0; + } + + page = pfn_to_online_page(pfn); + if (!page || is_zone_device_page(page)) + return 0; + + folio = page_folio(page); + if (!folio_test_lru(folio)) + return 0; + + bkt = hash_min(pfn, KPROMOTED_HASH_ORDER); + mutex_lock(&page_hotness_lock[bkt]); + phi = kpromoted_lookup(pfn, bkt, now); + if (!phi) { + ret = PTR_ERR(phi); + goto out; + } + + if ((phi->last_update - now) > msecs_to_jiffies(KPROMOTED_FREQ_WINDOW)) { + /* New window */ + phi->frequency = 1; /* TODO: Factor in the history */ + phi->last_update = now; + } else { + phi->frequency++; + } + phi->recency = now; + + /* + * TODOs: + * 1. Source nid is hard-coded for some temperature sources + * 2. Take action if hot_node changes - may be a shared page? + * 3. Maintain node info for every access within the window? + */ + phi->hot_node = (nid == NUMA_NO_NODE) ? 1 : nid; + mutex_unlock(&page_hotness_lock[bkt]); +out: + return 0; +} + +/* + * Go through the accumulated mem_access_info and migrate + * pages if required. + */ +static void kpromoted_do_work(pg_data_t *pgdat) +{ + kpromoted_migrate(pgdat); +} + +static inline bool kpromoted_work_requested(pg_data_t *pgdat) +{ + return false; +} + +static int kpromoted(void *p) +{ + pg_data_t *pgdat = (pg_data_t *)p; + struct task_struct *tsk = current; + long timeout = msecs_to_jiffies(KPROMOTE_DELAY); + + const struct cpumask *cpumask = cpumask_of_node(pgdat->node_id); + + if (!cpumask_empty(cpumask)) + set_cpus_allowed_ptr(tsk, cpumask); + + while (!kthread_should_stop()) { + wait_event_timeout(pgdat->kpromoted_wait, + kpromoted_work_requested(pgdat), timeout); + kpromoted_do_work(pgdat); + } + return 0; +} + +static void kpromoted_run(int nid) +{ + pg_data_t *pgdat = NODE_DATA(nid); + + if (pgdat->kpromoted) + return; + + pgdat->kpromoted = kthread_run(kpromoted, pgdat, "kpromoted%d", nid); + if (IS_ERR(pgdat->kpromoted)) { + pr_err("Failed to start kpromoted on node %d\n", nid); + pgdat->kpromoted = NULL; + } +} + +static int kpromoted_cpu_online(unsigned int cpu) +{ + int nid; + + for_each_node_state(nid, N_CPU) { + pg_data_t *pgdat = NODE_DATA(nid); + const struct cpumask *mask; + + mask = cpumask_of_node(pgdat->node_id); + + if (cpumask_any_and(cpu_online_mask, mask) < nr_cpu_ids) + /* One of our CPUs online: restore mask */ + if (pgdat->kpromoted) + set_cpus_allowed_ptr(pgdat->kpromoted, mask); + } + return 0; +} + +static int __init kpromoted_init(void) +{ + int nid, ret, i; + + ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, + "mm/promotion:online", + kpromoted_cpu_online, NULL); + if (ret < 0) { + pr_err("kpromoted: failed to register hotplug callbacks.\n"); + return ret; + } + + for (i = 0; i < (1UL << KPROMOTED_HASH_ORDER); i++) + mutex_init(&page_hotness_lock[i]); + + for_each_node_state(nid, N_CPU) + kpromoted_run(nid); + + return 0; +} + +subsys_initcall(kpromoted_init) diff --git a/mm/mm_init.c b/mm/mm_init.c index 2630cc30147e..d212df24f89b 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -1362,6 +1362,15 @@ static void pgdat_init_kcompactd(struct pglist_data *pgdat) static void pgdat_init_kcompactd(struct pglist_data *pgdat) {} #endif +#ifdef CONFIG_KPROMOTED +static void pgdat_init_kpromoted(struct pglist_data *pgdat) +{ + init_waitqueue_head(&pgdat->kpromoted_wait); +} +#else +static void pgdat_init_kpromoted(struct pglist_data *pgdat) {} +#endif + static void __meminit pgdat_init_internals(struct pglist_data *pgdat) { int i; @@ -1371,6 +1380,7 @@ static void __meminit pgdat_init_internals(struct pglist_data *pgdat) pgdat_init_split_queue(pgdat); pgdat_init_kcompactd(pgdat); + pgdat_init_kpromoted(pgdat); init_waitqueue_head(&pgdat->kswapd_wait); init_waitqueue_head(&pgdat->pfmemalloc_wait); diff --git a/mm/vmstat.c b/mm/vmstat.c index 16bfe1c694dd..618f44bae5c8 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1466,6 +1466,19 @@ const char * const vmstat_text[] = { "kstack_rest", #endif #endif + "kpromoted_recorded_accesses", + "kpromoted_recorded_hwhints", + "kpromoted_recorded_pgtscans", + "kpromoted_record_toptier", + "kpromoted_record_added", + "kpromoted_record_exists", + "kpromoted_mig_right_node", + "kpromoted_mig_non_lru", + "kpromoted_mig_cold_old", + "kpromoted_mig_cold_not_accessed", + "kpromoted_mig_candidate", + "kpromoted_mig_promoted", + "kpromoted_mig_dropped", #endif /* CONFIG_VM_EVENT_COUNTERS || CONFIG_MEMCG */ }; #endif /* CONFIG_PROC_FS || CONFIG_SYSFS || CONFIG_NUMA || CONFIG_MEMCG */ From patchwork Thu Mar 6 05:45:31 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bharata B Rao X-Patchwork-Id: 14003847 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03271C282D1 for ; Thu, 6 Mar 2025 05:48:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 463AF280004; Thu, 6 Mar 2025 00:48:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3EC2E280001; Thu, 6 Mar 2025 00:48:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1F3E6280004; Thu, 6 Mar 2025 00:48:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id EA202280001 for ; Thu, 6 Mar 2025 00:48:27 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id DCC00141028 for ; Thu, 6 Mar 2025 05:48:28 +0000 (UTC) X-FDA: 83190046296.24.500CFB3 Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2078.outbound.protection.outlook.com [40.107.220.78]) by imf10.hostedemail.com (Postfix) with ESMTP id CC9FFC000A for ; Thu, 6 Mar 2025 05:48:25 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=amd.com header.s=selector1 header.b=L6zAb5lP; dmarc=pass (policy=quarantine) header.from=amd.com; spf=pass (imf10.hostedemail.com: domain of bharata@amd.com designates 40.107.220.78 as permitted sender) smtp.mailfrom=bharata@amd.com; arc=pass ("microsoft.com:s=arcselector10001:i=1") ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1741240106; a=rsa-sha256; cv=pass; b=FHtl/8mV3lpSPMjAxV6Xm3JDeudPhIXPGzNkgQECmhEyi4XibGEU1jkqkK6Z++SbnVnjTN mcs+3J24Dziop/mZsvZ3Joioq2wUVwlmL6b7u7KFtekLhrRdyMBPmKckH0BlPCjiy7VzyN qDUoq5gsrqSMOO+6sPe4MEzPCP9t/5c= ARC-Authentication-Results: i=2; imf10.hostedemail.com; dkim=pass header.d=amd.com header.s=selector1 header.b=L6zAb5lP; dmarc=pass (policy=quarantine) header.from=amd.com; spf=pass (imf10.hostedemail.com: domain of bharata@amd.com designates 40.107.220.78 as permitted sender) smtp.mailfrom=bharata@amd.com; arc=pass ("microsoft.com:s=arcselector10001:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741240106; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=5juOMg1h6w8SYB34nVKzLjnpS6LJy7kqJtjxvaQIYKE=; b=T1Eic7n5lhLBBPloDDgYlqePhx1jI6xSYawdNzYebnrNXkS5B0YBXBoyzpyKIOH49YqxyX 827ex3Gy5gB2PMxmcr+HBdPS5Cge34mTL+3R8fezNuCyK13eHDBD5evDgHgUjzb1uOPA3d NJThnZmvTIPKYrJI89kXE5Z6Jd9jGGk= ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=kieDVAnnyw/YEz71KjoqB9PMqD7cp1ANpIDoWhGTRAPoGavQgMtwNBwrt3YYDxT285AVYeCyzZZTywRn0BYa61yNoHNTfXOMsE8EQ1DLCesE2v5PY/vVj45dLjbo4h4pOCeiG7P1vC0u3PANIfvC5NXHhsh4tltBTW3WsueHHNhZrYKYNQcaTESVXzcQ3ksJtWgo4UhuF1mZJUsn/DHPsBQEwHqHdINtG6x0/DoPQ3739pyYixD76Hj1XQcf2bZhklkqMSEZO0797evpKGqnvEWuQvnwcPi9Nb3FIXBe3sSa0NYzT0rpkMlwBjGlpkIxZQtpWzF8BWgrdThEPEFB+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=5juOMg1h6w8SYB34nVKzLjnpS6LJy7kqJtjxvaQIYKE=; b=eJKRB3R5rniazFptDF8sGsEA2zzKWSids1ERpSvpR4DdAKv7adImRiJOoNMrfy1Qo+gfOSb6ImOXw5TetO7Zd54XLrQW9ZL6xT3SO6pA8K0WRynra+bYOV9ZbYsHl9VYwut44iXLBKwNr3qZdwAykTPWppQv7zMGK83kdAP5whA2EzgTX61wHtTOYCN/BoKeOgmDNN+ewlQI1ssbxX3EVneGeuzCe03p4wt+JoOYvpnYvqfzc4MLQM1H8z+XVfWoH0dQVH51443pqeLrPMJmP9g4++VX2bMzNgiskUx5/2XxqELaJv4NqQbTIdFmHxFBIR+Dha8Ei+houJgsVKTn3Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=5juOMg1h6w8SYB34nVKzLjnpS6LJy7kqJtjxvaQIYKE=; b=L6zAb5lP0pZYS0jnFvVPsw2EnxuVLsFI5JCEDgSU7QcT9LHecRAx0wUcfUw3e0YTufTOw3NQlEQ87wV7oPxGfTZYjy2s8ow4P3+4LiPVOtPKEV83SW2f1nHZTUIPF3dl32iD9GrlbHzm0nhUo+fEheaRtWI7owOxm4y9UFgC+2A= Received: from BN9PR03CA0198.namprd03.prod.outlook.com (2603:10b6:408:f9::23) by DS7PR12MB5765.namprd12.prod.outlook.com (2603:10b6:8:74::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8511.19; Thu, 6 Mar 2025 05:48:19 +0000 Received: from BL02EPF0001A0FE.namprd03.prod.outlook.com (2603:10b6:408:f9:cafe::a1) by BN9PR03CA0198.outlook.office365.com (2603:10b6:408:f9::23) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8511.16 via Frontend Transport; Thu, 6 Mar 2025 05:48:18 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by BL02EPF0001A0FE.mail.protection.outlook.com (10.167.242.105) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8511.15 via Frontend Transport; Thu, 6 Mar 2025 05:48:18 +0000 Received: from BLR-L-BHARARAO.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Wed, 5 Mar 2025 23:48:07 -0600 From: Bharata B Rao To: , CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Bharata B Rao Subject: [RFC PATCH 3/4] x86: ibs: In-kernel IBS driver for memory access profiling Date: Thu, 6 Mar 2025 11:15:31 +0530 Message-ID: <20250306054532.221138-4-bharata@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250306054532.221138-1-bharata@amd.com> References: <20250306054532.221138-1-bharata@amd.com> MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BL02EPF0001A0FE:EE_|DS7PR12MB5765:EE_ X-MS-Office365-Filtering-Correlation-Id: 4f40f431-1813-490f-9d26-08dd5c727f57 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|7416014|376014|1800799024|36860700013|82310400026|13003099007; X-Microsoft-Antispam-Message-Info: nKxSD6bJboedXI2jBlnJp6C+CoAgPR4UhaMxs1MyUX9DxLsOdbruOGz5vOHKSuTNPIuezT1WYdwT2Ka1W2MTM5zI+6iAmbH7CPchw+znPRxUT43CJBbtx5o8DsefkNsGdmTSS0fUMq7oQ60XexlyE9W1rKHwRTLslasWcSGL8cfCbz+f4QHhXd3L8qkQRUioJ/09zSjESyGrSxmMov7BL4O5/8bz++rhPQ2C8PMRTynwF5CAW1bHqnJNEOMiUwoscPwOqsv7EiYtkuxoOZqxc4XmV4dz2QjVlDqHIZMsH3wJjT5UzMCFrJkOgWgH1WuWv/SzcqpPIpmxfgmQVu6ZXeXKCFrrAn7EVZE0amHFjPRr/LtFlSE5prYJRD8AVzOyHwumli1pHieREnqEMA02W/LdT6kuw3SuVrzaepSiAgadkeDZu0qiGlE2HESb6K0ZwF1Y9RC3RoncxCelamvOdINWrZX6xD3Y9NSynfts8Yg08JzrwQk5CNAKhoMsaQ2IHzp84lIuEOq7gvxtzUydSbKpVCKjQiNgSfC48almNnEvPZoUKLZn7WSI3H4Qw+9PXaD0tFL7ncYotj3FlwRbvb5THcAhxfmiH/02fpoPgtsm5GYQYeUUABiRfQEMDvnHPOPF9im6TDjfLHSfnjcyREUl126XHlEXp7B6+zD5x10+vt5gIQ7WiUN0iBECcdg7dfMDdqnMlYmulpA4GYsT3w13SdbJNdo7adMchx55YUB+9LCDyit1cFOStzgFxIW1Tyiri37CgJkEwydWii3L8EyA6He7q+Pax+j3WVVCHwEVu9apsXlrku1T8UwTc51BvuIB3k05FdD/oidhwGfMbmHClZI+kDRE/TkpMG3Ac8aLfBd84vetiRi12l4FeMTHbYDuWo7h6Kt/W8xSJFUCtwd9yaYQTTNTqm/9Pq18gzM/AjZI1zRGTT6BVNbXwtuF+jOjCszctjlLAatKftOt05RDrfkV5jvAU2Wf2TQbd98BZUbzuSWsjajaoI8R9AAgVTP1WTqNGuhXXjTh4lTkvYGDRQIS8dzfW+nRSfe6+vvJ1o9KiYbXLuv6BHvYAoX0uAahxsc1Uvj6e95Noj6gmlxPMdaK3wQP1KLqnZQai9Q2NS0k4Yx3lG48jY5ZhrhBWV9CxfSdPS/8a9ry1QtBdbozg5DagFpWK9r9u3aHEKRKfVqEmdNgannpWg0rZ+UUixJ/0iPTieK4INXuSmdgr/zPeQyLDq1ZzMoMr1lQJ5/7p41tweC7SrZtXdbVpeY8yeo3FMGuryjNfloXNZxr5JxchCXT45SoI18MuMx+1PMhjnlTv1WE0Vy0udqKgmw3jZj/XVAj0tpEGl/YTxtCEGZU7ocGLHxRNl7PIew9ejrn/12Qk0KmAGYwJfzhMyAbtgIdTkNttFpcyplkAICgVmbrF/o8VQnT6JwqSPTd0AYv7S0eJFL87XFrTR0+HPJY X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(7416014)(376014)(1800799024)(36860700013)(82310400026)(13003099007);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Mar 2025 05:48:18.8485 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 4f40f431-1813-490f-9d26-08dd5c727f57 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: BL02EPF0001A0FE.namprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS7PR12MB5765 X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: CC9FFC000A X-Stat-Signature: cbttce6ea9qxknap1cgwx6qeazd4bwj6 X-HE-Tag: 1741240105-844546 X-HE-Meta: U2FsdGVkX1/DGHvvT0EqnirKSy6fbHI63K+U3JM7FFiORimwI3/ScilzUHEjBYmoAW72faSBZKxNUHQa5oeS7mneCQzPDyU737uOk8I7OYgGmJm5LGsEN9TKVCMg4y1fJpY9XnI9SvInKho7qzE/ZW2AQFSQk9h1zyVcO8DtyzsMtiXpq31bmu3OtXV6ddiiBE0KsttGj+X9/+jJiNG+gRMcDeQU5DI6BVw/fITqJt1yduYDbs/ygkhcGt1+tjl2G9lWnCYYw5I9qitJ0WQR690wlzuuUZF6MdW6Wk+QdVk7YqOygs05ee5SWHwWyM/Xzq1Q8P2j6nTYtm8DXJ5E2lRkz09LCPNKbEHyi1SEVr8oo9q85lTVsayJ9/QJ54tx5SZQQz+D+vNJDz3UgS6mbnpexPrJSwJVVnGE4Rdzr5E6ph27va2bsPsMd5GqjSTk1UcvSXjbxm79jbtpSgia/YPMhi8yfqjQLgY5h7J+kAlmo7jfoJkrL8P4EPjHkVMl33Qi6IvlJzz+91jIwFS1c/xNFlGHuCVZx+xBi9+p3Nb7osAn/X76a0toe3hTXEK7MAK6F9CbBGGhhNRJrm51qc0whcUpRpcPuy1JlE5w+rdQ92rh+Ipo5OmuzJoUtTQgYllsecKPRv74EByGsaG6jJ4WWljJCxlhyBLEBSt8+T9XWOjijrk/vDsu3Wghy4slvCyKpw+YwYGHK22fXvI7zWSwc45MYOoupE6t3nUr3GkXWkGcLFx1/U0SL2qewtDjLMNkUIGWzNmTnPGOIeWmwFpoDETGQWf8MAvYCbvfG2Td/i8SajjsNgI1YdpLfnkwxs3XODcKDKxj1z7roBq1IQMbpceCbGi3CyecdAYlc2wbrdz2XkwxSI6xaR0GHtaLO0YmnuzcF6cZnxm6+SPgUifRMvHA8eD7PRM5sPfu8nvyR6G6xtHa/v+ZB/oKOI8c7TXLUkfw/2OTxiMnDPm Mr8yE7Bl ukPBhTqUqfuaUrGE4K8h0EFETlKFYQj+29fuc1exbl+a76VuAtk5gxIEvnX5RoT2HzVaXKRXktVbtvUxXnnTi97AutfmK/r4uBU1R4WoxLcNZdtx7vi+Wcm6kY3LSzGqThMZ2bA0HQ5i+mqaG1IYH+HYMuUsgGKjcmDp9wcWhjY2ewTRFYxnWJhVBtowkWPZjJQpOXXLnWxrBgLWqsw/dofLh6quKiUYr92cn70SB/OiuyXX5Dkwz/Sb0ylV1XfhicbTdELSVnYEbJlgSa3VxCxT4kxRhZHWgUA2dwmHupIiVuFQarnN2shGfYZbBUZ4NtVD+Ua6fwAbXV+IIjp6WM5Gvcy8fWZVCOXtQJiXcN4PZneZsgqDCqD/MBbxy7yAd8JkpGBJufTk7Cz/OvzrynMmkuiyIwNpbYxuynRxR10BvO95666ou18ZRF5MxyHcSgb+dfAc6jGKKZXF60GVFMZ+4lzIUA63KlU/u4NQZ54BYA4yLecBA8hyr5M3T3FZBJl3SEddXTYzfiOguWUlNmSjCVvqMRLXq7gh/8U+x6JuOdjiL5gC1IDO2RoRDcQRpbx7/tjtKGLAkJjy3Ob+xgVieFA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Use IBS (Instruction Based Sampling) feature present in AMD processors for memory access tracking. The access information obtained from IBS via NMI is fed to kpromoted daemon for futher action. In addition to many other information related to the memory access, IBS provides physical (and virtual) address of the access and indicates if the access came from slower tier. Only memory accesses originating from slower tiers are further acted upon by this driver. The samples are initially accumulated in percpu buffers which are flushed to kpromoted using irq_work. About IBS --------- IBS can be programmed to provide data about instruction execution periodically. This is done by programming a desired sample count (number of ops) in a control register. When the programmed number of ops are dispatched, a micro-op gets tagged, various information about the tagged micro-op's execution is populated in IBS execution MSRs and an interrupt is raised. While IBS provides a lot of data for each sample, for the purpose of memory access profiling, we are interested in linear and physical address of the memory access that reached DRAM. Recent AMD processors provide further filtering where it is possible to limit the sampling to those ops that had an L3 miss which greately reduces the non-useful samples. While IBS provides capability to sample instruction fetch and execution, only IBS execution sampling is used here to collect data about memory accesses that occur during the instruction execution. More information about IBS is available in Sec 13.3 of AMD64 Architecture Programmer's Manual, Volume 2:System Programming which is present at: https://bugzilla.kernel.org/attachment.cgi?id=288923 Information about MSRs used for programming IBS can be found in Sec 2.1.14.4 of PPR Vol 1 for AMD Family 19h Model 11h B1 which is currently present at: https://www.amd.com/system/files/TechDocs/55901_0.25.zip Signed-off-by: Bharata B Rao --- arch/x86/events/amd/ibs.c | 11 ++ arch/x86/include/asm/ibs.h | 7 + arch/x86/include/asm/msr-index.h | 16 ++ arch/x86/mm/Makefile | 3 +- arch/x86/mm/ibs.c | 312 +++++++++++++++++++++++++++++++ include/linux/vm_event_item.h | 17 ++ mm/vmstat.c | 17 ++ 7 files changed, 382 insertions(+), 1 deletion(-) create mode 100644 arch/x86/include/asm/ibs.h create mode 100644 arch/x86/mm/ibs.c diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c index e7a8b8758e08..35497e8c0846 100644 --- a/arch/x86/events/amd/ibs.c +++ b/arch/x86/events/amd/ibs.c @@ -13,8 +13,10 @@ #include #include #include +#include #include +#include #include "../perf_event.h" @@ -1539,6 +1541,15 @@ static __init int amd_ibs_init(void) { u32 caps; + /* + * TODO: Find a clean way to disable perf IBS so that IBS + * can be used for memory access profiling. + */ + if (arch_hw_access_profiling) { + pr_info("IBS isn't available for perf use\n"); + return 0; + } + caps = __get_ibs_caps(); if (!caps) return -ENODEV; /* ibs not supported by the cpu */ diff --git a/arch/x86/include/asm/ibs.h b/arch/x86/include/asm/ibs.h new file mode 100644 index 000000000000..b5a4f2ca6330 --- /dev/null +++ b/arch/x86/include/asm/ibs.h @@ -0,0 +1,7 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_X86_IBS_H +#define _ASM_X86_IBS_H + +extern bool arch_hw_access_profiling; + +#endif /* _ASM_X86_IBS_H */ diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 72765b2fe0d8..12291e362b01 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -719,6 +719,22 @@ /* AMD Last Branch Record MSRs */ #define MSR_AMD64_LBR_SELECT 0xc000010e +/* AMD IBS MSR bits */ +#define MSR_AMD64_IBSOPDATA2_DATASRC 0x7 +#define MSR_AMD64_IBSOPDATA2_DATASRC_LCL_CACHE 0x1 +#define MSR_AMD64_IBSOPDATA2_DATASRC_PEER_CACHE_NEAR 0x2 +#define MSR_AMD64_IBSOPDATA2_DATASRC_DRAM 0x3 +#define MSR_AMD64_IBSOPDATA2_DATASRC_FAR_CCX_CACHE 0x5 +#define MSR_AMD64_IBSOPDATA2_DATASRC_EXT_MEM 0x8 +#define MSR_AMD64_IBSOPDATA2_RMTNODE 0x10 + +#define MSR_AMD64_IBSOPDATA3_LDOP BIT_ULL(0) +#define MSR_AMD64_IBSOPDATA3_STOP BIT_ULL(1) +#define MSR_AMD64_IBSOPDATA3_DCMISS BIT_ULL(7) +#define MSR_AMD64_IBSOPDATA3_LADDR_VALID BIT_ULL(17) +#define MSR_AMD64_IBSOPDATA3_PADDR_VALID BIT_ULL(18) +#define MSR_AMD64_IBSOPDATA3_L2MISS BIT_ULL(20) + /* Zen4 */ #define MSR_ZEN4_BP_CFG 0xc001102e #define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5 diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 690fbf48e853..3b1a5dbbac64 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -26,7 +26,8 @@ CFLAGS_REMOVE_pgprot.o = -pg endif obj-y := init.o init_$(BITS).o fault.o ioremap.o extable.o mmap.o \ - pgtable.o physaddr.o tlb.o cpu_entry_area.o maccess.o pgprot.o + pgtable.o physaddr.o tlb.o cpu_entry_area.o maccess.o pgprot.o \ + ibs.o obj-y += pat/ diff --git a/arch/x86/mm/ibs.c b/arch/x86/mm/ibs.c new file mode 100644 index 000000000000..5c966050ad86 --- /dev/null +++ b/arch/x86/mm/ibs.c @@ -0,0 +1,312 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include +#include +#include + +#include +#include /* TODO: Move defns like IBS_OP_ENABLE into non-perf header */ +#include +#include + +bool arch_hw_access_profiling; +static u64 ibs_config __read_mostly; +static u32 ibs_caps; + +#define IBS_NR_SAMPLES 50 + +/* + * Basic access info captured for each memory access. + */ +struct ibs_sample { + unsigned long pfn; + unsigned long time; /* jiffies when accessed */ + int nid; /* Accessing node ID, if known */ +}; + +/* + * Percpu buffer of access samples. Samples are accumulated here + * before pushing them to kpromoted for further action. + */ +struct ibs_sample_pcpu { + struct ibs_sample samples[IBS_NR_SAMPLES]; + int head, tail; +}; + +struct ibs_sample_pcpu __percpu *ibs_s; + +/* + * The workqueue for pushing the percpu access samples to kpromoted. + */ +static struct work_struct ibs_work; +static struct irq_work ibs_irq_work; + +/* + * Record the IBS-reported access sample in percpu buffer. + * Called from IBS NMI handler. + */ +static int ibs_push_sample(unsigned long pfn, int nid, unsigned long time) +{ + struct ibs_sample_pcpu *ibs_pcpu = raw_cpu_ptr(ibs_s); + int next = ibs_pcpu->head + 1; + + if (next >= IBS_NR_SAMPLES) + next = 0; + + if (next == ibs_pcpu->tail) + return 0; + + ibs_pcpu->samples[ibs_pcpu->head].pfn = pfn; + ibs_pcpu->samples[ibs_pcpu->head].time = time; + ibs_pcpu->head = next; + return 1; +} + +static int ibs_pop_sample(struct ibs_sample *s) +{ + struct ibs_sample_pcpu *ibs_pcpu = raw_cpu_ptr(ibs_s); + + int next = ibs_pcpu->tail + 1; + + if (ibs_pcpu->head == ibs_pcpu->tail) + return 0; + + if (next >= IBS_NR_SAMPLES) + next = 0; + + *s = ibs_pcpu->samples[ibs_pcpu->tail]; + ibs_pcpu->tail = next; + return 1; +} + +/* + * Remove access samples from percpu buffer and send them + * to kpromoted for further action. + */ +static void ibs_work_handler(struct work_struct *work) +{ + struct ibs_sample s; + + while (ibs_pop_sample(&s)) + kpromoted_record_access(s.pfn, s.nid, KPROMOTED_HW_HINTS, + s.time); +} + +static void ibs_irq_handler(struct irq_work *i) +{ + schedule_work_on(smp_processor_id(), &ibs_work); +} + +/* + * IBS NMI handler: Process the memory access info reported by IBS. + * + * Reads the MSRs to collect all the information about the reported + * memory access, validates the access, stores the valid sample and + * schedules the work on this CPU to further process the sample. + */ +static int ibs_overflow_handler(unsigned int cmd, struct pt_regs *regs) +{ + struct mm_struct *mm = current->mm; + u64 ops_ctl, ops_data3, ops_data2; + u64 laddr = -1, paddr = -1; + u64 data_src, rmt_node; + struct page *page; + unsigned long pfn; + + rdmsrl(MSR_AMD64_IBSOPCTL, ops_ctl); + + /* + * When IBS sampling period is reprogrammed via read-modify-update + * of MSR_AMD64_IBSOPCTL, overflow NMIs could be generated with + * IBS_OP_ENABLE not set. For such cases, return as HANDLED. + * + * With this, the handler will say "handled" for all NMIs that + * aren't related to this NMI. This stems from the limitation of + * having both status and control bits in one MSR. + */ + if (!(ops_ctl & IBS_OP_VAL)) + goto handled; + + wrmsrl(MSR_AMD64_IBSOPCTL, ops_ctl & ~IBS_OP_VAL); + + count_vm_event(HWHINT_NR_EVENTS); + + if (!user_mode(regs)) { + count_vm_event(HWHINT_KERNEL); + goto handled; + } + + if (!mm) { + count_vm_event(HWHINT_KTHREAD); + goto handled; + } + + rdmsrl(MSR_AMD64_IBSOPDATA3, ops_data3); + + /* Load/Store ops only */ + /* TODO: DataSrc isn't valid for stores, so filter out stores? */ + if (!(ops_data3 & (MSR_AMD64_IBSOPDATA3_LDOP | + MSR_AMD64_IBSOPDATA3_STOP))) { + count_vm_event(HWHINT_NON_LOAD_STORES); + goto handled; + } + + /* Discard the sample if it was L1 or L2 hit */ + if (!(ops_data3 & (MSR_AMD64_IBSOPDATA3_DCMISS | + MSR_AMD64_IBSOPDATA3_L2MISS))) { + count_vm_event(HWHINT_DC_L2_HITS); + goto handled; + } + + rdmsrl(MSR_AMD64_IBSOPDATA2, ops_data2); + data_src = ops_data2 & MSR_AMD64_IBSOPDATA2_DATASRC; + if (ibs_caps & IBS_CAPS_ZEN4) + data_src |= ((ops_data2 & 0xC0) >> 3); + + switch (data_src) { + case MSR_AMD64_IBSOPDATA2_DATASRC_LCL_CACHE: + count_vm_event(HWHINT_LOCAL_L3L1L2); + break; + case MSR_AMD64_IBSOPDATA2_DATASRC_PEER_CACHE_NEAR: + count_vm_event(HWHINT_LOCAL_PEER_CACHE_NEAR); + break; + case MSR_AMD64_IBSOPDATA2_DATASRC_DRAM: + count_vm_event(HWHINT_DRAM_ACCESSES); + break; + case MSR_AMD64_IBSOPDATA2_DATASRC_EXT_MEM: + count_vm_event(HWHINT_CXL_ACCESSES); + break; + case MSR_AMD64_IBSOPDATA2_DATASRC_FAR_CCX_CACHE: + count_vm_event(HWHINT_FAR_CACHE_HITS); + break; + } + + rmt_node = ops_data2 & MSR_AMD64_IBSOPDATA2_RMTNODE; + if (rmt_node) + count_vm_event(HWHINT_REMOTE_NODE); + + /* Is linear addr valid? */ + if (ops_data3 & MSR_AMD64_IBSOPDATA3_LADDR_VALID) + rdmsrl(MSR_AMD64_IBSDCLINAD, laddr); + else { + count_vm_event(HWHINT_LADDR_INVALID); + goto handled; + } + + /* Discard kernel address accesses */ + if (laddr & (1UL << 63)) { + count_vm_event(HWHINT_KERNEL_ADDR); + goto handled; + } + + /* Is phys addr valid? */ + if (ops_data3 & MSR_AMD64_IBSOPDATA3_PADDR_VALID) + rdmsrl(MSR_AMD64_IBSDCPHYSAD, paddr); + else { + count_vm_event(HWHINT_PADDR_INVALID); + goto handled; + } + + pfn = PHYS_PFN(paddr); + page = pfn_to_online_page(pfn); + if (!page) + goto handled; + + if (!PageLRU(page)) { + count_vm_event(HWHINT_NON_LRU); + goto handled; + } + + if (!ibs_push_sample(pfn, numa_node_id(), jiffies)) { + count_vm_event(HWHINT_BUFFER_FULL); + goto handled; + } + + irq_work_queue(&ibs_irq_work); + count_vm_event(HWHINT_USEFUL_SAMPLES); + +handled: + return NMI_HANDLED; +} + +static inline int get_ibs_lvt_offset(void) +{ + u64 val; + + rdmsrl(MSR_AMD64_IBSCTL, val); + if (!(val & IBSCTL_LVT_OFFSET_VALID)) + return -EINVAL; + + return val & IBSCTL_LVT_OFFSET_MASK; +} + +static void setup_APIC_ibs(void) +{ + int offset; + + offset = get_ibs_lvt_offset(); + if (offset < 0) + goto failed; + + if (!setup_APIC_eilvt(offset, 0, APIC_EILVT_MSG_NMI, 0)) + return; +failed: + pr_warn("IBS APIC setup failed on cpu #%d\n", + smp_processor_id()); +} + +static void clear_APIC_ibs(void) +{ + int offset; + + offset = get_ibs_lvt_offset(); + if (offset >= 0) + setup_APIC_eilvt(offset, 0, APIC_EILVT_MSG_FIX, 1); +} + +static int x86_amd_ibs_access_profile_startup(unsigned int cpu) +{ + setup_APIC_ibs(); + return 0; +} + +static int x86_amd_ibs_access_profile_teardown(unsigned int cpu) +{ + clear_APIC_ibs(); + return 0; +} + +static int __init ibs_access_profiling_init(void) +{ + if (!boot_cpu_has(X86_FEATURE_IBS)) { + pr_info("IBS capability is unavailable for access profiling\n"); + return 0; + } + + ibs_s = alloc_percpu_gfp(struct ibs_sample_pcpu, __GFP_ZERO); + if (!ibs_s) + return 0; + + INIT_WORK(&ibs_work, ibs_work_handler); + init_irq_work(&ibs_irq_work, ibs_irq_handler); + + /* Uses IBS Op sampling */ + ibs_config = IBS_OP_CNT_CTL | IBS_OP_ENABLE; + ibs_caps = cpuid_eax(IBS_CPUID_FEATURES); + if (ibs_caps & IBS_CAPS_ZEN4) + ibs_config |= IBS_OP_L3MISSONLY; + + register_nmi_handler(NMI_LOCAL, ibs_overflow_handler, 0, "ibs"); + + cpuhp_setup_state(CPUHP_AP_PERF_X86_AMD_IBS_STARTING, + "x86/amd/ibs_access_profile:starting", + x86_amd_ibs_access_profile_startup, + x86_amd_ibs_access_profile_teardown); + + pr_info("IBS setup for memory access profiling\n"); + return 0; +} + +arch_initcall(ibs_access_profiling_init); diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index b5823b037883..24279c46054c 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -195,6 +195,23 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, KPROMOTED_MIG_CANDIDATE, KPROMOTED_MIG_PROMOTED, KPROMOTED_MIG_DROPPED, + HWHINT_NR_EVENTS, + HWHINT_KERNEL, + HWHINT_KTHREAD, + HWHINT_NON_LOAD_STORES, + HWHINT_DC_L2_HITS, + HWHINT_LOCAL_L3L1L2, + HWHINT_LOCAL_PEER_CACHE_NEAR, + HWHINT_FAR_CACHE_HITS, + HWHINT_DRAM_ACCESSES, + HWHINT_CXL_ACCESSES, + HWHINT_REMOTE_NODE, + HWHINT_LADDR_INVALID, + HWHINT_KERNEL_ADDR, + HWHINT_PADDR_INVALID, + HWHINT_NON_LRU, + HWHINT_BUFFER_FULL, + HWHINT_USEFUL_SAMPLES, NR_VM_EVENT_ITEMS }; diff --git a/mm/vmstat.c b/mm/vmstat.c index 618f44bae5c8..a21d3118d6f6 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1479,6 +1479,23 @@ const char * const vmstat_text[] = { "kpromoted_mig_candidate", "kpromoted_mig_promoted", "kpromoted_mig_dropped", + "hwhint_nr_events", + "hwhint_kernel", + "hwhint_kthread", + "hwhint_non_load_stores", + "hwhint_dc_l2_hits", + "hwhint_local_l3l1l2", + "hwhint_local_peer_cache_near", + "hwhint_far_cache_hits", + "hwhint_dram_accesses", + "hwhint_cxl_accesses", + "hwhint_remote_node", + "hwhint_invalid_laddr", + "hwhint_kernel_addr", + "hwhint_invalid_paddr", + "hwhint_non_lru", + "hwhint_buffer_full", + "hwhint_useful_samples", #endif /* CONFIG_VM_EVENT_COUNTERS || CONFIG_MEMCG */ }; #endif /* CONFIG_PROC_FS || CONFIG_SYSFS || CONFIG_NUMA || CONFIG_MEMCG */ From patchwork Thu Mar 6 05:45:32 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bharata B Rao X-Patchwork-Id: 14003848 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A4EEC282DE for ; Thu, 6 Mar 2025 05:49:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BD6FD280006; Thu, 6 Mar 2025 00:49:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B61DD280005; Thu, 6 Mar 2025 00:49:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9B416280006; Thu, 6 Mar 2025 00:49:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 77CD3280005 for ; Thu, 6 Mar 2025 00:49:00 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 77E7C1CA35C for ; Thu, 6 Mar 2025 05:49:01 +0000 (UTC) X-FDA: 83190047682.24.21BE394 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2046.outbound.protection.outlook.com [40.107.94.46]) by imf09.hostedemail.com (Postfix) with ESMTP id 6482F140007 for ; Thu, 6 Mar 2025 05:48:58 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=amd.com header.s=selector1 header.b=RZ2SfdV+; dmarc=pass (policy=quarantine) header.from=amd.com; arc=pass ("microsoft.com:s=arcselector10001:i=1"); spf=pass (imf09.hostedemail.com: domain of bharata@amd.com designates 40.107.94.46 as permitted sender) smtp.mailfrom=bharata@amd.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1741240138; a=rsa-sha256; cv=pass; b=nc5BO2kpPOc9AsKLIKAxdCf3PgwgyBMH5v4kLpix61f0igvF5VxqqmD/kO7QpBpLSQl+SY Mls2BeAK33P1zwMd3iQr8nFvNf9zNyeLsnPjSSggbns9hew8Qc3DUzHWpGXcQQdNrUE9mp 1+vs1k7TgGZwNWeuy61flkCIT70WPi4= ARC-Authentication-Results: i=2; imf09.hostedemail.com; dkim=pass header.d=amd.com header.s=selector1 header.b=RZ2SfdV+; dmarc=pass (policy=quarantine) header.from=amd.com; arc=pass ("microsoft.com:s=arcselector10001:i=1"); spf=pass (imf09.hostedemail.com: domain of bharata@amd.com designates 40.107.94.46 as permitted sender) smtp.mailfrom=bharata@amd.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741240138; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=L0Ndj2upTDNcAeObFctsa70itOmQpn0tcJK4ZgSMTeY=; b=VmAuxH0iu+ZkUfc9wx6gx/+qyHSOAyRViMeGdv0iqcHb316bLtXwOCpjuoocTDZrE6dNaR s8DRoUKbHZV7y/eY3yq8wbdYxznszyYRhqmaALXSYS4nSCSzwnIj9QzwmsehZGZW6Bdflu JZzCvAKL6OqkLCt8VaXhFau5Aij31bE= ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=xCT1GsWWk34tpTfQQgf99NKvnmJsM1si86SaFIrgv8Y2smkBcqLc62yi7t2ITDkIPunXJ34QPJMC90xER8xRk9rxcZ9uaNHlBnMpewmx1048Bopw/ZTvJl8rt6joFWXxqcLRBs06Up+Ydms0ZyZsI7tpz2h/wQP8IOrKuKlKIFDN05xreS0muIiJVsZWDOthiBdFIjygAbvoRqIWoKZDbBaF3LPWg6REmEPkVBrPaO+94FGEMQiNJEMuH9L9yGeaQleEcGau7hEanMAq/4UZ+zZJum8GDqPvxfepAycqBx7LZJ56t8Qf8Ix93Sbd/6vAfQ2unAFblVxPq6zmoK3LVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=L0Ndj2upTDNcAeObFctsa70itOmQpn0tcJK4ZgSMTeY=; b=kV8QeVFjJqMN80tlWEUkjnnJ99WrfPUyR/SH9ANop8Mi8TJGCcVW0o9kmCgSVQvoGhByek6Aavb+H0uNnBvpSRGe14Agz/tOMIxRyJZs9fRVQ1Jxt2PJfscyM2JLDUV2F5N1AvvBpNKcyManFRa4eYxQyD7aw75enx5A/+lIQitEd6TgkPTIVdr1V+S3FKM/8ng6KcU5JiFIjoq53NRQBm0RwSqq25b+Bs3j8NeuR/FgdSiv8zsSGKCShN7zzmmZ4c++9pSxMudO8ZlWLyTa4FaY8GQTeKLyzx+LPIYGc3v84wUcraE4SGgyqvooZyvS6RP8VUAOeW0s4dXv1KmNBQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=L0Ndj2upTDNcAeObFctsa70itOmQpn0tcJK4ZgSMTeY=; b=RZ2SfdV+Cp5+1NsYwlEWFZZwV07QHda3XVkOGEo55xXjFPKwDCmgENZ6LcYAU4TxMx/EyKoayPZRoDAgaV/Q0KJQE91Y9m/La6PyftjosKm6JdQeWq0Y6xgfj74h76UCdzZXrZ5FRi04Y1gNTIl5YDMr0m3kc3jDKiRFS3CRLco= Received: from BN0PR04CA0113.namprd04.prod.outlook.com (2603:10b6:408:ec::28) by PH7PR12MB8106.namprd12.prod.outlook.com (2603:10b6:510:2ba::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8511.17; Thu, 6 Mar 2025 05:48:52 +0000 Received: from BL02EPF0001A0FA.namprd03.prod.outlook.com (2603:10b6:408:ec:cafe::9a) by BN0PR04CA0113.outlook.office365.com (2603:10b6:408:ec::28) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8511.17 via Frontend Transport; Thu, 6 Mar 2025 05:48:52 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by BL02EPF0001A0FA.mail.protection.outlook.com (10.167.242.101) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8511.15 via Frontend Transport; Thu, 6 Mar 2025 05:48:52 +0000 Received: from BLR-L-BHARARAO.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Wed, 5 Mar 2025 23:48:41 -0600 From: Bharata B Rao To: , CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Bharata B Rao Subject: [RFC PATCH 4/4] x86: ibs: Enable IBS profiling for memory accesses Date: Thu, 6 Mar 2025 11:15:32 +0530 Message-ID: <20250306054532.221138-5-bharata@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250306054532.221138-1-bharata@amd.com> References: <20250306054532.221138-1-bharata@amd.com> MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BL02EPF0001A0FA:EE_|PH7PR12MB8106:EE_ X-MS-Office365-Filtering-Correlation-Id: d29b96e1-9abc-4b22-afb6-08dd5c72934c X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|36860700013|7416014|376014|1800799024; X-Microsoft-Antispam-Message-Info: 5U+1RGArTFxV6bVbuT4JAc6TWwHesgkeMv6dYtuNWUXCRuKJlunwuVZGrPX6wQWlhI8n0Jj6+/axbSCZUGVO5VRwAj8150wBYW6yV1zXWQlLEZ40ll6PzHxHdUr67czReJGi3LORUPA+A4XUF3KpIizOdroxlN0y8tpk9jwqjqAvdKEdDV2rosca6K61cdxd9ExAin4tGYLOuC35trn9sARUY5XN3Ixu+Nat6Gl4DwFAY6RsPo1ZsUAeM9tOUpZsz7OeHDxCOWcpn4LRy/l6QzPZXNdG72RBbMrl/y+Rp8ZNUY5YAD3r4X5dnVInn5ngjkE8lm29PEmLSNK7gMNJClFDpiGcnWcu+3n0RPTBTp34bqQx2Be3CmanTfC+Sx78pvGkQHVOAsQgD9Po7c+eAp3XrWJjYAHnAyJ9HPObnsVVgWFMdXEtedJNkfuSJ0N3GXN5vu2E3PbTLGlP6LPIX7z2bu5Bqp/yxsiA8kXow9JtTuRYfiB4i3gwbnWOZD/H2ZPvCRV58i0BgOTr+hcyI2k3mHkCTrpkEw8p1GPW0Z949tB9fZX5WxCFK3RlF45GwYIKwjbU7q9ieBi48cnfKPSDg3Gkin7sAR+jtliTUnuvq3789F7P14Z9k1hYBVjkgTqefn6MXZjAwEoWKTelqsxKmIv3y3p+ytiu2G9TbkbStZyZb9C5gIBi1B9jM4XwWF34jjchtRGx3c5HpTlVYT0qsTszbj9G+tQo82pTV3xRYy+jlTB/V/VfwgmTc3w1XjHUi/wXIXj/YK9pEY03eMN48SeaHWuQ+q0JwnpIR7KRwtdSHkH8SM2HR1unzw9Gt/PQHz0U1+wEw+rRaeBwgV+BYRBMnn3UkQr9kd8ZFgQm1nDd4dOiP5o5HZJlbMG70mU/XDmGf6MzvGJroS1DSvDb+rTuHVU6iSpIhTWIzghKPwLgvb7xG34VbylSOexvB70Bo1QzqGf5Sd+Y9Nye2RGu0x55bLD02FsXJu45zz3yFdiq+adgHwnyMoPWUIoJv9HFu4qcSnKyq/cowJJQJ9tC2RM6whJ4yLKnhBdVwCtDAYdD1TWYJvPPgUPRXQ/q9b4zDr/mgxUKSBspzwW5f9aMco3idcwQNHdYzN7eSckiq8xAl3Lc9l/gzxoYqUkfWqDtxbbaZHr9THgNO4Yp0PSEphb7pAJvz0/2U6/ij1+bW3BcwFOzj5HvCTeSQghgHN6epLIEfqXmAPeFbD/sV7+kf0X3EVokzTzD0CsVm9/HpB6Do7lBdwOk2lDqkzMTQ0OleIQ80lahCqe1NoZgvshvHVMONOA1x9Xy2129jHWauzxpoWn0SINuPR302aqj9bJGCCagnFHjdTCrybriLoQOaB4yAl+B/cbtJcT5mnBQMykRHX+wpNVJuCDYjI8IEs07r3YZdLte9dz67a1uXMz4mvfqvPgewK1zYYzPb0x1WIx/i4ZfYUfkcFnZVsU3fW9+Y5HBMGyYhxZErXZ4wElWIyRSLOhYsBe/3KGrg7U= X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(82310400026)(36860700013)(7416014)(376014)(1800799024);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Mar 2025 05:48:52.3157 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: d29b96e1-9abc-4b22-afb6-08dd5c72934c X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: BL02EPF0001A0FA.namprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH7PR12MB8106 X-Rspamd-Queue-Id: 6482F140007 X-Stat-Signature: yxkjw3usdocnnqg4gq6daojjuixasjso X-Rspam-User: X-Rspamd-Server: rspam05 X-HE-Tag: 1741240138-73321 X-HE-Meta: U2FsdGVkX182enIpOXMXukKVVVMoyLt6ZrgqG+pW9A3wrVAPzb0Yfxwtc4rvcXlMHdSU6cmHTglwt0eQ2Q7YklQSNbEDnwBkVD1svXu0ergEWK25SELplbo9SR4WWclzzt/I/+IFjHODuw0OW54Gvk3pZiIOs2Vl0qbkeB+jRaUHF5RwrmOYILqRBt+TYKJ9VnCr89/5h0Q6UMpZcLR5+xTkpfEL8qFHmhVRpP4GkmsvRah5/PI+xCNAjL99Qlwl0kllb7od57ySW2wYlQB2I/HuL5ZPb+S8Q8e8L0RNaR0BRFlHxvjDpfE/qFDD1ezkx8/jZYq4jnT+6qLR8JMM1mq77QnoilhudJuyE9wAYtYc7tzBgCltJw/IcVGMRp1Np0rYjnAlFIX/DBdmGvzMaNWQDrCh6A0VnncDJPHCdAMHpRddkUcBHSR/dlHS3He7/K4h/39iLCryfzf4t7P7rffFBCOFPlXi9sXUjaOAP0ZzanR3ojGf7jOGKVNpaiycfvQGXUSx1VJlnYIA1v9G+ZFWfDDuKf9AOqV6z4hiZi8cziO7dsG3wQqcVspqLk6eYbr7jsYaq9c9kLkJbjjSKcSHoAwc5MhqHTJfw6S9Gs0+rEkorBB5G/sPkek9nWlheUUbsXwzaztWQkf8a7pWxqCWP+F8TmzWO5ByG3d58Un8Qjszs+ibB6SHgcSF8JZojbdRfgOnlxCDHpTBvWZWkH/eUgzaVI1OYtW8B8QG60bfBg1Hzw8KPqSko11eqBR5CZ8ZWVoIjOO/UhQ22pz7f6YkyrBgExT1mvTTD71tYzp1GrOp+GV1a0wSrHRsF+fcLlvd+hEHbQeNtj9jajmBvsl3zlfY/RZRL5pJPx/2YIA4wufV4aaOZ+ZUJvpAOGkg/GQibXQelvUW0gjkY8aA5tJGIxNmY/6IDTy+qxLSR50NTr7ZYDQ5NOOc3eQleD/EUAzZzOV4FLVoYZRv7ID 7VWFase7 dXr+FlpXovFv8h1EYugJdAW6wP4SHhP9aN/zDCYtE+M+0GmlWU9PaLWgSafQV11TJbL93ZTYewg7ZaVOm0wWivmS1Q1JbnEwH4jDRzEQFgyOfL/vTdk+/BNxX8wwnNgSuMgaeF4vaj0XKsCnig9MCz8HWHaljR+c7UdgaMczsBIq982La4/AZUTId6Ajryvg9UP8DgpfpaE2u7VIDQlTk+40UXr1A9LkUm+OmiLeJ+YvL7SE3SquinqC71U3J2V+fnJeyNS05fpRv8C+/NzqUtZaurxiYBA9ZiIWM2Zp9yG/p+10G7TzcNHm01H7CFsPH6XilUE1yfvV7N49qZy6UHnRfdUCSzT1BoudrkEQQcsJDTLxuBnMxAcVGwERd6lTZlIp0o7XuoNfOAgLIPx+ejZGqBgZf5NrdhoX5uvhxQJzSXb/J6arwcrWBAnn0piZB52DThu9bvXUcRrsQSYg9pAxx2FcIv7jagOaXIfke/nY/3PVS045JolcELjsKrn1QNDAuABFCgZ4+ut4MmkddEcOX43ZWHXx/1F+2re+fqmNkSOxxSsR5tAplFM/au5GrVsom X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Enable IBS memory access data collection for user memory accesses by programming the required MSRs. The profiling is turned ON only for user mode execution and turned OFF for kernel mode execution. Profiling is explicitly disabled for NMI handler too. TODOs: - IBS sampling rate is kept fixed for now. - Arch/vendor separation/isolation of the code needs relook. Signed-off-by: Bharata B Rao --- arch/x86/include/asm/entry-common.h | 3 +++ arch/x86/include/asm/hardirq.h | 2 ++ arch/x86/include/asm/ibs.h | 2 ++ arch/x86/mm/ibs.c | 32 +++++++++++++++++++++++++++++ 4 files changed, 39 insertions(+) diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/entry-common.h index 77d20555e04d..8127111c6ad3 100644 --- a/arch/x86/include/asm/entry-common.h +++ b/arch/x86/include/asm/entry-common.h @@ -9,10 +9,12 @@ #include #include #include +#include /* Check that the stack and regs on entry from user mode are sane. */ static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs) { + hw_access_profiling_stop(); if (IS_ENABLED(CONFIG_DEBUG_ENTRY)) { /* * Make sure that the entry code gave us a sensible EFLAGS @@ -98,6 +100,7 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs, static __always_inline void arch_exit_to_user_mode(void) { amd_clear_divider(); + hw_access_profiling_start(); } #define arch_exit_to_user_mode arch_exit_to_user_mode diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h index 6ffa8b75f4cd..b928fbbcf3e5 100644 --- a/arch/x86/include/asm/hardirq.h +++ b/arch/x86/include/asm/hardirq.h @@ -91,4 +91,6 @@ static __always_inline bool kvm_get_cpu_l1tf_flush_l1d(void) static __always_inline void kvm_set_cpu_l1tf_flush_l1d(void) { } #endif /* IS_ENABLED(CONFIG_KVM_INTEL) */ +#define arch_nmi_enter() hw_access_profiling_stop() +#define arch_nmi_exit() hw_access_profiling_start() #endif /* _ASM_X86_HARDIRQ_H */ diff --git a/arch/x86/include/asm/ibs.h b/arch/x86/include/asm/ibs.h index b5a4f2ca6330..6b480958534e 100644 --- a/arch/x86/include/asm/ibs.h +++ b/arch/x86/include/asm/ibs.h @@ -2,6 +2,8 @@ #ifndef _ASM_X86_IBS_H #define _ASM_X86_IBS_H +void hw_access_profiling_start(void); +void hw_access_profiling_stop(void); extern bool arch_hw_access_profiling; #endif /* _ASM_X86_IBS_H */ diff --git a/arch/x86/mm/ibs.c b/arch/x86/mm/ibs.c index 5c966050ad86..961d0c67ca50 100644 --- a/arch/x86/mm/ibs.c +++ b/arch/x86/mm/ibs.c @@ -15,6 +15,7 @@ bool arch_hw_access_profiling; static u64 ibs_config __read_mostly; static u32 ibs_caps; +#define IBS_SAMPLE_PERIOD 10000 #define IBS_NR_SAMPLES 50 /* @@ -99,6 +100,36 @@ static void ibs_irq_handler(struct irq_work *i) schedule_work_on(smp_processor_id(), &ibs_work); } +void hw_access_profiling_stop(void) +{ + u64 ops_ctl; + + if (!arch_hw_access_profiling) + return; + + rdmsrl(MSR_AMD64_IBSOPCTL, ops_ctl); + wrmsrl(MSR_AMD64_IBSOPCTL, ops_ctl & ~IBS_OP_ENABLE); +} + +void hw_access_profiling_start(void) +{ + u64 config = 0; + unsigned int period = IBS_SAMPLE_PERIOD; + + if (!arch_hw_access_profiling) + return; + + /* Disable IBS for kernel thread */ + if (!current->mm) + goto out; + + config = (period >> 4) & IBS_OP_MAX_CNT; + config |= (period & IBS_OP_MAX_CNT_EXT_MASK); + config |= ibs_config; +out: + wrmsrl(MSR_AMD64_IBSOPCTL, config); +} + /* * IBS NMI handler: Process the memory access info reported by IBS. * @@ -305,6 +336,7 @@ static int __init ibs_access_profiling_init(void) x86_amd_ibs_access_profile_startup, x86_amd_ibs_access_profile_teardown); + arch_hw_access_profiling = true; pr_info("IBS setup for memory access profiling\n"); return 0; }