From patchwork Mon Sep 30 05:51:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Huang, Ying" X-Patchwork-Id: 13815409 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 843A1CF6491 for ; Mon, 30 Sep 2024 05:51:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D9C466B00F9; Mon, 30 Sep 2024 01:51:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D30176B0100; Mon, 30 Sep 2024 01:51:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B50856B0106; Mon, 30 Sep 2024 01:51:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 9245C6B00F9 for ; Mon, 30 Sep 2024 01:51:54 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 2B6BA14071B for ; Mon, 30 Sep 2024 05:51:54 +0000 (UTC) X-FDA: 82620333348.12.C557DBD Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.21]) by imf15.hostedemail.com (Postfix) with ESMTP id E102CA0005 for ; Mon, 30 Sep 2024 05:51:50 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=GwEdmgl1; spf=pass (imf15.hostedemail.com: domain of ying.huang@intel.com designates 198.175.65.21 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727675387; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=Tg64s8rcWc5prC5wlswFX05/W0UxeiKON2U70gmbJ9E=; b=QFpUHjvCAVlpcIbxgJtbhKvS0vv3PHA8rb2k42eA8sPirx0suFBoEF/JQk9lUwUjq+r7ow 8FLcW+oskdDQHEz5qBcj3UKIQWVX1fJsMFZaUcrKDDSxtrt4cDIzxmUE/E0B4DM2m7VaKi z6bVPuwYvIfqmiLiUaiqu9JEZDpHtPA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727675387; a=rsa-sha256; cv=none; b=FAMWCcNqtYKBMlUqeK8oM8QpTRrp4lJQ7u+XKatl9VUltA9e5KR40KrW2EPVOfv+B5jPE5 UpmVtmUgbxQlInyJn18h4T/HkHXWotHhcZRMQmqXLuNIf1vql/pzOQn7uOS7a2BrpImysA S91Wsnf+cMPtjSA2Le1JARDDYonWD5M= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=GwEdmgl1; spf=pass (imf15.hostedemail.com: domain of ying.huang@intel.com designates 198.175.65.21 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1727675511; x=1759211511; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=D7qgt/re79b94z4qKj4/hVxF8hx2+DoYj0YpC+nDoLk=; b=GwEdmgl1xOb355ig1TStO1O9+iX9gi7vFsvqzXwj+0KwWs3ycjFDxCHD +mkd4IFxYXLEcnwVwNQtH03SznTl8wm3maeszzwem6WA95yuBpi8ZIPq5 N1XbqUAnx3vL2Yhy5qBNqJyL+U+1ym1u2uiEHTpbE09QAe0/UiCqiLI/4 RArqd3jXZNGf3MJ/NxAZgYzQeh1lRYNqKzm2CO37AUW8OppWNxuVhz2zh EcqAqMuSDYieACasAh6F5+D0rtXLw4BMTXLENbBhcaiqxwwbzzFkYeeyE zXtbtyVnu0z4pZYSxe+O8ikoXolps+KeaxNkoTVKOJFLDsTnC1L6cdFjv w==; X-CSE-ConnectionGUID: bQjlmDYaQrWCx3qwx4QBJQ== X-CSE-MsgGUID: Mpo2wKnbQ6KV0rvJBAf0Bg== X-IronPort-AV: E=McAfee;i="6700,10204,11210"; a="26698451" X-IronPort-AV: E=Sophos;i="6.11,164,1725346800"; d="scan'208";a="26698451" Received: from fmviesa010.fm.intel.com ([10.60.135.150]) by orvoesa113.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Sep 2024 22:51:49 -0700 X-CSE-ConnectionGUID: 5zR1OpmOSyywRCtLL0caCQ== X-CSE-MsgGUID: fDaLogJbR0WCZoFMTV148w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,164,1725346800"; d="scan'208";a="73497045" Received: from lhan1x-mobl.ccr.corp.intel.com (HELO yhuang6-mobl2.ccr.corp.intel.com) ([10.124.238.64]) by fmviesa010-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Sep 2024 22:51:45 -0700 From: Huang Ying To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "Kirill A . Shutemov" Cc: x86@kernel.org, David Hildenbrand , Andrew Morton , Oscar Salvador , linux-coco@lists.linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Huang Ying , Dan Williams , Kai Huang , "H. Peter Anvin" , Andy Lutomirski Subject: [PATCH] tdx, memory hotplug: Check whole hot-adding memory range for TDX Date: Mon, 30 Sep 2024 13:51:12 +0800 Message-Id: <20240930055112.344206-1-ying.huang@intel.com> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 X-Rspamd-Queue-Id: E102CA0005 X-Stat-Signature: j48f6injqir64m5ohdhep8czh31umoni X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1727675510-532823 X-HE-Meta: U2FsdGVkX19TQs0PLZXAksL8PvpMulHgp1FrCJfdlLAFjqpGYU2w1IWsLvzhhXlyN9lpqFykZNjkHs0vX873F7bHiD1AvkQ2L3Bxx0qt8VFEkFjnWX76BXs+ys4Ww9ybS24poCiAeHHEeSbsigJTYPktqDdTGSPaqJ98LXe8OgWX6pRUrYoC5dngoaqoE/MnRMAmkCxHnipO88vPZAdwhBoSi8AWA+tBfZx1o6if2viMCYxEICm1SeF7gtYmE7zqolZJHLorzIYgfrXiwjwilKuCq5/EbM7Dd67c6IzJYr+0mgWVpeuKTFJFSteWcoDCgobx0e2id0G5tQBl+1B7Fc17giyrG/p3YLBr5mQweZ1g2JNmQ6r8Ksk46CY3YopwhFzTf+2oMNQVn7JKcWTZ0p2khNSa7x+WYizlPaknVgf931Rx3ir1klvk9J0kP9DaWRC4vUokpI77+tgMruiV26jjuRau6YxPS5nLM5/sOWufK1wNIW45/yejQVGrbG9RIEGuAfW65hpYEyTUOWY99OhJAp1qya6ez7xqVeax1j/51ju02PVNInOdr+7LZUPqfYVK6JZQ7H+a1H4QIFVcekoxfV5KodIMM1WmjYhSEFjB1whS5snjQtoZxjfavcvXdFdx3ZAFO75368L/fy8iTq2OwdJwNDmvPd5VCA0OzH0296RwaPL4unmo9nQj/SeIuXDB+dXG1+rJyFr5fpQCnCAZ/JGD5UL+e2F7yEF/DH8E8VMC9SlCYfNw9bP5Qb8DpywZOE2xMIo6q3l5M0FY9u6Rz1gC98cC4zGgWQ4hmom0s45PqEfmwfzkAQAVtnOtA6qzLPqVl/NKJz6C5Gjx818nQJEnGRW6M6NvSVzqfYym/kHs8x/Zo34hV65B68HCx3f74gkNOL8oC3c4ZmkJHset1irUH7SriBhn+8kJ+wmel/g3BxJRYcQ0pb5G2FfFuDUffPh4nF2GO7j88e0 XL7Ds/kp fpONScmwj536zbCHJeXNtzzx/uF+P+nqpOOBt3KF/YmkU/xL9alXw7qZKTkhbdHbxEXmOFJfEG4Lf2HDhCumgPbY7CbFfVMkXwRlfuracxxq0mnbESXGO1/ovuNNNKaT6dmEJ7ALIrt9Lzqhw5QEcNOte2h+up/xBrmiW1bMjf+Y9/6G4Rz1ZamQFskS8z+6//c9zHHrIfwoB9sVvh/YS9btYb05ZBztjfmvzaI85uuTzIyKo5J4Y2yDahx4DJ7AI758ro351Ej2WxfxpXbPatlghHuQYqyxp9baN X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On systems with TDX (Trust Domain eXtensions) enabled, memory ranges hot-added must be checked for compatibility by TDX. This is currently implemented through memory hotplug notifiers for each memory_block. If a memory range which isn't TDX compatible is hot-added, for example, some CXL memory, the command line as follows, $ echo 1 > /sys/devices/system/node/nodeX/memoryY/online will report something like, bash: echo: write error: Operation not permitted If pr_debug() is enabled, the error message like below will be shown in the kernel log, online_pages [mem 0xXXXXXXXXXX-0xXXXXXXXXXX] failed Both are too general to root cause the problem. This will confuse users. One solution is to print some error messages in the TDX memory hotplug notifier. However, memory hotplug notifiers are called for each memory block, so this may lead to a large volume of messages in the kernel log if a large number of memory blocks are onlined with a script or automatically. For example, the typical size of memory block is 128MB on x86_64, when online 64GB CXL memory, 512 messages will be logged. Therefore, in this patch, the whole hot-adding memory range is checked for TDX compatibility through a newly added architecture specific function (arch_check_hotplug_memory_range()). If rejected, the memory hot-adding will be aborted with a proper kernel log message. Which looks like something as below, virt/tdx: Reject hot-adding memory range: 0xXXXXXXXX-0xXXXXXXXX for TDX compatibility. The target use case is to support CXL memory on TDX enabled systems. If the CXL memory isn't compatible with TDX, the whole CXL memory range hot-adding will be rejected. While the CXL memory can still be used via devdax interface. This also makes the original TDX memory hotplug notifier useless, so delete it. Signed-off-by: "Huang, Ying" Suggested-by: Dan Williams Reviewed-by: Dan Williams Acked-by: Kai Huang Cc: Thomas Gleixner Cc: Dave Hansen Cc: "H. Peter Anvin" Cc: Andy Lutomirski Cc: Kirill A. Shutemov Cc: David Hildenbrand Cc: Oscar Salvador Acked-by: David Hildenbrand --- arch/x86/include/asm/tdx.h | 2 ++ arch/x86/mm/init_64.c | 6 ++++++ arch/x86/virt/vmx/tdx/tdx.c | 35 ++++++++++++---------------------- include/linux/memory_hotplug.h | 3 +++ mm/memory_hotplug.c | 7 ++++++- 5 files changed, 29 insertions(+), 24 deletions(-) diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index eba178996d84..6db5da34e4ba 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -116,11 +116,13 @@ static inline u64 sc_retry(sc_func_t func, u64 fn, int tdx_cpu_enable(void); int tdx_enable(void); const char *tdx_dump_mce_info(struct mce *m); +int tdx_check_hotplug_memory_range(u64 start, u64 size); #else static inline void tdx_init(void) { } static inline int tdx_cpu_enable(void) { return -ENODEV; } static inline int tdx_enable(void) { return -ENODEV; } static inline const char *tdx_dump_mce_info(struct mce *m) { return NULL; } +static inline int tdx_check_hotplug_memory_range(u64 start, u64 size) { return 0; } #endif /* CONFIG_INTEL_TDX_HOST */ #endif /* !__ASSEMBLY__ */ diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index ff253648706f..30a4ad4272ce 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -55,6 +55,7 @@ #include #include #include +#include #include "mm_internal.h" @@ -974,6 +975,11 @@ int add_pages(int nid, unsigned long start_pfn, unsigned long nr_pages, return ret; } +int arch_check_hotplug_memory_range(u64 start, u64 size) +{ + return tdx_check_hotplug_memory_range(start, size); +} + int arch_add_memory(int nid, u64 start, u64 size, struct mhp_params *params) { diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index 4e2b2e2ac9f9..c477b04c5548 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -1388,36 +1388,32 @@ static bool is_tdx_memory(unsigned long start_pfn, unsigned long end_pfn) return false; } -static int tdx_memory_notifier(struct notifier_block *nb, unsigned long action, - void *v) +int tdx_check_hotplug_memory_range(u64 start, u64 size) { - struct memory_notify *mn = v; - - if (action != MEM_GOING_ONLINE) - return NOTIFY_OK; + u64 start_pfn = PHYS_PFN(start); + u64 end_pfn = PHYS_PFN(start + size); /* * Empty list means TDX isn't enabled. Allow any memory - * to go online. + * to be hot-added. */ if (list_empty(&tdx_memlist)) - return NOTIFY_OK; + return 0; /* * The TDX memory configuration is static and can not be - * changed. Reject onlining any memory which is outside of + * changed. Reject hot-adding any memory which is outside of * the static configuration whether it supports TDX or not. */ - if (is_tdx_memory(mn->start_pfn, mn->start_pfn + mn->nr_pages)) - return NOTIFY_OK; + if (is_tdx_memory(start_pfn, end_pfn)) + return 0; - return NOTIFY_BAD; + pr_info("Reject hot-adding memory range: %#llx-%#llx for TDX compatibility.\n", + start, start + size); + + return -EINVAL; } -static struct notifier_block tdx_memory_nb = { - .notifier_call = tdx_memory_notifier, -}; - static void __init check_tdx_erratum(void) { /* @@ -1465,13 +1461,6 @@ void __init tdx_init(void) return; } - err = register_memory_notifier(&tdx_memory_nb); - if (err) { - pr_err("initialization failed: register_memory_notifier() failed (%d)\n", - err); - return; - } - #if defined(CONFIG_ACPI) && defined(CONFIG_SUSPEND) pr_info("Disable ACPI S3. Turn off TDX in the BIOS to use ACPI S3.\n"); acpi_suspend_lowlevel = NULL; diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index b27ddce5d324..c5ba7b909bb4 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -140,6 +140,9 @@ extern int try_online_node(int nid); extern int arch_add_memory(int nid, u64 start, u64 size, struct mhp_params *params); + +extern int arch_check_hotplug_memory_range(u64 start, u64 size); + extern u64 max_mem_size; extern int mhp_online_type_from_str(const char *str); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 621ae1015106..c4769f24b1e2 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1305,6 +1305,11 @@ int try_online_node(int nid) return ret; } +int __weak arch_check_hotplug_memory_range(u64 start, u64 size) +{ + return 0; +} + static int check_hotplug_memory_range(u64 start, u64 size) { /* memory range must be block size aligned */ @@ -1315,7 +1320,7 @@ static int check_hotplug_memory_range(u64 start, u64 size) return -EINVAL; } - return 0; + return arch_check_hotplug_memory_range(start, size); } static int online_memory_block(struct memory_block *mem, void *arg)