From patchwork Tue Jul 11 04:48:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Aneesh Kumar K.V" X-Patchwork-Id: 13308126 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C0EAEB64DC for ; Tue, 11 Jul 2023 04:49:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 36B818E0006; Tue, 11 Jul 2023 00:49:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 31B538E0001; Tue, 11 Jul 2023 00:49:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 16E4A8E0006; Tue, 11 Jul 2023 00:49:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 0AA568E0001 for ; Tue, 11 Jul 2023 00:49:45 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id D74951C7FAC for ; Tue, 11 Jul 2023 04:49:44 +0000 (UTC) X-FDA: 80998103088.18.EB3CFD2 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by imf29.hostedemail.com (Postfix) with ESMTP id 99777120014 for ; Tue, 11 Jul 2023 04:49:42 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b="eqyS98/V"; spf=pass (imf29.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689050982; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=izAKLkGXW4KeySMlKO+lq28yrDcfXnzVthdmbn/zMFQ=; b=USybRpsuSWumSeihMWa1y0ibtwfG2nwloHq0aBovhveMRtvDPjKrAB2Su8AAkmO+6ID2jF RbwFm3pyHS3f0USeRewBQmymEKdgNDA4BxidBmJsd8oX2W+3D+CbUCHGkQplttSjQHXxby wUsUGRZ1QNKSZL7xvi1tJe4g+FRevVs= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b="eqyS98/V"; spf=pass (imf29.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689050982; a=rsa-sha256; cv=none; b=3T9r59QjpRLjsxuy7V3EHcXTY2MwrKUiKVwWXctjhjjMReX1PD/cPUQyVpYes6jr0WEg49 wBga87/wUpoKYELvATNG35kNYndjcKZXsr7DhMtkS1TIRJXMMF4biQD2/VqHeKw5XOEihW WcE1fgfim0wmmri9LECyJl6BqarASMI= Received: from pps.filterd (m0353723.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 36B4kwjF016806; Tue, 11 Jul 2023 04:49:28 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=izAKLkGXW4KeySMlKO+lq28yrDcfXnzVthdmbn/zMFQ=; b=eqyS98/VjnrlSvhy86dKXl6ELjpYbS3P6rxJtEus+Fl1ZzujdgD4J6G+kakzx8KOPsTf b6WXDb58lua7Jk6FPTcVDQkDNrKOuWbpKq/JfSmAIIpkn96GvJlHYmIgg4BNUWreBP2j yCIfl4Do7J4xOPyOaXEejUaCW1HOSPjyjBbDhJGMJ/rrTHw0PT9uWIIbtlBo5S55NKtv 2sBABPmCyIc74UoK8grOp4GDRpdTRlZsF6YshrNmsDL91ajmURzF0KGCepH1x/IZD2GM 94FHswuBWGS/X7llC6nPFeIgcEyddqGjB++9XbWG4zbwf3JkaMKT11iPGmrIjcbTHW8D sA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rs0btg4g1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 11 Jul 2023 04:49:27 +0000 Received: from m0353723.ppops.net (m0353723.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 36B4kuF0016759; Tue, 11 Jul 2023 04:49:27 GMT Received: from ppma03dal.us.ibm.com (b.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.11]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rs0btg4fe-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 11 Jul 2023 04:49:26 +0000 Received: from pps.filterd (ppma03dal.us.ibm.com [127.0.0.1]) by ppma03dal.us.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 36AJnxOn026569; Tue, 11 Jul 2023 04:49:26 GMT Received: from smtprelay07.dal12v.mail.ibm.com ([9.208.130.99]) by ppma03dal.us.ibm.com (PPS) with ESMTPS id 3rpye5m6xx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 11 Jul 2023 04:49:26 +0000 Received: from smtpav01.wdc07v.mail.ibm.com (smtpav01.wdc07v.mail.ibm.com [10.39.53.228]) by smtprelay07.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 36B4nOUs21561760 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 11 Jul 2023 04:49:24 GMT Received: from smtpav01.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9351258067; Tue, 11 Jul 2023 04:49:24 +0000 (GMT) Received: from smtpav01.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A7EA25804B; Tue, 11 Jul 2023 04:49:17 +0000 (GMT) Received: from skywalker.ibmuc.com (unknown [9.43.86.43]) by smtpav01.wdc07v.mail.ibm.com (Postfix) with ESMTP; Tue, 11 Jul 2023 04:49:17 +0000 (GMT) From: "Aneesh Kumar K.V" To: linux-mm@kvack.org, akpm@linux-foundation.org, mpe@ellerman.id.au, linuxppc-dev@lists.ozlabs.org, npiggin@gmail.com, christophe.leroy@csgroup.eu Cc: Oscar Salvador , David Hildenbrand , Michal Hocko , Vishal Verma , "Rafael J. Wysocki" , Len Brown , Dan Williams , Dave Jiang , Dave Hansen , Huang Ying , "Aneesh Kumar K . V" Subject: [PATCH v3 6/7] dax/kmem: Always enroll hotplugged memory for memmap_on_memory Date: Tue, 11 Jul 2023 10:18:32 +0530 Message-ID: <20230711044834.72809-7-aneesh.kumar@linux.ibm.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230711044834.72809-1-aneesh.kumar@linux.ibm.com> References: <20230711044834.72809-1-aneesh.kumar@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: oa8R27GieEOayLSVPDKzsgEAIl9wm7HA X-Proofpoint-GUID: dhygPK-lH0NzJ-5CMxb87fS1U4hSZ6Ai X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-11_02,2023-07-06_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 priorityscore=1501 lowpriorityscore=0 mlxscore=0 spamscore=0 adultscore=0 phishscore=0 impostorscore=0 clxscore=1015 mlxlogscore=999 suspectscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2307110040 X-Rspamd-Queue-Id: 99777120014 X-Rspam-User: X-Stat-Signature: jgj5pdafh5e73j958a4hrcsipa4tkeac X-Rspamd-Server: rspam01 X-HE-Tag: 1689050982-271682 X-HE-Meta: U2FsdGVkX1+tX/GacISBzDVV0AoNYCuEK7dQAZq//5wjHUfQxVNrMNUzAs9pTAVxX7w4n0SN8j2xJ8vtLU6656rWEDnsCv8BlGIWEn9WGkiaLf6rRys4crdlvuudkCqJXti70Ws5D/oV1j+FkTnVuCuCmKeJ0/7SHd7xildZJOSRvpSGjJiWRl7SOgdbTUA4JOAaoukWHJsTWXe6iBaNg/yIf5nQMd3k2V4WjYapwm5aNm9ynHfsygcx73J88t2QJTCPz0lajWeIWWmWFCvSGyHyg5+Zf7XxH0bP4/Nt488rbuKIlxkEKFKpgMfWF5dyNslynRKQAtX8XIQ9/EEC8zCWHyHEfo/IfawGF+er3Q4x+172ZzV6hPmN4RvSSRTdLzQJnmocm9ceJ88bhrusfnGhMJfOsw9IqILyuuMyi0TzwkvzcdT/zwxxfaIzR6uMlQhgtPfsh4phJDpGtbrgZ3AgHLsl/fUTs3JmEAqpY4xvmiOjtFw6kN56uylIk9nvQAhpME5KNXWyAJe/OdmAyhOtAA7TP71LgCEtkgo0Tdpf1/IeK+TWB3yvs2q42LBO4jIYvLUIaErYDsQKHgdMNFQ8sU92OHQ4yfii9EeRQbW9k+hbmNY0yzjaGpoQRSNEYNfKzddigSqfbYJd6RzouVlPgGBaqDFJaEhYhdKA2JsHbjd/m14f3273x8FVKRhnUjbN4N9/kfuKdzSyXO8uzcT9Rc2kUfbhzXQFFj1MyyBV5N7JXNZw9wDy51cY2KpBiiv4s6cBwQsm88AJSr68Iz+rHf+oB7yiCdNm8FvtZuqVSx7fdHGRzD7RcRVglqEXIuaCUUxFlWSGLTE+k/QmYbfvGgtp1Wr46SYWglwWeq15jJmsYp+xCW+mDt8ZHlzgj5cnT5PS+s611ZhLOhlAIpclMq/u4VXlQ+68sY9PPtpTSNN3S25N3a/6eI1okgZu4um21cXGXnkKd5kfeCL SUq3u5EJ 589xrzEod1FzATEnAiTdEHi80WrIktGYKPfaB0lEJ/XIJWJ8JEYYtPz1E3tZ5PhKDAPSPIZliskU39a2ieWQRWg99DSZAiqrp6toUajIOb9NJU1xRn/OexA7c3jXPOF0O3I64iveA6zW6kcvAPi4+2dFQW43M7sy7ucjefRLBy+ogff1lxlY9qfuSAnK4Y1Iuj3NdJxuoVCe0mSIYtj7yKzj1F3TZTjzPZ+e+yAv9ZB7em8PAmxDED05zCUnqr81OFSosCiW/yKnWTLiJ9wil8VE9lmMZ6k1wq6HdeanwfBK8gAF3UqVZ1Wy7q3Tqw3SMCgL7oy3JMDEGr8Wzx9P22bORUIN14/R2BvhCzoVV8t6BJMs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Vishal Verma With DAX memory regions originating from CXL memory expanders or NVDIMMs, the kmem driver may be hot-adding huge amounts of system memory on a system without enough 'regular' main memory to support the memmap for it. To avoid this, ensure that all kmem managed hotplugged memory is added with the MHP_MEMMAP_ON_MEMORY flag to place the memmap on the new memory region being hot added. To do this, call add_memory() in chunks of memory_block_size_bytes() as that is a requirement for memmap_on_memory. Cc: "Rafael J. Wysocki" Cc: Len Brown Cc: Andrew Morton Cc: David Hildenbrand Cc: Oscar Salvador Cc: Dan Williams Cc: Dave Jiang Cc: Dave Hansen Cc: Huang Ying Signed-off-by: Vishal Verma Signed-off-by: Aneesh Kumar K.V --- drivers/dax/kmem.c | 81 +++++++++++++++++++++++++++++++++------------- 1 file changed, 59 insertions(+), 22 deletions(-) diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c index 898ca9505754..840bf7b40a44 100644 --- a/drivers/dax/kmem.c +++ b/drivers/dax/kmem.c @@ -12,6 +12,7 @@ #include #include #include +#include #include "dax-private.h" #include "bus.h" @@ -105,6 +106,7 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax) data->mgid = rc; for (i = 0; i < dev_dax->nr_range; i++) { + u64 cur_start, cur_len, remaining; struct resource *res; struct range range; @@ -137,21 +139,42 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax) res->flags = IORESOURCE_SYSTEM_RAM; /* - * Ensure that future kexec'd kernels will not treat - * this as RAM automatically. + * Add memory in chunks of memory_block_size_bytes() so that + * it is considered for MHP_MEMMAP_ON_MEMORY + * @range has already been aligned to memory_block_size_bytes(), + * so the following loop will always break it down cleanly. */ - rc = add_memory_driver_managed(data->mgid, range.start, - range_len(&range), kmem_name, MHP_NID_IS_MGID); - - if (rc) { - dev_warn(dev, "mapping%d: %#llx-%#llx memory add failed\n", - i, range.start, range.end); - remove_resource(res); - kfree(res); - data->res[i] = NULL; - if (mapped) - continue; - goto err_request_mem; + cur_start = range.start; + cur_len = memory_block_size_bytes(); + remaining = range_len(&range); + while (remaining) { + /* + * If alignment rules are not satisified we will + * fallback normal memmap allocation. + */ + mhp_t mhp_flags = MHP_NID_IS_MGID | MHP_MEMMAP_ON_MEMORY; + /* + * Ensure that future kexec'd kernels will not treat + * this as RAM automatically. + */ + rc = add_memory_driver_managed(data->mgid, cur_start, + cur_len, kmem_name, + mhp_flags); + + if (rc) { + dev_warn(dev, + "mapping%d: %#llx-%#llx memory add failed\n", + i, cur_start, cur_start + cur_len - 1); + remove_resource(res); + kfree(res); + data->res[i] = NULL; + if (mapped) + continue; + goto err_request_mem; + } + + cur_start += cur_len; + remaining -= cur_len; } mapped++; } @@ -186,25 +209,39 @@ static void dev_dax_kmem_remove(struct dev_dax *dev_dax) * unbind will succeed even if we return failure. */ for (i = 0; i < dev_dax->nr_range; i++) { + + u64 cur_start, cur_len, remaining; struct range range; + bool resource_remove; int rc; rc = dax_kmem_range(dev_dax, i, &range); if (rc) continue; - rc = remove_memory(range.start, range_len(&range)); - if (rc == 0) { + resource_remove = true; + cur_start = range.start; + cur_len = memory_block_size_bytes(); + remaining = range_len(&range); + while (remaining) { + + rc = remove_memory(cur_start, cur_len); + if (rc) { + resource_remove = false; + dev_err(dev, + "mapping%d: %#llx-%#llx cannot be hotremoved until the next reboot\n", + i, cur_start, cur_len); + } + cur_start += cur_len; + remaining -= cur_len; + } + if (resource_remove) { remove_resource(data->res[i]); kfree(data->res[i]); data->res[i] = NULL; success++; - continue; - } - any_hotremove_failed = true; - dev_err(dev, - "mapping%d: %#llx-%#llx cannot be hotremoved until the next reboot\n", - i, range.start, range.end); + } else + any_hotremove_failed = true; } if (success >= dev_dax->nr_range) {