From patchwork Tue Jul 18 02:44:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Aneesh Kumar K.V" X-Patchwork-Id: 13316692 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B726EB64DC for ; Tue, 18 Jul 2023 02:44:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F13166B007D; Mon, 17 Jul 2023 22:44:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E9BAE6B007E; Mon, 17 Jul 2023 22:44:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D3C6F8D0001; Mon, 17 Jul 2023 22:44:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C7CEA6B007D for ; Mon, 17 Jul 2023 22:44:47 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 9B24F1206E7 for ; Tue, 18 Jul 2023 02:44:47 +0000 (UTC) X-FDA: 81023189814.28.DF9042B Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by imf03.hostedemail.com (Postfix) with ESMTP id 568CC20006 for ; Tue, 18 Jul 2023 02:44:45 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=o7Zg0QEy; dmarc=pass (policy=none) header.from=ibm.com; spf=pass (imf03.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689648285; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iiAMzf9/JUUpzASd6edLD6T9VcF+qOXLt17LVK5cPJ4=; b=RR0SEHbXP8NlF+h34nR42t4/EgGRR7/bnblWzWQjL2RnsX+CB95r0QXLjAweO0LnIDYyPY EUdVkhV3Zgkx6rIJM1yCudivv/EodAVRgs/uhfPt5VFnIOqAbW39YtAPQTwfGZ2JS0KO6P HkMUhVDcY6hgjJcOEGv1/9U4XeA9Cq0= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=o7Zg0QEy; dmarc=pass (policy=none) header.from=ibm.com; spf=pass (imf03.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689648285; a=rsa-sha256; cv=none; b=3z1clBviSjoBB8Y1oWr8LMqcc2t7oQilErSkyha4MizUbbNPYx3D59B6/vaddJ4SDbgU2+ NNPi4UV8qCeRj06rreo3Mi4CPhT4jkKBiss1IlijvFBTYgni3lnN7Z8uRUWo01aQyJ7iJb YbC/1V4hYIAceydLuy+Qi1z+Fvd4100= Received: from pps.filterd (m0353723.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 36I2eN5g012244; Tue, 18 Jul 2023 02:44:38 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=iiAMzf9/JUUpzASd6edLD6T9VcF+qOXLt17LVK5cPJ4=; b=o7Zg0QEyZSLjn+u++72sqDO+S8idWKIWuxEFlPkiSMCBYh3sJX9PF3JCxuAp7Ov0jYiW d/pHaX28z1ts77SLQSuWpnoLaKBhgjcry/7Wt1VxknrvvZoMUNlqZTD8EUw+qXYVD+1g 3HgVZBWjZ8i5PRqMxNQgBKFOnfdwR8twfG6vD9FAbehHMO2n4GBxiAC4oqFNVO/WC6xl Y0bqo8TbPH+dUnPjUk+mXqdsjqR43mSeNtPPynyWT/hZtI2jZ/L+Zpq9qUXJuPzh9mMC XO5ByAtTAdKhyBol4lS+/pJ9OsHa8xWK6y4GUxfaFl7/YjbsBxwSkYFNWjOVFbzU02au gg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rwhwhg8vs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 18 Jul 2023 02:44:38 +0000 Received: from m0353723.ppops.net (m0353723.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 36I2ibR0022406; Tue, 18 Jul 2023 02:44:37 GMT Received: from ppma21.wdc07v.mail.ibm.com (5b.69.3da9.ip4.static.sl-reverse.com [169.61.105.91]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rwhwhg8vj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 18 Jul 2023 02:44:37 +0000 Received: from pps.filterd (ppma21.wdc07v.mail.ibm.com [127.0.0.1]) by ppma21.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 36HMmEpS029129; Tue, 18 Jul 2023 02:44:37 GMT Received: from smtprelay05.dal12v.mail.ibm.com ([172.16.1.7]) by ppma21.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3rv6smbtw2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 18 Jul 2023 02:44:36 +0000 Received: from smtpav02.dal12v.mail.ibm.com (smtpav02.dal12v.mail.ibm.com [10.241.53.101]) by smtprelay05.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 36I2iaou4326056 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 18 Jul 2023 02:44:36 GMT Received: from smtpav02.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2F9C35805C; Tue, 18 Jul 2023 02:44:36 +0000 (GMT) Received: from smtpav02.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2522658051; Tue, 18 Jul 2023 02:44:32 +0000 (GMT) Received: from skywalker.ibmuc.com (unknown [9.43.62.199]) by smtpav02.dal12v.mail.ibm.com (Postfix) with ESMTP; Tue, 18 Jul 2023 02:44:31 +0000 (GMT) From: "Aneesh Kumar K.V" To: linux-mm@kvack.org, akpm@linux-foundation.org, mpe@ellerman.id.au, linuxppc-dev@lists.ozlabs.org, npiggin@gmail.com, christophe.leroy@csgroup.eu Cc: Oscar Salvador , David Hildenbrand , Michal Hocko , Vishal Verma , "Aneesh Kumar K.V" Subject: [PATCH v4 4/6] mm/hotplug: Allow pageblock alignment via altmap reservation Date: Tue, 18 Jul 2023 08:14:07 +0530 Message-ID: <20230718024409.95742-5-aneesh.kumar@linux.ibm.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230718024409.95742-1-aneesh.kumar@linux.ibm.com> References: <20230718024409.95742-1-aneesh.kumar@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: NYGjjbxDG3yzk0yrnkp12ax3MuX9J-Am X-Proofpoint-ORIG-GUID: h_qhn8GUpWWyCJOSb143OV4IB6QMl6yj X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-17_15,2023-07-13_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 malwarescore=0 priorityscore=1501 spamscore=0 mlxlogscore=999 impostorscore=0 lowpriorityscore=0 clxscore=1015 adultscore=0 mlxscore=0 bulkscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2306200000 definitions=main-2307180022 X-Rspamd-Queue-Id: 568CC20006 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: knxdy5grfqah4t9z3d85agbwpmzehm7q X-HE-Tag: 1689648285-846375 X-HE-Meta: U2FsdGVkX19E8ctnG8EcT47T6r5sGZ33RcNhEKytEniziqRPGuAosB1dnaj2Ya8G6j8a5wfWAAX6HlaqwROcsdLFd7F5yo/XTcFq65cHvGQWwvG4mLxBlxVob2RN43IEm5LR2bj3K6VuaEtY6dMhBdMjQ4FFElVcCZTcAstdWUBrQcfHlbaKmdHCeVv/f+k/HXnKM8HqbDQPGSf5jwKjC3GamtMiNZX6Cr+MkXxrCZBzOS5xonOo7AEapNoIJH+kIRdzrSyxTGV5xcEYkfmDb4kwiZ9F/il0hhfL5WW/SxUz4r2BGTlg+QnkAN1/G0DwL2jpvOgTm3T/axDMi9APxzOTU2nAxWlh9MADnh3j62kcEtccqmi5MgrI490ZFPKUgcuZEiBZEfdL17u4GTf+ckclkknsrWoJ99NkAxnx5lL1LkzZ5n8ujCLGjewpbD4JHpFxbTbrco9rZeultGvRjqBJ640Psg9VLTxiOg/cDIGEeYjXfOYSN4uDEM9AFwCQ8U6neJ8UhVgDrV/woQt4YUZssoOqPwo6yP3BAXykO93pXIyu5Y+S3FjWZncq7534xfxjlo7gfr6HfB26VCPUzDL1CPgrHUeWb+SVTR8EaBTevaxpG/rqxd9UIC1ZgKN1mL50bLizh6yyPvF5tQNMdvvq5opcPTD5cVHxslazy+9gpc+uKNWjTvDlub3i0goHej7UINiT16O4NnWph/MbclXKa3/BErCRDlRhUW7Sx+62GoN1FqVLDsOz1zoZuGJN0MflLcLxRbBmS6uhyim7C+i+SNPdE8aCKw/ZFdPKccONRRU1S0/AJWVj4RrzDO3P9M60ekUHUGDUTevHH5LWDEAy4ShWQIPj/VAXc4YC3EiF+20/r7NAXLk0L//KR4qywIkXDSOamllnPV8t51++H5JYO5/UPm69sreCiCjKpLhV0mUrCrQdiV4QGwsRCvarJxP4hi/SyC3dwog0Gb/ iT/MhlU/ y2REBD/2tNqvkPPfYS07Zag5+bs8Sta4u4J/J/PxOXB+QPFViYRoGAMMnNQGdyan8+lB08ZXcDMURdK3qAweD2tRx7n3c67eFcsCurETITloTw2GNRHDBj+b4yBzPf7r/tfLTpffOUQhsN7o= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add a new kconfig option that can be selected if we want to allow pageblock alignment by reserving pages in the vmemmap altmap area. This implies we will be reserving some pages for every memoryblock This also allows the memmap on memory feature to be widely useful with different memory block size values. Signed-off-by: Aneesh Kumar K.V --- mm/memory_hotplug.c | 109 ++++++++++++++++++++++++++++++++++++++------ 1 file changed, 96 insertions(+), 13 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 5921c81fcb70..c409f5ff6a59 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -41,17 +41,85 @@ #include "internal.h" #include "shuffle.h" +enum { + MEMMAP_ON_MEMORY_DISABLE = 0, + MEMMAP_ON_MEMORY_ENABLE, + MEMMAP_ON_MEMORY_FORCE, +}; + +static int memmap_mode __read_mostly = MEMMAP_ON_MEMORY_DISABLE; + +static inline unsigned long memory_block_align_base(unsigned long size) +{ + if (memmap_mode == MEMMAP_ON_MEMORY_FORCE) { + unsigned long align; + unsigned long nr_vmemmap_pages = size >> PAGE_SHIFT; + unsigned long vmemmap_size; + + vmemmap_size = DIV_ROUND_UP(nr_vmemmap_pages * sizeof(struct page), PAGE_SIZE); + align = pageblock_align(vmemmap_size) - vmemmap_size; + return align; + } else + return 0; +} + #ifdef CONFIG_MHP_MEMMAP_ON_MEMORY /* * memory_hotplug.memmap_on_memory parameter */ -static bool memmap_on_memory __ro_after_init; -module_param(memmap_on_memory, bool, 0444); -MODULE_PARM_DESC(memmap_on_memory, "Enable memmap on memory for memory hotplug"); +static int set_memmap_mode(const char *val, const struct kernel_param *kp) +{ + int ret, mode; + bool enabled; + + if (sysfs_streq(val, "force") || sysfs_streq(val, "FORCE")) { + mode = MEMMAP_ON_MEMORY_FORCE; + goto matched; + } + + ret = kstrtobool(val, &enabled); + if (ret < 0) + return ret; + if (enabled) + mode = MEMMAP_ON_MEMORY_ENABLE; + else + mode = MEMMAP_ON_MEMORY_DISABLE; + +matched: + *((int *)kp->arg) = mode; + if (mode == MEMMAP_ON_MEMORY_FORCE) { + pr_info("Memory hotplug will reserve %ld pages in each memory block\n", + memory_block_align_base(memory_block_size_bytes())); + } + return 0; +} + +static int get_memmap_mode(char *buffer, const struct kernel_param *kp) +{ + if (*((int *)kp->arg) == MEMMAP_ON_MEMORY_FORCE) + return sprintf(buffer, "force\n"); + if (*((int *)kp->arg) == MEMMAP_ON_MEMORY_ENABLE) + return sprintf(buffer, "y\n"); + + return sprintf(buffer, "n\n"); +} + +static const struct kernel_param_ops memmap_mode_ops = { + .set = set_memmap_mode, + .get = get_memmap_mode, +}; +module_param_cb(memmap_on_memory, &memmap_mode_ops, &memmap_mode, 0444); +MODULE_PARM_DESC(memmap_on_memory, "Enable memmap on memory for memory hotplug\n" + "With value \"force\" it could result in memory wastage due to memmap size limitations \n" + "For example, if the memmap for a memory block requires 1 MiB, but the pageblock \n" + "size is 2 MiB, 1 MiB of hotplugged memory will be wasted. Note that there are \n" + "still cases where the feature cannot be enforced: for example, if the memmap is \n" + "smaller than a single page, or if the architecture does not support the forced \n" + "mode in all configurations. (y/n/force)"); static inline bool mhp_memmap_on_memory(void) { - return memmap_on_memory; + return !!memmap_mode; } #else static inline bool mhp_memmap_on_memory(void) @@ -1264,7 +1332,6 @@ static inline bool arch_supports_memmap_on_memory(unsigned long size) static bool mhp_supports_memmap_on_memory(unsigned long size) { - unsigned long nr_vmemmap_pages = size >> PAGE_SHIFT; unsigned long vmemmap_size = nr_vmemmap_pages * sizeof(struct page); unsigned long remaining_size = size - vmemmap_size; @@ -1295,10 +1362,23 @@ static bool mhp_supports_memmap_on_memory(unsigned long size) * altmap as an alternative source of memory, and we do not exactly * populate a single PMD. */ - return mhp_memmap_on_memory() && - size == memory_block_size_bytes() && - IS_ALIGNED(remaining_size, (pageblock_nr_pages << PAGE_SHIFT)) && - arch_supports_memmap_on_memory(size); + if (!mhp_memmap_on_memory() || size != memory_block_size_bytes()) + return false; + + /* + * Make sure the vmemmap allocation is fully contained + * so that we always allocate vmemmap memory from altmap area. + */ + if (!IS_ALIGNED(vmemmap_size, PAGE_SIZE)) + return false; + /* + * Without page reservation remaining pages should be pageblock aligned. + */ + if (memmap_mode != MEMMAP_ON_MEMORY_FORCE && + !IS_ALIGNED(remaining_size, (pageblock_nr_pages << PAGE_SHIFT))) + return false; + + return arch_supports_memmap_on_memory(size); } /* @@ -1311,7 +1391,11 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) { struct mhp_params params = { .pgprot = pgprot_mhp(PAGE_KERNEL) }; enum memblock_flags memblock_flags = MEMBLOCK_NONE; - struct vmem_altmap mhp_altmap = {}; + struct vmem_altmap mhp_altmap = { + .base_pfn = PHYS_PFN(res->start), + .end_pfn = PHYS_PFN(res->end), + .reserve = memory_block_align_base(resource_size(res)), + }; struct memory_group *group = NULL; u64 start, size; bool new_node = false; @@ -1356,8 +1440,7 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) */ if (mhp_flags & MHP_MEMMAP_ON_MEMORY) { if (mhp_supports_memmap_on_memory(size)) { - mhp_altmap.free = PHYS_PFN(size); - mhp_altmap.base_pfn = PHYS_PFN(start); + mhp_altmap.free = PHYS_PFN(size) - mhp_altmap.reserve; params.altmap = &mhp_altmap; } /* fallback to not using altmap */ @@ -1369,7 +1452,7 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) goto error; /* create memory block devices after memory was added */ - ret = create_memory_block_devices(start, size, mhp_altmap.alloc, + ret = create_memory_block_devices(start, size, mhp_altmap.alloc + mhp_altmap.reserve, group); if (ret) { arch_remove_memory(start, size, NULL);