From patchwork Fri Jan 3 08:50:02 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhenhua Huang X-Patchwork-Id: 13925370 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45CEAE7718F for ; Fri, 3 Jan 2025 08:50:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9A56F6B007B; Fri, 3 Jan 2025 03:50:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 955556B0082; Fri, 3 Jan 2025 03:50:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 81D766B0083; Fri, 3 Jan 2025 03:50:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 652366B007B for ; Fri, 3 Jan 2025 03:50:42 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 101DFAEBD3 for ; Fri, 3 Jan 2025 08:50:42 +0000 (UTC) X-FDA: 82965517866.25.135E1C2 Received: from mx0b-0031df01.pphosted.com (mx0b-0031df01.pphosted.com [205.220.180.131]) by imf30.hostedemail.com (Postfix) with ESMTP id 3D25A8000F for ; Fri, 3 Jan 2025 08:49:01 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=quicinc.com header.s=qcppdkim1 header.b=AIC0S8b3; spf=pass (imf30.hostedemail.com: domain of quic_zhenhuah@quicinc.com designates 205.220.180.131 as permitted sender) smtp.mailfrom=quic_zhenhuah@quicinc.com; dmarc=pass (policy=none) header.from=quicinc.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1735894179; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=EiTWrdPd+yj2FHOvAvudpCJ9LLU2BCANkcALvO3hnXg=; b=50v8EuMXW21VEyar5vBqpfuwl05jg7Q6FE6sX6aflJ45p/YjOLwaYaSvQzpggGI+Jeekbk FlBhaKErMjySZZPd9e/P/6uNXXWpnExIpkRHKBGyWxYnRahlmTci77iqhtcXK4lvhKdkAg Xx1FK2jUTIkULU65L7kx0+5SCfFPzsw= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1735894179; a=rsa-sha256; cv=none; b=y0442jBCnHPooSQt1qGQKYgSPpB+TQVlt1kKC1G/9Vl+EoRDjbgSnvmXBqZZUfvhv+L0UR nMYYkhtn6+h9wdZxZev5Xb6keVS8YJb4Gle9XwMrV7yl8dRQGnERQfIljDLJafWUI9HZx3 2rpp5wWMSL3vkPZAFaaJv/l6FIExPsY= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=quicinc.com header.s=qcppdkim1 header.b=AIC0S8b3; spf=pass (imf30.hostedemail.com: domain of quic_zhenhuah@quicinc.com designates 205.220.180.131 as permitted sender) smtp.mailfrom=quic_zhenhuah@quicinc.com; dmarc=pass (policy=none) header.from=quicinc.com Received: from pps.filterd (m0279871.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 5037FfaZ010711; Fri, 3 Jan 2025 08:50:34 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h= cc:content-transfer-encoding:content-type:date:from:message-id :mime-version:subject:to; s=qcppdkim1; bh=EiTWrdPd+yj2FHOvAvudpC J9LLU2BCANkcALvO3hnXg=; b=AIC0S8b3IAD6JaLfwl7qt0F4ciMStN5A2AfDLR IjfQ7gImt5FfelRLKmmma0ibHhH/C2/QPs3QLVHlCx4WCkOBdRAWbGx1pVOLgkBD uRZg5v40z2+LQ1Oyuoz5NH8IIJF5Pxyajxm3+lhx9K/6OFxbvyEIocycaT/RfLpW dAL8FJdWl9y3x0mLM+UM0bivGBq7+gaeFvWv2dh/4PxzGj4RBy+Fj6CYZkFKKUYB D9FqzueUbqIebYYbfj7A+QkGF4Rzm7SSsXFhOY8pPy5pDF+6gCnnREsT3hTDnDyk 2FH51hxkmb7n6t0+P7tGJ7FjgW40SmN6NdRsHLjQ5rw4aV0Q== Received: from nalasppmta02.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 43xbbp06qy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 03 Jan 2025 08:50:33 +0000 (GMT) Received: from nalasex01a.na.qualcomm.com (nalasex01a.na.qualcomm.com [10.47.209.196]) by NALASPPMTA02.qualcomm.com (8.18.1.2/8.18.1.2) with ESMTPS id 5038oWRY000962 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 3 Jan 2025 08:50:32 GMT Received: from ap-kernel-sh01-lnx.qualcomm.com (10.80.80.8) by nalasex01a.na.qualcomm.com (10.47.209.196) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.9; Fri, 3 Jan 2025 00:50:28 -0800 From: Zhenhua Huang To: , CC: , , , , , , , , , , , , , Zhenhua Huang , Subject: [PATCH v3] arm64: mm: Populate vmemmap at the page level for hotplugged sections Date: Fri, 3 Jan 2025 16:50:02 +0800 Message-ID: <20250103085002.27243-1-quic_zhenhuah@quicinc.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Originating-IP: [10.80.80.8] X-ClientProxiedBy: nasanex01a.na.qualcomm.com (10.52.223.231) To nalasex01a.na.qualcomm.com (10.47.209.196) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-ORIG-GUID: pbTVjusVWVeqDA3YjrrNTLMewUxKBtXN X-Proofpoint-GUID: pbTVjusVWVeqDA3YjrrNTLMewUxKBtXN X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.60.29 definitions=2024-09-06_09,2024-09-06_01,2024-09-02_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 bulkscore=0 phishscore=0 lowpriorityscore=0 malwarescore=0 priorityscore=1501 mlxscore=0 suspectscore=0 impostorscore=0 adultscore=0 mlxlogscore=934 clxscore=1011 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2411120000 definitions=main-2501030076 X-Rspamd-Queue-Id: 3D25A8000F X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: ag3kep7z9srnx7g1psdsfps7uhib9b6q X-HE-Tag: 1735894141-508962 X-HE-Meta: U2FsdGVkX1+KaKr0Ucns9of+WSmBG+2esQdWiPDfFpM1rRsUvjfiBh2yoAqVSBWEPqyjFtna1pISuxj6Hv9g7RKgrASt/vmwhFutXKuYO2RbFcesx46Ls9I2Riu/SBmWmmlkmddd+W4wGARqToIynzULccPcji7EYL0JuV2e1tA2wCLzQSnYnI5UUQ/1gqXVfV6OZlNA0wPixuvzXolJ5ODWbn45mlGJnSp7IEulzGhoK4QEvs/ZTCBE2DL81MzUjQRSNE/m/PwtuLrEwbdLJ1XnyJCMI+0ExOGh7b4c+yRhu1PJ46PHQiJilpKHms+bQBc/0Xwfj7KPWOuy7olUtHXzRubSOrQyVZ5xrqSMvrG1N8SI2gJmhs9jM9IoGqjbsYw6itetPuFsj/JmixeyJ6h4nmQ1oF/7izmBIByskbxlDbhOMtFL3Sa2nQdmSZhMxOBo+cZ6Xg1aHjwfZx/KuWix2+TEEx5eDgfz73eQDHx/Sd/XKH01LSH3EXV8+kza2ZoeM/F3aGw0WtC1KgFd60KZDRPkFN8EFphBCdPyuwAU90ZVejoXjeM10cL+9bqYZ9Zk/X2ijHhgGV89xdxXAteyp0hnVi3th/kRusv++0CXJg99TvKyrkzNUaRwzjVwmmLV7AaOnVTXoN6fhqYtTQumuJX6fgRUYKmia0VAY5XXGWapXGj3P6PR4qtMjDEQNZlZ9UOzlJkfgDKW0RwdSz1mK8yfvUiLu7UwdB66LHQzN1dZXK52y4yXJ4oELXkTN3Qn+6RPaTUoOeJqxsQHf/lLiN5qEAhhfi97mxRkjPbn5150U4/ubjEiFTqJi8Pm0mL58jvYJK9/pYIzxKNAyJ2nCwKtnR1lhfp/zuYWSFOtTHMaiucGBV4MKQziHKfyIaL81xuov3DNco5ptwbzcaSCv8Vl4HAQZeDrwoed6KYjMAy6CaNwadfWfF01OPIhTCtrYMAW0RGMIElZnSV TzXl+uCH a3sRav4F9+SqTjxRtujfifQKVyFbcEHyi35dw50eyIt2xvKuGTRZr4k2DlwpUs5cEkBbkrNbRBai0wms1IQn8JEueVgODi1ZhuLGdQoKgOikyr23uVe+vlM3EJuFryBVjcE+y+eemDXZc9sWK4oJFmvS+aJHUNgiZbez9i6R5mw8NSjzeeu+XOfXYzKScdBG820zVJfahcntSx5eQ+jA5sgGZ7OXqEdWHK7ekeNqcvtdcQ0Vk0duhKEWShUn9zRoHIQDkMR7kLA5oBXUzm9Ne5nJd6f6q+iw3S+ws/Qc3bLpTVzSU2V7dh+zjXVIAQvMoE9KeybjT6ZvIt7Q8BNeH2bedXw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Commit c1cc1552616d ("arm64: MMU initialisation") optimizes the vmemmap to populate at the PMD section level which was suitable initially since hotplugging granule is always 128M. However, commit ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug") which added 2M hotplugging granule disrupted the arm64 assumptions. Considering the vmemmap_free -> unmap_hotplug_pmd_range path, when pmd_sect() is true, the entire PMD section is cleared, even if there is other effective subsection. For example pagemap1 and pagemap2 are part of a single PMD entry and they are hot-added sequentially. Then pagemap1 is removed, vmemmap_free() will clear the entire PMD entry freeing the struct page metadata for the whole section, even though pagemap2 is still active. To address the issue, we need to prevent PMD/PUD/CONT mappings for both linear and vmemmap for non-boot sections if the size exceeds 2MB (considering sub-section is 2MB). We only permit 2MB blocks in a 4KB page configuration. Cc: stable@vger.kernel.org # v5.4+ Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug") Signed-off-by: Zhenhua Huang --- Hi Catalin and Anshuman, Based on your review comments, I concluded below patch and tested with my setup. I have not folded patchset #2 since this patch seems to be enough for backporting.. Please see if you have further suggestions. arch/arm64/mm/mmu.c | 33 +++++++++++++++++++++++++++++---- 1 file changed, 29 insertions(+), 4 deletions(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index e2739b69e11b..2b4d23f01d85 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -42,9 +42,11 @@ #include #include -#define NO_BLOCK_MAPPINGS BIT(0) +#define NO_PMD_BLOCK_MAPPINGS BIT(0) #define NO_CONT_MAPPINGS BIT(1) #define NO_EXEC_MAPPINGS BIT(2) /* assumes FEAT_HPDS is not used */ +#define NO_PUD_BLOCK_MAPPINGS BIT(3) /* Hotplug case: do not want block mapping for PUD */ +#define NO_BLOCK_MAPPINGS (NO_PMD_BLOCK_MAPPINGS | NO_PUD_BLOCK_MAPPINGS) u64 kimage_voffset __ro_after_init; EXPORT_SYMBOL(kimage_voffset); @@ -254,7 +256,7 @@ static void init_pmd(pmd_t *pmdp, unsigned long addr, unsigned long end, /* try section mapping first */ if (((addr | next | phys) & ~PMD_MASK) == 0 && - (flags & NO_BLOCK_MAPPINGS) == 0) { + (flags & NO_PMD_BLOCK_MAPPINGS) == 0) { pmd_set_huge(pmdp, phys, prot); /* @@ -356,10 +358,11 @@ static void alloc_init_pud(p4d_t *p4dp, unsigned long addr, unsigned long end, /* * For 4K granule only, attempt to put down a 1GB block + * Hotplug case: do not attempt 1GB block */ if (pud_sect_supported() && ((addr | next | phys) & ~PUD_MASK) == 0 && - (flags & NO_BLOCK_MAPPINGS) == 0) { + (flags & NO_PUD_BLOCK_MAPPINGS) == 0) { pud_set_huge(pudp, phys, prot); /* @@ -1175,9 +1178,16 @@ int __meminit vmemmap_check_pmd(pmd_t *pmdp, int node, int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, struct vmem_altmap *altmap) { + unsigned long start_pfn; + struct mem_section *ms; + WARN_ON((start < VMEMMAP_START) || (end > VMEMMAP_END)); - if (!IS_ENABLED(CONFIG_ARM64_4K_PAGES)) + start_pfn = page_to_pfn((struct page *)start); + ms = __pfn_to_section(start_pfn); + + /* Hotplugged section not support hugepages */ + if (!IS_ENABLED(CONFIG_ARM64_4K_PAGES) || !early_section(ms)) return vmemmap_populate_basepages(start, end, node, altmap); else return vmemmap_populate_hugepages(start, end, node, altmap); @@ -1339,9 +1349,24 @@ int arch_add_memory(int nid, u64 start, u64 size, struct mhp_params *params) { int ret, flags = NO_EXEC_MAPPINGS; + unsigned long start_pfn = page_to_pfn((struct page *)start); + struct mem_section *ms = __pfn_to_section(start_pfn); VM_BUG_ON(!mhp_range_allowed(start, size, true)); + /* Should not be invoked by early section */ + WARN_ON(early_section(ms)); + + if (IS_ENABLED(CONFIG_ARM64_4K_PAGES)) + /* + * As per subsection granule is 2M, allow PMD block mapping in + * case 4K PAGES. + * Other cases forbid section mapping. + */ + flags |= NO_PUD_BLOCK_MAPPINGS | NO_CONT_MAPPINGS; + else + flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS; + if (can_set_direct_map()) flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;