From patchwork Sun Aug 28 00:55:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baoquan He X-Patchwork-Id: 12957128 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 699BDECAAD1 for ; Sun, 28 Aug 2022 00:58:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=wfQXyWfyA9co+L8HmkY+l08wOnjsSKCvseFVy4IclhM=; b=TxXag9yAbGGvFr mRUXN8eItTCf/Jzzlp/wVk2xgTQRa417Uf7hL8aV+6+RnBPrRNAEuw8XjLLQk/2Bdia0yU+u2CnIm fagtQivCTm1otj/tb7hiE7hN3LF8FWwM/uRkpXtRwYVE0WTR2U6ThFuAMC8fzcZ81JWv7qCoiuOTW Gpse5Y+VrSrtli6OV03xbRcIJRc7ESQxwwC0FFprPEUKAwxDkvQZKkHv9nosCerVXOcd1+4b3w4Vh nS4QQXMyDpMw3TEjs3EHOZUU2sbZocgtGLoFHohT3S5QBOyOfI85tL0JnTdQRvCeBuc8M74cIVXcD evAV/py7FexNtwAN5o4Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oS6bI-0073lN-5n; Sun, 28 Aug 2022 00:56:32 +0000 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1oS6ar-00736f-BW for linux-arm-kernel@lists.infradead.org; Sun, 28 Aug 2022 00:56:10 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661648163; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rvB61AUUyCy58XPpSF7Tow+QnKoMMfrS9peNbMrYXNU=; b=N0bkRZqp6e+KBW2aksvhQBYNZ/zi43QJjAVqoscLJ0Mw5IKp+0R1DbSXMeVQc9mpLQTiYX /99IboqhgS0/BjWjGOgxlsyP7sYWKBEwDkX5ZEWd09/5DR15CJ1WbgaEKtwO/oakvWzgcW vlj5DBcH47hyYniyqJQRR/ILYB7GdqY= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-58-_BgS1K4mNWeHtQcoMRrjUw-1; Sat, 27 Aug 2022 20:56:02 -0400 X-MC-Unique: _BgS1K4mNWeHtQcoMRrjUw-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 8CFEE801231; Sun, 28 Aug 2022 00:56:01 +0000 (UTC) Received: from MiWiFi-R3L-srv.redhat.com (ovpn-12-25.pek2.redhat.com [10.72.12.25]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6F22A492C3B; Sun, 28 Aug 2022 00:55:55 +0000 (UTC) From: Baoquan He To: linux-kernel@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, catalin.marinas@arm.com, ardb@kernel.org, rppt@kernel.org, guanghuifeng@linux.alibaba.com, mark.rutland@arm.com, will@kernel.org, linux-mm@kvack.org, thunder.leizhen@huawei.com, wangkefeng.wang@huawei.com, kexec@lists.infradead.org, Baoquan He Subject: [PATCH 1/2] arm64, kdump: enforce to take 4G as the crashkernel low memory end Date: Sun, 28 Aug 2022 08:55:44 +0800 Message-Id: <20220828005545.94389-2-bhe@redhat.com> In-Reply-To: <20220828005545.94389-1-bhe@redhat.com> References: <20220828005545.94389-1-bhe@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.85 on 10.11.54.10 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220827_175605_517899_03B0C300 X-CRM114-Status: GOOD ( 20.08 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Problem: ======= On arm64, block and section mapping is supported to build page tables. However, currently it enforces to take base page mapping for the whole linear mapping if CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 is enabled and crashkernel kernel parameter is set. This will cause longer time of the linear mapping process during bootup and severe performance degradation during running time. Root cause: ========== On arm64, crashkernel reservation relies on knowing the upper limit of low memory zone because it needs to reserve memory in the zone so that devices' DMA addressing in kdump kernel can be satisfied. However, the limit on arm64 is variant. And the upper limit can only be decided late till bootmem_init() is called. And we need to map the crashkernel region with base page granularity when doing linear mapping, because kdump needs to protect the crashkernel region via set_memory_valid(,0) after kdump kernel loading. However, arm64 doesn't support well on splitting the built block or section mapping due to some cpu reststriction [1]. And unfortunately, the linear mapping is done before bootmem_init(). To resolve the above conflict on arm64, the compromise is enforcing to take base page mapping for the entire linear mapping if crashkernel is set, and CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 is enabed. Hence performance is sacrificed. Solution: ========= To fix the problem, we should always take 4G as the crashkernel low memory end in case CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 is enabled. With this, we don't need to defer the crashkernel reservation till bootmem_init() is called to set the arm64_dma_phys_limit. As long as memblock init is done, we can conclude what is the upper limit of low memory zone. 1) both CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 are disabled or memblock_start_of_DRAM() > 4G limit = PHYS_ADDR_MAX+1 (Corner cases) 2) CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 are enabled: limit = 4G (generic case) [1] https://lore.kernel.org/all/YrIIJkhKWSuAqkCx@arm.com/T/#u Signed-off-by: Baoquan He Reviewed-by: Zhen Lei Signed-off-by: Mike Rapoport --- arch/arm64/mm/init.c | 24 ++++++++++++++---------- arch/arm64/mm/mmu.c | 38 ++++++++++++++++++++++---------------- 2 files changed, 36 insertions(+), 26 deletions(-) diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index b9af30be813e..8ae55afdd11c 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -90,10 +90,22 @@ phys_addr_t __ro_after_init arm64_dma_phys_limit; phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1; #endif +static phys_addr_t __init crash_addr_low_max(void) +{ + phys_addr_t low_mem_mask = U32_MAX; + phys_addr_t phys_start = memblock_start_of_DRAM(); + + if ((!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32)) || + (phys_start > U32_MAX)) + low_mem_mask = PHYS_ADDR_MAX; + + return min(low_mem_mask, memblock_end_of_DRAM() - 1) + 1; +} + /* Current arm64 boot protocol requires 2MB alignment */ #define CRASH_ALIGN SZ_2M -#define CRASH_ADDR_LOW_MAX arm64_dma_phys_limit +#define CRASH_ADDR_LOW_MAX crash_addr_low_max() #define CRASH_ADDR_HIGH_MAX (PHYS_MASK + 1) static int __init reserve_crashkernel_low(unsigned long long low_size) @@ -389,8 +401,7 @@ void __init arm64_memblock_init(void) early_init_fdt_scan_reserved_mem(); - if (!defer_reserve_crashkernel()) - reserve_crashkernel(); + reserve_crashkernel(); high_memory = __va(memblock_end_of_DRAM() - 1) + 1; } @@ -434,13 +445,6 @@ void __init bootmem_init(void) */ dma_contiguous_reserve(arm64_dma_phys_limit); - /* - * request_standard_resources() depends on crashkernel's memory being - * reserved, so do it here. - */ - if (defer_reserve_crashkernel()) - reserve_crashkernel(); - memblock_dump_all(); } diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index e7ad44585f40..cdd338fa2115 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -547,13 +547,12 @@ static void __init map_mem(pgd_t *pgdp) memblock_mark_nomap(kernel_start, kernel_end - kernel_start); #ifdef CONFIG_KEXEC_CORE - if (crash_mem_map) { - if (defer_reserve_crashkernel()) - flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS; - else if (crashk_res.end) - memblock_mark_nomap(crashk_res.start, - resource_size(&crashk_res)); - } + if (crashk_res.end) + memblock_mark_nomap(crashk_res.start, + resource_size(&crashk_res)); + if (crashk_low_res.end) + memblock_mark_nomap(crashk_low_res.start, + resource_size(&crashk_low_res)); #endif /* map all the memory banks */ @@ -589,16 +588,23 @@ static void __init map_mem(pgd_t *pgdp) * through /sys/kernel/kexec_crash_size interface. */ #ifdef CONFIG_KEXEC_CORE - if (crash_mem_map && !defer_reserve_crashkernel()) { - if (crashk_res.end) { - __map_memblock(pgdp, crashk_res.start, - crashk_res.end + 1, - PAGE_KERNEL, - NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS); - memblock_clear_nomap(crashk_res.start, - resource_size(&crashk_res)); - } + if (crashk_res.end) { + __map_memblock(pgdp, crashk_res.start, + crashk_res.end + 1, + PAGE_KERNEL, + NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS); + memblock_clear_nomap(crashk_res.start, + resource_size(&crashk_res)); } + if (crashk_low_res.end) { + __map_memblock(pgdp, crashk_low_res.start, + crashk_low_res.end + 1, + PAGE_KERNEL, + NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS); + memblock_clear_nomap(crashk_low_res.start, + resource_size(&crashk_low_res)); + } + #endif }