From patchwork Sat Aug 1 13:08:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: chenzhou X-Patchwork-Id: 11696085 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 940301575 for ; Sat, 1 Aug 2020 13:06:53 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6D0B12076A for ; Sat, 1 Aug 2020 13:06:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="RY7N3aEY" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6D0B12076A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-ID:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=QUaT9FPixYFK5RJTDuJTgnxx1W4UVWhO2bD5hZJuE/w=; b=RY7N3aEYFiTviOYs5tuOu1qtx NSyzAo0S5dCjZrWKI4XUtr4KYsNeZN4pvpXR+aI0/qZJWhr9EaHzZScJ+nmIvrk5wH2C+t2FQahBN sDGj8qWGvqloUKjybLNzc9SSArMv/hbY+wuMcYWfva0Go3ZOGAJmgoBCzq/2B5Qrl3MycrtnLOpY6 YC9msENs/JPFX7c7z3wINLetPhoLVkEuNrb3i1VUVsNxIh2qsJs2fL/A1UHSdkzXfIL7Aj9AJ10eg 3V8NIPSaPbsxEfA6yK2C/ax8DLnPcOhij2mZA20P7Kq6gmIRVVbuf0DH6SzDkK+VTxlJoBDmLYqEK 5dzYn6CNw==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1k1rDj-0000aI-H3; Sat, 01 Aug 2020 13:06:39 +0000 Received: from szxga05-in.huawei.com ([45.249.212.191] helo=huawei.com) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1k1rDU-0000Rl-2u; Sat, 01 Aug 2020 13:06:26 +0000 Received: from DGGEMS405-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id 2F5C34CE450590F118BA; Sat, 1 Aug 2020 21:06:10 +0800 (CST) Received: from localhost.localdomain.localdomain (10.175.113.25) by DGGEMS405-HUB.china.huawei.com (10.3.19.205) with Microsoft SMTP Server id 14.3.487.0; Sat, 1 Aug 2020 21:05:59 +0800 From: Chen Zhou To: , , , , , , , , , , Subject: [PATCH v11 3/5] arm64: kdump: reimplement crashkernel=X Date: Sat, 1 Aug 2020 21:08:54 +0800 Message-ID: <20200801130856.86625-4-chenzhou10@huawei.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200801130856.86625-1-chenzhou10@huawei.com> References: <20200801130856.86625-1-chenzhou10@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.113.25] X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200801_090624_957532_700B531B X-CRM114-Status: GOOD ( 17.63 ) X-Spam-Score: -2.3 (--) X-Spam-Report: SpamAssassin version 3.4.4 on merlin.infradead.org summary: Content analysis details: (-2.3 points) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 RCVD_IN_MSPIKE_H4 RBL: Very Good reputation (+4) [45.249.212.191 listed in wl.mailspike.net] -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at https://www.dnswl.org/, medium trust [45.249.212.191 listed in list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record -0.0 SPF_HELO_PASS SPF: HELO matches SPF record 0.0 RCVD_IN_MSPIKE_WL Mailspike good senders X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: wangkefeng.wang@huawei.com, arnd@arndb.de, linux-doc@vger.kernel.org, chenzhou10@huawei.com, xiexiuqi@huawei.com, kexec@lists.infradead.org, linux-kernel@vger.kernel.org, robh+dt@kernel.org, horms@verge.net.au, nsaenzjulienne@suse.de, huawei.libin@huawei.com, guohanjun@huawei.com, linux-arm-kernel@lists.infradead.org Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org There are following issues in arm64 kdump: 1. We use crashkernel=X to reserve crashkernel below 4G, which will fail when there is no enough low memory. 2. If reserving crashkernel above 4G, in this case, crash dump kernel will boot failure because there is no low memory available for allocation. 3. Since commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32"), if the memory reserved for crash dump kernel falled in ZONE_DMA32, the devices in crash dump kernel need to use ZONE_DMA will alloc fail. To solve these issues, change the behavior of crashkernel=X. crashkernel=X tries low allocation in ZONE_DMA, and fall back to high allocation if it fails. If requized size X is too large and leads to very little free memory in ZONE_DMA after low allocation, the system may not work normally. So add a threshold and go for high allocation directly if the required size is too large. The value of threshold is set as the half of the low memory. If crash_base is outside ZONE_DMA, try to allocate at least 256M in ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate specified size low memory. For non-RPi4 platforms, change ZONE_DMA memtioned above to ZONE_DMA32. Another minor change, there may be two regions reserved for crash dump kernel, in order to distinct from the high region and make no effect to the use of existing kexec-tools, rename the low region as "Crash kernel (low)". Signed-off-by: Chen Zhou --- arch/arm64/include/asm/kexec.h | 4 +++ arch/arm64/kernel/setup.c | 8 +++++- arch/arm64/mm/init.c | 51 ++++++++++++++++++++++++++++++---- 3 files changed, 57 insertions(+), 6 deletions(-) diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h index 1a2f27f12794..92ed53d0bf21 100644 --- a/arch/arm64/include/asm/kexec.h +++ b/arch/arm64/include/asm/kexec.h @@ -28,7 +28,11 @@ /* 2M alignment for crash kernel regions */ #define CRASH_ALIGN SZ_2M +#ifdef CONFIG_ZONE_DMA +#define CRASH_ADDR_LOW_MAX arm64_dma_phys_limit +#else #define CRASH_ADDR_LOW_MAX arm64_dma32_phys_limit +#endif #ifndef __ASSEMBLY__ diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index 93b3844cf442..4dc51a2ac012 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -238,7 +238,13 @@ static void __init request_standard_resources(void) kernel_data.end <= res->end) request_resource(res, &kernel_data); #ifdef CONFIG_KEXEC_CORE - /* Userspace will find "Crash kernel" region in /proc/iomem. */ + /* + * Userspace will find "Crash kernel" region in /proc/iomem. + * Note: the low region is renamed as Crash kernel (low). + */ + if (crashk_low_res.end && crashk_low_res.start >= res->start && + crashk_low_res.end <= res->end) + request_resource(res, &crashk_low_res); if (crashk_res.end && crashk_res.start >= res->start && crashk_res.end <= res->end) request_resource(res, &crashk_res); diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index a3d0193f6a0a..53c8916fd32f 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -70,6 +70,14 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init; phys_addr_t arm64_dma32_phys_limit __ro_after_init; #ifdef CONFIG_KEXEC_CORE + +/* + * Add a threshold for required memory size of crashkernel. If required memory + * size is greater than threshold, just go for high allocation directly. The + * value of threshold is set as half of the total low memory. + */ +#define REQUIRED_MEMORY_THRESHOLD (memblock_mem_size(CRASH_ADDR_LOW_MAX >> \ + PAGE_SHIFT) >> 1) /* * reserve_crashkernel() - reserves memory for crash kernel * @@ -90,11 +98,22 @@ static void __init reserve_crashkernel(void) crash_size = PAGE_ALIGN(crash_size); - if (crash_base == 0) { - /* Current arm64 boot protocol requires 2MB alignment */ - crash_base = memblock_find_in_range(0, CRASH_ADDR_LOW_MAX, - crash_size, CRASH_ALIGN); - if (crash_base == 0) { + if (!crash_base) { + /* + * Current arm64 boot protocol requires 2MB alignment. + * If required memory size is greater than threshold, just go + * for high allocation directly. + * If required memory size is less than or equal to threshold, + * try low allocation firstly, and then fall back to high allocation + * if it fails. + */ + if (crash_size <= REQUIRED_MEMORY_THRESHOLD) + crash_base = memblock_find_in_range(0, CRASH_ADDR_LOW_MAX, + crash_size, CRASH_ALIGN); + if (!crash_base) + crash_base = memblock_find_in_range(0, MEMBLOCK_ALLOC_ACCESSIBLE, + crash_size, SZ_2M); + if (!crash_base) { pr_warn("cannot allocate crashkernel (size:0x%llx)\n", crash_size); return; @@ -118,6 +137,28 @@ static void __init reserve_crashkernel(void) } memblock_reserve(crash_base, crash_size); + if (crash_base >= CRASH_ADDR_LOW_MAX) { + const char *rename = "Crash kernel (low)"; + + if (reserve_crashkernel_low()) { + memblock_free(crash_base, crash_size); + return; + } + + /* + * In order to distinct from the high region and make no effect + * to the use of existing kexec-tools, rename the low region as + * "Crash kernel (low)". + */ + crashk_low_res.name = rename; + /* + * The low region is intended to be used for crash dump kernel + * devices, just mark the low region as "nomap" simply. + */ + memblock_mark_nomap(crashk_low_res.start, + resource_size(&crashk_low_res)); + } + pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n", crash_base, crash_base + crash_size, crash_size >> 20);