diff mbox series

mips: kdump: Crash kernel should be able to see old memories

Message ID 1618829807-12522-1-git-send-email-tangyouling@loongson.cn (mailing list archive)
State Superseded
Headers show
Series mips: kdump: Crash kernel should be able to see old memories | expand

Commit Message

Youling Tang April 19, 2021, 10:56 a.m. UTC
From: Huacai Chen <chenhc@lemote.com>

kexec-tools use mem=X@Y to pass usable memories to crash kernel, but in
commit a94e4f24ec836c8984f83959 ("MIPS: init: Drop boot_mem_map") all
BIOS passed memories are removed by early_parse_mem(). I think this is
reasonable for a normal kernel but not for a crash kernel, because a
crash kernel should be able to see all old memories, even though it is
not supposed to use them.

Fixes: a94e4f24ec836c8984f83959 ("MIPS: init: Drop boot_mem_map")
Signed-off-by: Huacai Chen <chenhuacai@kernel.org>
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
---
 arch/mips/kernel/setup.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Jiaxun Yang April 20, 2021, 1:11 a.m. UTC | #1
在 2021/4/19 18:56, Youling Tang 写道:
> From: Huacai Chen <chenhc@lemote.com>
>
> kexec-tools use mem=X@Y to pass usable memories to crash kernel, but in
> commit a94e4f24ec836c8984f83959 ("MIPS: init: Drop boot_mem_map") all
> BIOS passed memories are removed by early_parse_mem(). I think this is
> reasonable for a normal kernel but not for a crash kernel, because a
> crash kernel should be able to see all old memories, even though it is
> not supposed to use them.
>
> Fixes: a94e4f24ec836c8984f83959 ("MIPS: init: Drop boot_mem_map")
> Signed-off-by: Huacai Chen <chenhuacai@kernel.org>
> Signed-off-by: Youling Tang <tangyouling@loongson.cn>
> ---
>   arch/mips/kernel/setup.c | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
> index b86e241..ac90d3b 100644
> --- a/arch/mips/kernel/setup.c
> +++ b/arch/mips/kernel/setup.c
> @@ -351,8 +351,10 @@ static int __init early_parse_mem(char *p)
>   	 */
>   	if (usermem == 0) {
>   		usermem = 1;
> +#ifndef CONFIG_CRASH_DUMP

Why depend on a config instead of a runtime variable?

Btw as you are fixing my commit please keep me CCed.


Thanks.


- Jiaxun
Youling Tang April 20, 2021, 5:22 a.m. UTC | #2
Hi, Jiaxun

On 04/20/2021 09:11 AM, Jiaxun Yang wrote:
>
> 在 2021/4/19 18:56, Youling Tang 写道:
>> From: Huacai Chen <chenhc@lemote.com>
>>
>> kexec-tools use mem=X@Y to pass usable memories to crash kernel, but in
>> commit a94e4f24ec836c8984f83959 ("MIPS: init: Drop boot_mem_map") all
>> BIOS passed memories are removed by early_parse_mem(). I think this is
>> reasonable for a normal kernel but not for a crash kernel, because a
>> crash kernel should be able to see all old memories, even though it is
>> not supposed to use them.
>>
>> Fixes: a94e4f24ec836c8984f83959 ("MIPS: init: Drop boot_mem_map")
>> Signed-off-by: Huacai Chen <chenhuacai@kernel.org>
>> Signed-off-by: Youling Tang <tangyouling@loongson.cn>
>> ---
>>   arch/mips/kernel/setup.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
>> index b86e241..ac90d3b 100644
>> --- a/arch/mips/kernel/setup.c
>> +++ b/arch/mips/kernel/setup.c
>> @@ -351,8 +351,10 @@ static int __init early_parse_mem(char *p)
>>        */
>>       if (usermem == 0) {
>>           usermem = 1;
>> +#ifndef CONFIG_CRASH_DUMP
>
> Why depend on a config instead of a runtime variable?
>
If not depend on config, we can determine whether the command line contains
the "elfcorehdr=" parameter, because the "mem=" and "elfcorhdr=" parameters
are automatically added in kexec-tools. So if there is an "elfcorehdr="
parameter in the command line, it means that the currently running kernel
is a capture kernel, and the memblock_remove() operation is not called.

The revised patch is as follows:
         if (usermem == 0) {
                 usermem = 1;
-               memblock_remove(memblock_start_of_DRAM(),
-                       memblock_end_of_DRAM() - memblock_start_of_DRAM());
+               if (!strstr(boot_command_line, "elfcorehdr")) {
+                       memblock_remove(memblock_start_of_DRAM(),
+                               memblock_end_of_DRAM() - 
memblock_start_of_DRAM());
+               }

Do you think it is feasible?
> Btw as you are fixing my commit please keep me CCed.
Sorry, I will add your CCed.

Thanks,
Youling
>
> Thanks.
>
>
> - Jiaxun
>
Maciej W. Rozycki June 8, 2021, 12:16 a.m. UTC | #3
On Tue, 20 Apr 2021, Youling Tang wrote:

> > Why depend on a config instead of a runtime variable?
> > 
> If not depend on config, we can determine whether the command line contains
> the "elfcorehdr=" parameter, because the "mem=" and "elfcorhdr=" parameters
> are automatically added in kexec-tools. So if there is an "elfcorehdr="
> parameter in the command line, it means that the currently running kernel
> is a capture kernel, and the memblock_remove() operation is not called.
> 
> The revised patch is as follows:
>         if (usermem == 0) {
>                 usermem = 1;
> -               memblock_remove(memblock_start_of_DRAM(),
> -                       memblock_end_of_DRAM() - memblock_start_of_DRAM());
> +               if (!strstr(boot_command_line, "elfcorehdr")) {
> +                       memblock_remove(memblock_start_of_DRAM(),
> +                               memblock_end_of_DRAM() -
> memblock_start_of_DRAM());
> +               }
> 
> Do you think it is feasible?

 This looks like a hack to me.  How do other platforms solve it, e.g. x86?

  Maciej
Youling Tang June 8, 2021, 7:46 a.m. UTC | #4
Hi, Maciej

On 06/08/2021 08:16 AM, Maciej W. Rozycki wrote:
> On Tue, 20 Apr 2021, Youling Tang wrote:
>
>>> Why depend on a config instead of a runtime variable?
>>>
>> If not depend on config, we can determine whether the command line contains
>> the "elfcorehdr=" parameter, because the "mem=" and "elfcorhdr=" parameters
>> are automatically added in kexec-tools. So if there is an "elfcorehdr="
>> parameter in the command line, it means that the currently running kernel
>> is a capture kernel, and the memblock_remove() operation is not called.
>>
>> The revised patch is as follows:
>>          if (usermem == 0) {
>>                  usermem = 1;
>> -               memblock_remove(memblock_start_of_DRAM(),
>> -                       memblock_end_of_DRAM() - memblock_start_of_DRAM());
>> +               if (!strstr(boot_command_line, "elfcorehdr")) {
>> +                       memblock_remove(memblock_start_of_DRAM(),
>> +                               memblock_end_of_DRAM() -
>> memblock_start_of_DRAM());
>> +               }
>>
>> Do you think it is feasible?
>   This looks like a hack to me.  How do other platforms solve it, e.g. x86?
In the x86 architecture, when parsing "mem=" or "memmap=", there is no
requirement to process visible operations on old memory.

In the MIPS architecture, there should be no need to make the old memory
visible. The reason for adding the patch at the time is that without the
patch, the following panic will occur during the kdump operation.

[    0.229919] CPU 0 Unable to handle kernel paging request at virtual 
address 0000000001004c80, epc == fffff4
[    0.245191] Oops[#1]:
[    0.247365] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
5.12.0-rc2kdump+ #48
[    0.254211] Hardware name: THTF CX TL630 
Series/THTF-LS3A4000-7A1000-ML4A, BIOS V2.1.1 06/29/2020
[    0.263055] $ 0   : 0000000000000000 ffffffff8432ede4 
ffffffffffffffe0 0000000001004c80
[    0.271030] $ 4   : 0000000000040132 9800000006400000 
0000000000004000 0000000000000000
[    0.279006] $ 8   : 0000000000000000 0000000000000000 
980000000aafa000 980000000aaf6100
[    0.286982] $12   : 98000000061c7bc8 ffffffffffffffff 
fffffffffffffffc 0000000000000000
[    0.294958] $16   : 0000000000004000 9800000006400000 
0000000000000000 98000000061c7c40
[    0.302934] $20   : 0000000000000000 0000000000000000 
ffffffff85070000 0000000000040132
[    0.310912] $24   : 0000000000000000 ffffffff845bb520
[    0.318887] $28   : 98000000061c4000 98000000061c7bc0 
0000000000004024 ffffffff8432ede4
[    0.326863] Hi    : 000000000000000e
[    0.330418] Lo    : ffffffffcfd21432
[    0.333973] epc   : ffffffff84020808 copy_oldmem_page+0x58/0x1a0
[    0.339956] ra    : ffffffff8432ede4 read_from_oldmem.part.8+0xdc/0x120
[    0.346544] Status: 5400cce3 KX SX UX KERNEL EXL IE
[    0.351486] Cause : 10000008 (ExcCode 02)
[    0.355473] BadVA : 0000000001004c80
[    0.359029] PrId  : 0014c004 (ICT Loongson-3)
[    0.363364] Modules linked in:
[    0.366398] Process swapper/0 (pid: 1, threadinfo=(____ptrval____), 
task=(____ptrval____), tls=00000000000)
[    0.376543] Stack : 0000000000004000 ffffffff8432ede4 
9800000006400000 0000000000000000
[    0.384518]         00000000000001f4 98000000063fc000 
98000000063fc120 98000000063fc040
[    0.392494]         ffffffff84da0000 0000000000004024 
ffffffff84c7f010 0000000000000004
[    0.400469]         9800000006400000 ffffffff84f60208 
980000025fffe500 ffffffff84c7efb8
[    0.408445]         00000001004c8000 0000000100000000 
0000000000000000 a8fc2d8167e1bd00
[    0.416420]         0000000000000270 ffffffff8432ede4 
ffffffff84c60000 a8fc2d8167e1bd00
[    0.424396]         0000000000000270 ffffffff85070000 
0000000000000000 ffffffff84e60000
[    0.432371]         ffffffff85070000 ffffffff84da0000 
0000000000000005 0000000000000006
[    0.440348]         ffffffff84f7c568 ffffffff84f6093c 
0000000000000000 98000000061c7d48
[    0.448324]         ffffffff84e90000 98000000061c7d38 
ffffffff84daa968 ffffffff84da0000
[    0.456300]         ...
[    0.458728] Call Trace:
[    0.461154] [<ffffffff84020808>] copy_oldmem_page+0x58/0x1a0
[    0.466792] [<ffffffff8432ede4>] read_from_oldmem.part.8+0xdc/0x120
[    0.473033] [<ffffffff84f60208>] 
merge_note_headers_elf64.constprop.14+0xa8/0x308
[    0.480489] [<ffffffff84f6093c>] vmcore_init+0x1d8/0x5fc
[    0.485776] [<ffffffff84000c9c>] do_one_initcall+0x54/0x2b8
[    0.491325] [<ffffffff84f451e8>] kernel_init_freeable+0x1e4/0x234
[    0.497394] [<ffffffff84aba374>] kernel_init+0x1c/0x128
[    0.502597] [<ffffffff84005d0c>] ret_from_kernel_thread+0x14/0x1c
[    0.508669]
[    0.510139] Code: 2402ffe0  01224824  0123182d <dc640000> 00044dbe  
000948f8  d94a4803  1140003b  7c844b02
[    0.519848]
[    0.521331] ---[ end trace 0f67ff1443cfce3c ]---
[    0.525927] Kernel panic - not syncing: Attempted to kill init! 
exitcode=0x0000000b
[    0.533554] ---[ end Kernel panic - not syncing: Attempted to kill 
init! exitcode=0x0000000b ]---


The reason is that kmap_local_pfn() will call kmap_atomic(pfn_to_page(pfn)),
and when pfn_to_page(), the page does not exist, so a panic occurs when the
page is accessed.
With this patch, the old memory is made visible, and the page is available
to the old memory, so the page can be accessed normally without panic.

So the root of the problem is that there is a problem with the 
implementation
of copy_oldmem_page(). The panic can be solved in the following two ways.


Method One:
This function can refer to the implementation of arm64, and use memremap()
to map an available range instead of getting addr directly through 
pfn_to_page().
The patch is as follows:
diff --git a/arch/mips/kernel/crash_dump.c b/arch/mips/kernel/crash_dump.c
index 2e50f551..e122477 100644
--- a/arch/mips/kernel/crash_dump.c
+++ b/arch/mips/kernel/crash_dump.c
@@ -23,16 +23,21 @@ ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
         if (!csize)
                 return 0;

-       vaddr = kmap_local_pfn(pfn);
+       vaddr = memremap(__pfn_to_phys(pfn), PAGE_SIZE, MEMREMAP_WB);

         if (!userbuf) {
                 memcpy(buf, vaddr + offset, csize);
         } else {
-               if (copy_to_user(buf, vaddr + offset, csize))
+               if (copy_to_user(buf, vaddr + offset, csize)) {
+                       memunmap(vaddr);
                         csize = -EFAULT;
+               }
         }

-       kunmap_local(vaddr);
+       memunmap(vaddr);

         return csize;
  }


Method Two:
This function can also refer to the implementation of ia64 to obtain the
virtual address corresponding to pfn through __va().
The patch is as follows:

diff --git a/arch/mips/kernel/crash_dump.c b/arch/mips/kernel/crash_dump.c
index 2e50f551..e122477 100644
--- a/arch/mips/kernel/crash_dump.c
+++ b/arch/mips/kernel/crash_dump.c
@@ -23,16 +23,21 @@ ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
         if (!csize)
                 return 0;

-       vaddr = kmap_local_pfn(pfn);
+       vaddr = __va(pfn << PAGE_SHIFT);

         if (!userbuf) {
                 memcpy(buf, vaddr + offset, csize);
         } else {
                 if (copy_to_user(buf, vaddr + offset, csize))
                         csize = -EFAULT;
         }

-       kunmap_local(vaddr);

         return csize;
  }
>
>    Maciej
Thanks,
Youling
diff mbox series

Patch

diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index b86e241..ac90d3b 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -351,8 +351,10 @@  static int __init early_parse_mem(char *p)
 	 */
 	if (usermem == 0) {
 		usermem = 1;
+#ifndef CONFIG_CRASH_DUMP
 		memblock_remove(memblock_start_of_DRAM(),
 			memblock_end_of_DRAM() - memblock_start_of_DRAM());
+#endif
 	}
 	start = 0;
 	size = memparse(p, &p);