diff mbox

arm64, kaslr: export offset in VMCOREINFO ELF notes

Message ID 1531949864-27447-1-git-send-email-bhsharma@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Bhupesh Sharma July 18, 2018, 9:37 p.m. UTC
Include KASLR offset in VMCOREINFO ELF notes to assist in debugging.

makedumpfile user-space utility will need fixup to use this KASLR offset
to work with cases where we need to find a way to translate symbol
address from vmlinux to kernel run time address in case of KASLR boot on
arm64.

I already have those fixup ready, which will be sent upstream once this
patch makes through (see [0]).

I tested this on my qualcomm amberwing board both for KASLR and
non-KASLR boot cases:

Without this patch:
   # cat > scrub.conf << EOF
   [vmlinux]
   erase jiffies
   erase init_task.utime
   for tsk in init_task.tasks.next within task_struct:tasks
       erase tsk.utime
   endfor
   EOF

  # makedumpfile --split -d 31 -x vmlinux --config scrub.conf vmcore dumpfile_{1,2,3}
  readpage_elf: Attempt to read non-existent page at 0xffffa8a5bf180000.
  readmem: type_addr: 1, addr:ffffa8a5bf180000, size:8
  vaddr_to_paddr_arm64: Can't read pgd
  readmem: Can't convert a virtual address(ffff0000092a542c) to physical
  address.
  readmem: type_addr: 0, addr:ffff0000092a542c, size:390
  check_release: Can't get the address of system_utsname

After this patch check_release() is ok, and also we are able to erase
symbol from vmcore (I checked this with kernel 4.18.0-rc4+):

  # makedumpfile --split -d 31 -x vmlinux --config scrub.conf vmcore dumpfile_{1,2,3}
  The kernel version is not supported.
  The makedumpfile operation may be incomplete.
  Checking for memory holes                         : [100.0 %] \
  Checking for memory holes                         : [100.0 %] |
  Checking foExcluding unnecessary pages                       : [100.0 %]
  \
  Excluding unnecessary pages                       : [100.0 %] \

  The dumpfiles are saved to dumpfile_1, dumpfile_2, and dumpfile_3.

  makedumpfile Completed.

[0] https://github.com/bhupesh-sharma/makedumpfile/commit/555e5ae0fb2b21797c450ad55950e81c470224ef

Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
---
 arch/arm64/kernel/machine_kexec.c | 1 +
 1 file changed, 1 insertion(+)

Comments

James Morse July 19, 2018, 11:31 a.m. UTC | #1
Hi Bhupesh,

On 18/07/18 22:37, Bhupesh Sharma wrote:
> Include KASLR offset in VMCOREINFO ELF notes to assist in debugging.
> 
> makedumpfile user-space utility will need fixup to use this KASLR offset
> to work with cases where we need to find a way to translate symbol
> address from vmlinux to kernel run time address in case of KASLR boot on
> arm64.

You need the kernel VA for a symbol. Isn't this what kallsyms is for?
| root@frikadeller:~# cat /proc/kallsyms | grep swapper_pg_dir
| ffff5404610d0000 B swapper_pg_dir

This is the KASLR address, the vmlinux has:
| root@frikadeller:~/linux/build_arm64# nm -s vmlinux | grep swapper_pg_dir
| ffff0000096d0000 B swapper_pg_dir


This is in the vmcoreinfo too, so you can work if out from the vmcore too:
| root@frikadeller:~# dd if=/proc/kcore bs=8K count=1 2>/dev/null | strings |
| grep swapper_pg_dir
| SYMBOL(swapper_pg_dir)=ffff5404610d0000


I picked swapper_pg_dir, but you could use any of the vmcore:SYMBOL() addresses
to work out this offset. (you should expect the kernel to rename these symbols
at a whim).


Thanks,

James
Bhupesh Sharma July 19, 2018, 2:55 p.m. UTC | #2
Hi James,

On Thu, Jul 19, 2018 at 5:01 PM, James Morse <james.morse@arm.com> wrote:
> Hi Bhupesh,
>
> On 18/07/18 22:37, Bhupesh Sharma wrote:
>> Include KASLR offset in VMCOREINFO ELF notes to assist in debugging.
>>
>> makedumpfile user-space utility will need fixup to use this KASLR offset
>> to work with cases where we need to find a way to translate symbol
>> address from vmlinux to kernel run time address in case of KASLR boot on
>> arm64.
>
> You need the kernel VA for a symbol. Isn't this what kallsyms is for?
> | root@frikadeller:~# cat /proc/kallsyms | grep swapper_pg_dir
> | ffff5404610d0000 B swapper_pg_dir

Its already used by other archs like x86_64. See
'arch/x86/kernel/machine_kexec_64.c' :

void arch_crash_save_vmcoreinfo(void)
{
    <..snip..>
    vmcoreinfo_append_str("KERNELOFFSET=%lx\n",
                  kaslr_offset());
    <..snip..>
}

Its just that we are enabling this feature for arm64 now that the
KASLR boot cases are more widely seen on arm64 boards (as newer EFI
firmware implementations are available which support EFI_RNG_PROTOCOL
and hence KASLR boot).

And we want existing user-space application(s) to work similarly on
arm64 distributions as they work historically on other archs like
x86_64 (in most cases the user-space user is not even aware, if he is
developing for or using an underlying hardware which is arm64 or
x86_64)

> This is the KASLR address, the vmlinux has:
> | root@frikadeller:~/linux/build_arm64# nm -s vmlinux | grep swapper_pg_dir
> | ffff0000096d0000 B swapper_pg_dir
>
>
> This is in the vmcoreinfo too, so you can work if out from the vmcore too:
> | root@frikadeller:~# dd if=/proc/kcore bs=8K count=1 2>/dev/null | strings |
> | grep swapper_pg_dir
> | SYMBOL(swapper_pg_dir)=ffff5404610d0000
>
>
> I picked swapper_pg_dir, but you could use any of the vmcore:SYMBOL() addresses
> to work out this offset. (you should expect the kernel to rename these symbols
> at a whim).
>

Perhaps you missed what the above makedumpfile command was doing, so
let me summarize again:

The above makedumpfile command is trying to 'split' a vmcore file into
smaller sub dumpfiles on the basis of some filtering rules. These
rules are defined via a .conf file (scrub.conf in my example below).

Lets see what MAKEDUMPFILE(8) documents about the '--split' option:

--split
              Split the dump data to multiple DUMPFILEs in parallel.
If specifying DUMPFILEs on different  storage  devices,  a  device
can share I/O load with other devices and it reduces time for saving
the
dump data. The file size of each DUMPFILE is smaller than the system
memory size which is  divided by the number of DUMPFILEs. This feature
supports only the kdump-compressed format.

So, this use-case expects 'vmlinux' and 'vmcore' as mandatory inputs.

Now, coming back to the use-case:

# makedumpfile --split -d 31 -x vmlinux --config scrub.conf vmcore
dumpfile_{1,2,3}

Here, 'scrub.conf ' is defined to erase symbols 'jiffies',
'init_task.utime' and 'tsk.utime'

 # cat > scrub.conf << EOF
   [vmlinux]
   erase jiffies
   erase init_task.utime
   for tsk in init_task.tasks.next within task_struct:tasks
       erase tsk.utime
   endfor
   EOF

This is usually used to erase company-confidential or non-important
symbols from a vmcore before handing it over to a debugger (which uses
this vmcore to determine the root-cause of a crash) - as there can be
some sensitive symbols which a reporter may not want the debugger to
read.

So, in this use case both vmlinux (which contains the symbols) and
vmcore are mandatory inputs and we cannot kallsyms. as it breaks the
existing user-space use-cases and the .conf file can be user-defined
(and hence he can pick any symbol/functions which might not even be
present in 'kallsyms'). So no we cannot use 'kallsyms' here.

Thanks,
Bhupesh
James Morse July 23, 2018, 5:05 p.m. UTC | #3
Hi Bhupesh,

(CC: +mips list, looks like mips is missing vmcore's KERNELOFFSET too.
Start of this thread: https://lkml.org/lkml/2018/7/18/951 )

On 19/07/18 15:55, Bhupesh Sharma wrote:
> On Thu, Jul 19, 2018 at 5:01 PM, James Morse <james.morse@arm.com> wrote:
>> On 18/07/18 22:37, Bhupesh Sharma wrote:
>>> Include KASLR offset in VMCOREINFO ELF notes to assist in debugging.
>>>
>>> makedumpfile user-space utility will need fixup to use this KASLR offset
>>> to work with cases where we need to find a way to translate symbol
>>> address from vmlinux to kernel run time address in case of KASLR boot on
>>> arm64.
>>
>> You need the kernel VA for a symbol. Isn't this what kallsyms is for?
>> | root@frikadeller:~# cat /proc/kallsyms | grep swapper_pg_dir
>> | ffff5404610d0000 B swapper_pg_dir

> Its already used by other archs like x86_64.
> Its just that we are enabling this feature for arm64 now that the
> KASLR boot cases are more widely seen on arm64 boards (as newer EFI
> firmware implementations are available which support EFI_RNG_PROTOCOL
> and hence KASLR boot).
> 
> And we want existing user-space application(s) to work similarly on
> arm64 distributions as they work historically on other archs like
> x86_64 (in most cases the user-space user is not even aware, if he is
> developing for or using an underlying hardware which is arm64 or
> x86_64)

Aha, so its ABI. This is the information that should be in the commit message as
it describes why this patch should be merged.

... Ideally core code would do this, that way this information won't be missed
when an architecture adds KASLR support.

But mips has CONFIG_RANDOMIZE_BASE, and doesn't provide kaslr_offset(),
and x86 always provides this value, not just if CONFIG_RANDOMIZE_BASE is
selected. I can't see a way to do this without touching all three architectures.
(we can try and tidy it up once its clear whether mips needs this too)


I think the patch is fine, but could you re-post it with a commit message that
describes that vmcore parsing in user-space already expects this value in the
notes. We're providing it for portability of those existing tools with x86.


>> I picked swapper_pg_dir, but you could use any of the vmcore:SYMBOL() addresses
>> to work out this offset. (you should expect the kernel to rename these symbols
>> at a whim).
>>
> 
> Perhaps you missed what the above makedumpfile command was doing, so
> let me summarize again:

Yes, I glossed over it, all that seemed relevant is you are trying to find the
kernel-va of a symbol from the value in the vmlinux.


> as it breaks the
> existing user-space use-cases and the .conf file can be user-defined
> (and hence he can pick any symbol/functions

My suggestion was you can calculate the offset between the link-time and
run-time address from information you already have. You just need one of each.
This would be better as its independent of how the kernel does the relocation.

But, this is irrelevant as 'KERNELOFFSET=' is already an ABI string.


> which might not even be present in 'kallsyms').

Eh? How can that happen? I thought even modules had their symbols added to kallsyms.



Thanks,

James
Bhupesh Sharma July 25, 2018, 7:57 p.m. UTC | #4
Hello James,

On Mon, Jul 23, 2018 at 10:35 PM, James Morse <james.morse@arm.com> wrote:
> Hi Bhupesh,
>
> (CC: +mips list, looks like mips is missing vmcore's KERNELOFFSET too.
> Start of this thread: https://lkml.org/lkml/2018/7/18/951 )

Yes, but the current upstream makedumpfile doesn't seem to contain
mips specific support (please see
<git://git.code.sf.net/p/makedumpfile/code> for details), so I am not
sure they need the 'KERNELOFFSET=' supported in kernel as they are
probably not using it in the user-space.

If someone from the mips list can help suggest if my understanding is
correct it would be great. Just as a side note, I don't have access to
mips hardware, so I will be happy to update the v2 to add mips kernel
bits as well, in case someone is willing to give it a try on their
mips hardware.

> On 19/07/18 15:55, Bhupesh Sharma wrote:
>> On Thu, Jul 19, 2018 at 5:01 PM, James Morse <james.morse@arm.com> wrote:
>>> On 18/07/18 22:37, Bhupesh Sharma wrote:
>>>> Include KASLR offset in VMCOREINFO ELF notes to assist in debugging.
>>>>
>>>> makedumpfile user-space utility will need fixup to use this KASLR offset
>>>> to work with cases where we need to find a way to translate symbol
>>>> address from vmlinux to kernel run time address in case of KASLR boot on
>>>> arm64.
>>>
>>> You need the kernel VA for a symbol. Isn't this what kallsyms is for?
>>> | root@frikadeller:~# cat /proc/kallsyms | grep swapper_pg_dir
>>> | ffff5404610d0000 B swapper_pg_dir
>
>> Its already used by other archs like x86_64.
>> Its just that we are enabling this feature for arm64 now that the
>> KASLR boot cases are more widely seen on arm64 boards (as newer EFI
>> firmware implementations are available which support EFI_RNG_PROTOCOL
>> and hence KASLR boot).
>>
>> And we want existing user-space application(s) to work similarly on
>> arm64 distributions as they work historically on other archs like
>> x86_64 (in most cases the user-space user is not even aware, if he is
>> developing for or using an underlying hardware which is arm64 or
>> x86_64)
>
> Aha, so its ABI. This is the information that should be in the commit message as
> it describes why this patch should be merged.

Sure, will add the description in the commit message of v2.

> ... Ideally core code would do this, that way this information won't be missed
> when an architecture adds KASLR support.
>
> But mips has CONFIG_RANDOMIZE_BASE, and doesn't provide kaslr_offset(),
> and x86 always provides this value, not just if CONFIG_RANDOMIZE_BASE is
> selected. I can't see a way to do this without touching all three architectures.
> (we can try and tidy it up once its clear whether mips needs this too)

Yes, I checked internally at Red Hat and no use-cases for mips
surfaced for this feature. So, if someone from the mips list can help
clarify, it will help me re-write the v2 accordingly.

> I think the patch is fine, but could you re-post it with a commit message that
> describes that vmcore parsing in user-space already expects this value in the
> notes. We're providing it for portability of those existing tools with x86.

Sure.

>>> I picked swapper_pg_dir, but you could use any of the vmcore:SYMBOL() addresses
>>> to work out this offset. (you should expect the kernel to rename these symbols
>>> at a whim).
>>>
>>
>> Perhaps you missed what the above makedumpfile command was doing, so
>> let me summarize again:
>
> Yes, I glossed over it, all that seemed relevant is you are trying to find the
> kernel-va of a symbol from the value in the vmlinux.

Yes, that's correct.

>
>> as it breaks the
>> existing user-space use-cases and the .conf file can be user-defined
>> (and hence he can pick any symbol/functions
>
> My suggestion was you can calculate the offset between the link-time and
> run-time address from information you already have. You just need one of each.
> This would be better as its independent of how the kernel does the relocation.
>
> But, this is irrelevant as 'KERNELOFFSET=' is already an ABI string.

Indeed. We would like to have ABI strings similar across archs.

>> which might not even be present in 'kallsyms').
>
> Eh? How can that happen? I thought even modules had their symbols added to kallsyms.

Sorry, when I re-read my reply, I think I did not explain it better in
terms of the distribution use-cases we usually encounter. Let me try
again:

One of the reason is that existing user-space implementations (for
archs like x86_64), historically use 'vmlinux' for filtering of
symbols from 'vmcore'.

Usually when an end-user uses a distribution kernel on their hardware
and face issues with it, they share the 'vmcore' (crash dump) and
'vmlinux' with the distribution provider. 'vmlinux' can be used by
utilities like gdb/crash-utility to debug the reason for kernel crash.

Let's assume, even if the user manages to save the 'kallsyms' on some
external media (e.g. usb stick) when the kernel crashed, since the
symbol addresses in 'kallsyms' change on each boot (as KASLR
randomizes the same), so we cannot use 'KERNELOFFSET=' to calculate
kaslr offset to the symbols reliably.

For example, let me share the following values of a real use-case:

Let's saw we are looking to find and erase the symbol 'jiffies' from
the vmcore, using (1) - vmlinux and (2) - kallsyms:

(1) vmlinux - Address of 'jiffies' -> 0xffff000009291980
(2) kallsyms - Address of 'jiffies' -> 0xffff4ce385291980 (value seen
during the boot when primary kernel crashed)

In this particular boot, makedumpfile reports 'kaslr_offset' as 0x2934e3000000
So, if we use:
(1) vmlinux - We calculate Address of 'jiffies' as ->
0xffff000009291980 + 0x2934e3000000 = 0xFFFF2934EC291980
(2) kallsyms - We calculate Address of 'jiffies' as ->
0xffff4ce385291980 + 0x2934e3000000 = 0xFFFF761868291980

When we check the 'vmcore' we can see that the address of 'jiffies' is
indeed 0xFFFF2934EC291980 and not 0xFFFF761868291980:

crash> sym jiffies
ffff2934ec291980 (D) jiffies

Since, the address of 'jiffies' pointed to by 'kallsyms' will change
on every boot, its probably not a good source for such user-space
use-cases.

Will send out a v2 shortly (after waiting for some inputs from the mips guys).

Thanks,
Bhupesh
diff mbox

Patch

diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index f62effc6e064..028df356a5fd 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -360,4 +360,5 @@  void arch_crash_save_vmcoreinfo(void)
 						kimage_voffset);
 	vmcoreinfo_append_str("NUMBER(PHYS_OFFSET)=0x%llx\n",
 						PHYS_OFFSET);
+	vmcoreinfo_append_str("KERNELOFFSET=%lx\n", kaslr_offset());
 }