mbox series

[0/5] arm64: mte: add core dump support

Message ID 20211208121941.494956-1-catalin.marinas@arm.com (mailing list archive)
Headers show
Series arm64: mte: add core dump support | expand

Message

Catalin Marinas Dec. 8, 2021, 12:19 p.m. UTC
Hi,

Add core dump support for MTE tags. When a core file is generated and
the user has mappings with PROT_MTE, segments with the PT_ARM_MEMTAG_MTE
type are dumped. These correspond to the PT_LOAD segments for the same
virtual addresses.

The last patch documents the core file format. The tags are dumped
packed, two tags per byte (unlike ptrace where we have one tag per byte)
and there is no header to define the format, it's all fixed for the
PT_ARM_MEMTAG_MTE type.

Below you can see the output of 'readelf -a core' for a program mapping
two regions with PROT_MTE, one 2-page and the other 4-page long. Half of
the first page in each range was filled with 0xa and 0xb tags
respectively.

Program Headers:
  Type             Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  ...
  LOAD             0x030000 0x0000ffff80034000 0x0000000000000000 0x000000 0x002000 RW  0x1000
  LOAD             0x030000 0x0000ffff80036000 0x0000000000000000 0x004000 0x004000 RW  0x1000
  ...
  LOPROC+0x5441470 0x05b000 0x0000ffff80034000 0x0000000000000000 0x000100 0x002000     0
  LOPROC+0x5441470 0x05b100 0x0000ffff80036000 0x0000000000000000 0x000200 0x004000     0

The relevant 'od -tx1 core' output:

05b000 bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb
*
05b040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
*
05b100 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
*
05b140 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
*
05b300

Catalin Marinas (5):
  elfcore: Replace CONFIG_{IA64,UML} checks with a new option
  elf: Introduce the ARM MTE ELF segment type
  arm64: mte: Define the number of bytes for storing the tags in a page
  arm64: mte: Dump the MTE tags in the core file
  arm64: mte: Document the core dump file format

 .../arm64/memory-tagging-extension.rst        |  22 ++++
 arch/arm64/Kconfig                            |   1 +
 arch/arm64/include/asm/mte-def.h              |   1 +
 arch/arm64/kernel/Makefile                    |   1 +
 arch/arm64/kernel/elfcore.c                   | 123 ++++++++++++++++++
 arch/arm64/lib/mte.S                          |   4 +-
 arch/arm64/mm/mteswap.c                       |   2 +-
 arch/ia64/Kconfig                             |   1 +
 arch/x86/um/Kconfig                           |   1 +
 fs/Kconfig.binfmt                             |   3 +
 include/linux/elfcore.h                       |   4 +-
 include/uapi/linux/elf.h                      |   3 +
 12 files changed, 161 insertions(+), 5 deletions(-)
 create mode 100644 arch/arm64/kernel/elfcore.c

Comments

Eric W. Biederman Dec. 8, 2021, 5:21 p.m. UTC | #1
Catalin Marinas <catalin.marinas@arm.com> writes:

> Hi,
>
> Add core dump support for MTE tags. When a core file is generated and
> the user has mappings with PROT_MTE, segments with the PT_ARM_MEMTAG_MTE
> type are dumped. These correspond to the PT_LOAD segments for the same
> virtual addresses.

Why did you choose to encode this information as a program header
instead of as a note?

I can't see anything fundamentally wrong with encoding this information
as a new program header type, but I also don't know what makes this
information special enough that it doesn't work as a note.

The advantage for encoding things as a note is that everyone pretty much
already knows what to do with notes, and notes they do not understand.

If this was something the loader would need when loading an application,
and the loader could parse this program header as well that would
definitely be justification for using a program header.

I also don't know what an MTE tag is.  A memory type extension?

Eric


> The last patch documents the core file format. The tags are dumped
> packed, two tags per byte (unlike ptrace where we have one tag per byte)
> and there is no header to define the format, it's all fixed for the
> PT_ARM_MEMTAG_MTE type.
>
> Below you can see the output of 'readelf -a core' for a program mapping
> two regions with PROT_MTE, one 2-page and the other 4-page long. Half of
> the first page in each range was filled with 0xa and 0xb tags
> respectively.
>
> Program Headers:
>   Type             Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
>   ...
>   LOAD             0x030000 0x0000ffff80034000 0x0000000000000000 0x000000 0x002000 RW  0x1000
>   LOAD             0x030000 0x0000ffff80036000 0x0000000000000000 0x004000 0x004000 RW  0x1000
>   ...
>   LOPROC+0x5441470 0x05b000 0x0000ffff80034000 0x0000000000000000 0x000100 0x002000     0
>   LOPROC+0x5441470 0x05b100 0x0000ffff80036000 0x0000000000000000 0x000200 0x004000     0
>
> The relevant 'od -tx1 core' output:
>
> 05b000 bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb
> *
> 05b040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> *
> 05b100 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
> *
> 05b140 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> *
> 05b300
>
> Catalin Marinas (5):
>   elfcore: Replace CONFIG_{IA64,UML} checks with a new option
>   elf: Introduce the ARM MTE ELF segment type
>   arm64: mte: Define the number of bytes for storing the tags in a page
>   arm64: mte: Dump the MTE tags in the core file
>   arm64: mte: Document the core dump file format
>
>  .../arm64/memory-tagging-extension.rst        |  22 ++++
>  arch/arm64/Kconfig                            |   1 +
>  arch/arm64/include/asm/mte-def.h              |   1 +
>  arch/arm64/kernel/Makefile                    |   1 +
>  arch/arm64/kernel/elfcore.c                   | 123 ++++++++++++++++++
>  arch/arm64/lib/mte.S                          |   4 +-
>  arch/arm64/mm/mteswap.c                       |   2 +-
>  arch/ia64/Kconfig                             |   1 +
>  arch/x86/um/Kconfig                           |   1 +
>  fs/Kconfig.binfmt                             |   3 +
>  include/linux/elfcore.h                       |   4 +-
>  include/uapi/linux/elf.h                      |   3 +
>  12 files changed, 161 insertions(+), 5 deletions(-)
>  create mode 100644 arch/arm64/kernel/elfcore.c
Catalin Marinas Dec. 8, 2021, 5:57 p.m. UTC | #2
On Wed, Dec 08, 2021 at 11:21:24AM -0600, Eric W. Biederman wrote:
> Catalin Marinas <catalin.marinas@arm.com> writes:
> > Add core dump support for MTE tags. When a core file is generated and
> > the user has mappings with PROT_MTE, segments with the PT_ARM_MEMTAG_MTE
> > type are dumped. These correspond to the PT_LOAD segments for the same
> > virtual addresses.
> 
> Why did you choose to encode this information as a program header
> instead of as a note?

That's how we started, even had binutils patches ready to merge until we
realised that elf64_note::n_descsz is 32-bit only.

For MTE, the tags need (vma_size / PAGE_SIZE * 128) bytes in the
coredump or 2^(vma_shift - 5). In theory a vma can be 52-bit, so we'd
need a theoretical 47-bit size for the content of a note.
elf64_phdr::p_filesz, OTOH, is a 64-bit value.

We could split this int multiple notes but, as I try to describe below,
I think its designation is closer to a PT_LOAD segment than a note
(well, without the load part).

> I also don't know what an MTE tag is.  A memory type extension?

Sorry, I should have described it in the cover letter: Memory Tagging
Extensions (pretty much like SPARC ADI). This hardware feature allows
every 16 bytes in memory to have an associated "tag". On access, the top
byte of the pointer (actually bits 59:56) are compared with the
in-memory tag. If they don't match, a fault is raised. Typical use-case:
heap allocators set a tag for a range of memory and return a pointer
with the corresponding top byte set. Out of bounds access or use after
free can be caught (with some probability since we only have 16 tags in
total).

Now, when we do a core dump, it would be useful to the debugger to know,
for a corresponding PT_LOAD segment, what the in-memory tags were, if
any.

> If this was something the loader would need when loading an application,
> and the loader could parse this program header as well that would
> definitely be justification for using a program header.

We don't currently have a use for the loader to parse this but it's
possible in theory to, say, tag some data or bss ranges with something
other than the default 0 (though most likely this would be the loader
picking a random tag rather than deciding its value at build-time).

Thanks.