Message ID | 20230929031716.it.155-kees@kernel.org (mailing list archive) |
---|---|
Headers | show |
Series | binfmt_elf: Support segments with 0 filesz and misaligned starts | expand |
Hello Kees, On Thu, 28 Sep 2023, Kees Cook wrote: > This is the continuation of the work Eric started for handling > "p_memsz > p_filesz" in arbitrary segments (rather than just the last, > BSS, segment). I've added the suggested changes: > > - drop unused "elf_bss" variable > - refactor load_elf_interp() to use elf_load() > - refactor load_elf_library() to use elf_load() > - report padzero() errors when PROT_WRITE is present > - drop vm_brk() While I was debugging the initial issue I stumbled over the following - care to take it as part of this series? ----->8 [PATCH] mm: vm_brk_flags don't bail out while holding lock Calling vm_brk_flags() with flags set other than VM_EXEC will exit the function without releasing the mmap_write_lock. Just do the sanity check before the lock is acquired. This doesn't fix an actual issue since no caller sets a flag other than VM_EXEC. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-mm@kvack.org Signed-off-by: Sebastian Ott <sebott@redhat.com> --- mm/mmap.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index b56a7f0c9f85..7ed286662839 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -3143,13 +3143,13 @@ int vm_brk_flags(unsigned long addr, unsigned long request, unsigned long flags) if (!len) return 0; - if (mmap_write_lock_killable(mm)) - return -EINTR; - /* Until we need other flags, refuse anything except VM_EXEC. */ if ((flags & (~VM_EXEC)) != 0) return -EINVAL; + if (mmap_write_lock_killable(mm)) + return -EINTR; + ret = check_brk_limits(addr, len); if (ret) goto limits_failed;
On Fri, Sep 29, 2023 at 4:24 AM Kees Cook <keescook@chromium.org> wrote: > > Hi, > > This is the continuation of the work Eric started for handling > "p_memsz > p_filesz" in arbitrary segments (rather than just the last, > BSS, segment). I've added the suggested changes: > > - drop unused "elf_bss" variable > - refactor load_elf_interp() to use elf_load() > - refactor load_elf_library() to use elf_load() > - report padzero() errors when PROT_WRITE is present > - drop vm_brk() > > Thanks! > > -Kees > > v4: > - refactor load_elf_library() too > - don't refactor padzero(), just test in the only remaining caller > - drop now-unused vm_brk() > v3: https://lore.kernel.org/all/20230927033634.make.602-kees@kernel.org > v2: https://lore.kernel.org/lkml/87sf71f123.fsf@email.froward.int.ebiederm.org > v1: https://lore.kernel.org/lkml/87jzsemmsd.fsf_-_@email.froward.int.ebiederm.org > > Eric W. Biederman (1): > binfmt_elf: Support segments with 0 filesz and misaligned starts > > Kees Cook (5): > binfmt_elf: elf_bss no longer used by load_elf_binary() > binfmt_elf: Use elf_load() for interpreter > binfmt_elf: Use elf_load() for library > binfmt_elf: Only report padzero() errors when PROT_WRITE > mm: Remove unused vm_brk() > > fs/binfmt_elf.c | 214 ++++++++++++++++----------------------------- > include/linux/mm.h | 3 +- > mm/mmap.c | 6 -- > mm/nommu.c | 5 -- > 4 files changed, 76 insertions(+), 152 deletions(-) Sorry for taking so long to take a look at this. While I didn't test PPC64 (I don't own PPC64 hardware, and I wasn't the original reporter), I did manage to craft a reduced test case of: a.out: Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align PHDR 0x0000000000000040 0x0000000000000040 0x0000000000000040 0x00000000000001f8 0x00000000000001f8 R 0x8 INTERP 0x0000000000000238 0x0000000000000238 0x0000000000000238 0x0000000000000020 0x0000000000000020 R 0x1 [Requesting program interpreter: /home/pfalcato/musl/lib/libc.so] LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000428 0x0000000000000428 R 0x1000 LOAD 0x0000000000001000 0x0000000000001000 0x0000000000001000 0x00000000000000cd 0x00000000000000cd R E 0x1000 LOAD 0x0000000000002000 0x0000000000002000 0x0000000000002000 0x0000000000000084 0x0000000000000084 R 0x1000 LOAD 0x0000000000002e50 0x0000000000003e50 0x0000000000003e50 0x00000000000001c8 0x00000000000001c8 RW 0x1000 DYNAMIC 0x0000000000002e50 0x0000000000003e50 0x0000000000003e50 0x0000000000000180 0x0000000000000180 RW 0x8 GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 RW 0x10 GNU_RELRO 0x0000000000002e50 0x0000000000003e50 0x0000000000003e50 0x00000000000001b0 0x00000000000001b0 R 0x1 /home/pfalcato/musl/lib/libc.so: Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align PHDR 0x0000000000000040 0x0000000000000040 0x0000000000000040 0x0000000000000230 0x0000000000000230 R 0x8 LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000049d9c 0x0000000000049d9c R 0x1000 LOAD 0x0000000000049da0 0x000000000004ada0 0x000000000004ada0 0x0000000000057d30 0x0000000000057d30 R E 0x1000 LOAD 0x00000000000a1ad0 0x00000000000a3ad0 0x00000000000a3ad0 0x00000000000005f0 0x00000000000015f0 RW 0x1000 LOAD 0x00000000000a20c0 0x00000000000a60c0 0x00000000000a60c0 0x0000000000000428 0x0000000000002f80 RW 0x1000 DYNAMIC 0x00000000000a1f38 0x00000000000a3f38 0x00000000000a3f38 0x0000000000000110 0x0000000000000110 RW 0x8 GNU_RELRO 0x00000000000a1ad0 0x00000000000a3ad0 0x00000000000a3ad0 0x00000000000005f0 0x0000000000002530 R 0x1 GNU_EH_FRAME 0x0000000000049d10 0x0000000000049d10 0x0000000000049d10 0x0000000000000024 0x0000000000000024 R 0x4 GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 RW 0x0 NOTE 0x0000000000000270 0x0000000000000270 0x0000000000000270 0x0000000000000018 0x0000000000000018 R 0x4 Section to Segment mapping: Segment Sections... 00 01 .note.gnu.build-id .dynsym .gnu.hash .hash .dynstr .rela.dyn .rela.plt .rodata .eh_frame_hdr .eh_frame 02 .text .plt 03 .data.rel.ro .dynamic .got .toc 04 .data .got.plt .bss 05 .dynamic 06 .data.rel.ro .dynamic .got .toc 07 .eh_frame_hdr 08 09 .note.gnu.build-id So on that end, you can take my Tested-by: Pedro Falcato <pedro.falcato@gmail.com> Although this still doesn't address the other bug I found (https://github.com/heatd/elf-bug-questionmark), where segments can accidentally overwrite cleared BSS if we end up in a situation where e.g we have the following segments: Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x0000000000001000 0x0000000000400000 0x0000000000400000 0x0000000000000045 0x0000000000000045 R E 0x1000 LOAD 0x0000000000002000 0x0000000000401000 0x0000000000401000 0x000000000000008c 0x000000000000008c R 0x1000 LOAD 0x0000000000000000 0x0000000000402000 0x0000000000402000 0x0000000000000000 0x0000000000000801 RW 0x1000 LOAD 0x0000000000002801 0x0000000000402801 0x0000000000402801 0x0000000000000007 0x0000000000000007 RW 0x1000 NOTE 0x0000000000002068 0x0000000000401068 0x0000000000401068 0x0000000000000024 0x0000000000000024 0x4 Section to Segment mapping: Segment Sections... 00 .text 01 .rodata .note.gnu.property .note.gnu.build-id 02 .bss 03 .data 04 .note.gnu.build-id since the mmap of .data will end up happening over .bss.
Pedro Falcato <pedro.falcato@gmail.com> writes: > On Fri, Sep 29, 2023 at 4:24 AM Kees Cook <keescook@chromium.org> wrote: >> >> Hi, >> >> This is the continuation of the work Eric started for handling >> "p_memsz > p_filesz" in arbitrary segments (rather than just the last, >> BSS, segment). I've added the suggested changes: >> >> - drop unused "elf_bss" variable >> - refactor load_elf_interp() to use elf_load() >> - refactor load_elf_library() to use elf_load() >> - report padzero() errors when PROT_WRITE is present >> - drop vm_brk() >> >> Thanks! >> >> -Kees >> >> v4: >> - refactor load_elf_library() too >> - don't refactor padzero(), just test in the only remaining caller >> - drop now-unused vm_brk() >> v3: https://lore.kernel.org/all/20230927033634.make.602-kees@kernel.org >> v2: https://lore.kernel.org/lkml/87sf71f123.fsf@email.froward.int.ebiederm.org >> v1: https://lore.kernel.org/lkml/87jzsemmsd.fsf_-_@email.froward.int.ebiederm.org >> >> Eric W. Biederman (1): >> binfmt_elf: Support segments with 0 filesz and misaligned starts >> >> Kees Cook (5): >> binfmt_elf: elf_bss no longer used by load_elf_binary() >> binfmt_elf: Use elf_load() for interpreter >> binfmt_elf: Use elf_load() for library >> binfmt_elf: Only report padzero() errors when PROT_WRITE >> mm: Remove unused vm_brk() >> >> fs/binfmt_elf.c | 214 ++++++++++++++++----------------------------- >> include/linux/mm.h | 3 +- >> mm/mmap.c | 6 -- >> mm/nommu.c | 5 -- >> 4 files changed, 76 insertions(+), 152 deletions(-) > > Sorry for taking so long to take a look at this. > While I didn't test PPC64 (I don't own PPC64 hardware, and I wasn't > the original reporter), I did manage to craft a reduced test case of: > > a.out: > > Program Headers: > Type Offset VirtAddr PhysAddr > FileSiz MemSiz Flags Align > PHDR 0x0000000000000040 0x0000000000000040 0x0000000000000040 > 0x00000000000001f8 0x00000000000001f8 R 0x8 > INTERP 0x0000000000000238 0x0000000000000238 0x0000000000000238 > 0x0000000000000020 0x0000000000000020 R 0x1 > [Requesting program interpreter: /home/pfalcato/musl/lib/libc.so] > LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000 > 0x0000000000000428 0x0000000000000428 R 0x1000 > LOAD 0x0000000000001000 0x0000000000001000 0x0000000000001000 > 0x00000000000000cd 0x00000000000000cd R E 0x1000 > LOAD 0x0000000000002000 0x0000000000002000 0x0000000000002000 > 0x0000000000000084 0x0000000000000084 R 0x1000 > LOAD 0x0000000000002e50 0x0000000000003e50 0x0000000000003e50 > 0x00000000000001c8 0x00000000000001c8 RW 0x1000 > DYNAMIC 0x0000000000002e50 0x0000000000003e50 0x0000000000003e50 > 0x0000000000000180 0x0000000000000180 RW 0x8 > GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000 > 0x0000000000000000 0x0000000000000000 RW 0x10 > GNU_RELRO 0x0000000000002e50 0x0000000000003e50 0x0000000000003e50 > 0x00000000000001b0 0x00000000000001b0 R 0x1 > > /home/pfalcato/musl/lib/libc.so: > Program Headers: > Type Offset VirtAddr PhysAddr > FileSiz MemSiz Flags Align > PHDR 0x0000000000000040 0x0000000000000040 0x0000000000000040 > 0x0000000000000230 0x0000000000000230 R 0x8 > LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000 > 0x0000000000049d9c 0x0000000000049d9c R 0x1000 > LOAD 0x0000000000049da0 0x000000000004ada0 0x000000000004ada0 > 0x0000000000057d30 0x0000000000057d30 R E 0x1000 > LOAD 0x00000000000a1ad0 0x00000000000a3ad0 0x00000000000a3ad0 > 0x00000000000005f0 0x00000000000015f0 RW 0x1000 > LOAD 0x00000000000a20c0 0x00000000000a60c0 0x00000000000a60c0 > 0x0000000000000428 0x0000000000002f80 RW 0x1000 > DYNAMIC 0x00000000000a1f38 0x00000000000a3f38 0x00000000000a3f38 > 0x0000000000000110 0x0000000000000110 RW 0x8 > GNU_RELRO 0x00000000000a1ad0 0x00000000000a3ad0 0x00000000000a3ad0 > 0x00000000000005f0 0x0000000000002530 R 0x1 > GNU_EH_FRAME 0x0000000000049d10 0x0000000000049d10 0x0000000000049d10 > 0x0000000000000024 0x0000000000000024 R 0x4 > GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000 > 0x0000000000000000 0x0000000000000000 RW 0x0 > NOTE 0x0000000000000270 0x0000000000000270 0x0000000000000270 > 0x0000000000000018 0x0000000000000018 R 0x4 > > Section to Segment mapping: > Segment Sections... > 00 > 01 .note.gnu.build-id .dynsym .gnu.hash .hash .dynstr .rela.dyn > .rela.plt .rodata .eh_frame_hdr .eh_frame > 02 .text .plt > 03 .data.rel.ro .dynamic .got .toc > 04 .data .got.plt .bss > 05 .dynamic > 06 .data.rel.ro .dynamic .got .toc > 07 .eh_frame_hdr > 08 > 09 .note.gnu.build-id > > > So on that end, you can take my > > Tested-by: Pedro Falcato <pedro.falcato@gmail.com> > > Although this still doesn't address the other bug I found > (https://github.com/heatd/elf-bug-questionmark), where segments can > accidentally overwrite cleared BSS if we end up in a situation where > e.g we have the following segments: > > Program Headers: > Type Offset VirtAddr PhysAddr > FileSiz MemSiz Flags Align > LOAD 0x0000000000001000 0x0000000000400000 0x0000000000400000 > 0x0000000000000045 0x0000000000000045 R E 0x1000 > LOAD 0x0000000000002000 0x0000000000401000 0x0000000000401000 > 0x000000000000008c 0x000000000000008c R 0x1000 > LOAD 0x0000000000000000 0x0000000000402000 0x0000000000402000 > 0x0000000000000000 0x0000000000000801 RW 0x1000 > LOAD 0x0000000000002801 0x0000000000402801 0x0000000000402801 > 0x0000000000000007 0x0000000000000007 RW 0x1000 > NOTE 0x0000000000002068 0x0000000000401068 0x0000000000401068 > 0x0000000000000024 0x0000000000000024 0x4 > > Section to Segment mapping: > Segment Sections... > 00 .text > 01 .rodata .note.gnu.property .note.gnu.build-id > 02 .bss > 03 .data > 04 .note.gnu.build-id > > since the mmap of .data will end up happening over .bss. This is simply invalid userspace, doubly so with the alignment set to 0x1000. The best the kernel can do is have a pre-pass through the elf program headers (before the point of no-return) and if they describe overlapping segments or out of order segments, cause execve syscall to fail. Eric
Sebastian Ott <sebott@redhat.com> writes: > Hello Kees, > > On Thu, 28 Sep 2023, Kees Cook wrote: >> This is the continuation of the work Eric started for handling >> "p_memsz > p_filesz" in arbitrary segments (rather than just the last, >> BSS, segment). I've added the suggested changes: >> >> - drop unused "elf_bss" variable >> - refactor load_elf_interp() to use elf_load() >> - refactor load_elf_library() to use elf_load() >> - report padzero() errors when PROT_WRITE is present >> - drop vm_brk() > > While I was debugging the initial issue I stumbled over the following > - care to take it as part of this series? > > ----->8 > [PATCH] mm: vm_brk_flags don't bail out while holding lock > > Calling vm_brk_flags() with flags set other than VM_EXEC > will exit the function without releasing the mmap_write_lock. > > Just do the sanity check before the lock is acquired. This > doesn't fix an actual issue since no caller sets a flag other > than VM_EXEC. That seems like a sensible patch. Have you by any chance read this code enough to understand what is gained by calling vm_brk_flags rather than vm_mmap without a file? Unless there is a real advantage it probably makes sense to replace the call of vm_brk_flags with vm_mmap(NULL, ...) as binfmt_elf_fdpic has already done. That would allow removing vm_brk_flags and sys_brk would be the last caller of do_brk_flags. Eric > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: linux-mm@kvack.org > Signed-off-by: Sebastian Ott <sebott@redhat.com> > --- > mm/mmap.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/mm/mmap.c b/mm/mmap.c > index b56a7f0c9f85..7ed286662839 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -3143,13 +3143,13 @@ int vm_brk_flags(unsigned long addr, unsigned long request, unsigned long flags) > if (!len) > return 0; > > - if (mmap_write_lock_killable(mm)) > - return -EINTR; > - > /* Until we need other flags, refuse anything except VM_EXEC. */ > if ((flags & (~VM_EXEC)) != 0) > return -EINVAL; > > + if (mmap_write_lock_killable(mm)) > + return -EINTR; > + > ret = check_brk_limits(addr, len); > if (ret) > goto limits_failed;
On Fri, Sep 29, 2023 at 12:58:18PM +0100, Pedro Falcato wrote: > So on that end, you can take my > > Tested-by: Pedro Falcato <pedro.falcato@gmail.com> Thanks!
On Fri, Sep 29, 2023 at 01:33:50PM +0200, Sebastian Ott wrote: > Hello Kees, > > On Thu, 28 Sep 2023, Kees Cook wrote: > > This is the continuation of the work Eric started for handling > > "p_memsz > p_filesz" in arbitrary segments (rather than just the last, > > BSS, segment). I've added the suggested changes: > > > > - drop unused "elf_bss" variable > > - refactor load_elf_interp() to use elf_load() > > - refactor load_elf_library() to use elf_load() > > - report padzero() errors when PROT_WRITE is present > > - drop vm_brk() > > While I was debugging the initial issue I stumbled over the following > - care to take it as part of this series? > ----->8 > [PATCH] mm: vm_brk_flags don't bail out while holding lock > > Calling vm_brk_flags() with flags set other than VM_EXEC > will exit the function without releasing the mmap_write_lock. > > Just do the sanity check before the lock is acquired. This > doesn't fix an actual issue since no caller sets a flag other > than VM_EXEC. Oh, eek. Yeah, that seems like a good idea. :) Reviewed-by: Kees Cook <keescook@chromium.org> -Kees