mbox series

[v2,0/5] Fix ELF / FDPIC ELF core dumping, and use mmap_sem properly in there

Message ID 20200429214954.44866-1-jannh@google.com (mailing list archive)
Headers show
Series Fix ELF / FDPIC ELF core dumping, and use mmap_sem properly in there | expand

Message

Jann Horn April 29, 2020, 9:49 p.m. UTC
At the moment, we have that rather ugly mmget_still_valid() helper to
work around <https://crbug.com/project-zero/1790>: ELF core dumping
doesn't take the mmap_sem while traversing the task's VMAs, and if
anything (like userfaultfd) then remotely messes with the VMA tree,
fireworks ensue. So at the moment we use mmget_still_valid() to bail
out in any writers that might be operating on a remote mm's VMAs.

With this series, I'm trying to get rid of the need for that as
cleanly as possible.
In particular, I want to avoid holding the mmap_sem across unbounded
sleeps.


Patches 1, 2 and 3 are relatively unrelated cleanups in the core
dumping code.

Patches 4 and 5 implement the main change: Instead of repeatedly
accessing the VMA list with sleeps in between, we snapshot it at the
start with proper locking, and then later we just use our copy of
the VMA list. This ensures that the kernel won't crash, that VMA
metadata in the coredump is consistent even in the presence of
concurrent modifications, and that any virtual addresses that aren't
being concurrently modified have their contents show up in the core
dump properly.

The disadvantage of this approach is that we need a bit more memory
during core dumping for storing metadata about all VMAs.

After this series has landed, we should be able to rip out
mmget_still_valid().


Testing done so far:

 - Creating a simple core dump on X86-64 still works.
 - The created coredump on X86-64 opens in GDB, and both the stack and the
   exectutable look vaguely plausible.
 - 32-bit ARM compiles with FDPIC support, both with MMU and !MMU config.

I'm CCing some folks from the architectures that use FDPIC in case
anyone wants to give this a spin.


This series is based on
<https://lore.kernel.org/linux-fsdevel/20200427200626.1622060-1-hch@lst.de/>
(Christoph Hellwig's "remove set_fs calls from the coredump code v4").


changed in v2:
 - replace "Fix handling of partial writes in dump_emit()" with
   "Let dump_emit() bail out on short writes" (Linus)
 - get rid of the useless complicated cache flushing in
   "Take mmap_sem in get_dump_page()" (Linus)

Jann Horn (5):
  binfmt_elf_fdpic: Stop using dump_emit() on user pointers on !MMU
  coredump: Let dump_emit() bail out on short writes
  coredump: Refactor page range dumping into common helper
  binfmt_elf, binfmt_elf_fdpic: Use a VMA list snapshot
  mm/gup: Take mmap_sem in get_dump_page()

 fs/binfmt_elf.c          | 170 ++++++++++++---------------------------
 fs/binfmt_elf_fdpic.c    | 106 +++++++++---------------
 fs/coredump.c            | 123 +++++++++++++++++++++++++---
 include/linux/coredump.h |  12 +++
 mm/gup.c                 |  60 +++++++-------
 5 files changed, 245 insertions(+), 226 deletions(-)


base-commit: 6a8b55ed4056ea5559ebe4f6a4b247f627870d4c
prerequisite-patch-id: c0a20b414eebc48fe0a8ca570b05de34c7980396
prerequisite-patch-id: 51973b8db0fa4b114e0c3fd8936b634d9d5061c5
prerequisite-patch-id: 0e1e8de282ca6d458dc6cbdc6b6ec5879edd8a05
prerequisite-patch-id: d5ee749c4d3a22ec80bd0dd88aadf89aeb569db8
prerequisite-patch-id: 46ce14e59e98e212a1eca0aef69c6dcdb62b8242

Comments

Russell King (Oracle) April 29, 2020, 9:56 p.m. UTC | #1
On Wed, Apr 29, 2020 at 11:49:49PM +0200, Jann Horn wrote:
> At the moment, we have that rather ugly mmget_still_valid() helper to
> work around <https://crbug.com/project-zero/1790>: ELF core dumping
> doesn't take the mmap_sem while traversing the task's VMAs, and if
> anything (like userfaultfd) then remotely messes with the VMA tree,
> fireworks ensue. So at the moment we use mmget_still_valid() to bail
> out in any writers that might be operating on a remote mm's VMAs.
> 
> With this series, I'm trying to get rid of the need for that as
> cleanly as possible.
> In particular, I want to avoid holding the mmap_sem across unbounded
> sleeps.
> 
> 
> Patches 1, 2 and 3 are relatively unrelated cleanups in the core
> dumping code.
> 
> Patches 4 and 5 implement the main change: Instead of repeatedly
> accessing the VMA list with sleeps in between, we snapshot it at the
> start with proper locking, and then later we just use our copy of
> the VMA list. This ensures that the kernel won't crash, that VMA
> metadata in the coredump is consistent even in the presence of
> concurrent modifications, and that any virtual addresses that aren't
> being concurrently modified have their contents show up in the core
> dump properly.
> 
> The disadvantage of this approach is that we need a bit more memory
> during core dumping for storing metadata about all VMAs.
> 
> After this series has landed, we should be able to rip out
> mmget_still_valid().
> 
> 
> Testing done so far:
> 
>  - Creating a simple core dump on X86-64 still works.
>  - The created coredump on X86-64 opens in GDB, and both the stack and the
>    exectutable look vaguely plausible.
>  - 32-bit ARM compiles with FDPIC support, both with MMU and !MMU config.
> 
> I'm CCing some folks from the architectures that use FDPIC in case
> anyone wants to give this a spin.

I've never had any reason to use FDPIC, and I don't have any binaries
that would use it.  Nicolas Pitre added ARM support, so I guess he
would be the one to talk to about it.  (Added Nicolas.)
Linus Torvalds April 29, 2020, 11:03 p.m. UTC | #2
On Wed, Apr 29, 2020 at 2:57 PM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
>
> I've never had any reason to use FDPIC, and I don't have any binaries
> that would use it.  Nicolas Pitre added ARM support, so I guess he
> would be the one to talk to about it.  (Added Nicolas.)

While we're at it, is there anybody who knows binfmt_flat?

It might be Nicolas too.

binfmt_flat doesn't do core-dumping, but it has some other oddities.
In particular, I'd like to bring sanity to the installation of the new
creds, and all the _normal_ binfmt cases do it largely close together
with setup_new_exec().

binfmt_flat is doing odd things. It's doing this:

        /* Flush all traces of the currently running executable */
        if (id == 0) {
                ret = flush_old_exec(bprm);
                if (ret)
                        goto err;

                /* OK, This is the point of no return */
                set_personality(PER_LINUX_32BIT);
                setup_new_exec(bprm);
        }

in load_flat_file() - which is also used to loading _libraries_. Where
it makes no sense at all.

It does the

        install_exec_creds(bprm);

in load_flat_binary() (which makes more sense: that is only for actual
binary loading, no library case).

I would _like_ for every binfmt loader to do

        /* Flush all traces of the currently running executable */
        retval = flush_old_exec(bprm);
        if (retval)
                return retval;

   .. possibly set up personalities here ..

        setup_new_exec(bprm);
        install_exec_creds(bprm);

all together, and at least merge 'setup_new_exec()' with 'install_exec_creds()'.

And I think all the binfmt handlers would be ok with that, but the
flat one in particular is really oddly set up.

*Particularly* with that flush_old_exec/setup_new_exec() being done by
the same routine that is also loading libraries (and called from
'calc_reloc()' from binary loading too).

Adding Greg Ungerer for m68knommu. Can somebody sort out why that
flush_old_exec/setup_new_exec() isn't in load_flat_binary() like
install_exec_creds() is?

Most of that file goes back to pre-git days. And most of the commits
since are not so much about binfmt_flat, as they are about cleanups or
changes elsewhere where binfmt_flat was just a victim.

               Linus
Nicolas Pitre April 30, 2020, 1:27 a.m. UTC | #3
On Wed, 29 Apr 2020, Linus Torvalds wrote:

> While we're at it, is there anybody who knows binfmt_flat?

I'd say Greg Ungerer.

> It might be Nicolas too.

I only contributed the necessary changes to make it work on targets with 
a MMU. Once fdpic started to worked I used that instead.

FWIW I couldn't find a toolchain that would produce FLAT binaries with 
dynamic libs on ARM so I only used static binaries.


Nicolas
Nicolas Pitre April 30, 2020, 1:59 a.m. UTC | #4
On Wed, 29 Apr 2020, Russell King - ARM Linux admin wrote:

> On Wed, Apr 29, 2020 at 11:49:49PM +0200, Jann Horn wrote:
> > At the moment, we have that rather ugly mmget_still_valid() helper to
> > work around <https://crbug.com/project-zero/1790>: ELF core dumping
> > doesn't take the mmap_sem while traversing the task's VMAs, and if
> > anything (like userfaultfd) then remotely messes with the VMA tree,
> > fireworks ensue. So at the moment we use mmget_still_valid() to bail
> > out in any writers that might be operating on a remote mm's VMAs.
> > 
> > With this series, I'm trying to get rid of the need for that as
> > cleanly as possible.
> > In particular, I want to avoid holding the mmap_sem across unbounded
> > sleeps.
> > 
> > 
> > Patches 1, 2 and 3 are relatively unrelated cleanups in the core
> > dumping code.
> > 
> > Patches 4 and 5 implement the main change: Instead of repeatedly
> > accessing the VMA list with sleeps in between, we snapshot it at the
> > start with proper locking, and then later we just use our copy of
> > the VMA list. This ensures that the kernel won't crash, that VMA
> > metadata in the coredump is consistent even in the presence of
> > concurrent modifications, and that any virtual addresses that aren't
> > being concurrently modified have their contents show up in the core
> > dump properly.
> > 
> > The disadvantage of this approach is that we need a bit more memory
> > during core dumping for storing metadata about all VMAs.
> > 
> > After this series has landed, we should be able to rip out
> > mmget_still_valid().
> > 
> > 
> > Testing done so far:
> > 
> >  - Creating a simple core dump on X86-64 still works.
> >  - The created coredump on X86-64 opens in GDB, and both the stack and the
> >    exectutable look vaguely plausible.
> >  - 32-bit ARM compiles with FDPIC support, both with MMU and !MMU config.
> > 
> > I'm CCing some folks from the architectures that use FDPIC in case
> > anyone wants to give this a spin.
> 
> I've never had any reason to use FDPIC, and I don't have any binaries
> that would use it.  Nicolas Pitre added ARM support, so I guess he
> would be the one to talk to about it.  (Added Nicolas.)

It's been a while since I worked with it. However Christophe Lyon (in 
CC) added support for ARM FDPIC to mainline gcc recently, so hopefully 
he might still be set up and able to help.


Nicolas
Greg Ungerer April 30, 2020, 2:10 p.m. UTC | #5
On 30/4/20 9:03 am, Linus Torvalds wrote:
> On Wed, Apr 29, 2020 at 2:57 PM Russell King - ARM Linux admin
> <linux@armlinux.org.uk> wrote:
>>
>> I've never had any reason to use FDPIC, and I don't have any binaries
>> that would use it.  Nicolas Pitre added ARM support, so I guess he
>> would be the one to talk to about it.  (Added Nicolas.)
> 
> While we're at it, is there anybody who knows binfmt_flat?
> 
> It might be Nicolas too.
> 
> binfmt_flat doesn't do core-dumping, but it has some other oddities.
> In particular, I'd like to bring sanity to the installation of the new
> creds, and all the _normal_ binfmt cases do it largely close together
> with setup_new_exec().
> 
> binfmt_flat is doing odd things. It's doing this:
> 
>          /* Flush all traces of the currently running executable */
>          if (id == 0) {
>                  ret = flush_old_exec(bprm);
>                  if (ret)
>                          goto err;
> 
>                  /* OK, This is the point of no return */
>                  set_personality(PER_LINUX_32BIT);
>                  setup_new_exec(bprm);
>          }
> 
> in load_flat_file() - which is also used to loading _libraries_. Where
> it makes no sense at all.

I haven't looked at the shared lib support in there for a long time,
but I thought that "id" is only 0 for the actual final program.
Libraries have a slot or id number associated with them.

> It does the
> 
>          install_exec_creds(bprm);
> 
> in load_flat_binary() (which makes more sense: that is only for actual
> binary loading, no library case).
> 
> I would _like_ for every binfmt loader to do
> 
>          /* Flush all traces of the currently running executable */
>          retval = flush_old_exec(bprm);
>          if (retval)
>                  return retval;
> 
>     .. possibly set up personalities here ..
> 
>          setup_new_exec(bprm);
>          install_exec_creds(bprm);
> 
> all together, and at least merge 'setup_new_exec()' with 'install_exec_creds()'.
> 
> And I think all the binfmt handlers would be ok with that, but the
> flat one in particular is really oddly set up.
> 
> *Particularly* with that flush_old_exec/setup_new_exec() being done by
> the same routine that is also loading libraries (and called from
> 'calc_reloc()' from binary loading too).
> 
> Adding Greg Ungerer for m68knommu. Can somebody sort out why that
> flush_old_exec/setup_new_exec() isn't in load_flat_binary() like
> install_exec_creds() is?
> 
> Most of that file goes back to pre-git days. And most of the commits
> since are not so much about binfmt_flat, as they are about cleanups or
> changes elsewhere where binfmt_flat was just a victim.

I'll have a look at this.

Quick hack test shows moving setup_new_exec(bprm) to be just before
install_exec_creds(bprm) works fine for the static binaries case.
Doing the flush_old_exec(bprm) there too crashed out - I'll need to
dig into that to see why.

Regards
Greg
Rich Felker April 30, 2020, 2:51 p.m. UTC | #6
On Fri, May 01, 2020 at 12:10:05AM +1000, Greg Ungerer wrote:
> 
> 
> On 30/4/20 9:03 am, Linus Torvalds wrote:
> >On Wed, Apr 29, 2020 at 2:57 PM Russell King - ARM Linux admin
> ><linux@armlinux.org.uk> wrote:
> >>
> >>I've never had any reason to use FDPIC, and I don't have any binaries
> >>that would use it.  Nicolas Pitre added ARM support, so I guess he
> >>would be the one to talk to about it.  (Added Nicolas.)
> >
> >While we're at it, is there anybody who knows binfmt_flat?
> >
> >It might be Nicolas too.
> >
> >binfmt_flat doesn't do core-dumping, but it has some other oddities.
> >In particular, I'd like to bring sanity to the installation of the new
> >creds, and all the _normal_ binfmt cases do it largely close together
> >with setup_new_exec().
> >
> >binfmt_flat is doing odd things. It's doing this:
> >
> >         /* Flush all traces of the currently running executable */
> >         if (id == 0) {
> >                 ret = flush_old_exec(bprm);
> >                 if (ret)
> >                         goto err;
> >
> >                 /* OK, This is the point of no return */
> >                 set_personality(PER_LINUX_32BIT);
> >                 setup_new_exec(bprm);
> >         }
> >
> >in load_flat_file() - which is also used to loading _libraries_. Where
> >it makes no sense at all.
> 
> I haven't looked at the shared lib support in there for a long time,
> but I thought that "id" is only 0 for the actual final program.
> Libraries have a slot or id number associated with them.

This sounds correct. My understanding of FLAT shared library support
is that it's really bad and based on having preassigned slot indices
for each library on the system, and a global array per-process to give
to data base address for each library. Libraries are compiled to know
their own slot numbers so that they just load from fixed_reg[slot_id]
to get what's effectively their GOT pointer.

I'm not sure if anybody has actually used this in over a decade. Last
time I looked the tooling appeared broken, but in this domain lots of
users have forked private tooling that's not publicly available or at
least not publicly indexed, so it's hard to say for sure.

Rich
Linus Torvalds April 30, 2020, 4:54 p.m. UTC | #7
On Thu, Apr 30, 2020 at 7:10 AM Greg Ungerer <gerg@linux-m68k.org> wrote:
>
> > in load_flat_file() - which is also used to loading _libraries_. Where
> > it makes no sense at all.
>
> I haven't looked at the shared lib support in there for a long time,
> but I thought that "id" is only 0 for the actual final program.
> Libraries have a slot or id number associated with them.

Yes, that was my assumption, but looking at the code, it really isn't
obvious that that is the case at all.

'id' gets calculated from fields that very much look like they could
be zero (eg by taking the top bits from another random field).

> > Most of that file goes back to pre-git days. And most of the commits
> > since are not so much about binfmt_flat, as they are about cleanups or
> > changes elsewhere where binfmt_flat was just a victim.
>
> I'll have a look at this.

Thanks.

> Quick hack test shows moving setup_new_exec(bprm) to be just before
> install_exec_creds(bprm) works fine for the static binaries case.
> Doing the flush_old_exec(bprm) there too crashed out - I'll need to
> dig into that to see why.

Just moving setup_new_exec() would at least allow us to then join the
two together, and just say "setup_new_exec() does the credential
installation too".

So to some degree, that's the important one.

But that flush_old_exec() does look odd in load_flat_file(). It's not
like anything but executing a binary should flush the old exec.
Certainly not loading a library, however odd that flat library code
is.

My _guess_ is that the reason for this is that "load_flat_file()" also
does a lot of verification of the file and does that whole "return
-ENOEXEC if the file format isn't right". So we don't want to flush
the old exec before that is done, but we obviously also don't want to
flush the old exec after we've actually loaded the new one into
memory..

So the location of flush_old_exec() makes that kind of sense, but it
would have made it better if that flat file support had a clear
separation of "check the file" from "load the file".

Oh well. As mentioned, the whole "at least put setup_new_exec() and
install_exec_creds() together" is the bigger thing.

But if it's true that nobody really uses the odd flat library support
any more and there are no testers, maybe we should consider ripping it
out...

                Linus
Eric W. Biederman April 30, 2020, 7:07 p.m. UTC | #8
Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Thu, Apr 30, 2020 at 7:10 AM Greg Ungerer <gerg@linux-m68k.org> wrote:

>> > Most of that file goes back to pre-git days. And most of the commits
>> > since are not so much about binfmt_flat, as they are about cleanups or
>> > changes elsewhere where binfmt_flat was just a victim.
>>
>> I'll have a look at this.
>
> Thanks.
>
>> Quick hack test shows moving setup_new_exec(bprm) to be just before
>> install_exec_creds(bprm) works fine for the static binaries case.
>> Doing the flush_old_exec(bprm) there too crashed out - I'll need to
>> dig into that to see why.
>
> Just moving setup_new_exec() would at least allow us to then join the
> two together, and just say "setup_new_exec() does the credential
> installation too".

But it is only half a help if we allow failure points between
flush_old_exec and install_exec_creds.

Greg do things work acceptably if install_exec_creds is moved to right
after setup_new_exec? (patch below)

Looking at the code in load_flat_file after setup_new_exec it looks like
the kinds of things that in binfmt_elf.c we do after install_exec_creds
(aka vm_map).  So I think we want install_exec_creds sooner, instead
of setup_new_exec later.

> But if it's true that nobody really uses the odd flat library support
> any more and there are no testers, maybe we should consider ripping it
> out...

I looked a little deeper and there is another reason to think about
ripping out the flat library loader.  The code is recursive, and
supports a maximum of 4 shared libraries in the entire system.

load_flat_binary
	load_flat_file
        	calc_reloc
                	load_flat_shared_libary
                        	load_flat_file
                                	....

I am mystified with what kind of system can survive with a grand total
of 4 shared libaries.  I think my a.out slackware system that I ran on
my i486 had more shared libraries.

Having read just a bit more it is definitely guaranteed (by the code)
that the first time load_flat_file is called id 0 will be used (aka id 0
is guaranteed to be the binary), and the ids 1, 2, 3 and 4 will only be
used if a relocation includes that id to reference an external shared
library.  That part of the code is drop dead simple.

---

This is what I was thinking about applying.

diff --git a/fs/binfmt_flat.c b/fs/binfmt_flat.c
index 831a2b25ba79..1a1d1fcb893f 100644
--- a/fs/binfmt_flat.c
+++ b/fs/binfmt_flat.c
@@ -541,6 +541,7 @@ static int load_flat_file(struct linux_binprm *bprm,
 		/* OK, This is the point of no return */
 		set_personality(PER_LINUX_32BIT);
 		setup_new_exec(bprm);
+		install_exec_creds(bprm);
 	}
 
 	/*
@@ -963,8 +964,6 @@ static int load_flat_binary(struct linux_binprm *bprm)
 		}
 	}
 
-	install_exec_creds(bprm);
-
 	set_binfmt(&flat_format);
 
 #ifdef CONFIG_MMU
Rob Landley April 30, 2020, 9:13 p.m. UTC | #9
On 4/30/20 9:51 AM, Rich Felker wrote:
> This sounds correct. My understanding of FLAT shared library support
> is that it's really bad and based on having preassigned slot indices
> for each library on the system, and a global array per-process to give
> to data base address for each library. Libraries are compiled to know
> their own slot numbers so that they just load from fixed_reg[slot_id]
> to get what's effectively their GOT pointer.
> 
> I'm not sure if anybody has actually used this in over a decade. Last
> time I looked the tooling appeared broken, but in this domain lots of
> users have forked private tooling that's not publicly available or at
> least not publicly indexed, so it's hard to say for sure.

Lots of people in this area are also still using 10 year old tools because it
breaks every time they upgrade.

Heck, nommu support for architectures musl doesn't support yet is _explicitly_
the main thing keeping uClibc alive:

  https://www.openwall.com/lists/musl/2015/05/30/1

Rob
Greg Ungerer May 1, 2020, 5:44 a.m. UTC | #10
On 1/5/20 5:07 am, Eric W. Biederman wrote:
> Linus Torvalds <torvalds@linux-foundation.org> writes:
> 
>> On Thu, Apr 30, 2020 at 7:10 AM Greg Ungerer <gerg@linux-m68k.org> wrote:
> 
>>>> Most of that file goes back to pre-git days. And most of the commits
>>>> since are not so much about binfmt_flat, as they are about cleanups or
>>>> changes elsewhere where binfmt_flat was just a victim.
>>>
>>> I'll have a look at this.
>>
>> Thanks.
>>
>>> Quick hack test shows moving setup_new_exec(bprm) to be just before
>>> install_exec_creds(bprm) works fine for the static binaries case.
>>> Doing the flush_old_exec(bprm) there too crashed out - I'll need to
>>> dig into that to see why.
>>
>> Just moving setup_new_exec() would at least allow us to then join the
>> two together, and just say "setup_new_exec() does the credential
>> installation too".
> 
> But it is only half a help if we allow failure points between
> flush_old_exec and install_exec_creds.
> 
> Greg do things work acceptably if install_exec_creds is moved to right
> after setup_new_exec? (patch below)

Yes, confirmed. Worked fine with that patch applied.


> Looking at the code in load_flat_file after setup_new_exec it looks like
> the kinds of things that in binfmt_elf.c we do after install_exec_creds
> (aka vm_map).  So I think we want install_exec_creds sooner, instead
> of setup_new_exec later.
> 
>> But if it's true that nobody really uses the odd flat library support
>> any more and there are no testers, maybe we should consider ripping it
>> out...
> 
> I looked a little deeper and there is another reason to think about
> ripping out the flat library loader.  The code is recursive, and
> supports a maximum of 4 shared libraries in the entire system.
> 
> load_flat_binary
> 	load_flat_file
>          	calc_reloc
>                  	load_flat_shared_libary
>                          	load_flat_file
>                                  	....
> 
> I am mystified with what kind of system can survive with a grand total
> of 4 shared libaries.  I think my a.out slackware system that I ran on
> my i486 had more shared libraries.

The kind of embedded systems that were built with this stuff 20 years
ago didn't have lots of applications and libraries. I think we found
back then that most of your savings were from making libc shared.
Less significant gains from making other libraries shared. And there
was a bit of extra pain in setting them up with the shared library
code generation options (that had to be unique for each one).

The whole mechanism is a bit of hack, and there was a few other
limitations with the way it worked (I don't recall what they were
right now).

I am definitely in favor of removing it.

Regards
Greg



> Having read just a bit more it is definitely guaranteed (by the code)
> that the first time load_flat_file is called id 0 will be used (aka id 0
> is guaranteed to be the binary), and the ids 1, 2, 3 and 4 will only be
> used if a relocation includes that id to reference an external shared
> library.  That part of the code is drop dead simple.
> 
> ---
> 
> This is what I was thinking about applying.
> 
> diff --git a/fs/binfmt_flat.c b/fs/binfmt_flat.c
> index 831a2b25ba79..1a1d1fcb893f 100644
> --- a/fs/binfmt_flat.c
> +++ b/fs/binfmt_flat.c
> @@ -541,6 +541,7 @@ static int load_flat_file(struct linux_binprm *bprm,
>   		/* OK, This is the point of no return */
>   		set_personality(PER_LINUX_32BIT);
>   		setup_new_exec(bprm);
> +		install_exec_creds(bprm);
>   	}
>   
>   	/*
> @@ -963,8 +964,6 @@ static int load_flat_binary(struct linux_binprm *bprm)
>   		}
>   	}
>   
> -	install_exec_creds(bprm);
> -
>   	set_binfmt(&flat_format);
>   
>   #ifdef CONFIG_MMU
> 
>
Greg Ungerer May 1, 2020, 6 a.m. UTC | #11
On 1/5/20 12:51 am, Rich Felker wrote:
> On Fri, May 01, 2020 at 12:10:05AM +1000, Greg Ungerer wrote:
>>
>>
>> On 30/4/20 9:03 am, Linus Torvalds wrote:
>>> On Wed, Apr 29, 2020 at 2:57 PM Russell King - ARM Linux admin
>>> <linux@armlinux.org.uk> wrote:
>>>>
>>>> I've never had any reason to use FDPIC, and I don't have any binaries
>>>> that would use it.  Nicolas Pitre added ARM support, so I guess he
>>>> would be the one to talk to about it.  (Added Nicolas.)
>>>
>>> While we're at it, is there anybody who knows binfmt_flat?
>>>
>>> It might be Nicolas too.
>>>
>>> binfmt_flat doesn't do core-dumping, but it has some other oddities.
>>> In particular, I'd like to bring sanity to the installation of the new
>>> creds, and all the _normal_ binfmt cases do it largely close together
>>> with setup_new_exec().
>>>
>>> binfmt_flat is doing odd things. It's doing this:
>>>
>>>          /* Flush all traces of the currently running executable */
>>>          if (id == 0) {
>>>                  ret = flush_old_exec(bprm);
>>>                  if (ret)
>>>                          goto err;
>>>
>>>                  /* OK, This is the point of no return */
>>>                  set_personality(PER_LINUX_32BIT);
>>>                  setup_new_exec(bprm);
>>>          }
>>>
>>> in load_flat_file() - which is also used to loading _libraries_. Where
>>> it makes no sense at all.
>>
>> I haven't looked at the shared lib support in there for a long time,
>> but I thought that "id" is only 0 for the actual final program.
>> Libraries have a slot or id number associated with them.
> 
> This sounds correct. My understanding of FLAT shared library support
> is that it's really bad and based on having preassigned slot indices
> for each library on the system, and a global array per-process to give
> to data base address for each library. Libraries are compiled to know
> their own slot numbers so that they just load from fixed_reg[slot_id]
> to get what's effectively their GOT pointer.
> 
> I'm not sure if anybody has actually used this in over a decade. Last
> time I looked the tooling appeared broken, but in this domain lots of
> users have forked private tooling that's not publicly available or at
> least not publicly indexed, so it's hard to say for sure.

Be at least 12 or 13 years since I last had a working shared library
build for m68knommu. I have not bothered with it since then, not that I
even used it much when it worked. Seemed more pain than it was worth.

Regards
Greg
Greg Ungerer May 1, 2020, 7:14 a.m. UTC | #12
On 1/5/20 2:54 am, Linus Torvalds wrote:
> On Thu, Apr 30, 2020 at 7:10 AM Greg Ungerer <gerg@linux-m68k.org> wrote:
>>
>>> in load_flat_file() - which is also used to loading _libraries_. Where
>>> it makes no sense at all.
>>
>> I haven't looked at the shared lib support in there for a long time,
>> but I thought that "id" is only 0 for the actual final program.
>> Libraries have a slot or id number associated with them.
> 
> Yes, that was my assumption, but looking at the code, it really isn't
> obvious that that is the case at all.
> 
> 'id' gets calculated from fields that very much look like they could
> be zero (eg by taking the top bits from another random field).
> 
>>> Most of that file goes back to pre-git days. And most of the commits
>>> since are not so much about binfmt_flat, as they are about cleanups or
>>> changes elsewhere where binfmt_flat was just a victim.
>>
>> I'll have a look at this.
> 
> Thanks.
> 
>> Quick hack test shows moving setup_new_exec(bprm) to be just before
>> install_exec_creds(bprm) works fine for the static binaries case.
>> Doing the flush_old_exec(bprm) there too crashed out - I'll need to
>> dig into that to see why.
> 
> Just moving setup_new_exec() would at least allow us to then join the
> two together, and just say "setup_new_exec() does the credential
> installation too".
> 
> So to some degree, that's the important one.
> 
> But that flush_old_exec() does look odd in load_flat_file(). It's not
> like anything but executing a binary should flush the old exec.
> Certainly not loading a library, however odd that flat library code
> is.
> 
> My _guess_ is that the reason for this is that "load_flat_file()" also
> does a lot of verification of the file and does that whole "return
> -ENOEXEC if the file format isn't right". So we don't want to flush
> the old exec before that is done, but we obviously also don't want to
> flush the old exec after we've actually loaded the new one into
> memory..

Yeah, that is what it looks like. Looking at the history, the introduction
of setup_new_exec() [in commit 221af7f87b974] was probably just
added where the the existing flush_old_exec() was.


> So the location of flush_old_exec() makes that kind of sense, but it
> would have made it better if that flat file support had a clear
> separation of "check the file" from "load the file".
> 
> Oh well. As mentioned, the whole "at least put setup_new_exec() and
> install_exec_creds() together" is the bigger thing.
> 
> But if it's true that nobody really uses the odd flat library support
> any more and there are no testers, maybe we should consider ripping it
> out...

I am for that. If nobody pipes up and complains I'll look at taking it out.

Regards
Greg
Eric W. Biederman May 1, 2020, 11:13 a.m. UTC | #13
Greg Ungerer <gerg@linux-m68k.org> writes:

> On 1/5/20 5:07 am, Eric W. Biederman wrote:
>> Linus Torvalds <torvalds@linux-foundation.org> writes:
>>
>>> On Thu, Apr 30, 2020 at 7:10 AM Greg Ungerer <gerg@linux-m68k.org> wrote:
>>
>>>>> Most of that file goes back to pre-git days. And most of the commits
>>>>> since are not so much about binfmt_flat, as they are about cleanups or
>>>>> changes elsewhere where binfmt_flat was just a victim.
>>>>
>>>> I'll have a look at this.
>>>
>>> Thanks.
>>>
>>>> Quick hack test shows moving setup_new_exec(bprm) to be just before
>>>> install_exec_creds(bprm) works fine for the static binaries case.
>>>> Doing the flush_old_exec(bprm) there too crashed out - I'll need to
>>>> dig into that to see why.
>>>
>>> Just moving setup_new_exec() would at least allow us to then join the
>>> two together, and just say "setup_new_exec() does the credential
>>> installation too".
>>
>> But it is only half a help if we allow failure points between
>> flush_old_exec and install_exec_creds.
>>
>> Greg do things work acceptably if install_exec_creds is moved to right
>> after setup_new_exec? (patch below)
>
> Yes, confirmed. Worked fine with that patch applied.

Good.  Thank you.

That is what we need for other cleanups.  All three of those together.

>> This is what I was thinking about applying.
>>
>> diff --git a/fs/binfmt_flat.c b/fs/binfmt_flat.c
>> index 831a2b25ba79..1a1d1fcb893f 100644
>> --- a/fs/binfmt_flat.c
>> +++ b/fs/binfmt_flat.c
>> @@ -541,6 +541,7 @@ static int load_flat_file(struct linux_binprm *bprm,
>>   		/* OK, This is the point of no return */
>>   		set_personality(PER_LINUX_32BIT);
>>   		setup_new_exec(bprm);
>> +		install_exec_creds(bprm);
>>   	}
>>     	/*
>> @@ -963,8 +964,6 @@ static int load_flat_binary(struct linux_binprm *bprm)
>>   		}
>>   	}
>>   -	install_exec_creds(bprm);
>> -
>>   	set_binfmt(&flat_format);
>>     #ifdef CONFIG_MMU

Eric
Rob Landley May 1, 2020, 7:09 p.m. UTC | #14
On 5/1/20 1:00 AM, Greg Ungerer wrote:
>> This sounds correct. My understanding of FLAT shared library support
>> is that it's really bad and based on having preassigned slot indices
>> for each library on the system, and a global array per-process to give
>> to data base address for each library. Libraries are compiled to know
>> their own slot numbers so that they just load from fixed_reg[slot_id]
>> to get what's effectively their GOT pointer.

fdpic is to elf what binflt is to a.out, and a.out shared libraries were never
pretty. Or easy.

>> I'm not sure if anybody has actually used this in over a decade. Last
>> time I looked the tooling appeared broken, but in this domain lots of
>> users have forked private tooling that's not publicly available or at
>> least not publicly indexed, so it's hard to say for sure.
> 
> Be at least 12 or 13 years since I last had a working shared library
> build for m68knommu. I have not bothered with it since then, not that I
> even used it much when it worked. Seemed more pain than it was worth.

Shared libraries worked fine with fdpic on sh2 last I checked, it's basically
just ELF PIC with the ability to move the 4 segments (text/rodata/bss/data)
independently of each other. (4 base pointers, no waiting.)

I don't think I've _ever_ used shared binflt libraries. I left myself
breadcrumbs back when I was wrestling with that stuff:

  https://landley.net/notes-2014.html#07-12-2014

But it looks like that last time I touched anything using elf2flt was:

  https://landley.net/notes-2018.html#08-05-2018

And that was just because arm's fdpic support stayed out of tree for years so I
dug up binflt and gave it another go. (It sucked so much I wound up building
static pie for cortex-m, taking the efficiency hit, and moving on. Running pie
binaries on nommu _works_, it's just incredibly inefficient. Since the writeable
and readable segments of the ELF are all relative to the same single base
pointer, you can't share the read-only parts of the binaries without address
remapping, so if you launch 4 instances of PIE bash on nommu you've loaded 4
instances of the bash text and rodata, and of course none of it can even be
demand faulted. In theory shared libraries _do_ help there but I hit some ld.so
bug and didn't want to debug a half-assed solution, so big hammer and moved on
until arm fdpic got merged and fixed it _properly_...)

Rob

P.S. The reason for binflt is bare metal hardware engineers who are conceptually
uncomfortable with software love them, because it's as close to "objcopy -O
binary" as they can get. Meanwhile on j-core we've had an 8k ROM boot loader
that loads vmlinux images and does the ELF relocations for 5 years now, and ever
since the switch to device tree that's our _only_ way to feed a dtb to the
kernel without statically linking it in, so it's ELF all the way down for us.