diff mbox

Fix bss mapping for the interpreter in binfmt_elf

Message ID 1462963020-11097-1-git-send-email-hecmargi@upv.es (mailing list archive)
State New, archived
Headers show

Commit Message

Hector Marco-Gisbert May 11, 2016, 10:37 a.m. UTC
While working on a new ASLR for userspace we detected an error in the
interpret loader.

The size of the bss section for some interpreters is not correctly
calculated resulting in unnecessary calls to vm_brk() with enormous size
values.

The bug appears when loading some interpreters with a small bss size. Once
the last loadable segment has been loaded, the bss section is zeroed up to
the page boundary and the elf_bss variable is updated to this new page
boundary.  Because of this update (alignment), the last_bss could be less
than elf_bss and the subtraction "last_bss - elf_bss" value could overflow.

Although it is quite easy to check this error, it has not been manifested
because some peculiarities of the bug. Following is a brief description:


$ size /lib32/ld-2.19.so 
   text    data     bss     dec     hex filename
 128456    2964     192  131612   2021c /lib32/ld-2.19.so


An execution example with:
  - last_bss: 0xb7794938
  - elf_bss:  0xb7794878

        
From fs/binfmt_elf.c:
---------------------------------------------------------------------------
     if (last_bss > elf_bss) { 
             /*
              * Now fill out the bss section.  First pad the last page up
              * to the page boundary, and then perform a mmap to make sure
              * that there are zero-mapped pages up to and including the
              * last bss page.
              */
             if (padzero(elf_bss)) {
                     error = -EFAULT;
                     goto out;
             }

             /* What we have mapped so far */
             elf_bss = ELF_PAGESTART(elf_bss + ELF_MIN_ALIGN - 1);

             <---------- elf_bss here is 0xb7795000

             /* Map the last of the bss segment */
             error = vm_brk(elf_bss, last_bss - elf_bss); <-- overflow!
             if (BAD_ADDR(error))
                     goto out;
     }

     error = load_addr;
out:
     return error;
}
---------------------------------------------------------------------------


The size value requested to the vm_brk() call (last_bss - elf_bss) is
0xfffffffffffff938 and internally this size is page aligned in the do_brk()
function resulting in a 0 length request.

static unsigned long do_brk(unsigned long addr, unsigned long len)
{
        ...
        len = PAGE_ALIGN(len);
        if (!len)
                return addr;


Since a segment of 0 bytes is perfectly valid, it returns the requested
address to vm_brk() and because it is page aligned (done by the previous
line to the vm_brk() call the "error" is not detected by
"BAD_ADDR(error)" and the "load_elf_interp()" functions does not
returns any error. Note that vm_brk() is not necessary at all.

In brief, if the end of the bss is in the same page than the last segment
loaded then the size of the last of bss segment is incorrectly calculated.


This patch set up to the page boundary of the last_bss variable and only do
the vm_brk() call when necessary.


Signed-off-by: Hector Marco-Gisbert <hecmargi@upv.es>
Acked-by: Ismael Ripoll Ripoll <iripoll@upv.es>
---
 fs/binfmt_elf.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

Comments

Hector Marco-Gisbert June 22, 2016, 4:28 p.m. UTC | #1
Second try ...

Although it does not seem dangerous, I think it worth to fix this to avoid
potential future problems.

any thoughts on this ?


Thanks,
Hector.


> While working on a new ASLR for userspace we detected an error in the
> interpret loader.
> 
> The size of the bss section for some interpreters is not correctly
> calculated resulting in unnecessary calls to vm_brk() with enormous size
> values.
> 
> The bug appears when loading some interpreters with a small bss size. Once
> the last loadable segment has been loaded, the bss section is zeroed up to
> the page boundary and the elf_bss variable is updated to this new page
> boundary.  Because of this update (alignment), the last_bss could be less
> than elf_bss and the subtraction "last_bss - elf_bss" value could overflow.
> 
> Although it is quite easy to check this error, it has not been manifested
> because some peculiarities of the bug. Following is a brief description:
> 
> 
> $ size /lib32/ld-2.19.so 
>    text    data     bss     dec     hex filename
>  128456    2964     192  131612   2021c /lib32/ld-2.19.so
> 
> 
> An execution example with:
>   - last_bss: 0xb7794938
>   - elf_bss:  0xb7794878
> 
>         
> From fs/binfmt_elf.c:
> ---------------------------------------------------------------------------
>      if (last_bss > elf_bss) { 
>              /*
>               * Now fill out the bss section.  First pad the last page up
>               * to the page boundary, and then perform a mmap to make sure
>               * that there are zero-mapped pages up to and including the
>               * last bss page.
>               */
>              if (padzero(elf_bss)) {
>                      error = -EFAULT;
>                      goto out;
>              }
> 
>              /* What we have mapped so far */
>              elf_bss = ELF_PAGESTART(elf_bss + ELF_MIN_ALIGN - 1);
> 
>              <---------- elf_bss here is 0xb7795000
> 
>              /* Map the last of the bss segment */
>              error = vm_brk(elf_bss, last_bss - elf_bss); <-- overflow!
>              if (BAD_ADDR(error))
>                      goto out;
>      }
> 
>      error = load_addr;
> out:
>      return error;
> }
> ---------------------------------------------------------------------------
> 
> 
> The size value requested to the vm_brk() call (last_bss - elf_bss) is
> 0xfffffffffffff938 and internally this size is page aligned in the do_brk()
> function resulting in a 0 length request.
> 
> static unsigned long do_brk(unsigned long addr, unsigned long len)
> {
>         ...
>         len = PAGE_ALIGN(len);
>         if (!len)
>                 return addr;
> 
> 
> Since a segment of 0 bytes is perfectly valid, it returns the requested
> address to vm_brk() and because it is page aligned (done by the previous
> line to the vm_brk() call the "error" is not detected by
> "BAD_ADDR(error)" and the "load_elf_interp()" functions does not
> returns any error. Note that vm_brk() is not necessary at all.
> 
> In brief, if the end of the bss is in the same page than the last segment
> loaded then the size of the last of bss segment is incorrectly calculated.
> 
> 
> This patch set up to the page boundary of the last_bss variable and only do
> the vm_brk() call when necessary.
> 
> 
> Signed-off-by: Hector Marco-Gisbert <hecmargi@upv.es>
> Acked-by: Ismael Ripoll Ripoll <iripoll@upv.es>
> ---
>  fs/binfmt_elf.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
> index 81381cc..acfbdc2 100644
> --- a/fs/binfmt_elf.c
> +++ b/fs/binfmt_elf.c
> @@ -526,7 +526,7 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
>  	int load_addr_set = 0;
>  	unsigned long last_bss = 0, elf_bss = 0;
>  	unsigned long error = ~0UL;
> -	unsigned long total_size;
> +	unsigned long total_size, size;
>  	int i;
>  
>  	/* First of all, some simple consistency checks */
> @@ -626,11 +626,15 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
>  
>  		/* What we have mapped so far */
>  		elf_bss = ELF_PAGESTART(elf_bss + ELF_MIN_ALIGN - 1);
> +		last_bss = ELF_PAGESTART(last_bss + ELF_MIN_ALIGN - 1);
>  
>  		/* Map the last of the bss segment */
> -		error = vm_brk(elf_bss, last_bss - elf_bss);
> -		if (BAD_ADDR(error))
> -			goto out;
> +		size = last_bss - elf_bss;
> +		if (size) {
> +			error = vm_brk(elf_bss, size);
> +			if (BAD_ADDR(error))
> +				goto out;
> +		}
>  	}
>  
>  	error = load_addr;
>
Kees Cook June 22, 2016, 10:31 p.m. UTC | #2
On Wed, May 11, 2016 at 3:37 AM, Hector Marco-Gisbert <hecmargi@upv.es> wrote:
> While working on a new ASLR for userspace we detected an error in the
> interpret loader.
>
> The size of the bss section for some interpreters is not correctly
> calculated resulting in unnecessary calls to vm_brk() with enormous size
> values.
>
> The bug appears when loading some interpreters with a small bss size. Once
> the last loadable segment has been loaded, the bss section is zeroed up to
> the page boundary and the elf_bss variable is updated to this new page
> boundary.  Because of this update (alignment), the last_bss could be less
> than elf_bss and the subtraction "last_bss - elf_bss" value could overflow.
>
> Although it is quite easy to check this error, it has not been manifested
> because some peculiarities of the bug. Following is a brief description:

Heh, the bugs cancel each other out! :P

>
>
> $ size /lib32/ld-2.19.so
>    text    data     bss     dec     hex filename
>  128456    2964     192  131612   2021c /lib32/ld-2.19.so
>
>
> An execution example with:
>   - last_bss: 0xb7794938
>   - elf_bss:  0xb7794878
>
>
> From fs/binfmt_elf.c:
> ---------------------------------------------------------------------------
>      if (last_bss > elf_bss) {
>              /*
>               * Now fill out the bss section.  First pad the last page up
>               * to the page boundary, and then perform a mmap to make sure
>               * that there are zero-mapped pages up to and including the
>               * last bss page.
>               */
>              if (padzero(elf_bss)) {
>                      error = -EFAULT;
>                      goto out;
>              }

Tangent: shouldn't this padzero() call happen even in the case where
last_bss <= elf_bss? The elf_map call has total_size bumped up to page
size, so junk may have been loaded from the file still, even if
last_bss == elf_bss, etc. Maybe I'm missing something.

>              /* What we have mapped so far */
>              elf_bss = ELF_PAGESTART(elf_bss + ELF_MIN_ALIGN - 1);
>
>              <---------- elf_bss here is 0xb7795000
>
>              /* Map the last of the bss segment */
>              error = vm_brk(elf_bss, last_bss - elf_bss); <-- overflow!
>              if (BAD_ADDR(error))
>                      goto out;
>      }
>
>      error = load_addr;
> out:
>      return error;
> }
> ---------------------------------------------------------------------------
>
>
> The size value requested to the vm_brk() call (last_bss - elf_bss) is
> 0xfffffffffffff938 and internally this size is page aligned in the do_brk()
> function resulting in a 0 length request.
>
> static unsigned long do_brk(unsigned long addr, unsigned long len)
> {
>         ...
>         len = PAGE_ALIGN(len);

I feel like do_brk should detect wrap-around and error out...

>         if (!len)
>                 return addr;
>
>
> Since a segment of 0 bytes is perfectly valid, it returns the requested
> address to vm_brk() and because it is page aligned (done by the previous
> line to the vm_brk() call the "error" is not detected by
> "BAD_ADDR(error)" and the "load_elf_interp()" functions does not
> returns any error. Note that vm_brk() is not necessary at all.
>
> In brief, if the end of the bss is in the same page than the last segment
> loaded then the size of the last of bss segment is incorrectly calculated.
>
>
> This patch set up to the page boundary of the last_bss variable and only do
> the vm_brk() call when necessary.
>
>
> Signed-off-by: Hector Marco-Gisbert <hecmargi@upv.es>
> Acked-by: Ismael Ripoll Ripoll <iripoll@upv.es>
> ---
>  fs/binfmt_elf.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
> index 81381cc..acfbdc2 100644
> --- a/fs/binfmt_elf.c
> +++ b/fs/binfmt_elf.c
> @@ -526,7 +526,7 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
>         int load_addr_set = 0;
>         unsigned long last_bss = 0, elf_bss = 0;
>         unsigned long error = ~0UL;
> -       unsigned long total_size;
> +       unsigned long total_size, size;
>         int i;
>
>         /* First of all, some simple consistency checks */
> @@ -626,11 +626,15 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
>
>                 /* What we have mapped so far */
>                 elf_bss = ELF_PAGESTART(elf_bss + ELF_MIN_ALIGN - 1);

This math is just ELF_PAGEALIGN(elf_bss)...

> +               last_bss = ELF_PAGESTART(last_bss + ELF_MIN_ALIGN - 1);

I don't think this is correct, actually. It works for the
less-than-a-single-page case, but not for the intended case. I don't
like that this hides the PAGE_ALIGN() (on len) and PAGE_SHIFT (on
addr) that happens inside do_brk either, and this will break if
ELF_MIN_ALIGN != PAGE_SIZE.

>
>                 /* Map the last of the bss segment */
> -               error = vm_brk(elf_bss, last_bss - elf_bss);
> -               if (BAD_ADDR(error))
> -                       goto out;
> +               size = last_bss - elf_bss;
> +               if (size) {
> +                       error = vm_brk(elf_bss, size);
> +                       if (BAD_ADDR(error))
> +                               goto out;
> +               }

So, I think what's needed here are three patches:

1) move padzero() before the last_bss > elf_bss case, since the
zero-filling of the ELF_PAGE should have nothing to do with the
relationship of last_bss and elf_bss. AIUI, it's all about the
relationship of elf_bss to the ELF_PAGE size.

2) handle the math on elf_bss vs last_bss correctly. I think this
requires that both be ELF_PAGE aligned, since that's the expected
granularity of the mappings. elf_bss already had alignment-based
padding happen in, so the "start" of the vm_brk should be moved
forward as done in the original code. However, since the "end" of the
vm_brk will already be PAGE_ALIGNed then last_bss should get aligned
here to avoid hiding it. Something like:

    elf_bss = ELF_PAGEALIGN(elf_bss);
    last_bss = ELF_PAGEALIGN(last_bss);

And leave the vm_brk as-is since it already checks for size==0.

3) add code to do_brk() to check for these kinds of overflows, which
shouldn't happen:

static int do_brk(unsigned long addr, unsigned long request)
{
...
    unsigned long len = PAGE_ALIGN(request);

    if (len < request) {
        WARN_ONCE(...);
        return -ENOMEM;
    }
    if (!len)
        return 0;
...


How's that sound?

-Kees
Hector Marco-Gisbert July 4, 2016, 5:01 p.m. UTC | #3
> On Wed, May 11, 2016 at 3:37 AM, Hector Marco-Gisbert <hecmargi@upv.es> wrote:
>> While working on a new ASLR for userspace we detected an error in the
>> interpret loader.
>>
>> The size of the bss section for some interpreters is not correctly
>> calculated resulting in unnecessary calls to vm_brk() with enormous size
>> values.
>>
>> The bug appears when loading some interpreters with a small bss size. Once
>> the last loadable segment has been loaded, the bss section is zeroed up to
>> the page boundary and the elf_bss variable is updated to this new page
>> boundary.  Because of this update (alignment), the last_bss could be less
>> than elf_bss and the subtraction "last_bss - elf_bss" value could overflow.
>>
>> Although it is quite easy to check this error, it has not been manifested
>> because some peculiarities of the bug. Following is a brief description:
>
> Heh, the bugs cancel each other out! :P
>
>>
>>
>> $ size /lib32/ld-2.19.so
>>    text    data     bss     dec     hex filename
>>  128456    2964     192  131612   2021c /lib32/ld-2.19.so
>>
>>
>> An execution example with:
>>   - last_bss: 0xb7794938
>>   - elf_bss:  0xb7794878
>>
>>
>> From fs/binfmt_elf.c:
>> ---------------------------------------------------------------------------
>>      if (last_bss > elf_bss) {
>>              /*
>>               * Now fill out the bss section.  First pad the last page up
>>               * to the page boundary, and then perform a mmap to make sure
>>               * that there are zero-mapped pages up to and including the
>>               * last bss page.
>>               */
>>              if (padzero(elf_bss)) {
>>                      error = -EFAULT;
>>                      goto out;
>>              }
>
> Tangent: shouldn't this padzero() call happen even in the case where
> last_bss <= elf_bss? The elf_map call has total_size bumped up to page
> size, so junk may have been loaded from the file still, even if
> last_bss == elf_bss, etc. Maybe I'm missing something.
>

When elf_map() is called the first time (with total_size !=0) a mmap() call
with the full image size is performed and immediately (in the same elf_map()
call) an unmap() is done in order to have only the first segment loadable
mapped. So, the total_size here is used to ensure enough space for the elf
(avoid overlapping because of the randomization).

Subsequent calls to elf_map() (per loadable segment) sets total_size to zero
resulting in mmaps of p_filesz size.  This p_filesz is bumped up to ELF page
size which causes to load junk from the file.  I think this is intended to
preserve the memory-mapped file benefits, so no calls to padzero() are made for
these maps.

Later, after loading all segments, if last_bss > elf_bss (memsz > filesz of the
last segment) then we need to call padzero(elf_bss) to fill the bss with zeros,
otherwise the bss could contain junk.

The problem of junk occurs at the end of each loadable segment (as far at it
does not end at page boundary). And AFAIK it has not been considered to be a
problem, because the junk comes from the file itself, and so no memory leak.
Typically, the last segment contains the bss (which must be zeroed). Only the
bss must be zeroed.

I don't see why it is necessary to call padzero() when there is no bss section
(last_bss == elf_bss). Same for (last_bss <= elf_bss) which is a symptom of an
error (intentional or not), because the p_filesz must always be <= p_memsz.


>>              /* What we have mapped so far */
>>              elf_bss = ELF_PAGESTART(elf_bss + ELF_MIN_ALIGN - 1);
>>
>>              <---------- elf_bss here is 0xb7795000
>>
>>              /* Map the last of the bss segment */
>>              error = vm_brk(elf_bss, last_bss - elf_bss); <-- overflow!
>>              if (BAD_ADDR(error))
>>                      goto out;
>>      }
>>
>>      error = load_addr;
>> out:
>>      return error;
>> }
>> ---------------------------------------------------------------------------
>>
>>
>> The size value requested to the vm_brk() call (last_bss - elf_bss) is
>> 0xfffffffffffff938 and internally this size is page aligned in the do_brk()
>> function resulting in a 0 length request.
>>
>> static unsigned long do_brk(unsigned long addr, unsigned long len)
>> {
>>         ...
>>         len = PAGE_ALIGN(len);
>
> I feel like do_brk should detect wrap-around and error out...

Yes!

>
>>         if (!len)
>>                 return addr;
>>
>>
>> Since a segment of 0 bytes is perfectly valid, it returns the requested
>> address to vm_brk() and because it is page aligned (done by the previous
>> line to the vm_brk() call the "error" is not detected by
>> "BAD_ADDR(error)" and the "load_elf_interp()" functions does not
>> returns any error. Note that vm_brk() is not necessary at all.
>>
>> In brief, if the end of the bss is in the same page than the last segment
>> loaded then the size of the last of bss segment is incorrectly calculated.
>>
>>
>> This patch set up to the page boundary of the last_bss variable and only do
>> the vm_brk() call when necessary.
>>
>>
>> Signed-off-by: Hector Marco-Gisbert <hecmargi@upv.es>
>> Acked-by: Ismael Ripoll Ripoll <iripoll@upv.es>
>> ---
>>  fs/binfmt_elf.c | 12 ++++++++----
>>  1 file changed, 8 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
>> index 81381cc..acfbdc2 100644
>> --- a/fs/binfmt_elf.c
>> +++ b/fs/binfmt_elf.c
>> @@ -526,7 +526,7 @@ static unsigned long load_elf_interp(struct elfhdr
*interp_elf_ex,
>>         int load_addr_set = 0;
>>         unsigned long last_bss = 0, elf_bss = 0;
>>         unsigned long error = ~0UL;
>> -       unsigned long total_size;
>> +       unsigned long total_size, size;
>>         int i;
>>
>>         /* First of all, some simple consistency checks */
>> @@ -626,11 +626,15 @@ static unsigned long load_elf_interp(struct elfhdr
*interp_elf_ex,
>>
>>                 /* What we have mapped so far */
>>                 elf_bss = ELF_PAGESTART(elf_bss + ELF_MIN_ALIGN - 1);
>
> This math is just ELF_PAGEALIGN(elf_bss)...
>
>> +               last_bss = ELF_PAGESTART(last_bss + ELF_MIN_ALIGN - 1);
>
> I don't think this is correct, actually. It works for the
> less-than-a-single-page case, but not for the intended case. I don't
> like that this hides the PAGE_ALIGN() (on len) and PAGE_SHIFT (on
> addr) that happens inside do_brk either, and this will break if
> ELF_MIN_ALIGN != PAGE_SIZE.

I don't see your point. ELF_MIN_ALIGN is always greater or equal than PAGE_SIZE
and elf_map() calls align the segment size to ELF page. Also padzero() uses
ELF_MIN_ALIGN to calculate the number of bytes to clear.


>
>>
>>                 /* Map the last of the bss segment */
>> -               error = vm_brk(elf_bss, last_bss - elf_bss);
>> -               if (BAD_ADDR(error))
>> -                       goto out;
>> +               size = last_bss - elf_bss;
>> +               if (size) {
>> +                       error = vm_brk(elf_bss, size);
>> +                       if (BAD_ADDR(error))
>> +                               goto out;
>> +               }
>
> So, I think what's needed here are three patches:
>
> 1) move padzero() before the last_bss > elf_bss case, since the
> zero-filling of the ELF_PAGE should have nothing to do with the
> relationship of last_bss and elf_bss. AIUI, it's all about the
> relationship of elf_bss to the ELF_PAGE size.

As I pointed above I'm not sure of this.

>
> 2) handle the math on elf_bss vs last_bss correctly. I think this
> requires that both be ELF_PAGE aligned, since that's the expected
> granularity of the mappings. elf_bss already had alignment-based
> padding happen in, so the "start" of the vm_brk should be moved
> forward as done in the original code. However, since the "end" of the
> vm_brk will already be PAGE_ALIGNed then last_bss should get aligned
> here to avoid hiding it. Something like:
>
>     elf_bss = ELF_PAGEALIGN(elf_bss);
>     last_bss = ELF_PAGEALIGN(last_bss);
>
> And leave the vm_brk as-is since it already checks for size==0.

Yes, it is much clearer to use ELF_PAGEALIGN.

>
> 3) add code to do_brk() to check for these kinds of overflows, which
> shouldn't happen:
>
> static int do_brk(unsigned long addr, unsigned long request)
> {
> ...
>     unsigned long len = PAGE_ALIGN(request);
>
>     if (len < request) {
>         WARN_ONCE(...);
>         return -ENOMEM;
>     }
>     if (!len)
>         return 0;
> ...
>
>
> How's that sound?


It sounds good :). Another possibility would be:

static int do_brk(unsigned long addr, unsigned long len)
{
...
    if (len > TASK_SIZE) {
        WARN_ONCE(...);
        return -ENOMEM;
    }
...



Hector Marco.


>
> -Kees
>
Kees Cook July 8, 2016, 9:10 p.m. UTC | #4
On Mon, Jul 4, 2016 at 1:01 PM, Hector Marco-Gisbert <hecmargi@upv.es> wrote:
>> On Wed, May 11, 2016 at 3:37 AM, Hector Marco-Gisbert <hecmargi@upv.es> wrote:
>>> While working on a new ASLR for userspace we detected an error in the
>>> interpret loader.
>>>
>>> The size of the bss section for some interpreters is not correctly
>>> calculated resulting in unnecessary calls to vm_brk() with enormous size
>>> values.
>>>
>>> The bug appears when loading some interpreters with a small bss size. Once
>>> the last loadable segment has been loaded, the bss section is zeroed up to
>>> the page boundary and the elf_bss variable is updated to this new page
>>> boundary.  Because of this update (alignment), the last_bss could be less
>>> than elf_bss and the subtraction "last_bss - elf_bss" value could overflow.
>>>
>>> Although it is quite easy to check this error, it has not been manifested
>>> because some peculiarities of the bug. Following is a brief description:
>>
>> Heh, the bugs cancel each other out! :P
>>
>>>
>>>
>>> $ size /lib32/ld-2.19.so
>>>    text    data     bss     dec     hex filename
>>>  128456    2964     192  131612   2021c /lib32/ld-2.19.so
>>>
>>>
>>> An execution example with:
>>>   - last_bss: 0xb7794938
>>>   - elf_bss:  0xb7794878
>>>
>>>
>>> From fs/binfmt_elf.c:
>>> ---------------------------------------------------------------------------
>>>      if (last_bss > elf_bss) {
>>>              /*
>>>               * Now fill out the bss section.  First pad the last page up
>>>               * to the page boundary, and then perform a mmap to make sure
>>>               * that there are zero-mapped pages up to and including the
>>>               * last bss page.
>>>               */
>>>              if (padzero(elf_bss)) {
>>>                      error = -EFAULT;
>>>                      goto out;
>>>              }
>>
>> Tangent: shouldn't this padzero() call happen even in the case where
>> last_bss <= elf_bss? The elf_map call has total_size bumped up to page
>> size, so junk may have been loaded from the file still, even if
>> last_bss == elf_bss, etc. Maybe I'm missing something.
>>
>
> When elf_map() is called the first time (with total_size !=0) a mmap() call
> with the full image size is performed and immediately (in the same elf_map()
> call) an unmap() is done in order to have only the first segment loadable
> mapped. So, the total_size here is used to ensure enough space for the elf
> (avoid overlapping because of the randomization).
>
> Subsequent calls to elf_map() (per loadable segment) sets total_size to zero
> resulting in mmaps of p_filesz size.  This p_filesz is bumped up to ELF page
> size which causes to load junk from the file.  I think this is intended to
> preserve the memory-mapped file benefits, so no calls to padzero() are made for
> these maps.
>
> Later, after loading all segments, if last_bss > elf_bss (memsz > filesz of the
> last segment) then we need to call padzero(elf_bss) to fill the bss with zeros,
> otherwise the bss could contain junk.

Right, though it does seem weird to me to leave junk in memory.
Regardless, this would only fix the last case, not the others, so I
agree: it should stay.

> The problem of junk occurs at the end of each loadable segment (as far at it
> does not end at page boundary). And AFAIK it has not been considered to be a
> problem, because the junk comes from the file itself, and so no memory leak.
> Typically, the last segment contains the bss (which must be zeroed). Only the
> bss must be zeroed.
>
> I don't see why it is necessary to call padzero() when there is no bss section
> (last_bss == elf_bss). Same for (last_bss <= elf_bss) which is a symptom of an
> error (intentional or not), because the p_filesz must always be <= p_memsz.
>
>
>>>              /* What we have mapped so far */
>>>              elf_bss = ELF_PAGESTART(elf_bss + ELF_MIN_ALIGN - 1);
>>>
>>>              <---------- elf_bss here is 0xb7795000
>>>
>>>              /* Map the last of the bss segment */
>>>              error = vm_brk(elf_bss, last_bss - elf_bss); <-- overflow!
>>>              if (BAD_ADDR(error))
>>>                      goto out;
>>>      }
>>>
>>>      error = load_addr;
>>> out:
>>>      return error;
>>> }
>>> ---------------------------------------------------------------------------
>>>
>>>
>>> The size value requested to the vm_brk() call (last_bss - elf_bss) is
>>> 0xfffffffffffff938 and internally this size is page aligned in the do_brk()
>>> function resulting in a 0 length request.
>>>
>>> static unsigned long do_brk(unsigned long addr, unsigned long len)
>>> {
>>>         ...
>>>         len = PAGE_ALIGN(len);
>>
>> I feel like do_brk should detect wrap-around and error out...
>
> Yes!
>
>>
>>>         if (!len)
>>>                 return addr;
>>>
>>>
>>> Since a segment of 0 bytes is perfectly valid, it returns the requested
>>> address to vm_brk() and because it is page aligned (done by the previous
>>> line to the vm_brk() call the "error" is not detected by
>>> "BAD_ADDR(error)" and the "load_elf_interp()" functions does not
>>> returns any error. Note that vm_brk() is not necessary at all.
>>>
>>> In brief, if the end of the bss is in the same page than the last segment
>>> loaded then the size of the last of bss segment is incorrectly calculated.
>>>
>>>
>>> This patch set up to the page boundary of the last_bss variable and only do
>>> the vm_brk() call when necessary.
>>>
>>>
>>> Signed-off-by: Hector Marco-Gisbert <hecmargi@upv.es>
>>> Acked-by: Ismael Ripoll Ripoll <iripoll@upv.es>
>>> ---
>>>  fs/binfmt_elf.c | 12 ++++++++----
>>>  1 file changed, 8 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
>>> index 81381cc..acfbdc2 100644
>>> --- a/fs/binfmt_elf.c
>>> +++ b/fs/binfmt_elf.c
>>> @@ -526,7 +526,7 @@ static unsigned long load_elf_interp(struct elfhdr
> *interp_elf_ex,
>>>         int load_addr_set = 0;
>>>         unsigned long last_bss = 0, elf_bss = 0;
>>>         unsigned long error = ~0UL;
>>> -       unsigned long total_size;
>>> +       unsigned long total_size, size;
>>>         int i;
>>>
>>>         /* First of all, some simple consistency checks */
>>> @@ -626,11 +626,15 @@ static unsigned long load_elf_interp(struct elfhdr
> *interp_elf_ex,
>>>
>>>                 /* What we have mapped so far */
>>>                 elf_bss = ELF_PAGESTART(elf_bss + ELF_MIN_ALIGN - 1);
>>
>> This math is just ELF_PAGEALIGN(elf_bss)...
>>
>>> +               last_bss = ELF_PAGESTART(last_bss + ELF_MIN_ALIGN - 1);
>>
>> I don't think this is correct, actually. It works for the
>> less-than-a-single-page case, but not for the intended case. I don't
>> like that this hides the PAGE_ALIGN() (on len) and PAGE_SHIFT (on
>> addr) that happens inside do_brk either, and this will break if
>> ELF_MIN_ALIGN != PAGE_SIZE.
>
> I don't see your point. ELF_MIN_ALIGN is always greater or equal than PAGE_SIZE
> and elf_map() calls align the segment size to ELF page. Also padzero() uses
> ELF_MIN_ALIGN to calculate the number of bytes to clear.

Mostly I just didn't like all the hidden side-effects, so it bears
cleaning up regardless.

>>>                 /* Map the last of the bss segment */
>>> -               error = vm_brk(elf_bss, last_bss - elf_bss);
>>> -               if (BAD_ADDR(error))
>>> -                       goto out;
>>> +               size = last_bss - elf_bss;
>>> +               if (size) {
>>> +                       error = vm_brk(elf_bss, size);
>>> +                       if (BAD_ADDR(error))
>>> +                               goto out;
>>> +               }
>>
>> So, I think what's needed here are three patches:
>>
>> 1) move padzero() before the last_bss > elf_bss case, since the
>> zero-filling of the ELF_PAGE should have nothing to do with the
>> relationship of last_bss and elf_bss. AIUI, it's all about the
>> relationship of elf_bss to the ELF_PAGE size.
>
> As I pointed above I'm not sure of this.
>
>>
>> 2) handle the math on elf_bss vs last_bss correctly. I think this
>> requires that both be ELF_PAGE aligned, since that's the expected
>> granularity of the mappings. elf_bss already had alignment-based
>> padding happen in, so the "start" of the vm_brk should be moved
>> forward as done in the original code. However, since the "end" of the
>> vm_brk will already be PAGE_ALIGNed then last_bss should get aligned
>> here to avoid hiding it. Something like:
>>
>>     elf_bss = ELF_PAGEALIGN(elf_bss);
>>     last_bss = ELF_PAGEALIGN(last_bss);
>>
>> And leave the vm_brk as-is since it already checks for size==0.
>
> Yes, it is much clearer to use ELF_PAGEALIGN.
>
>>
>> 3) add code to do_brk() to check for these kinds of overflows, which
>> shouldn't happen:
>>
>> static int do_brk(unsigned long addr, unsigned long request)
>> {
>> ...
>>     unsigned long len = PAGE_ALIGN(request);
>>
>>     if (len < request) {
>>         WARN_ONCE(...);
>>         return -ENOMEM;
>>     }
>>     if (!len)
>>         return 0;
>> ...
>>
>>
>> How's that sound?
>
>
> It sounds good :). Another possibility would be:
>
> static int do_brk(unsigned long addr, unsigned long len)
> {
> ...
>     if (len > TASK_SIZE) {
>         WARN_ONCE(...);
>         return -ENOMEM;
>     }
> ...

Hm, maybe (len < request || len > TASK_SIZE) ?

I'll give it a spin.

-Kees

>
>
>
> Hector Marco.
>
>
>>
>> -Kees
>>
>
> --
> Dr. Hector Marco-Gisbert @ http://hmarco.org/
> Cyber Security Researcher @ http://cybersecurity.upv.es
> Universitat Politècnica de València (Spain)
>
diff mbox

Patch

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 81381cc..acfbdc2 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -526,7 +526,7 @@  static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
 	int load_addr_set = 0;
 	unsigned long last_bss = 0, elf_bss = 0;
 	unsigned long error = ~0UL;
-	unsigned long total_size;
+	unsigned long total_size, size;
 	int i;
 
 	/* First of all, some simple consistency checks */
@@ -626,11 +626,15 @@  static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
 
 		/* What we have mapped so far */
 		elf_bss = ELF_PAGESTART(elf_bss + ELF_MIN_ALIGN - 1);
+		last_bss = ELF_PAGESTART(last_bss + ELF_MIN_ALIGN - 1);
 
 		/* Map the last of the bss segment */
-		error = vm_brk(elf_bss, last_bss - elf_bss);
-		if (BAD_ADDR(error))
-			goto out;
+		size = last_bss - elf_bss;
+		if (size) {
+			error = vm_brk(elf_bss, size);
+			if (BAD_ADDR(error))
+				goto out;
+		}
 	}
 
 	error = load_addr;