diff mbox

exec: Account for argv/envp pointers

Message ID 20170622001720.GA32173@beast (mailing list archive)
State New, archived
Headers show

Commit Message

Kees Cook June 22, 2017, 12:17 a.m. UTC
When limiting the argv/envp strings during exec to 1/4 of the stack limit,
the storage of the pointers to the strings was not included. This means
that an exec with huge numbers of tiny strings could eat 1/4 of the
stack limit in strings and then additional space would be later used
by the pointers to the strings. For example, on 32-bit with a 8MB stack
rlimit, an exec with 1677721 single-byte strings would consume less than
2MB of stack, the max (8MB / 4) amount allowed, but the pointers to the
strings would consume the remaining additional stack space (1677721 *
4 == 6710884). The result (1677721 + 6710884 == 8388605) would exhaust
stack space entirely. Controlling this stack exhaustion could result in
pathological behavior in setuid binaries (CVE-2017-1000365).

Fixes: b6a2fea39318 ("mm: variable length argument support")
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 fs/exec.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

Comments

Rik van Riel June 22, 2017, 1:39 a.m. UTC | #1
On Wed, 2017-06-21 at 17:17 -0700, Kees Cook wrote:
> When limiting the argv/envp strings during exec to 1/4 of the stack
> limit,
> the storage of the pointers to the strings was not included. This
> means
> that an exec with huge numbers of tiny strings could eat 1/4 of the
> stack limit in strings and then additional space would be later used
> by the pointers to the strings. For example, on 32-bit with a 8MB
> stack
> rlimit, an exec with 1677721 single-byte strings would consume less
> than
> 2MB of stack, the max (8MB / 4) amount allowed, but the pointers to
> the
> strings would consume the remaining additional stack space (1677721 *
> 4 == 6710884). The result (1677721 + 6710884 == 8388605) would
> exhaust
> stack space entirely. Controlling this stack exhaustion could result
> in
> pathological behavior in setuid binaries (CVE-2017-1000365).
> 
> Fixes: b6a2fea39318 ("mm: variable length argument support")
> Cc: stable@vger.kernel.org
> Signed-off-by: Kees Cook <keescook@chromium.org>

Acked-by: Rik van Riel <riel@redhat.com>
Michal Hocko June 23, 2017, 1:59 p.m. UTC | #2
On Wed 21-06-17 17:17:20, Kees Cook wrote:
> When limiting the argv/envp strings during exec to 1/4 of the stack limit,
> the storage of the pointers to the strings was not included. This means
> that an exec with huge numbers of tiny strings could eat 1/4 of the
> stack limit in strings and then additional space would be later used
> by the pointers to the strings. For example, on 32-bit with a 8MB stack
> rlimit, an exec with 1677721 single-byte strings would consume less than
> 2MB of stack, the max (8MB / 4) amount allowed, but the pointers to the
> strings would consume the remaining additional stack space (1677721 *
> 4 == 6710884). The result (1677721 + 6710884 == 8388605) would exhaust
> stack space entirely. Controlling this stack exhaustion could result in
> pathological behavior in setuid binaries (CVE-2017-1000365).
> 
> Fixes: b6a2fea39318 ("mm: variable length argument support")
> Cc: stable@vger.kernel.org
> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
>  fs/exec.c | 20 ++++++++++++++++----
>  1 file changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index 72934df68471..8079ca70cfda 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -220,8 +220,18 @@ static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
>  
>  	if (write) {
>  		unsigned long size = bprm->vma->vm_end - bprm->vma->vm_start;
> +		unsigned long ptr_size;
>  		struct rlimit *rlim;
>  
> +		/*
> +		 * Since the stack will hold pointers to the strings, we
> +		 * must account for them as well.
> +		 */
> +		ptr_size = (bprm->argc + bprm->envc) * sizeof(void *);
> +		if (ptr_size > ULONG_MAX - size)
> +			goto fail;
> +		size += ptr_size;
> +
>  		acct_arg_size(bprm, size / PAGE_SIZE);

Doesn't this over account? I mean this gets called for partial arguments
as they fit into a page so a single argument can get into this function
multiple times AFAIU. I also do not understand why would you want to
account bprm->argc + bprm->envc pointers for each argument.

>  
>  		/*
> @@ -239,13 +249,15 @@ static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
>  		 *    to work from.
>  		 */
>  		rlim = current->signal->rlim;
> -		if (size > ACCESS_ONCE(rlim[RLIMIT_STACK].rlim_cur) / 4) {
> -			put_page(page);
> -			return NULL;
> -		}
> +		if (size > READ_ONCE(rlim[RLIMIT_STACK].rlim_cur) / 4)
> +			goto fail;
>  	}
>  
>  	return page;
> +
> +fail:
> +	put_page(page);
> +	return NULL;
>  }
>  
>  static void put_arg_page(struct page *page)
> -- 
> 2.7.4
> 
> 
> -- 
> Kees Cook
> Pixel Security
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Kees Cook June 23, 2017, 2:05 p.m. UTC | #3
On Fri, Jun 23, 2017 at 6:59 AM, Michal Hocko <mhocko@kernel.org> wrote:
> On Wed 21-06-17 17:17:20, Kees Cook wrote:
>> When limiting the argv/envp strings during exec to 1/4 of the stack limit,
>> the storage of the pointers to the strings was not included. This means
>> that an exec with huge numbers of tiny strings could eat 1/4 of the
>> stack limit in strings and then additional space would be later used
>> by the pointers to the strings. For example, on 32-bit with a 8MB stack
>> rlimit, an exec with 1677721 single-byte strings would consume less than
>> 2MB of stack, the max (8MB / 4) amount allowed, but the pointers to the
>> strings would consume the remaining additional stack space (1677721 *
>> 4 == 6710884). The result (1677721 + 6710884 == 8388605) would exhaust
>> stack space entirely. Controlling this stack exhaustion could result in
>> pathological behavior in setuid binaries (CVE-2017-1000365).
>>
>> Fixes: b6a2fea39318 ("mm: variable length argument support")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Kees Cook <keescook@chromium.org>
>> ---
>>  fs/exec.c | 20 ++++++++++++++++----
>>  1 file changed, 16 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/exec.c b/fs/exec.c
>> index 72934df68471..8079ca70cfda 100644
>> --- a/fs/exec.c
>> +++ b/fs/exec.c
>> @@ -220,8 +220,18 @@ static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
>>
>>       if (write) {
>>               unsigned long size = bprm->vma->vm_end - bprm->vma->vm_start;
>> +             unsigned long ptr_size;
>>               struct rlimit *rlim;
>>
>> +             /*
>> +              * Since the stack will hold pointers to the strings, we
>> +              * must account for them as well.
>> +              */
>> +             ptr_size = (bprm->argc + bprm->envc) * sizeof(void *);
>> +             if (ptr_size > ULONG_MAX - size)
>> +                     goto fail;
>> +             size += ptr_size;
>> +
>>               acct_arg_size(bprm, size / PAGE_SIZE);
>
> Doesn't this over account? I mean this gets called for partial arguments
> as they fit into a page so a single argument can get into this function
> multiple times AFAIU. I also do not understand why would you want to
> account bprm->argc + bprm->envc pointers for each argument.

Based on what I could understand in acct_arg_size(), this is called
repeatedly with with the "current" size (it handles the difference
between prior calls, see calls like acct_arg_size(bprm, 0)).

The size calculation is the entire vma while each arg page is built,
so each time we get here it's calculating how far it is currently
(rather than each call being just the newly added size from the arg
page). As a result, we need to always add the entire size of the
pointers, so that on the last call to get_arg_page() we'll actually
have the entire correct size.

-Kees

>
>>
>>               /*
>> @@ -239,13 +249,15 @@ static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
>>                *    to work from.
>>                */
>>               rlim = current->signal->rlim;
>> -             if (size > ACCESS_ONCE(rlim[RLIMIT_STACK].rlim_cur) / 4) {
>> -                     put_page(page);
>> -                     return NULL;
>> -             }
>> +             if (size > READ_ONCE(rlim[RLIMIT_STACK].rlim_cur) / 4)
>> +                     goto fail;
>>       }
>>
>>       return page;
>> +
>> +fail:
>> +     put_page(page);
>> +     return NULL;
>>  }
>>
>>  static void put_arg_page(struct page *page)
>> --
>> 2.7.4
>>
>>
>> --
>> Kees Cook
>> Pixel Security
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
> --
> Michal Hocko
> SUSE Labs
Michal Hocko June 23, 2017, 2:18 p.m. UTC | #4
On Fri 23-06-17 07:05:37, Kees Cook wrote:
> On Fri, Jun 23, 2017 at 6:59 AM, Michal Hocko <mhocko@kernel.org> wrote:
[...]
> >> --- a/fs/exec.c
> >> +++ b/fs/exec.c
> >> @@ -220,8 +220,18 @@ static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
> >>
> >>       if (write) {
> >>               unsigned long size = bprm->vma->vm_end - bprm->vma->vm_start;
> >> +             unsigned long ptr_size;
> >>               struct rlimit *rlim;
> >>
> >> +             /*
> >> +              * Since the stack will hold pointers to the strings, we
> >> +              * must account for them as well.
> >> +              */
> >> +             ptr_size = (bprm->argc + bprm->envc) * sizeof(void *);
> >> +             if (ptr_size > ULONG_MAX - size)
> >> +                     goto fail;
> >> +             size += ptr_size;
> >> +
> >>               acct_arg_size(bprm, size / PAGE_SIZE);
> >
> > Doesn't this over account? I mean this gets called for partial arguments
> > as they fit into a page so a single argument can get into this function
> > multiple times AFAIU. I also do not understand why would you want to
> > account bprm->argc + bprm->envc pointers for each argument.
> 
> Based on what I could understand in acct_arg_size(), this is called
> repeatedly with with the "current" size (it handles the difference
> between prior calls, see calls like acct_arg_size(bprm, 0)).
> 
> The size calculation is the entire vma while each arg page is built,
> so each time we get here it's calculating how far it is currently
> (rather than each call being just the newly added size from the arg
> page). As a result, we need to always add the entire size of the
> pointers, so that on the last call to get_arg_page() we'll actually
> have the entire correct size.

Ohh, I forgot about this tricky part. The code just looks confusing
becauser we are mixing 2 things together here. This deserves a comment I
guess.

Other than that feel free to add
Acked-by: Michal Hocko <mhocko@suse.com>

Thanks!
diff mbox

Patch

diff --git a/fs/exec.c b/fs/exec.c
index 72934df68471..8079ca70cfda 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -220,8 +220,18 @@  static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
 
 	if (write) {
 		unsigned long size = bprm->vma->vm_end - bprm->vma->vm_start;
+		unsigned long ptr_size;
 		struct rlimit *rlim;
 
+		/*
+		 * Since the stack will hold pointers to the strings, we
+		 * must account for them as well.
+		 */
+		ptr_size = (bprm->argc + bprm->envc) * sizeof(void *);
+		if (ptr_size > ULONG_MAX - size)
+			goto fail;
+		size += ptr_size;
+
 		acct_arg_size(bprm, size / PAGE_SIZE);
 
 		/*
@@ -239,13 +249,15 @@  static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
 		 *    to work from.
 		 */
 		rlim = current->signal->rlim;
-		if (size > ACCESS_ONCE(rlim[RLIMIT_STACK].rlim_cur) / 4) {
-			put_page(page);
-			return NULL;
-		}
+		if (size > READ_ONCE(rlim[RLIMIT_STACK].rlim_cur) / 4)
+			goto fail;
 	}
 
 	return page;
+
+fail:
+	put_page(page);
+	return NULL;
 }
 
 static void put_arg_page(struct page *page)