diff mbox series

[v2] fs/exec: require argv[0] presence in do_execveat_common()

Message ID 20220126114447.25776-1-ariadne@dereferenced.org (mailing list archive)
State New, archived
Headers show
Series [v2] fs/exec: require argv[0] presence in do_execveat_common() | expand

Commit Message

Ariadne Conill Jan. 26, 2022, 11:44 a.m. UTC
In several other operating systems, it is a hard requirement that the
first argument to execve(2) be the name of a program, thus prohibiting
a scenario where argc < 1.  POSIX 2017 also recommends this behaviour,
but it is not an explicit requirement[0]:

    The argument arg0 should point to a filename string that is
    associated with the process being started by one of the exec
    functions.

To ensure that execve(2) with argc < 1 is not a useful gadget for
shellcode to use, we can validate this in do_execveat_common() and
fail for this scenario, effectively blocking successful exploitation
of CVE-2021-4034 and similar bugs which depend on this gadget.

The use of -EFAULT for this case is similar to other systems, such
as FreeBSD, OpenBSD and Solaris.  QNX uses -EINVAL for this case.

Interestingly, Michael Kerrisk opened an issue about this in 2008[1],
but there was no consensus to support fixing this issue then.
Hopefully now that CVE-2021-4034 shows practical exploitative use
of this bug in a shellcode, we can reconsider.

[0]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
[1]: https://bugzilla.kernel.org/show_bug.cgi?id=8408

Changes from v1:
- Rework commit message significantly.
- Make the argv[0] check explicit rather than hijacking the error-check
  for count().

Signed-off-by: Ariadne Conill <ariadne@dereferenced.org>
---
 fs/exec.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Matthew Wilcox Jan. 26, 2022, 2:40 p.m. UTC | #1
On Wed, Jan 26, 2022 at 11:44:47AM +0000, Ariadne Conill wrote:
> In several other operating systems, it is a hard requirement that the
> first argument to execve(2) be the name of a program, thus prohibiting
> a scenario where argc < 1.  POSIX 2017 also recommends this behaviour,
> but it is not an explicit requirement[0]:
> 
>     The argument arg0 should point to a filename string that is
>     associated with the process being started by one of the exec
>     functions.
> 
> To ensure that execve(2) with argc < 1 is not a useful gadget for
> shellcode to use, we can validate this in do_execveat_common() and
> fail for this scenario, effectively blocking successful exploitation
> of CVE-2021-4034 and similar bugs which depend on this gadget.
> 
> The use of -EFAULT for this case is similar to other systems, such
> as FreeBSD, OpenBSD and Solaris.  QNX uses -EINVAL for this case.
> 
> Interestingly, Michael Kerrisk opened an issue about this in 2008[1],
> but there was no consensus to support fixing this issue then.
> Hopefully now that CVE-2021-4034 shows practical exploitative use
> of this bug in a shellcode, we can reconsider.
> 
> [0]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=8408
> 
> Changes from v1:
> - Rework commit message significantly.
> - Make the argv[0] check explicit rather than hijacking the error-check
>   for count().
> 
> Signed-off-by: Ariadne Conill <ariadne@dereferenced.org>
> ---
>  fs/exec.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index 79f2c9483302..e52c41991aab 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1899,6 +1899,10 @@ static int do_execveat_common(int fd, struct filename *filename,
>  	retval = count(argv, MAX_ARG_STRINGS);
>  	if (retval < 0)
>  		goto out_free;
> +	if (retval == 0) {
> +		retval = -EFAULT;
> +		goto out_free;
> +	}

I don't object to the concept, but it's a more common pattern in Linux
to do this:

	retval = count(argv, MAX_ARG_STRINGS);
+	if (retval == 0)
+		retval = -EFAULT;
	if (retval < 0)
		goto out_free;

(aka I like my bikesheds painted in Toasty Eggshell)
Matthew Wilcox Jan. 26, 2022, 2:59 p.m. UTC | #2
On Wed, Jan 26, 2022 at 11:44:47AM +0000, Ariadne Conill wrote:
> Interestingly, Michael Kerrisk opened an issue about this in 2008[1],
> but there was no consensus to support fixing this issue then.
> Hopefully now that CVE-2021-4034 shows practical exploitative use
> of this bug in a shellcode, we can reconsider.
> 
> [0]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=8408

Having now read 8408 ... if ABI change is a concern (and I really doubt
it is), we could treat calling execve() with a NULL argv as if the
caller had passed an array of length 1 with the first element set to
NULL.  Just like we reopen fds 0,1,2 for suid execs if they were closed.
Kees Cook Jan. 26, 2022, 4:40 p.m. UTC | #3
On Wed, Jan 26, 2022 at 02:59:52PM +0000, Matthew Wilcox wrote:
> On Wed, Jan 26, 2022 at 11:44:47AM +0000, Ariadne Conill wrote:
> > Interestingly, Michael Kerrisk opened an issue about this in 2008[1],
> > but there was no consensus to support fixing this issue then.
> > Hopefully now that CVE-2021-4034 shows practical exploitative use
> > of this bug in a shellcode, we can reconsider.
> > 
> > [0]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
> > [1]: https://bugzilla.kernel.org/show_bug.cgi?id=8408
> 
> Having now read 8408 ... if ABI change is a concern (and I really doubt
> it is), we could treat calling execve() with a NULL argv as if the
> caller had passed an array of length 1 with the first element set to
> NULL.  Just like we reopen fds 0,1,2 for suid execs if they were closed.

I was having similar thoughts this morning. We can't actually change the
argc, though, because of the various tests (see the debian code search
links) that explicitly tests for argc == 0 in the child. But, the flaw
is not the count, but rather that argv == argp in the argc == 0 case.
(Or that argv NULL-checking iteration begins at argv[1].)

But that would could fix easily by just adding an extra NULL. e.g.:

Currently:

argc = 1
argv = "foo", NULL
envp = "bar=baz", ..., NULL

argc = 0
argv = NULL
envp = "bar=baz", ..., NULL

We could just make the argc = 0 case be:

argc = 0
argv = NULL, NULL
envp = "bar=baz", ..., NULL

We need to be careful with the stack utilization counts, though, so I'm
thinking we could actually make this completely unconditional and just
pad envp by 1 NULL on the user stack:

argv = "what", "ever", NULL
       NULL
envp = "bar=baz", ..., NULL

My only concern there is that there may be some code out there that
depends on envp immediately following the trailing argv NULL, so I think
my preference would be to pad only in the argc == 0 case and correctly
manage the stack utilization.
Eric W. Biederman Jan. 26, 2022, 4:57 p.m. UTC | #4
Matthew Wilcox <willy@infradead.org> writes:

> On Wed, Jan 26, 2022 at 11:44:47AM +0000, Ariadne Conill wrote:
>> Interestingly, Michael Kerrisk opened an issue about this in 2008[1],
>> but there was no consensus to support fixing this issue then.
>> Hopefully now that CVE-2021-4034 shows practical exploitative use
>> of this bug in a shellcode, we can reconsider.
>> 
>> [0]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
>> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=8408
>
> Having now read 8408 ... if ABI change is a concern (and I really doubt
> it is), we could treat calling execve() with a NULL argv as if the
> caller had passed an array of length 1 with the first element set to
> NULL.  Just like we reopen fds 0,1,2 for suid execs if they were
> closed.

Where do we reopen fds 0,1,2 for suid execs?  I feel silly but I looked
through the code fs/exec.c quickly and I could not see it.


I am attracted to the notion of converting an empty argv array passed
to the kernel into something we can safely pass to userspace.

I think it would need to be having the first entry point to "" instead
of the first entry being NULL.  That would maintain the invariant that you
can always dereference a pointer in the argv array.

Eric
Ariadne Conill Jan. 26, 2022, 5:32 p.m. UTC | #5
Hi,

On Wed, 26 Jan 2022, Eric W. Biederman wrote:

> Matthew Wilcox <willy@infradead.org> writes:
>
>> On Wed, Jan 26, 2022 at 11:44:47AM +0000, Ariadne Conill wrote:
>>> Interestingly, Michael Kerrisk opened an issue about this in 2008[1],
>>> but there was no consensus to support fixing this issue then.
>>> Hopefully now that CVE-2021-4034 shows practical exploitative use
>>> of this bug in a shellcode, we can reconsider.
>>>
>>> [0]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
>>> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=8408
>>
>> Having now read 8408 ... if ABI change is a concern (and I really doubt
>> it is), we could treat calling execve() with a NULL argv as if the
>> caller had passed an array of length 1 with the first element set to
>> NULL.  Just like we reopen fds 0,1,2 for suid execs if they were
>> closed.
>
> Where do we reopen fds 0,1,2 for suid execs?  I feel silly but I looked
> through the code fs/exec.c quickly and I could not see it.
>
>
> I am attracted to the notion of converting an empty argv array passed
> to the kernel into something we can safely pass to userspace.
>
> I think it would need to be having the first entry point to "" instead
> of the first entry being NULL.  That would maintain the invariant that you
> can always dereference a pointer in the argv array.

Yes, I think this is correct, because there's a lot of programs out there 
which will try to blindly read from argv[0], assuming it is present. 
Ensuring we wind up with {"", NULL} would be the way I would want to 
approach this if we go that route.

This approach would solve the problem with pkexec, but I still think there 
is some wisdom in denying with -EFAULT outright like other systems do.

Ariadne
Ariadne Conill Jan. 26, 2022, 5:41 p.m. UTC | #6
Hi,

On Wed, 26 Jan 2022, Matthew Wilcox wrote:

> On Wed, Jan 26, 2022 at 11:44:47AM +0000, Ariadne Conill wrote:
>> In several other operating systems, it is a hard requirement that the
>> first argument to execve(2) be the name of a program, thus prohibiting
>> a scenario where argc < 1.  POSIX 2017 also recommends this behaviour,
>> but it is not an explicit requirement[0]:
>>
>>     The argument arg0 should point to a filename string that is
>>     associated with the process being started by one of the exec
>>     functions.
>>
>> To ensure that execve(2) with argc < 1 is not a useful gadget for
>> shellcode to use, we can validate this in do_execveat_common() and
>> fail for this scenario, effectively blocking successful exploitation
>> of CVE-2021-4034 and similar bugs which depend on this gadget.
>>
>> The use of -EFAULT for this case is similar to other systems, such
>> as FreeBSD, OpenBSD and Solaris.  QNX uses -EINVAL for this case.
>>
>> Interestingly, Michael Kerrisk opened an issue about this in 2008[1],
>> but there was no consensus to support fixing this issue then.
>> Hopefully now that CVE-2021-4034 shows practical exploitative use
>> of this bug in a shellcode, we can reconsider.
>>
>> [0]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
>> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=8408
>>
>> Changes from v1:
>> - Rework commit message significantly.
>> - Make the argv[0] check explicit rather than hijacking the error-check
>>   for count().
>>
>> Signed-off-by: Ariadne Conill <ariadne@dereferenced.org>
>> ---
>>  fs/exec.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/fs/exec.c b/fs/exec.c
>> index 79f2c9483302..e52c41991aab 100644
>> --- a/fs/exec.c
>> +++ b/fs/exec.c
>> @@ -1899,6 +1899,10 @@ static int do_execveat_common(int fd, struct filename *filename,
>>  	retval = count(argv, MAX_ARG_STRINGS);
>>  	if (retval < 0)
>>  		goto out_free;
>> +	if (retval == 0) {
>> +		retval = -EFAULT;
>> +		goto out_free;
>> +	}
>
> I don't object to the concept, but it's a more common pattern in Linux
> to do this:
>
> 	retval = count(argv, MAX_ARG_STRINGS);
> +	if (retval == 0)
> +		retval = -EFAULT;
> 	if (retval < 0)
> 		goto out_free;

Yeah, that seems fine.  We can of course do it that way, which I will 
revise the patch to do if we decide to stick with denial over making a 
"safe" argv instead.

> (aka I like my bikesheds painted in Toasty Eggshell)

Toasty Eggshell is a nice color for a bikeshed :)

Ariadne
Matthew Wilcox Jan. 26, 2022, 6:03 p.m. UTC | #7
On Wed, Jan 26, 2022 at 10:57:29AM -0600, Eric W. Biederman wrote:
> Matthew Wilcox <willy@infradead.org> writes:
> 
> > On Wed, Jan 26, 2022 at 11:44:47AM +0000, Ariadne Conill wrote:
> >> Interestingly, Michael Kerrisk opened an issue about this in 2008[1],
> >> but there was no consensus to support fixing this issue then.
> >> Hopefully now that CVE-2021-4034 shows practical exploitative use
> >> of this bug in a shellcode, we can reconsider.
> >> 
> >> [0]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
> >> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=8408
> >
> > Having now read 8408 ... if ABI change is a concern (and I really doubt
> > it is), we could treat calling execve() with a NULL argv as if the
> > caller had passed an array of length 1 with the first element set to
> > NULL.  Just like we reopen fds 0,1,2 for suid execs if they were
> > closed.
> 
> Where do we reopen fds 0,1,2 for suid execs?  I feel silly but I looked
> through the code fs/exec.c quickly and I could not see it.

I'm wondering if I misremembered and it's being done in ld.so
rather than in the kernel?  That might be the right place to put
this fix too.

> I am attracted to the notion of converting an empty argv array passed
> to the kernel into something we can safely pass to userspace.
> 
> I think it would need to be having the first entry point to "" instead
> of the first entry being NULL.  That would maintain the invariant that you
> can always dereference a pointer in the argv array.

Yes, I like that better than NULL.
Ariadne Conill Jan. 26, 2022, 6:38 p.m. UTC | #8
Hi,

On Wed, 26 Jan 2022, Matthew Wilcox wrote:

> On Wed, Jan 26, 2022 at 10:57:29AM -0600, Eric W. Biederman wrote:
>> Matthew Wilcox <willy@infradead.org> writes:
>>
>>> On Wed, Jan 26, 2022 at 11:44:47AM +0000, Ariadne Conill wrote:
>>>> Interestingly, Michael Kerrisk opened an issue about this in 2008[1],
>>>> but there was no consensus to support fixing this issue then.
>>>> Hopefully now that CVE-2021-4034 shows practical exploitative use
>>>> of this bug in a shellcode, we can reconsider.
>>>>
>>>> [0]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
>>>> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=8408
>>>
>>> Having now read 8408 ... if ABI change is a concern (and I really doubt
>>> it is), we could treat calling execve() with a NULL argv as if the
>>> caller had passed an array of length 1 with the first element set to
>>> NULL.  Just like we reopen fds 0,1,2 for suid execs if they were
>>> closed.
>>
>> Where do we reopen fds 0,1,2 for suid execs?  I feel silly but I looked
>> through the code fs/exec.c quickly and I could not see it.
>
> I'm wondering if I misremembered and it's being done in ld.so
> rather than in the kernel?  That might be the right place to put
> this fix too.
>
>> I am attracted to the notion of converting an empty argv array passed
>> to the kernel into something we can safely pass to userspace.
>>
>> I think it would need to be having the first entry point to "" instead
>> of the first entry being NULL.  That would maintain the invariant that you
>> can always dereference a pointer in the argv array.
>
> Yes, I like that better than NULL.

If we are doing {"", NULL}, then I think it makes sense that we could just 
say argc == 1 at that point, which probably sidesteps the concern Jann 
raised with the {NULL, NULL} patch, no?

Ariadne
Kees Cook Jan. 26, 2022, 8:09 p.m. UTC | #9
On Wed, Jan 26, 2022 at 11:44:47AM +0000, Ariadne Conill wrote:
> In several other operating systems, it is a hard requirement that the
> first argument to execve(2) be the name of a program, thus prohibiting
> a scenario where argc < 1.  POSIX 2017 also recommends this behaviour,
> but it is not an explicit requirement[0]:
> 
>     The argument arg0 should point to a filename string that is
>     associated with the process being started by one of the exec
>     functions.
> 
> To ensure that execve(2) with argc < 1 is not a useful gadget for
> shellcode to use, we can validate this in do_execveat_common() and
> fail for this scenario, effectively blocking successful exploitation
> of CVE-2021-4034 and similar bugs which depend on this gadget.
> 
> The use of -EFAULT for this case is similar to other systems, such
> as FreeBSD, OpenBSD and Solaris.  QNX uses -EINVAL for this case.
> 
> Interestingly, Michael Kerrisk opened an issue about this in 2008[1],
> but there was no consensus to support fixing this issue then.
> Hopefully now that CVE-2021-4034 shows practical exploitative use
> of this bug in a shellcode, we can reconsider.
> 
> [0]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=8408
> 
> Changes from v1:
> - Rework commit message significantly.
> - Make the argv[0] check explicit rather than hijacking the error-check
>   for count().
> 
> Signed-off-by: Ariadne Conill <ariadne@dereferenced.org>
> ---
>  fs/exec.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index 79f2c9483302..e52c41991aab 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1899,6 +1899,10 @@ static int do_execveat_common(int fd, struct filename *filename,
>  	retval = count(argv, MAX_ARG_STRINGS);
>  	if (retval < 0)
>  		goto out_free;
> +	if (retval == 0) {
> +		retval = -EFAULT;
> +		goto out_free;
> +	}
>  	bprm->argc = retval;
>  
>  	retval = count(envp, MAX_ARG_STRINGS);
> -- 
> 2.34.1

Okay, so, the dangerous condition is userspace iterating through envp
when it thinks it's iterating argv.

Assuming it is not okay to break valgrind's test suite:
https://sources.debian.org/src/valgrind/1:3.18.1-1/none/tests/execve.c/?hl=22#L22
we cannot reject a NULL argv (test will fail), and we cannot mutate
argc=0 into argc=1 (test will enter infinite loop).

Perhaps we need to reject argv=NULL when envp!=NULL, and add a
pr_warn_once() about using a NULL argv?

I note that glibc already warns about NULL argv:
argc0.c:7:3: warning: null argument where non-null required (argument 2)
[-Wnonnull]
    7 |   execve(argv[0], NULL, envp);
      |   ^~~~~~

in the future we could expand this to only looking at argv=NULL?
Ariadne Conill Jan. 26, 2022, 8:23 p.m. UTC | #10
Hi,

On Wed, 26 Jan 2022, Kees Cook wrote:

> On Wed, Jan 26, 2022 at 11:44:47AM +0000, Ariadne Conill wrote:
>> In several other operating systems, it is a hard requirement that the
>> first argument to execve(2) be the name of a program, thus prohibiting
>> a scenario where argc < 1.  POSIX 2017 also recommends this behaviour,
>> but it is not an explicit requirement[0]:
>>
>>     The argument arg0 should point to a filename string that is
>>     associated with the process being started by one of the exec
>>     functions.
>>
>> To ensure that execve(2) with argc < 1 is not a useful gadget for
>> shellcode to use, we can validate this in do_execveat_common() and
>> fail for this scenario, effectively blocking successful exploitation
>> of CVE-2021-4034 and similar bugs which depend on this gadget.
>>
>> The use of -EFAULT for this case is similar to other systems, such
>> as FreeBSD, OpenBSD and Solaris.  QNX uses -EINVAL for this case.
>>
>> Interestingly, Michael Kerrisk opened an issue about this in 2008[1],
>> but there was no consensus to support fixing this issue then.
>> Hopefully now that CVE-2021-4034 shows practical exploitative use
>> of this bug in a shellcode, we can reconsider.
>>
>> [0]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
>> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=8408
>>
>> Changes from v1:
>> - Rework commit message significantly.
>> - Make the argv[0] check explicit rather than hijacking the error-check
>>   for count().
>>
>> Signed-off-by: Ariadne Conill <ariadne@dereferenced.org>
>> ---
>>  fs/exec.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/fs/exec.c b/fs/exec.c
>> index 79f2c9483302..e52c41991aab 100644
>> --- a/fs/exec.c
>> +++ b/fs/exec.c
>> @@ -1899,6 +1899,10 @@ static int do_execveat_common(int fd, struct filename *filename,
>>  	retval = count(argv, MAX_ARG_STRINGS);
>>  	if (retval < 0)
>>  		goto out_free;
>> +	if (retval == 0) {
>> +		retval = -EFAULT;
>> +		goto out_free;
>> +	}
>>  	bprm->argc = retval;
>>
>>  	retval = count(envp, MAX_ARG_STRINGS);
>> --
>> 2.34.1
>
> Okay, so, the dangerous condition is userspace iterating through envp
> when it thinks it's iterating argv.
>
> Assuming it is not okay to break valgrind's test suite:
> https://sources.debian.org/src/valgrind/1:3.18.1-1/none/tests/execve.c/?hl=22#L22
> we cannot reject a NULL argv (test will fail), and we cannot mutate
> argc=0 into argc=1 (test will enter infinite loop).
>
> Perhaps we need to reject argv=NULL when envp!=NULL, and add a
> pr_warn_once() about using a NULL argv?

Sure, I can rework the patch to do it for only the envp != NULL case.

I think we should combine it with the {NULL, NULL} padding patch in this 
case though, since it appears to work, that way the execve(..., NULL, 
NULL) case gets some protection.

> I note that glibc already warns about NULL argv:
> argc0.c:7:3: warning: null argument where non-null required (argument 2)
> [-Wnonnull]
>    7 |   execve(argv[0], NULL, envp);
>      |   ^~~~~~
>
> in the future we could expand this to only looking at argv=NULL?

I don't think musl's headers generate a diagnostic for this, but main(0, 
{NULL}) is not a supported use-case at least as far as Alpine is 
concerned.  I am sure it is the same with the other musl distributions.

Will send a v3 patch with this logic change and move to EINVAL shortly.

Ariadne
Kees Cook Jan. 26, 2022, 8:56 p.m. UTC | #11
On Wed, Jan 26, 2022 at 02:23:59PM -0600, Ariadne Conill wrote:
> Hi,
> 
> On Wed, 26 Jan 2022, Kees Cook wrote:
> 
> > On Wed, Jan 26, 2022 at 11:44:47AM +0000, Ariadne Conill wrote:
> > > In several other operating systems, it is a hard requirement that the
> > > first argument to execve(2) be the name of a program, thus prohibiting
> > > a scenario where argc < 1.  POSIX 2017 also recommends this behaviour,
> > > but it is not an explicit requirement[0]:
> > > 
> > >     The argument arg0 should point to a filename string that is
> > >     associated with the process being started by one of the exec
> > >     functions.
> > > 
> > > To ensure that execve(2) with argc < 1 is not a useful gadget for
> > > shellcode to use, we can validate this in do_execveat_common() and
> > > fail for this scenario, effectively blocking successful exploitation
> > > of CVE-2021-4034 and similar bugs which depend on this gadget.
> > > 
> > > The use of -EFAULT for this case is similar to other systems, such
> > > as FreeBSD, OpenBSD and Solaris.  QNX uses -EINVAL for this case.
> > > 
> > > Interestingly, Michael Kerrisk opened an issue about this in 2008[1],
> > > but there was no consensus to support fixing this issue then.
> > > Hopefully now that CVE-2021-4034 shows practical exploitative use
> > > of this bug in a shellcode, we can reconsider.
> > > 
> > > [0]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
> > > [1]: https://bugzilla.kernel.org/show_bug.cgi?id=8408
> > > 
> > > Changes from v1:
> > > - Rework commit message significantly.
> > > - Make the argv[0] check explicit rather than hijacking the error-check
> > >   for count().
> > > 
> > > Signed-off-by: Ariadne Conill <ariadne@dereferenced.org>
> > > ---
> > >  fs/exec.c | 4 ++++
> > >  1 file changed, 4 insertions(+)
> > > 
> > > diff --git a/fs/exec.c b/fs/exec.c
> > > index 79f2c9483302..e52c41991aab 100644
> > > --- a/fs/exec.c
> > > +++ b/fs/exec.c
> > > @@ -1899,6 +1899,10 @@ static int do_execveat_common(int fd, struct filename *filename,
> > >  	retval = count(argv, MAX_ARG_STRINGS);
> > >  	if (retval < 0)
> > >  		goto out_free;
> > > +	if (retval == 0) {
> > > +		retval = -EFAULT;
> > > +		goto out_free;
> > > +	}
> > >  	bprm->argc = retval;
> > > 
> > >  	retval = count(envp, MAX_ARG_STRINGS);
> > > --
> > > 2.34.1
> > 
> > Okay, so, the dangerous condition is userspace iterating through envp
> > when it thinks it's iterating argv.
> > 
> > Assuming it is not okay to break valgrind's test suite:
> > https://sources.debian.org/src/valgrind/1:3.18.1-1/none/tests/execve.c/?hl=22#L22
> > we cannot reject a NULL argv (test will fail), and we cannot mutate
> > argc=0 into argc=1 (test will enter infinite loop).
> > 
> > Perhaps we need to reject argv=NULL when envp!=NULL, and add a
> > pr_warn_once() about using a NULL argv?
> 
> Sure, I can rework the patch to do it for only the envp != NULL case.
> 
> I think we should combine it with the {NULL, NULL} padding patch in this
> case though, since it appears to work, that way the execve(..., NULL, NULL)
> case gets some protection.

I don't think the padding will actually work correctly, for the reason
Jann pointed out. My testing shows that suddenly my envp becomes NULL,
but libc is just counting argc to find envp to pass into main.

> > I note that glibc already warns about NULL argv:
> > argc0.c:7:3: warning: null argument where non-null required (argument 2)
> > [-Wnonnull]
> >    7 |   execve(argv[0], NULL, envp);
> >      |   ^~~~~~
> > 
> > in the future we could expand this to only looking at argv=NULL?
> 
> I don't think musl's headers generate a diagnostic for this, but main(0,
> {NULL}) is not a supported use-case at least as far as Alpine is concerned.
> I am sure it is the same with the other musl distributions.
> 
> Will send a v3 patch with this logic change and move to EINVAL shortly.

I took a spin too. Refuses execve(..., NULL, !NULL), injects "" argv[0]
for execve(..., NULL, NULL):


diff --git a/fs/exec.c b/fs/exec.c
index a098c133d8d7..0565089d5f9e 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1917,9 +1917,40 @@ static int do_execveat_common(int fd, struct filename *filename,
 	if (retval < 0)
 		goto out_free;
 
-	retval = copy_strings(bprm->argc, argv, bprm);
-	if (retval < 0)
-		goto out_free;
+	if (likely(bprm->argc > 0)) {
+		retval = copy_strings(bprm->argc, argv, bprm);
+		if (retval < 0)
+			goto out_free;
+	} else {
+		const char * const argv0 = "";
+
+		/*
+		 * Start making some noise about the argc == NULL case that
+		 * POSIX doesn't like and other Unix-like systems refuse.
+		 */
+		pr_warn_once("process '%s' used a NULL argv\n", bprm->filename);
+
+		/*
+		 * Refuse to execute when argc == 0 and envc > 0, since this
+		 * can lead to userspace iterating envp if it fails to check
+		 * for argc == 0.
+		 *
+		 * i.e. continue to allow: execve(path, NULL, NULL);
+		 */
+		if (bprm->envc > 0) {
+			retval = -EINVAL;
+			goto out_free;
+		}
+
+		/*
+		 * Force an argv of {"", NULL} if argc == 0 so that broken
+		 * userspace that assumes argc != 0 will not be surprised.
+		 */
+		bprm->argc = 1;
+		retval = copy_strings_kernel(bprm->argc, &argv0, bprm);
+		if (retval < 0)
+			goto out_free;
+	}
 
 	retval = bprm_execve(bprm, fd, filename, flags);
 out_free:


$ cat argc0.c
#include <stdio.h>
#include <unistd.h>

int main(int argc, char *argv[], char *envp[])
{
        if (argv[0][0] != '\0') {
                printf("execve(argv[0], NULL, envp);\n");
                execve(argv[0], NULL, envp);
                perror("execve");
                printf("execve(argv[0], NULL, NULL);\n");
                execve(argv[0], NULL, NULL);
                return 0;
        }
        printf("argc=%d\n", argc);
        printf("argv[0]%p=%s\n", &argv[0], argv[0]);
        printf("argv[1]%p=%s\n", &argv[1], argv[1]);
        printf("envp[0]%p=%s\n", &envp[0], envp[0]);
        return 0;
}

$ gcc -Wall argc0.c -o argc0
argc0.c: In function 'main':
argc0.c:8:3: warning: null argument where non-null required (argument 2) [-Wnonnull]
    8 |   execve(argv[0], NULL, envp);
      |   ^~~~~~
argc0.c:11:3: warning: null argument where non-null required (argument 2) [-Wnonnull]
   11 |   execve(argv[0], NULL, NULL);
      |   ^~~~~~

$ ./argc0
execve(argv[0], NULL, envp);
execve: Invalid argument
execve(argv[0], NULL, NULL);
argc=1
argv[0]0x7fff1f577bd8=
argv[1]0x7fff1f577be0=(null)
envp[0]0x7fff1f577be8=(null)

$ dmesg | tail -n1
[   20.748467] process './argc0' used a NULL argv
Ariadne Conill Jan. 26, 2022, 9:13 p.m. UTC | #12
Hi,

On Wed, 26 Jan 2022, Kees Cook wrote:

> On Wed, Jan 26, 2022 at 02:23:59PM -0600, Ariadne Conill wrote:
>> Hi,
>>
>> On Wed, 26 Jan 2022, Kees Cook wrote:
>>
>>> On Wed, Jan 26, 2022 at 11:44:47AM +0000, Ariadne Conill wrote:
>>>> In several other operating systems, it is a hard requirement that the
>>>> first argument to execve(2) be the name of a program, thus prohibiting
>>>> a scenario where argc < 1.  POSIX 2017 also recommends this behaviour,
>>>> but it is not an explicit requirement[0]:
>>>>
>>>>     The argument arg0 should point to a filename string that is
>>>>     associated with the process being started by one of the exec
>>>>     functions.
>>>>
>>>> To ensure that execve(2) with argc < 1 is not a useful gadget for
>>>> shellcode to use, we can validate this in do_execveat_common() and
>>>> fail for this scenario, effectively blocking successful exploitation
>>>> of CVE-2021-4034 and similar bugs which depend on this gadget.
>>>>
>>>> The use of -EFAULT for this case is similar to other systems, such
>>>> as FreeBSD, OpenBSD and Solaris.  QNX uses -EINVAL for this case.
>>>>
>>>> Interestingly, Michael Kerrisk opened an issue about this in 2008[1],
>>>> but there was no consensus to support fixing this issue then.
>>>> Hopefully now that CVE-2021-4034 shows practical exploitative use
>>>> of this bug in a shellcode, we can reconsider.
>>>>
>>>> [0]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
>>>> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=8408
>>>>
>>>> Changes from v1:
>>>> - Rework commit message significantly.
>>>> - Make the argv[0] check explicit rather than hijacking the error-check
>>>>   for count().
>>>>
>>>> Signed-off-by: Ariadne Conill <ariadne@dereferenced.org>
>>>> ---
>>>>  fs/exec.c | 4 ++++
>>>>  1 file changed, 4 insertions(+)
>>>>
>>>> diff --git a/fs/exec.c b/fs/exec.c
>>>> index 79f2c9483302..e52c41991aab 100644
>>>> --- a/fs/exec.c
>>>> +++ b/fs/exec.c
>>>> @@ -1899,6 +1899,10 @@ static int do_execveat_common(int fd, struct filename *filename,
>>>>  	retval = count(argv, MAX_ARG_STRINGS);
>>>>  	if (retval < 0)
>>>>  		goto out_free;
>>>> +	if (retval == 0) {
>>>> +		retval = -EFAULT;
>>>> +		goto out_free;
>>>> +	}
>>>>  	bprm->argc = retval;
>>>>
>>>>  	retval = count(envp, MAX_ARG_STRINGS);
>>>> --
>>>> 2.34.1
>>>
>>> Okay, so, the dangerous condition is userspace iterating through envp
>>> when it thinks it's iterating argv.
>>>
>>> Assuming it is not okay to break valgrind's test suite:
>>> https://sources.debian.org/src/valgrind/1:3.18.1-1/none/tests/execve.c/?hl=22#L22
>>> we cannot reject a NULL argv (test will fail), and we cannot mutate
>>> argc=0 into argc=1 (test will enter infinite loop).
>>>
>>> Perhaps we need to reject argv=NULL when envp!=NULL, and add a
>>> pr_warn_once() about using a NULL argv?
>>
>> Sure, I can rework the patch to do it for only the envp != NULL case.
>>
>> I think we should combine it with the {NULL, NULL} padding patch in this
>> case though, since it appears to work, that way the execve(..., NULL, NULL)
>> case gets some protection.
>
> I don't think the padding will actually work correctly, for the reason
> Jann pointed out. My testing shows that suddenly my envp becomes NULL,
> but libc is just counting argc to find envp to pass into main.
>
>>> I note that glibc already warns about NULL argv:
>>> argc0.c:7:3: warning: null argument where non-null required (argument 2)
>>> [-Wnonnull]
>>>    7 |   execve(argv[0], NULL, envp);
>>>      |   ^~~~~~
>>>
>>> in the future we could expand this to only looking at argv=NULL?
>>
>> I don't think musl's headers generate a diagnostic for this, but main(0,
>> {NULL}) is not a supported use-case at least as far as Alpine is concerned.
>> I am sure it is the same with the other musl distributions.
>>
>> Will send a v3 patch with this logic change and move to EINVAL shortly.
>
> I took a spin too. Refuses execve(..., NULL, !NULL), injects "" argv[0]
> for execve(..., NULL, NULL):
>
>
> diff --git a/fs/exec.c b/fs/exec.c
> index a098c133d8d7..0565089d5f9e 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1917,9 +1917,40 @@ static int do_execveat_common(int fd, struct filename *filename,
> 	if (retval < 0)
> 		goto out_free;
>
> -	retval = copy_strings(bprm->argc, argv, bprm);
> -	if (retval < 0)
> -		goto out_free;
> +	if (likely(bprm->argc > 0)) {
> +		retval = copy_strings(bprm->argc, argv, bprm);
> +		if (retval < 0)
> +			goto out_free;
> +	} else {
> +		const char * const argv0 = "";
> +
> +		/*
> +		 * Start making some noise about the argc == NULL case that
> +		 * POSIX doesn't like and other Unix-like systems refuse.
> +		 */
> +		pr_warn_once("process '%s' used a NULL argv\n", bprm->filename);
> +
> +		/*
> +		 * Refuse to execute when argc == 0 and envc > 0, since this
> +		 * can lead to userspace iterating envp if it fails to check
> +		 * for argc == 0.
> +		 *
> +		 * i.e. continue to allow: execve(path, NULL, NULL);
> +		 */
> +		if (bprm->envc > 0) {
> +			retval = -EINVAL;
> +			goto out_free;
> +		}
> +
> +		/*
> +		 * Force an argv of {"", NULL} if argc == 0 so that broken
> +		 * userspace that assumes argc != 0 will not be surprised.
> +		 */
> +		bprm->argc = 1;
> +		retval = copy_strings_kernel(bprm->argc, &argv0, bprm);
> +		if (retval < 0)
> +			goto out_free;
> +	}
>
> 	retval = bprm_execve(bprm, fd, filename, flags);
> out_free:

Looks good to me, but I wonder if we shouldn't set an argv of 
{bprm->filename, NULL} instead of {"", NULL}.  Discussion in IRC led to 
the realization that multicall programs will try to use argv[0] and might 
crash in this scenario.  If we're going to fake an argv, I guess we should 
try to do it right.

Ariadne
Kees Cook Jan. 26, 2022, 9:25 p.m. UTC | #13
On Wed, Jan 26, 2022 at 03:13:10PM -0600, Ariadne Conill wrote:
> Looks good to me, but I wonder if we shouldn't set an argv of
> {bprm->filename, NULL} instead of {"", NULL}.  Discussion in IRC led to the
> realization that multicall programs will try to use argv[0] and might crash
> in this scenario.  If we're going to fake an argv, I guess we should try to
> do it right.

They're crashing currently, though, yes? I think the goal is to move
toward making execve(..., NULL, NULL) just not work at all. Using the
{"", NULL} injection just gets us closer to protecting a bad userspace
program. I think things _should_ crash if they try to start depending
on this work-around.
Ariadne Conill Jan. 26, 2022, 9:30 p.m. UTC | #14
Hi,

On Wed, 26 Jan 2022, Kees Cook wrote:

> On Wed, Jan 26, 2022 at 03:13:10PM -0600, Ariadne Conill wrote:
>> Looks good to me, but I wonder if we shouldn't set an argv of
>> {bprm->filename, NULL} instead of {"", NULL}.  Discussion in IRC led to the
>> realization that multicall programs will try to use argv[0] and might crash
>> in this scenario.  If we're going to fake an argv, I guess we should try to
>> do it right.
>
> They're crashing currently, though, yes? I think the goal is to move
> toward making execve(..., NULL, NULL) just not work at all. Using the
> {"", NULL} injection just gets us closer to protecting a bad userspace
> program. I think things _should_ crash if they try to start depending
> on this work-around.

Is there a reason to spawn a program, just to have it crash, rather than 
just denying it to begin with, though?

I mean, it all seems fine enough, and perhaps I'm just a bit picky on the 
colors and flavors of my bikesheds, so if you want to go with this patch, 
I'll be glad to carry it in the Alpine security update I am doing to make 
sure the *other* GLib-using SUID programs people find don't get exploited 
in the same way.

Ariadne
Kees Cook Jan. 26, 2022, 10:49 p.m. UTC | #15
On Wed, Jan 26, 2022 at 03:30:13PM -0600, Ariadne Conill wrote:
> Hi,
> 
> On Wed, 26 Jan 2022, Kees Cook wrote:
> 
> > On Wed, Jan 26, 2022 at 03:13:10PM -0600, Ariadne Conill wrote:
> > > Looks good to me, but I wonder if we shouldn't set an argv of
> > > {bprm->filename, NULL} instead of {"", NULL}.  Discussion in IRC led to the
> > > realization that multicall programs will try to use argv[0] and might crash
> > > in this scenario.  If we're going to fake an argv, I guess we should try to
> > > do it right.
> > 
> > They're crashing currently, though, yes? I think the goal is to move
> > toward making execve(..., NULL, NULL) just not work at all. Using the
> > {"", NULL} injection just gets us closer to protecting a bad userspace
> > program. I think things _should_ crash if they try to start depending
> > on this work-around.
> 
> Is there a reason to spawn a program, just to have it crash, rather than
> just denying it to begin with, though?

I think the correct behavior here is to unconditionally reject a NULL
argv -- and I wish this had gotten fixed in 2008. :P Given the code we've
found that depends on NULL argv, I think we likely can't make the change
outright, so we're down this weird rabbit hole of trying to reject what we
can and create work-around behaviors for the cases that currently exist.
I think new users of the new work-around shouldn't be considered. We'd
prefer they get a rejection, etc.

> I mean, it all seems fine enough, and perhaps I'm just a bit picky on the
> colors and flavors of my bikesheds, so if you want to go with this patch,
> I'll be glad to carry it in the Alpine security update I am doing to make
> sure the *other* GLib-using SUID programs people find don't get exploited in
> the same way.

They "don't break userspace" guideline is really "don't break userspace
if someone notices". :P Since this is a mitigation (not strictly a
security flaw fix), changes to userspace behavior tend to be very
conservatively viewed by Linus. ;)

My preference is the earlier very simple version to fix this:

diff --git a/fs/exec.c b/fs/exec.c
index 79f2c9483302..aabadcf4a525 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1897,6 +1897,8 @@ static int do_execveat_common(int fd, struct filename *filename,
 	}
 
 	retval = count(argv, MAX_ARG_STRINGS);
+	if (reval == 0)
+		retval = -EINVAL;
 	if (retval < 0)
 		goto out_free;
 	bprm->argc = retval;

So, I guess we should start there and send a patch to valgrind?
Ariadne Conill Jan. 26, 2022, 11:07 p.m. UTC | #16
Hi,

On Wed, 26 Jan 2022, Kees Cook wrote:

> On Wed, Jan 26, 2022 at 03:30:13PM -0600, Ariadne Conill wrote:
>> Hi,
>>
>> On Wed, 26 Jan 2022, Kees Cook wrote:
>>
>>> On Wed, Jan 26, 2022 at 03:13:10PM -0600, Ariadne Conill wrote:
>>>> Looks good to me, but I wonder if we shouldn't set an argv of
>>>> {bprm->filename, NULL} instead of {"", NULL}.  Discussion in IRC led to the
>>>> realization that multicall programs will try to use argv[0] and might crash
>>>> in this scenario.  If we're going to fake an argv, I guess we should try to
>>>> do it right.
>>>
>>> They're crashing currently, though, yes? I think the goal is to move
>>> toward making execve(..., NULL, NULL) just not work at all. Using the
>>> {"", NULL} injection just gets us closer to protecting a bad userspace
>>> program. I think things _should_ crash if they try to start depending
>>> on this work-around.
>>
>> Is there a reason to spawn a program, just to have it crash, rather than
>> just denying it to begin with, though?
>
> I think the correct behavior here is to unconditionally reject a NULL
> argv -- and I wish this had gotten fixed in 2008. :P Given the code we've
> found that depends on NULL argv, I think we likely can't make the change
> outright, so we're down this weird rabbit hole of trying to reject what we
> can and create work-around behaviors for the cases that currently exist.
> I think new users of the new work-around shouldn't be considered. We'd
> prefer they get a rejection, etc.
>
>> I mean, it all seems fine enough, and perhaps I'm just a bit picky on the
>> colors and flavors of my bikesheds, so if you want to go with this patch,
>> I'll be glad to carry it in the Alpine security update I am doing to make
>> sure the *other* GLib-using SUID programs people find don't get exploited in
>> the same way.
>
> They "don't break userspace" guideline is really "don't break userspace
> if someone notices". :P Since this is a mitigation (not strictly a
> security flaw fix), changes to userspace behavior tend to be very
> conservatively viewed by Linus. ;)
>
> My preference is the earlier very simple version to fix this:
>
> diff --git a/fs/exec.c b/fs/exec.c
> index 79f2c9483302..aabadcf4a525 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1897,6 +1897,8 @@ static int do_execveat_common(int fd, struct filename *filename,
> 	}
>
> 	retval = count(argv, MAX_ARG_STRINGS);
> +	if (reval == 0)
> +		retval = -EINVAL;
> 	if (retval < 0)
> 		goto out_free;
> 	bprm->argc = retval;
>
> So, I guess we should start there and send a patch to valgrind?

Yes, seems reasonable, though without the typo :)

Since you've already written the patch, do you want to proceed with it?
If so, I can work on the Valgrind tests.

Ariadne
diff mbox series

Patch

diff --git a/fs/exec.c b/fs/exec.c
index 79f2c9483302..e52c41991aab 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1899,6 +1899,10 @@  static int do_execveat_common(int fd, struct filename *filename,
 	retval = count(argv, MAX_ARG_STRINGS);
 	if (retval < 0)
 		goto out_free;
+	if (retval == 0) {
+		retval = -EFAULT;
+		goto out_free;
+	}
 	bprm->argc = retval;
 
 	retval = count(envp, MAX_ARG_STRINGS);