diff mbox series

[3/4] proc: io_accounting: Use new infrastructure to fix deadlocks in execve

Message ID AM6PR03MB5170BD2476E35068E182EFA4E4FF0@AM6PR03MB5170.eurprd03.prod.outlook.com (mailing list archive)
State New, archived
Headers show
Series Use new infrastructure to fix deadlocks in execve | expand

Commit Message

Bernd Edlinger March 10, 2020, 5:45 p.m. UTC
This changes do_io_accounting to use the new exec_update_mutex
instead of cred_guard_mutex.

This fixes possible deadlocks when the trace is accessing
/proc/$pid/io for instance.

This should be safe, as the credentials are only used for reading.

Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
---
 fs/proc/base.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Eric W. Biederman March 10, 2020, 7:06 p.m. UTC | #1
Bernd Edlinger <bernd.edlinger@hotmail.de> writes:

> This changes do_io_accounting to use the new exec_update_mutex
> instead of cred_guard_mutex.
>
> This fixes possible deadlocks when the trace is accessing
> /proc/$pid/io for instance.
>
> This should be safe, as the credentials are only used for reading.

This is an improvement.

We probably want to do this just as an incremental step in making things
better but perhaps I am blind but I am not finding the reason for
guarding this with the cred_guard_mutex to be at all persuasive.

I think moving the ptrace_may_access check down to after the
unlock_task_sighand would be just as effective at addressing the
concerns raised in the original commit.  I think the task_lock provides
all of the barrier we need to make it safe to move the ptrace_may_access
checks safe.

The reason I say this is I don't see exec changing ->ioac.  Just
performing some I/O which would update the io accounting statistics.

Can anyone see if I am wrong?

Eric


commit 293eb1e7772b25a93647c798c7b89bf26c2da2e0
Author: Vasiliy Kulikov <segoon@openwall.com>
Date:   Tue Jul 26 16:08:38 2011 -0700

    proc: fix a race in do_io_accounting()
    
    If an inode's mode permits opening /proc/PID/io and the resulting file
    descriptor is kept across execve() of a setuid or similar binary, the
    ptrace_may_access() check tries to prevent using this fd against the
    task with escalated privileges.
    
    Unfortunately, there is a race in the check against execve().  If
    execve() is processed after the ptrace check, but before the actual io
    information gathering, io statistics will be gathered from the
    privileged process.  At least in theory this might lead to gathering
    sensible information (like ssh/ftp password length) that wouldn't be
    available otherwise.
    
    Holding task->signal->cred_guard_mutex while gathering the io
    information should protect against the race.
    
    The order of locking is similar to the one inside of ptrace_attach():
    first goes cred_guard_mutex, then lock_task_sighand().
    
    Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: <stable@kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>



> Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
> ---
>  fs/proc/base.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 4fdfe4f..529d0c6 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -2770,7 +2770,7 @@ static int do_io_accounting(struct task_struct *task, struct seq_file *m, int wh
>  	unsigned long flags;
>  	int result;
>  
> -	result = mutex_lock_killable(&task->signal->cred_guard_mutex);
> +	result = mutex_lock_killable(&task->signal->exec_update_mutex);
>  	if (result)
>  		return result;
>  
> @@ -2806,7 +2806,7 @@ static int do_io_accounting(struct task_struct *task, struct seq_file *m, int wh
>  	result = 0;
>  
>  out_unlock:
> -	mutex_unlock(&task->signal->cred_guard_mutex);
> +	mutex_unlock(&task->signal->exec_update_mutex);
>  	return result;
>  }
Bernd Edlinger March 10, 2020, 8:19 p.m. UTC | #2
On 3/10/20 8:06 PM, Eric W. Biederman wrote:
> Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
> 
>> This changes do_io_accounting to use the new exec_update_mutex
>> instead of cred_guard_mutex.
>>
>> This fixes possible deadlocks when the trace is accessing
>> /proc/$pid/io for instance.
>>
>> This should be safe, as the credentials are only used for reading.
> 
> This is an improvement.
> 
> We probably want to do this just as an incremental step in making things
> better but perhaps I am blind but I am not finding the reason for
> guarding this with the cred_guard_mutex to be at all persuasive.
> 
> I think moving the ptrace_may_access check down to after the
> unlock_task_sighand would be just as effective at addressing the
> concerns raised in the original commit.  I think the task_lock provides
> all of the barrier we need to make it safe to move the ptrace_may_access
> checks safe.
> 
> The reason I say this is I don't see exec changing ->ioac.  Just
> performing some I/O which would update the io accounting statistics.
> 

Maybe the suid executable is starting up and doing io or not,
and what the program does immediately at startup is a secret,
that we want to keep secret but evil eve want to find out.
eve is using /proc/alice/io to do that.

It is a bit constructed, but seems like a security concern.
when we keep the exec_update_mutex while collecting the data, we
cannot see any io of the new process when the new credentials
don't allow that.


Bernd.

> Can anyone see if I am wrong?
> 
> Eric
> 
> 
> commit 293eb1e7772b25a93647c798c7b89bf26c2da2e0
> Author: Vasiliy Kulikov <segoon@openwall.com>
> Date:   Tue Jul 26 16:08:38 2011 -0700
> 
>     proc: fix a race in do_io_accounting()
>     
>     If an inode's mode permits opening /proc/PID/io and the resulting file
>     descriptor is kept across execve() of a setuid or similar binary, the
>     ptrace_may_access() check tries to prevent using this fd against the
>     task with escalated privileges.
>     
>     Unfortunately, there is a race in the check against execve().  If
>     execve() is processed after the ptrace check, but before the actual io
>     information gathering, io statistics will be gathered from the
>     privileged process.  At least in theory this might lead to gathering
>     sensible information (like ssh/ftp password length) that wouldn't be
>     available otherwise.
>     
>     Holding task->signal->cred_guard_mutex while gathering the io
>     information should protect against the race.
>     
>     The order of locking is similar to the one inside of ptrace_attach():
>     first goes cred_guard_mutex, then lock_task_sighand().
>     
>     Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
>     Cc: Al Viro <viro@zeniv.linux.org.uk>
>     Cc: <stable@kernel.org>
>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> 
> 
> 
>> Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
>> ---
>>  fs/proc/base.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/proc/base.c b/fs/proc/base.c
>> index 4fdfe4f..529d0c6 100644
>> --- a/fs/proc/base.c
>> +++ b/fs/proc/base.c
>> @@ -2770,7 +2770,7 @@ static int do_io_accounting(struct task_struct *task, struct seq_file *m, int wh
>>  	unsigned long flags;
>>  	int result;
>>  
>> -	result = mutex_lock_killable(&task->signal->cred_guard_mutex);
>> +	result = mutex_lock_killable(&task->signal->exec_update_mutex);
>>  	if (result)
>>  		return result;
>>  
>> @@ -2806,7 +2806,7 @@ static int do_io_accounting(struct task_struct *task, struct seq_file *m, int wh
>>  	result = 0;
>>  
>>  out_unlock:
>> -	mutex_unlock(&task->signal->cred_guard_mutex);
>> +	mutex_unlock(&task->signal->exec_update_mutex);
>>  	return result;
>>  }
Eric W. Biederman March 10, 2020, 9:25 p.m. UTC | #3
Bernd Edlinger <bernd.edlinger@hotmail.de> writes:

> On 3/10/20 8:06 PM, Eric W. Biederman wrote:
>> Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
>> 
>>> This changes do_io_accounting to use the new exec_update_mutex
>>> instead of cred_guard_mutex.
>>>
>>> This fixes possible deadlocks when the trace is accessing
>>> /proc/$pid/io for instance.
>>>
>>> This should be safe, as the credentials are only used for reading.
>> 
>> This is an improvement.
>> 
>> We probably want to do this just as an incremental step in making things
>> better but perhaps I am blind but I am not finding the reason for
>> guarding this with the cred_guard_mutex to be at all persuasive.
>> 
>> I think moving the ptrace_may_access check down to after the
>> unlock_task_sighand would be just as effective at addressing the
>> concerns raised in the original commit.  I think the task_lock provides
>> all of the barrier we need to make it safe to move the ptrace_may_access
>> checks safe.
>> 
>> The reason I say this is I don't see exec changing ->ioac.  Just
>> performing some I/O which would update the io accounting statistics.
>> 
>
> Maybe the suid executable is starting up and doing io or not,
> and what the program does immediately at startup is a secret,
> that we want to keep secret but evil eve want to find out.
> eve is using /proc/alice/io to do that.
>
> It is a bit constructed, but seems like a security concern.
> when we keep the exec_update_mutex while collecting the data, we
> cannot see any io of the new process when the new credentials
> don't allow that.

Jann Horn has convinced me we should just convert these to the
exec_change_mutex today.  Because while not 100% correct in theory, the
only really interesting case is exec.  So the code does something
interesting and worth while, and mostly correct.  The last thing I want
to do is to cause an unnecessary regression.

Eric
Kees Cook March 11, 2020, 7:08 p.m. UTC | #4
On Tue, Mar 10, 2020 at 06:45:47PM +0100, Bernd Edlinger wrote:
> This changes do_io_accounting to use the new exec_update_mutex
> instead of cred_guard_mutex.
> 
> This fixes possible deadlocks when the trace is accessing
> /proc/$pid/io for instance.
> 
> This should be safe, as the credentials are only used for reading.

I'd like to see the rationale described better here for why it should be
safe. I'm still not seeing why this is safe here, as we might check
ptrace_may_access() with one cred and then iterate io accounting with a
different credential...

What am I missing?

-Kees

> 
> Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
> ---
>  fs/proc/base.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 4fdfe4f..529d0c6 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -2770,7 +2770,7 @@ static int do_io_accounting(struct task_struct *task, struct seq_file *m, int wh
>  	unsigned long flags;
>  	int result;
>  
> -	result = mutex_lock_killable(&task->signal->cred_guard_mutex);
> +	result = mutex_lock_killable(&task->signal->exec_update_mutex);
>  	if (result)
>  		return result;
>  
> @@ -2806,7 +2806,7 @@ static int do_io_accounting(struct task_struct *task, struct seq_file *m, int wh
>  	result = 0;
>  
>  out_unlock:
> -	mutex_unlock(&task->signal->cred_guard_mutex);
> +	mutex_unlock(&task->signal->exec_update_mutex);
>  	return result;
>  }
>  
> -- 
> 1.9.1
Bernd Edlinger March 11, 2020, 7:48 p.m. UTC | #5
On 3/11/20 8:08 PM, Kees Cook wrote:
> On Tue, Mar 10, 2020 at 06:45:47PM +0100, Bernd Edlinger wrote:
>> This changes do_io_accounting to use the new exec_update_mutex
>> instead of cred_guard_mutex.
>>
>> This fixes possible deadlocks when the trace is accessing
>> /proc/$pid/io for instance.
>>
>> This should be safe, as the credentials are only used for reading.
> 
> I'd like to see the rationale described better here for why it should be
> safe. I'm still not seeing why this is safe here, as we might check
> ptrace_may_access() with one cred and then iterate io accounting with a
> different credential...
> 
> What am I missing?
> 

The same here, even if execve is already started, the credentials
are not actually changed until the execve acquired the exec_update_mutex.

The data flow is from the task->cred => do_io_accounting,
if the data flow would be from do_io_accounting => task's no new privs
you would see an entirely different patch.

I am open for suggestions how to improve the description, or even
add a comment from time to time :)

Thanks
Bernd.

> -Kees
> 
>>
>> Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
>> ---
>>  fs/proc/base.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/proc/base.c b/fs/proc/base.c
>> index 4fdfe4f..529d0c6 100644
>> --- a/fs/proc/base.c
>> +++ b/fs/proc/base.c
>> @@ -2770,7 +2770,7 @@ static int do_io_accounting(struct task_struct *task, struct seq_file *m, int wh
>>  	unsigned long flags;
>>  	int result;
>>  
>> -	result = mutex_lock_killable(&task->signal->cred_guard_mutex);
>> +	result = mutex_lock_killable(&task->signal->exec_update_mutex);
>>  	if (result)
>>  		return result;
>>  
>> @@ -2806,7 +2806,7 @@ static int do_io_accounting(struct task_struct *task, struct seq_file *m, int wh
>>  	result = 0;
>>  
>>  out_unlock:
>> -	mutex_unlock(&task->signal->cred_guard_mutex);
>> +	mutex_unlock(&task->signal->exec_update_mutex);
>>  	return result;
>>  }
>>  
>> -- 
>> 1.9.1
>
Eric W. Biederman March 11, 2020, 7:48 p.m. UTC | #6
Kees Cook <keescook@chromium.org> writes:

> On Tue, Mar 10, 2020 at 06:45:47PM +0100, Bernd Edlinger wrote:
>> This changes do_io_accounting to use the new exec_update_mutex
>> instead of cred_guard_mutex.
>> 
>> This fixes possible deadlocks when the trace is accessing
>> /proc/$pid/io for instance.
>> 
>> This should be safe, as the credentials are only used for reading.
>
> I'd like to see the rationale described better here for why it should be
> safe. I'm still not seeing why this is safe here, as we might check
> ptrace_may_access() with one cred and then iterate io accounting with a
> different credential...
>
> What am I missing?

The rational for non-regression is that exec_update_mutex covers all
of the same tsk->cred changes as cred_guard_mutex.  Therefore we are not
any worse off, and we avoid the deadlock.

As for safety.  Jann's argument that the only interesting credential
change is in exec applies.  All other credential changes that have any
effect on permission checks make the new cred non-dumpable (excepions
apply see the code).

So I think this is a non-regressing change.  A safe change.

I don't think either version of this code is fully correct.

Eric

>> Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
>> ---
>>  fs/proc/base.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>> 
>> diff --git a/fs/proc/base.c b/fs/proc/base.c
>> index 4fdfe4f..529d0c6 100644
>> --- a/fs/proc/base.c
>> +++ b/fs/proc/base.c
>> @@ -2770,7 +2770,7 @@ static int do_io_accounting(struct task_struct *task, struct seq_file *m, int wh
>>  	unsigned long flags;
>>  	int result;
>>  
>> -	result = mutex_lock_killable(&task->signal->cred_guard_mutex);
>> +	result = mutex_lock_killable(&task->signal->exec_update_mutex);
>>  	if (result)
>>  		return result;
>>  
>> @@ -2806,7 +2806,7 @@ static int do_io_accounting(struct task_struct *task, struct seq_file *m, int wh
>>  	result = 0;
>>  
>>  out_unlock:
>> -	mutex_unlock(&task->signal->cred_guard_mutex);
>> +	mutex_unlock(&task->signal->exec_update_mutex);
>>  	return result;
>>  }
>>  
>> -- 
>> 1.9.1
diff mbox series

Patch

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 4fdfe4f..529d0c6 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2770,7 +2770,7 @@  static int do_io_accounting(struct task_struct *task, struct seq_file *m, int wh
 	unsigned long flags;
 	int result;
 
-	result = mutex_lock_killable(&task->signal->cred_guard_mutex);
+	result = mutex_lock_killable(&task->signal->exec_update_mutex);
 	if (result)
 		return result;
 
@@ -2806,7 +2806,7 @@  static int do_io_accounting(struct task_struct *task, struct seq_file *m, int wh
 	result = 0;
 
 out_unlock:
-	mutex_unlock(&task->signal->cred_guard_mutex);
+	mutex_unlock(&task->signal->exec_update_mutex);
 	return result;
 }