diff mbox series

[v5,5/6] prctl: Allow checkpoint/restore capable processes to change exe link

Message ID 20200715144954.1387760-6-areber@redhat.com (mailing list archive)
State New, archived
Headers show
Series capabilities: Introduce CAP_CHECKPOINT_RESTORE | expand

Commit Message

Adrian Reber July 15, 2020, 2:49 p.m. UTC
From: Nicolas Viennot <Nicolas.Viennot@twosigma.com>

Allow CAP_CHECKPOINT_RESTORE capable users to change /proc/self/exe.

This commit also changes the permission error code from -EINVAL to
-EPERM for consistency with the rest of the prctl() syscall when
checking capabilities.

Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
---
 kernel/sys.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

Comments

Christian Brauner July 15, 2020, 3:20 p.m. UTC | #1
On Wed, Jul 15, 2020 at 04:49:53PM +0200, Adrian Reber wrote:
> From: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
> 
> Allow CAP_CHECKPOINT_RESTORE capable users to change /proc/self/exe.
> 
> This commit also changes the permission error code from -EINVAL to
> -EPERM for consistency with the rest of the prctl() syscall when
> checking capabilities.

I agree that EINVAL seems weird here but this is a potentially user
visible change. Might be nice to have the EINVAL->EPERM change be an
additional patch on top after this one so we can revert it in case it
breaks someone (unlikely though). I can split this out myself though so
no need to resend for that alone.

What I would also prefer is to have some history in the commit message
tbh. The reason is that when we started discussing that specific change
I had to hunt down the history of changing /proc/self/exe and had to
dig up and read through ancient threads on lore to come up with the
explanation why this is placed under a capability. The commit message
should then also mention that there are other ways to change the
/proc/self/exe link that don't require capabilities and that
/proc/self/exe itself is not something userspace should rely on for
security. Mainly so that in a few months/years we can read through that
commit message and go "Weird, but ok.". :)

But maybe I can just rewrite this myself so you don't have to go through
the trouble. This is really not pedantry it's just that it's a lot of
work digging up the reasons for a piece of code existing when it's
really not obvious. :)

Christian

> 
> Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
> Signed-off-by: Adrian Reber <areber@redhat.com>
> ---
>  kernel/sys.c | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/sys.c b/kernel/sys.c
> index 00a96746e28a..dd59b9142b1d 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -2007,12 +2007,14 @@ static int prctl_set_mm_map(int opt, const void __user *addr, unsigned long data
>  
>  	if (prctl_map.exe_fd != (u32)-1) {
>  		/*
> -		 * Make sure the caller has the rights to
> -		 * change /proc/pid/exe link: only local sys admin should
> -		 * be allowed to.
> +		 * Check if the current user is checkpoint/restore capable.
> +		 * At the time of this writing, it checks for CAP_SYS_ADMIN
> +		 * or CAP_CHECKPOINT_RESTORE.
> +		 * Note that a user with access to ptrace can masquerade an
> +		 * arbitrary program as any executable, even setuid ones.
>  		 */
> -		if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
> -			return -EINVAL;
> +		if (!checkpoint_restore_ns_capable(current_user_ns()))
> +			return -EPERM;
>  
>  		error = prctl_set_mm_exe_file(mm, prctl_map.exe_fd);
>  		if (error)
> -- 
> 2.26.2
>
Nicolas Viennot July 15, 2020, 3:49 p.m. UTC | #2
> On Wed, Jul 15, 2020 at 04:49:53PM +0200, Adrian Reber wrote:
> > From: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
> > 
> > Allow CAP_CHECKPOINT_RESTORE capable users to change /proc/self/exe.
> > 
> > This commit also changes the permission error code from -EINVAL to 
> > -EPERM for consistency with the rest of the prctl() syscall when 
> > checking capabilities.
> I agree that EINVAL seems weird here but this is a potentially user visible change. Might be nice to have the EINVAL->EPERM change be an additional patch on top after this one so we can revert it in case it breaks someone (unlikely though). I can split this out myself though so no need to resend for that alone.
> What I would also prefer is to have some history in the commit message tbh. The reason is that when we started discussing that specific change I had to hunt down the history of changing /proc/self/exe and had to dig up and read through ancient threads on lore to come up with the explanation why this is placed under a capability. The commit message should then also mention that there are other ways to change the /proc/self/exe link that don't require capabilities and that /proc/self/exe itself is not something userspace should rely on for security. Mainly so that in a few months/years we can read through that commit message and go "Weird, but ok.". :)
> But maybe I can just rewrite this myself so you don't have to go through the trouble. This is really not pedantry it's just that it's a lot of work digging up the reasons for a piece of code existing when it's really not obvious. :)

Hello Christian,

I agree.
Thank you for suggesting doing the work, but you've done plenty already. So we'll come back to you with:
1) A separate commit for EINVAL->EPERM
2) A full history of discussions in the commit message related to /proc/self/exe capability check

Thanks,
Nico
diff mbox series

Patch

diff --git a/kernel/sys.c b/kernel/sys.c
index 00a96746e28a..dd59b9142b1d 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2007,12 +2007,14 @@  static int prctl_set_mm_map(int opt, const void __user *addr, unsigned long data
 
 	if (prctl_map.exe_fd != (u32)-1) {
 		/*
-		 * Make sure the caller has the rights to
-		 * change /proc/pid/exe link: only local sys admin should
-		 * be allowed to.
+		 * Check if the current user is checkpoint/restore capable.
+		 * At the time of this writing, it checks for CAP_SYS_ADMIN
+		 * or CAP_CHECKPOINT_RESTORE.
+		 * Note that a user with access to ptrace can masquerade an
+		 * arbitrary program as any executable, even setuid ones.
 		 */
-		if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
-			return -EINVAL;
+		if (!checkpoint_restore_ns_capable(current_user_ns()))
+			return -EPERM;
 
 		error = prctl_set_mm_exe_file(mm, prctl_map.exe_fd);
 		if (error)