Message ID | 20210310181857.401675-2-mic@digikod.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Unprivileged chroot | expand |
Mickaël Salaün <mic@digikod.net> writes: > From: Mickaël Salaün <mic@linux.microsoft.com> > > Being able to easily change root directories enable to ease some > development workflow and can be used as a tool to strengthen > unprivileged security sandboxes. chroot(2) is not an access-control > mechanism per se, but it can be used to limit the absolute view of the > filesystem, and then limit ways to access data and kernel interfaces > (e.g. /proc, /sys, /dev, etc.). > > Users may not wish to expose namespace complexity to potentially > malicious processes, or limit their use because of limited resources. > The chroot feature is much more simple (and limited) than the mount > namespace, but can still be useful. As for containers, users of > chroot(2) should take care of file descriptors or data accessible by > other means (e.g. current working directory, leaked FDs, passed FDs, > devices, mount points, etc.). There is a lot of literature that discuss > the limitations of chroot, and users of this feature should be aware of > the multiple ways to bypass it. Using chroot(2) for security purposes > can make sense if it is combined with other features (e.g. dedicated > user, seccomp, LSM access-controls, etc.). > > One could argue that chroot(2) is useless without a properly populated > root hierarchy (i.e. without /dev and /proc). However, there are > multiple use cases that don't require the chrooting process to create > file hierarchies with special files nor mount points, e.g.: > * A process sandboxing itself, once all its libraries are loaded, may > not need files other than regular files, or even no file at all. > * Some pre-populated root hierarchies could be used to chroot into, > provided for instance by development environments or tailored > distributions. > * Processes executed in a chroot may not require access to these special > files (e.g. with minimal runtimes, or by emulating some special files > with a LD_PRELOADed library or seccomp). > > Allowing a task to change its own root directory is not a threat to the > system if we can prevent confused deputy attacks, which could be > performed through execution of SUID-like binaries. This can be > prevented if the calling task sets PR_SET_NO_NEW_PRIVS on itself with > prctl(2). To only affect this task, its filesystem information must not > be shared with other tasks, which can be achieved by not passing > CLONE_FS to clone(2). A similar no_new_privs check is already used by > seccomp to avoid the same kind of security issues. Furthermore, because > of its security use and to avoid giving a new way for attackers to get > out of a chroot (e.g. using /proc/<pid>/root), an unprivileged chroot is > only allowed if the new root directory is the same or beneath the > current one. This still allows a process to use a subset of its > legitimate filesystem to chroot into and then further reduce its view of > the filesystem. > > This change may not impact systems relying on other permission models > than POSIX capabilities (e.g. Tomoyo). Being able to use chroot(2) on > such systems may require to update their security policies. > > Only the chroot system call is relaxed with this no_new_privs check; the > init_chroot() helper doesn't require such change. > > Allowing unprivileged users to use chroot(2) is one of the initial > objectives of no_new_privs: > https://www.kernel.org/doc/html/latest/userspace-api/no_new_privs.html > This patch is a follow-up of a previous one sent by Andy Lutomirski, but > with less limitations: > https://lore.kernel.org/lkml/0e2f0f54e19bff53a3739ecfddb4ffa9a6dbde4d.1327858005.git.luto@amacapital.net/ > > Cc: Al Viro <viro@zeniv.linux.org.uk> > Cc: Andy Lutomirski <luto@amacapital.net> > Cc: Christian Brauner <christian.brauner@ubuntu.com> > Cc: Christoph Hellwig <hch@lst.de> > Cc: David Howells <dhowells@redhat.com> > Cc: Dominik Brodowski <linux@dominikbrodowski.net> > Cc: Eric W. Biederman <ebiederm@xmission.com> > Cc: James Morris <jmorris@namei.org> > Cc: John Johansen <john.johansen@canonical.com> > Cc: Kees Cook <keescook@chromium.org> > Cc: Kentaro Takeda <takedakn@nttdata.co.jp> > Cc: Serge Hallyn <serge@hallyn.com> > Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> > Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> > Link: https://lore.kernel.org/r/20210310181857.401675-2-mic@digikod.net > --- > > Changes since v1: > * Replace custom is_path_beneath() with existing path_is_under(). Neither is_path_beneath nor path_is_under really help prevent escapes, as except for open files and files accessible from proc chroot already disallows going up. The reason is the path is resolved with the current root before switching to it. My brain was fuzzy. I had the classic escape scenario confused. It isn't chroot("../../.."); The actual classic chroot escape is. chdir("/"); chroot("/somedir"); chdir("../../../.."); Your change would disallow changing the root directory, but it doesn't much help as all files in the mount namespace are visible anyway. Furthermore changing chdir to ensure it's path is at or beneath the current root would cause regressions in existing userspace programs so we can't do that. Plus you are trying to rely on changing the definition of chroot to make it safe (not just changing the permission checks). That is also confusing and makes it difficult to analyze because people's previous analysis gets confused. Eric > --- > fs/open.c | 23 ++++++++++++++++++++--- > 1 file changed, 20 insertions(+), 3 deletions(-) > > diff --git a/fs/open.c b/fs/open.c > index e53af13b5835..280dbff25b25 100644 > --- a/fs/open.c > +++ b/fs/open.c > @@ -22,6 +22,7 @@ > #include <linux/slab.h> > #include <linux/uaccess.h> > #include <linux/fs.h> > +#include <linux/path.h> > #include <linux/personality.h> > #include <linux/pagemap.h> > #include <linux/syscalls.h> > @@ -546,15 +547,31 @@ SYSCALL_DEFINE1(chroot, const char __user *, filename) > if (error) > goto dput_and_out; > > + /* > + * Changing the root directory for the calling task (and its future > + * children) requires that this task has CAP_SYS_CHROOT in its > + * namespace, or be running with no_new_privs and not sharing its > + * fs_struct and not escaping its current root directory. As for > + * seccomp, checking no_new_privs avoids scenarios where unprivileged > + * tasks can affect the behavior of privileged children. Lock the path > + * to protect against TOCTOU race between path_is_under() and > + * set_fs_root(). No need to lock the root because it is not possible > + * to rename it beneath itself. > + */ > error = -EPERM; > - if (!ns_capable(current_user_ns(), CAP_SYS_CHROOT)) > - goto dput_and_out; > + inode_lock(d_inode(path.dentry)); > + if (!ns_capable(current_user_ns(), CAP_SYS_CHROOT) && > + !(task_no_new_privs(current) && current->fs->users == 1 > + && path_is_under(&path, ¤t->fs->root))) > + goto unlock_and_out; > error = security_path_chroot(&path); > if (error) > - goto dput_and_out; > + goto unlock_and_out; > > set_fs_root(current->fs, &path); > error = 0; > +unlock_and_out: > + inode_unlock(d_inode(path.dentry)); > dput_and_out: > path_put(&path); > if (retry_estale(error, lookup_flags)) {
On Wed, Mar 10, 2021 at 8:23 PM Eric W. Biederman <ebiederm@xmission.com> wrote: > > Mickaël Salaün <mic@digikod.net> writes: > > > From: Mickaël Salaün <mic@linux.microsoft.com> > > > > Being able to easily change root directories enable to ease some > > development workflow and can be used as a tool to strengthen > > unprivileged security sandboxes. chroot(2) is not an access-control > > mechanism per se, but it can be used to limit the absolute view of the > > filesystem, and then limit ways to access data and kernel interfaces > > (e.g. /proc, /sys, /dev, etc.). > > > > Users may not wish to expose namespace complexity to potentially > > malicious processes, or limit their use because of limited resources. > > The chroot feature is much more simple (and limited) than the mount > > namespace, but can still be useful. As for containers, users of > > chroot(2) should take care of file descriptors or data accessible by > > other means (e.g. current working directory, leaked FDs, passed FDs, > > devices, mount points, etc.). There is a lot of literature that discuss > > the limitations of chroot, and users of this feature should be aware of > > the multiple ways to bypass it. Using chroot(2) for security purposes > > can make sense if it is combined with other features (e.g. dedicated > > user, seccomp, LSM access-controls, etc.). > > > > One could argue that chroot(2) is useless without a properly populated > > root hierarchy (i.e. without /dev and /proc). However, there are > > multiple use cases that don't require the chrooting process to create > > file hierarchies with special files nor mount points, e.g.: > > * A process sandboxing itself, once all its libraries are loaded, may > > not need files other than regular files, or even no file at all. > > * Some pre-populated root hierarchies could be used to chroot into, > > provided for instance by development environments or tailored > > distributions. > > * Processes executed in a chroot may not require access to these special > > files (e.g. with minimal runtimes, or by emulating some special files > > with a LD_PRELOADed library or seccomp). > > > > Allowing a task to change its own root directory is not a threat to the > > system if we can prevent confused deputy attacks, which could be > > performed through execution of SUID-like binaries. This can be > > prevented if the calling task sets PR_SET_NO_NEW_PRIVS on itself with > > prctl(2). To only affect this task, its filesystem information must not > > be shared with other tasks, which can be achieved by not passing > > CLONE_FS to clone(2). A similar no_new_privs check is already used by > > seccomp to avoid the same kind of security issues. Furthermore, because > > of its security use and to avoid giving a new way for attackers to get > > out of a chroot (e.g. using /proc/<pid>/root), an unprivileged chroot is > > only allowed if the new root directory is the same or beneath the > > current one. This still allows a process to use a subset of its > > legitimate filesystem to chroot into and then further reduce its view of > > the filesystem. > > > > This change may not impact systems relying on other permission models > > than POSIX capabilities (e.g. Tomoyo). Being able to use chroot(2) on > > such systems may require to update their security policies. > > > > Only the chroot system call is relaxed with this no_new_privs check; the > > init_chroot() helper doesn't require such change. > > > > Allowing unprivileged users to use chroot(2) is one of the initial > > objectives of no_new_privs: > > https://www.kernel.org/doc/html/latest/userspace-api/no_new_privs.html > > This patch is a follow-up of a previous one sent by Andy Lutomirski, but > > with less limitations: > > https://lore.kernel.org/lkml/0e2f0f54e19bff53a3739ecfddb4ffa9a6dbde4d.1327858005.git.luto@amacapital.net/ [...] > Neither is_path_beneath nor path_is_under really help prevent escapes, > as except for open files and files accessible from proc chroot already > disallows going up. The reason is the path is resolved with the current > root before switching to it. Yeah, this probably should use the same check as the CLONE_NEWUSER logic, current_chrooted() from CLONE_NEWUSER; that check is already used for guarding against the following syscall sequence, which has similar security properties: unshare(CLONE_NEWUSER); // gives the current process namespaced CAP_SYS_ADMIN chroot("<...>"); // succeeds because of namespaced CAP_SYS_ADMIN The current_chrooted() check in create_user_ns() is for the same purpose as the check you're introducing here, so they should use the same logic.
From: Eric W. Biederman > Sent: 10 March 2021 19:24 ... > The actual classic chroot escape is. > chdir("/"); > chroot("/somedir"); > chdir("../../../.."); That one is easily checked. I thought something like: chroot("/somedir"); chdir("/somepath"); Friendly process: mvdir("/somedir/some_path", "/bar"); was the actual escape? David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On 10/03/2021 20:33, Jann Horn wrote: > On Wed, Mar 10, 2021 at 8:23 PM Eric W. Biederman <ebiederm@xmission.com> wrote: >> >> Mickaël Salaün <mic@digikod.net> writes: >> >>> From: Mickaël Salaün <mic@linux.microsoft.com> >>> >>> Being able to easily change root directories enable to ease some >>> development workflow and can be used as a tool to strengthen >>> unprivileged security sandboxes. chroot(2) is not an access-control >>> mechanism per se, but it can be used to limit the absolute view of the >>> filesystem, and then limit ways to access data and kernel interfaces >>> (e.g. /proc, /sys, /dev, etc.). >>> >>> Users may not wish to expose namespace complexity to potentially >>> malicious processes, or limit their use because of limited resources. >>> The chroot feature is much more simple (and limited) than the mount >>> namespace, but can still be useful. As for containers, users of >>> chroot(2) should take care of file descriptors or data accessible by >>> other means (e.g. current working directory, leaked FDs, passed FDs, >>> devices, mount points, etc.). There is a lot of literature that discuss >>> the limitations of chroot, and users of this feature should be aware of >>> the multiple ways to bypass it. Using chroot(2) for security purposes >>> can make sense if it is combined with other features (e.g. dedicated >>> user, seccomp, LSM access-controls, etc.). >>> >>> One could argue that chroot(2) is useless without a properly populated >>> root hierarchy (i.e. without /dev and /proc). However, there are >>> multiple use cases that don't require the chrooting process to create >>> file hierarchies with special files nor mount points, e.g.: >>> * A process sandboxing itself, once all its libraries are loaded, may >>> not need files other than regular files, or even no file at all. >>> * Some pre-populated root hierarchies could be used to chroot into, >>> provided for instance by development environments or tailored >>> distributions. >>> * Processes executed in a chroot may not require access to these special >>> files (e.g. with minimal runtimes, or by emulating some special files >>> with a LD_PRELOADed library or seccomp). >>> >>> Allowing a task to change its own root directory is not a threat to the >>> system if we can prevent confused deputy attacks, which could be >>> performed through execution of SUID-like binaries. This can be >>> prevented if the calling task sets PR_SET_NO_NEW_PRIVS on itself with >>> prctl(2). To only affect this task, its filesystem information must not >>> be shared with other tasks, which can be achieved by not passing >>> CLONE_FS to clone(2). A similar no_new_privs check is already used by >>> seccomp to avoid the same kind of security issues. Furthermore, because >>> of its security use and to avoid giving a new way for attackers to get >>> out of a chroot (e.g. using /proc/<pid>/root), an unprivileged chroot is >>> only allowed if the new root directory is the same or beneath the >>> current one. This still allows a process to use a subset of its >>> legitimate filesystem to chroot into and then further reduce its view of >>> the filesystem. >>> >>> This change may not impact systems relying on other permission models >>> than POSIX capabilities (e.g. Tomoyo). Being able to use chroot(2) on >>> such systems may require to update their security policies. >>> >>> Only the chroot system call is relaxed with this no_new_privs check; the >>> init_chroot() helper doesn't require such change. >>> >>> Allowing unprivileged users to use chroot(2) is one of the initial >>> objectives of no_new_privs: >>> https://www.kernel.org/doc/html/latest/userspace-api/no_new_privs.html >>> This patch is a follow-up of a previous one sent by Andy Lutomirski, but >>> with less limitations: >>> https://lore.kernel.org/lkml/0e2f0f54e19bff53a3739ecfddb4ffa9a6dbde4d.1327858005.git.luto@amacapital.net/ > [...] >> Neither is_path_beneath nor path_is_under really help prevent escapes, >> as except for open files and files accessible from proc chroot already >> disallows going up. The reason is the path is resolved with the current >> root before switching to it. > > Yeah, this probably should use the same check as the CLONE_NEWUSER > logic, current_chrooted() from CLONE_NEWUSER; that check is already > used for guarding against the following syscall sequence, which has > similar security properties: > unshare(CLONE_NEWUSER); // gives the current process namespaced CAP_SYS_ADMIN > chroot("<...>"); // succeeds because of namespaced CAP_SYS_ADMIN > > The current_chrooted() check in create_user_ns() is for the same > purpose as the check you're introducing here, so they should use the > same logic. > I don't know how I missed this, but current_chrooted() is definitely the right approach.
diff --git a/fs/open.c b/fs/open.c index e53af13b5835..280dbff25b25 100644 --- a/fs/open.c +++ b/fs/open.c @@ -22,6 +22,7 @@ #include <linux/slab.h> #include <linux/uaccess.h> #include <linux/fs.h> +#include <linux/path.h> #include <linux/personality.h> #include <linux/pagemap.h> #include <linux/syscalls.h> @@ -546,15 +547,31 @@ SYSCALL_DEFINE1(chroot, const char __user *, filename) if (error) goto dput_and_out; + /* + * Changing the root directory for the calling task (and its future + * children) requires that this task has CAP_SYS_CHROOT in its + * namespace, or be running with no_new_privs and not sharing its + * fs_struct and not escaping its current root directory. As for + * seccomp, checking no_new_privs avoids scenarios where unprivileged + * tasks can affect the behavior of privileged children. Lock the path + * to protect against TOCTOU race between path_is_under() and + * set_fs_root(). No need to lock the root because it is not possible + * to rename it beneath itself. + */ error = -EPERM; - if (!ns_capable(current_user_ns(), CAP_SYS_CHROOT)) - goto dput_and_out; + inode_lock(d_inode(path.dentry)); + if (!ns_capable(current_user_ns(), CAP_SYS_CHROOT) && + !(task_no_new_privs(current) && current->fs->users == 1 + && path_is_under(&path, ¤t->fs->root))) + goto unlock_and_out; error = security_path_chroot(&path); if (error) - goto dput_and_out; + goto unlock_and_out; set_fs_root(current->fs, &path); error = 0; +unlock_and_out: + inode_unlock(d_inode(path.dentry)); dput_and_out: path_put(&path); if (retry_estale(error, lookup_flags)) {