diff mbox

[RFC,v2,2/6] fs: protected project id

Message ID 20150310172206.23081.95005.stgit@buzz (mailing list archive)
State New, archived
Headers show

Commit Message

Konstantin Khlebnikov March 10, 2015, 5:22 p.m. UTC
Historically XFS project id doesn't have any permission control: file owner
is able to set any project id. Later they was sealed with user-namespace:
XFS allows to change it only from init user-ns. That works fine for isolated
containers or if user doesn't have direct access to the filesystem (NFS/FTP).

This patch adds sysctl fs.protected_projects which makes changing project id
privileged operation which requires CAP_SYS_RESOURCE in current user-namespace.
Thus there are two levels of protection: project id mapping in user-ns defines
set of permitted projects and capability protects operations within this set.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
---
 Documentation/sysctl/fs.txt     |   16 ++++++++++++++++
 fs/ioctl.c                      |    6 +++++-
 include/linux/fs.h              |    1 +
 include/uapi/linux/capability.h |    1 +
 kernel/sysctl.c                 |    9 +++++++++
 kernel/user_namespace.c         |    4 ++--
 6 files changed, 34 insertions(+), 3 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Andy Lutomirski March 10, 2015, 5:32 p.m. UTC | #1
On Tue, Mar 10, 2015 at 10:22 AM, Konstantin Khlebnikov
<khlebnikov@yandex-team.ru> wrote:
> Historically XFS project id doesn't have any permission control: file owner
> is able to set any project id. Later they was sealed with user-namespace:
> XFS allows to change it only from init user-ns. That works fine for isolated
> containers or if user doesn't have direct access to the filesystem (NFS/FTP).
>
> This patch adds sysctl fs.protected_projects which makes changing project id
> privileged operation which requires CAP_SYS_RESOURCE in current user-namespace.
> Thus there are two levels of protection: project id mapping in user-ns defines
> set of permitted projects and capability protects operations within this set.

If I understand this right, this doesn't work.  If I lack
CAP_SYS_RESOURCE but I have two projids mapped, then I can create a
new userns, map both projids, and get CAP_SYS_RESOURCE.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Konstantin Khlebnikov March 10, 2015, 6:51 p.m. UTC | #2
On Tue, Mar 10, 2015 at 8:32 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Tue, Mar 10, 2015 at 10:22 AM, Konstantin Khlebnikov
> <khlebnikov@yandex-team.ru> wrote:
>> Historically XFS project id doesn't have any permission control: file owner
>> is able to set any project id. Later they was sealed with user-namespace:
>> XFS allows to change it only from init user-ns. That works fine for isolated
>> containers or if user doesn't have direct access to the filesystem (NFS/FTP).
>>
>> This patch adds sysctl fs.protected_projects which makes changing project id
>> privileged operation which requires CAP_SYS_RESOURCE in current user-namespace.
>> Thus there are two levels of protection: project id mapping in user-ns defines
>> set of permitted projects and capability protects operations within this set.
>
> If I understand this right, this doesn't work.  If I lack
> CAP_SYS_RESOURCE but I have two projids mapped, then I can create a
> new userns, map both projids, and get CAP_SYS_RESOURCE.

Setting project id mapping for nested user-namespace also requires
this capability in parent namespace. The same as for setting uid/gid
mapping but without special case for mapping current uid/gid because
task has no "current" project id.

This is mentioned in cover letter but I forget it here. Sorry.

>
> --Andy
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andy Lutomirski March 10, 2015, 6:57 p.m. UTC | #3
On Tue, Mar 10, 2015 at 11:51 AM, Konstantin Khlebnikov
<koct9i@gmail.com> wrote:
> On Tue, Mar 10, 2015 at 8:32 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> On Tue, Mar 10, 2015 at 10:22 AM, Konstantin Khlebnikov
>> <khlebnikov@yandex-team.ru> wrote:
>>> Historically XFS project id doesn't have any permission control: file owner
>>> is able to set any project id. Later they was sealed with user-namespace:
>>> XFS allows to change it only from init user-ns. That works fine for isolated
>>> containers or if user doesn't have direct access to the filesystem (NFS/FTP).
>>>
>>> This patch adds sysctl fs.protected_projects which makes changing project id
>>> privileged operation which requires CAP_SYS_RESOURCE in current user-namespace.
>>> Thus there are two levels of protection: project id mapping in user-ns defines
>>> set of permitted projects and capability protects operations within this set.
>>
>> If I understand this right, this doesn't work.  If I lack
>> CAP_SYS_RESOURCE but I have two projids mapped, then I can create a
>> new userns, map both projids, and get CAP_SYS_RESOURCE.
>
> Setting project id mapping for nested user-namespace also requires
> this capability in parent namespace. The same as for setting uid/gid
> mapping but without special case for mapping current uid/gid because
> task has no "current" project id.
>
> This is mentioned in cover letter but I forget it here. Sorry.

Right, sorry.  I'm still used to projid mappings being unprotected.

--Andy

>
>>
>> --Andy
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/Documentation/sysctl/fs.txt b/Documentation/sysctl/fs.txt
index 88152f214f48..9f6579b99be6 100644
--- a/Documentation/sysctl/fs.txt
+++ b/Documentation/sysctl/fs.txt
@@ -34,6 +34,7 @@  Currently, these files are in /proc/sys/fs:
 - overflowgid
 - protected_hardlinks
 - protected_symlinks
+- protected_projects
 - suid_dumpable
 - super-max
 - super-nr
@@ -199,6 +200,21 @@  This protection is based on the restrictions in Openwall and grsecurity.
 
 ==============================================================
 
+protected_projects:
+
+Project id allows to enforce disk quota for several subtrees or individual
+files on the filesystem. Historically changing project id was a unprivileged
+operation and file owner is able to set any project id.
+
+When set to "0", changing project id is unprivileged operation. File owner
+can set any project id mapped in current user namespace.
+
+When set to "1" changing project id requires capability CAP_SYS_RESOURCE
+in current user namespace. Also defining project id mapping for nested
+user namespace requires CAP_SYS_RESOURCE in the parent user namespace.
+
+==============================================================
+
 suid_dumpable:
 
 This value can be used to query and set the core dump mode for setuid
diff --git a/fs/ioctl.c b/fs/ioctl.c
index d351576d95c8..2acf5efbc045 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -565,6 +565,8 @@  static int ioctl_getproject(struct file *filp, projid_t __user *argp)
 	return put_user(projid, argp);
 }
 
+int sysctl_protected_projects;
+
 static int ioctl_setproject(struct file *filp, projid_t __user *argp)
 {
 	struct user_namespace *ns = current_user_ns();
@@ -576,7 +578,9 @@  static int ioctl_setproject(struct file *filp, projid_t __user *argp)
 
 	if (!sb->s_op->set_project)
 		return -EOPNOTSUPP;
-	if (ns != &init_user_ns)
+	if (sysctl_protected_projects ?
+	    !ns_capable(ns, CAP_SYS_RESOURCE) :
+	    (ns != &init_user_ns))
 		return -EPERM;
 	ret = get_user(projid, argp);
 	if (ret)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 42156801739e..d3021feb3f7f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -64,6 +64,7 @@  extern struct inodes_stat_t inodes_stat;
 extern int leases_enable, lease_break_time;
 extern int sysctl_protected_symlinks;
 extern int sysctl_protected_hardlinks;
+extern int sysctl_protected_projects;
 
 struct buffer_head;
 typedef int (get_block_t)(struct inode *inode, sector_t iblock,
diff --git a/include/uapi/linux/capability.h b/include/uapi/linux/capability.h
index 12c37a197d24..0292885567cc 100644
--- a/include/uapi/linux/capability.h
+++ b/include/uapi/linux/capability.h
@@ -278,6 +278,7 @@  struct vfs_cap_data {
 /* Override resource limits. Set resource limits. */
 /* Override quota limits. */
 /* Override reserved space on ext2 filesystem */
+/* Modify file project id if protected_projects = 1 */
 /* Modify data journaling mode on ext3 filesystem (uses journaling
    resources) */
 /* NOTE: ext2 honors fsuid when checking for resource overrides, so
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 88ea2d6e0031..cb6f9fb13de3 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1649,6 +1649,15 @@  static struct ctl_table fs_table[] = {
 		.extra2		= &one,
 	},
 	{
+		.procname	= "protected_projects",
+		.data		= &sysctl_protected_projects,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &zero,
+		.extra2		= &one,
+	},
+	{
 		.procname	= "suid_dumpable",
 		.data		= &suid_dumpable,
 		.maxlen		= sizeof(int),
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 4109f8320684..88f66198b251 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -807,8 +807,8 @@  ssize_t proc_projid_map_write(struct file *file, const char __user *buf,
 	if ((seq_ns != ns) && (seq_ns != ns->parent))
 		return -EPERM;
 
-	/* Anyone can set any valid project id no capability needed */
-	return map_write(file, buf, size, ppos, -1,
+	return map_write(file, buf, size, ppos,
+			 sysctl_protected_projects ? CAP_SYS_RESOURCE : -1,
 			 &ns->projid_map, &ns->parent->projid_map);
 }