Message ID | 74a91362-247c-c749-5200-7bdce704ed9e@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Allow restricting permissions in /proc/sys | expand |
On Sun, Nov 03, 2019 at 04:55:48PM +0200, Topi Miettinen wrote: > Several items in /proc/sys need not be accessible to unprivileged > tasks. Let the system administrator change the permissions, but only > to more restrictive modes than what the sysctl tables allow. > > Signed-off-by: Topi Miettinen <toiwoton@gmail.com> Why should restruct the system administrator from changing the permissions to one which is more lax than what the sysctl tables? The system administrator is already very much trusted. Why should we take that discretion away from the system administrator? - Ted
Topi Miettinen <toiwoton@gmail.com> writes: > Several items in /proc/sys need not be accessible to unprivileged > tasks. Let the system administrator change the permissions, but only > to more restrictive modes than what the sysctl tables allow. This looks quite buggy. You neither update table->mode nor do you ever read from table->mode to initialize the inode. I am missing something in my quick reading of your patch? The not updating table->mode almost certainly means that as soon as the cached inode is invalidated the mode changes will disappear. Not to mention they will fail to propogate between different instances of proc. Loosing all of your changes at cache invalidation seems to make this a useless feature. Eric > Signed-off-by: Topi Miettinen <toiwoton@gmail.com> > --- > fs/proc/proc_sysctl.c | 41 +++++++++++++++++++++++++++++++---------- > 1 file changed, 31 insertions(+), 10 deletions(-) > > diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c > index d80989b6c344..88c4ca7d2782 100644 > --- a/fs/proc/proc_sysctl.c > +++ b/fs/proc/proc_sysctl.c > @@ -818,6 +818,10 @@ static int proc_sys_permission(struct inode *inode, int > mask) > if ((mask & MAY_EXEC) && S_ISREG(inode->i_mode)) > return -EACCES; > > + error = generic_permission(inode, mask); > + if (error) > + return error; > + > head = grab_header(inode); > if (IS_ERR(head)) > return PTR_ERR(head); > @@ -837,9 +841,35 @@ static int proc_sys_setattr(struct dentry *dentry, struct > iattr *attr) > struct inode *inode = d_inode(dentry); > int error; > > - if (attr->ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID)) > + if (attr->ia_valid & (ATTR_UID | ATTR_GID)) > return -EPERM; > > + if (attr->ia_valid & ATTR_MODE) { > + struct ctl_table_header *head = grab_header(inode); > + struct ctl_table *table = PROC_I(inode)->sysctl_entry; > + umode_t max_mode = 0777; /* Only these bits may change */ > + > + if (IS_ERR(head)) > + return PTR_ERR(head); > + > + if (!table) /* global root - r-xr-xr-x */ > + max_mode &= ~0222; > + else /* > + * Don't allow permissions to become less > + * restrictive than the sysctl table entry > + */ > + max_mode &= table->mode; > + > + sysctl_head_finish(head); > + > + /* Execute bits only allowed for directories */ > + if (!S_ISDIR(inode->i_mode)) > + max_mode &= ~0111; > + > + if (attr->ia_mode & ~S_IFMT & ~max_mode) > + return -EPERM; > + } > + > error = setattr_prepare(dentry, attr); > if (error) > return error; > @@ -853,17 +883,8 @@ static int proc_sys_getattr(const struct path *path, struct > kstat *stat, > u32 request_mask, unsigned int query_flags) > { > struct inode *inode = d_inode(path->dentry); > - struct ctl_table_header *head = grab_header(inode); > - struct ctl_table *table = PROC_I(inode)->sysctl_entry; > - > - if (IS_ERR(head)) > - return PTR_ERR(head); > > generic_fillattr(inode, stat); > - if (table) > - stat->mode = (stat->mode & S_IFMT) | table->mode; > - > - sysctl_head_finish(head); > return 0; > }
On 3.11.2019 19.56, Theodore Y. Ts'o wrote: > On Sun, Nov 03, 2019 at 04:55:48PM +0200, Topi Miettinen wrote: >> Several items in /proc/sys need not be accessible to unprivileged >> tasks. Let the system administrator change the permissions, but only >> to more restrictive modes than what the sysctl tables allow. >> >> Signed-off-by: Topi Miettinen <toiwoton@gmail.com> > > Why should restruct the system administrator from changing the > permissions to one which is more lax than what the sysctl tables? > > The system administrator is already very much trusted. Why should we > take that discretion away from the system administrator? That could make sense, in addition changing UID/GID would allow even more flexibility. The current checks and restrictions which prevent those changes were already present in original code in 2007. I didn't want to change the logic too much. Perhaps loosening the restrictions could be a follow-up patch, as it may give chance to use more of generic proc or fslib code and thus a larger restructuring. -Topi
On 3.11.2019 20.50, Eric W. Biederman wrote: > Topi Miettinen <toiwoton@gmail.com> writes: > >> Several items in /proc/sys need not be accessible to unprivileged >> tasks. Let the system administrator change the permissions, but only >> to more restrictive modes than what the sysctl tables allow. > > This looks quite buggy. You neither update table->mode nor > do you ever read from table->mode to initialize the inode. > I am missing something in my quick reading of your patch? inode->i_mode gets initialized in proc_sys_make_inode(). I didn't want to touch the table, so that the original permissions can be used to restrict the changes made. In case the restrictions are removed as suggested by Theodore Ts'o, table->mode could be changed. Otherwise I'd rather add a new field to store the current mode and the mode field can remain for reference. As the original author of the code from 2007, would you let the administrator to chmod/chown the items in /proc/sys without restrictions (e.g. 0400 -> 0777)? > The not updating table->mode almost certainly means that as soon as the > cached inode is invalidated the mode changes will disappear. Not to > mention they will fail to propogate between different instances of > proc. > > Loosing all of your changes at cache invalidation seems to make this a > useless feature. At least different proc instances seem to work just fine here (they show the same changes), but I suppose you are right about cache invalidation. -Topi
Topi Miettinen <toiwoton@gmail.com> writes: > On 3.11.2019 20.50, Eric W. Biederman wrote: >> Topi Miettinen <toiwoton@gmail.com> writes: >> >>> Several items in /proc/sys need not be accessible to unprivileged >>> tasks. Let the system administrator change the permissions, but only >>> to more restrictive modes than what the sysctl tables allow. >> >> This looks quite buggy. You neither update table->mode nor >> do you ever read from table->mode to initialize the inode. >> I am missing something in my quick reading of your patch? > > inode->i_mode gets initialized in proc_sys_make_inode(). > > I didn't want to touch the table, so that the original permissions can > be used to restrict the changes made. In case the restrictions are > removed as suggested by Theodore Ts'o, table->mode could be > changed. Otherwise I'd rather add a new field to store the current > mode and the mode field can remain for reference. As the original > author of the code from 2007, would you let the administrator to > chmod/chown the items in /proc/sys without restrictions (e.g. 0400 -> > 0777)? At an architectural level I think we need to do this carefully and have a compelling reason. The code has survived nearly the entire life of linux without this capability. I think right now the common solution is to mount another file over the file you are trying to hide/limit. Changing the permissions might be better but that is not at all clear. Do you have specific examples of the cases where you would like to change the permissions? >> The not updating table->mode almost certainly means that as soon as the >> cached inode is invalidated the mode changes will disappear. Not to >> mention they will fail to propogate between different instances of >> proc. >> >> Loosing all of your changes at cache invalidation seems to make this a >> useless feature. > > At least different proc instances seem to work just fine here (they > show the same changes), but I suppose you are right about cache > invalidation. It is going to take the creation of a pid namespace to see different proc instances. All mounts of the proc within the same pid_namespace return the same instance. Eric
On 4.11.2019 17.44, Eric W. Biederman wrote: > Topi Miettinen <toiwoton@gmail.com> writes: > >> On 3.11.2019 20.50, Eric W. Biederman wrote: >>> Topi Miettinen <toiwoton@gmail.com> writes: >>> >>>> Several items in /proc/sys need not be accessible to unprivileged >>>> tasks. Let the system administrator change the permissions, but only >>>> to more restrictive modes than what the sysctl tables allow. >>> >>> This looks quite buggy. You neither update table->mode nor >>> do you ever read from table->mode to initialize the inode. >>> I am missing something in my quick reading of your patch? >> >> inode->i_mode gets initialized in proc_sys_make_inode(). >> >> I didn't want to touch the table, so that the original permissions can >> be used to restrict the changes made. In case the restrictions are >> removed as suggested by Theodore Ts'o, table->mode could be >> changed. Otherwise I'd rather add a new field to store the current >> mode and the mode field can remain for reference. As the original >> author of the code from 2007, would you let the administrator to >> chmod/chown the items in /proc/sys without restrictions (e.g. 0400 -> >> 0777)? > > At an architectural level I think we need to do this carefully and have > a compelling reason. The code has survived nearly the entire life of > linux without this capability. I'd be happy with only allowing restrictions to access for now. Perhaps later with more analysis, also relaxing changes and maybe UID/GID changes can be allowed. > I think right now the common solution is to mount another file over the > file you are trying to hide/limit. Changing the permissions might be > better but that is not at all clear. > > Do you have specific examples of the cases where you would like to > change the permissions? Unprivileged applications typically do not need to access most items in /proc/sys, so I'd like to gradually find out which are needed. So far I've seen no problems with 0500 mode for directories abi, crypto, debug, dev, fs, user or vm. I'm also using systemd's InaccessiblePaths to limit access (which mounts an inaccessible directory over the path), but that's a bit too big hammer. For example there are over 100 files in /proc/sys/kernel, perhaps there will be issues when creating a mount for each, and that multiplied by a number of services. >>> The not updating table->mode almost certainly means that as soon as the >>> cached inode is invalidated the mode changes will disappear. Not to >>> mention they will fail to propogate between different instances of >>> proc. >>> >>> Loosing all of your changes at cache invalidation seems to make this a >>> useless feature. >> >> At least different proc instances seem to work just fine here (they >> show the same changes), but I suppose you are right about cache >> invalidation. > > It is going to take the creation of a pid namespace to see different > proc instances. All mounts of the proc within the same pid_namespace > return the same instance. I see no problems by using Firejail (which uses PID namespacing) with v2, the permissions in /proc/sys are the same as outside the namespace. -Topi
Topi Miettinen <toiwoton@gmail.com> writes: > On 4.11.2019 17.44, Eric W. Biederman wrote: >> Topi Miettinen <toiwoton@gmail.com> writes: >> >>> On 3.11.2019 20.50, Eric W. Biederman wrote: >>>> Topi Miettinen <toiwoton@gmail.com> writes: >>>> >>>>> Several items in /proc/sys need not be accessible to unprivileged >>>>> tasks. Let the system administrator change the permissions, but only >>>>> to more restrictive modes than what the sysctl tables allow. >>>> >>>> This looks quite buggy. You neither update table->mode nor >>>> do you ever read from table->mode to initialize the inode. >>>> I am missing something in my quick reading of your patch? >>> >>> inode->i_mode gets initialized in proc_sys_make_inode(). >>> >>> I didn't want to touch the table, so that the original permissions can >>> be used to restrict the changes made. In case the restrictions are >>> removed as suggested by Theodore Ts'o, table->mode could be >>> changed. Otherwise I'd rather add a new field to store the current >>> mode and the mode field can remain for reference. As the original >>> author of the code from 2007, would you let the administrator to >>> chmod/chown the items in /proc/sys without restrictions (e.g. 0400 -> >>> 0777)? >> >> At an architectural level I think we need to do this carefully and have >> a compelling reason. The code has survived nearly the entire life of >> linux without this capability. > > I'd be happy with only allowing restrictions to access for > now. Perhaps later with more analysis, also relaxing changes and maybe > UID/GID changes can be allowed. Let's find the use case where someone cares before we think about that. >> I think right now the common solution is to mount another file over the >> file you are trying to hide/limit. Changing the permissions might be >> better but that is not at all clear. >> >> Do you have specific examples of the cases where you would like to >> change the permissions? > > Unprivileged applications typically do not need to access most items > in /proc/sys, so I'd like to gradually find out which are needed. So > far I've seen no problems with 0500 mode for directories abi, crypto, > debug, dev, fs, user or vm. But if there is no problem in letting everyone access the information why reduce the permissions? > I'm also using systemd's InaccessiblePaths to limit access (which > mounts an inaccessible directory over the path), but that's a bit too > big hammer. For example there are over 100 files in /proc/sys/kernel, > perhaps there will be issues when creating a mount for each, and that > multiplied by a number of services. My sense is that if there is any kind of compelling reason to make world-readable values not world-readable, and it doesn't break anything (except malicious applications) than a kernel patch is probably the way to go. Policy knobs like this on proc tend to break in normal maintenance because they are not used enough so I am not a big fan of adding policy knobs just because we can. > I see no problems by using Firejail (which uses PID namespacing) with > v2, the permissions in /proc/sys are the same as outside the > namespace. Thank you for testing. Eric
On 5.11.2019 1.41, Eric W. Biederman wrote: > Topi Miettinen <toiwoton@gmail.com> writes: > >> On 4.11.2019 17.44, Eric W. Biederman wrote: >>> Do you have specific examples of the cases where you would like to >>> change the permissions? >> >> Unprivileged applications typically do not need to access most items >> in /proc/sys, so I'd like to gradually find out which are needed. So >> far I've seen no problems with 0500 mode for directories abi, crypto, >> debug, dev, fs, user or vm. > > But if there is no problem in letting everyone access the information > why reduce the permissions? Because information could be useful to an attacker. If there is no problem in not letting everyone access the information why not allow reducing the permissions? There certainly is no need to know. >> I'm also using systemd's InaccessiblePaths to limit access (which >> mounts an inaccessible directory over the path), but that's a bit too >> big hammer. For example there are over 100 files in /proc/sys/kernel, >> perhaps there will be issues when creating a mount for each, and that >> multiplied by a number of services. > > My sense is that if there is any kind of compelling reason to make > world-readable values not world-readable, and it doesn't break anything > (except malicious applications) than a kernel patch is probably the way > to go. With kernel patch, do you propose to change individual sysctls to not world-readable? That surely would help everybody instead of just those who care enough to change /proc/sys permissions. I guess it would also be more effort by an order of magnitude or two to convince each owner of a sysctl to accept the change. > Policy knobs like this on proc tend to break in normal maintenance > because they are not used enough so I am not a big fan of adding policy > knobs just because we can. But the rest of the /proc (except PID tree) allows changing permissions (and also UID and GID), just /proc/sys doesn't. This doesn't seem very logical to me. These code paths have not changed much or at all since the initial version in 2007, so I suppose the maintenance burden has not been overwhelming. By the way, /proc/sys still allows changing the {a,c,m}time. I think those are not backed anywhere, so they probably suffer from same caching problems as my first version of the patch. -Topi
On Sun, Nov 03, 2019 at 12:56:48PM -0500, Theodore Y. Ts'o wrote: > On Sun, Nov 03, 2019 at 04:55:48PM +0200, Topi Miettinen wrote: > > Several items in /proc/sys need not be accessible to unprivileged > > tasks. Let the system administrator change the permissions, but only > > to more restrictive modes than what the sysctl tables allow. > > > > Signed-off-by: Topi Miettinen <toiwoton@gmail.com> > > Why should restruct the system administrator from changing the > permissions to one which is more lax than what the sysctl tables? > > The system administrator is already very much trusted. Why should we > take that discretion away from the system administrator? Generally speaking, they're there to provide some sense of boundary between uid 0 and the kernel proper. I think it's correct to not allow weakening of these permissions (which is the current state: no change at all).
On Tue, Nov 05, 2019 at 09:35:46AM +0200, Topi Miettinen wrote: > On 5.11.2019 1.41, Eric W. Biederman wrote: > > My sense is that if there is any kind of compelling reason to make > > world-readable values not world-readable, and it doesn't break anything > > (except malicious applications) than a kernel patch is probably the way > > to go. > > With kernel patch, do you propose to change individual sysctls to not > world-readable? That surely would help everybody instead of just those who > care enough to change /proc/sys permissions. I guess it would also be more > effort by an order of magnitude or two to convince each owner of a sysctl to > accept the change. I would think of this as a two-stage process: provide a mechanism to tighten permissions arbitrarily so that it is easier to gather evidence about which could have their default changed in the future. > These code paths have not changed much or at all since the initial version > in 2007, so I suppose the maintenance burden has not been overwhelming. > > By the way, /proc/sys still allows changing the {a,c,m}time. I think those > are not backed anywhere, so they probably suffer from same caching problems > as my first version of the patch. Is a v2 of this patch needed? It wasn't clear to me if the inode modes were incorrectly cached...?
[Cc+ linux-api@vger.kernel.org] since that's potentially relevant to quite a few people. On Sun, Nov 03, 2019 at 04:55:48PM +0200, Topi Miettinen wrote: > Several items in /proc/sys need not be accessible to unprivileged > tasks. Let the system administrator change the permissions, but only > to more restrictive modes than what the sysctl tables allow. > > Signed-off-by: Topi Miettinen <toiwoton@gmail.com> > --- > fs/proc/proc_sysctl.c | 41 +++++++++++++++++++++++++++++++---------- > 1 file changed, 31 insertions(+), 10 deletions(-) > > diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c > index d80989b6c344..88c4ca7d2782 100644 > --- a/fs/proc/proc_sysctl.c > +++ b/fs/proc/proc_sysctl.c > @@ -818,6 +818,10 @@ static int proc_sys_permission(struct inode *inode, int > mask) > if ((mask & MAY_EXEC) && S_ISREG(inode->i_mode)) > return -EACCES; > > + error = generic_permission(inode, mask); > + if (error) > + return error; > + > head = grab_header(inode); > if (IS_ERR(head)) > return PTR_ERR(head); > @@ -837,9 +841,35 @@ static int proc_sys_setattr(struct dentry *dentry, > struct iattr *attr) > struct inode *inode = d_inode(dentry); > int error; > > - if (attr->ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID)) > + if (attr->ia_valid & (ATTR_UID | ATTR_GID)) > return -EPERM; > > + if (attr->ia_valid & ATTR_MODE) { > + struct ctl_table_header *head = grab_header(inode); > + struct ctl_table *table = PROC_I(inode)->sysctl_entry; > + umode_t max_mode = 0777; /* Only these bits may change */ > + > + if (IS_ERR(head)) > + return PTR_ERR(head); > + > + if (!table) /* global root - r-xr-xr-x */ > + max_mode &= ~0222; > + else /* > + * Don't allow permissions to become less > + * restrictive than the sysctl table entry > + */ > + max_mode &= table->mode; > + > + sysctl_head_finish(head); > + > + /* Execute bits only allowed for directories */ > + if (!S_ISDIR(inode->i_mode)) > + max_mode &= ~0111; > + > + if (attr->ia_mode & ~S_IFMT & ~max_mode) > + return -EPERM; > + } > + > error = setattr_prepare(dentry, attr); > if (error) > return error; > @@ -853,17 +883,8 @@ static int proc_sys_getattr(const struct path *path, > struct kstat *stat, > u32 request_mask, unsigned int query_flags) > { > struct inode *inode = d_inode(path->dentry); > - struct ctl_table_header *head = grab_header(inode); > - struct ctl_table *table = PROC_I(inode)->sysctl_entry; > - > - if (IS_ERR(head)) > - return PTR_ERR(head); > > generic_fillattr(inode, stat); > - if (table) > - stat->mode = (stat->mode & S_IFMT) | table->mode; > - > - sysctl_head_finish(head); > return 0; > } > > -- > 2.24.0.rc1 >
On Tue, Nov 12, 2019 at 03:19:00PM -0800, Kees Cook wrote: > On Tue, Nov 05, 2019 at 09:35:46AM +0200, Topi Miettinen wrote: > Is a v2 of this patch needed? It wasn't clear to me if the inode modes > were incorrectly cached...? I provided a review for it just now. The patch is cleaner but I believe it can be split up, and also I am not yet sure it is correct. You were CC'd on it but the subject was not clear that it was a V2. Topi, if you send a V3 can you please prefix the subject with this? Luis
On Tue, Nov 12, 2019 at 3:22 PM Christian Brauner <christian.brauner@ubuntu.com> wrote: > > [Cc+ linux-api@vger.kernel.org] > > since that's potentially relevant to quite a few people. > > On Sun, Nov 03, 2019 at 04:55:48PM +0200, Topi Miettinen wrote: > > Several items in /proc/sys need not be accessible to unprivileged > > tasks. Let the system administrator change the permissions, but only > > to more restrictive modes than what the sysctl tables allow. > > > > Signed-off-by: Topi Miettinen <toiwoton@gmail.com> > > --- > > fs/proc/proc_sysctl.c | 41 +++++++++++++++++++++++++++++++---------- > > 1 file changed, 31 insertions(+), 10 deletions(-) > > > > diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c > > index d80989b6c344..88c4ca7d2782 100644 > > --- a/fs/proc/proc_sysctl.c > > +++ b/fs/proc/proc_sysctl.c > > @@ -818,6 +818,10 @@ static int proc_sys_permission(struct inode *inode, int > > mask) > > if ((mask & MAY_EXEC) && S_ISREG(inode->i_mode)) > > return -EACCES; > > > > + error = generic_permission(inode, mask); > > + if (error) > > + return error; > > + > > head = grab_header(inode); > > if (IS_ERR(head)) > > return PTR_ERR(head); > > @@ -837,9 +841,35 @@ static int proc_sys_setattr(struct dentry *dentry, > > struct iattr *attr) > > struct inode *inode = d_inode(dentry); > > int error; > > > > - if (attr->ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID)) > > + if (attr->ia_valid & (ATTR_UID | ATTR_GID)) > > return -EPERM; Supporting at least ATTR_GID would make this much more useful. > > > > + if (attr->ia_valid & ATTR_MODE) { > > + struct ctl_table_header *head = grab_header(inode); > > + struct ctl_table *table = PROC_I(inode)->sysctl_entry; > > + umode_t max_mode = 0777; /* Only these bits may change */ > > + > > + if (IS_ERR(head)) > > + return PTR_ERR(head); > > + > > + if (!table) /* global root - r-xr-xr-x */ > > + max_mode &= ~0222; > > + else /* > > + * Don't allow permissions to become less > > + * restrictive than the sysctl table entry > > + */ > > + max_mode &= table->mode; Style nit: please put braces around multi-line if and else branches, even if they're only multi-line because of comments. > > + > > + sysctl_head_finish(head); > > + > > + /* Execute bits only allowed for directories */ > > + if (!S_ISDIR(inode->i_mode)) > > + max_mode &= ~0111; Why is this needed?
On 13.11.2019 6.50, Andy Lutomirski wrote: > On Tue, Nov 12, 2019 at 3:22 PM Christian Brauner > <christian.brauner@ubuntu.com> wrote: >> >> [Cc+ linux-api@vger.kernel.org] >> >> since that's potentially relevant to quite a few people. >> >> On Sun, Nov 03, 2019 at 04:55:48PM +0200, Topi Miettinen wrote: >>> Several items in /proc/sys need not be accessible to unprivileged >>> tasks. Let the system administrator change the permissions, but only >>> to more restrictive modes than what the sysctl tables allow. >>> >>> Signed-off-by: Topi Miettinen <toiwoton@gmail.com> >>> --- >>> fs/proc/proc_sysctl.c | 41 +++++++++++++++++++++++++++++++---------- >>> 1 file changed, 31 insertions(+), 10 deletions(-) >>> >>> diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c >>> index d80989b6c344..88c4ca7d2782 100644 >>> --- a/fs/proc/proc_sysctl.c >>> +++ b/fs/proc/proc_sysctl.c >>> @@ -818,6 +818,10 @@ static int proc_sys_permission(struct inode *inode, int >>> mask) >>> if ((mask & MAY_EXEC) && S_ISREG(inode->i_mode)) >>> return -EACCES; >>> >>> + error = generic_permission(inode, mask); >>> + if (error) >>> + return error; >>> + >>> head = grab_header(inode); >>> if (IS_ERR(head)) >>> return PTR_ERR(head); >>> @@ -837,9 +841,35 @@ static int proc_sys_setattr(struct dentry *dentry, >>> struct iattr *attr) >>> struct inode *inode = d_inode(dentry); >>> int error; >>> >>> - if (attr->ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID)) >>> + if (attr->ia_valid & (ATTR_UID | ATTR_GID)) >>> return -EPERM; > > Supporting at least ATTR_GID would make this much more useful. Yes, also XATTR/ACL support would be useful. But so far I've tried to allow only tightening of permissions. >>> >>> + if (attr->ia_valid & ATTR_MODE) { >>> + struct ctl_table_header *head = grab_header(inode); >>> + struct ctl_table *table = PROC_I(inode)->sysctl_entry; >>> + umode_t max_mode = 0777; /* Only these bits may change */ >>> + >>> + if (IS_ERR(head)) >>> + return PTR_ERR(head); >>> + >>> + if (!table) /* global root - r-xr-xr-x */ >>> + max_mode &= ~0222; >>> + else /* >>> + * Don't allow permissions to become less >>> + * restrictive than the sysctl table entry >>> + */ >>> + max_mode &= table->mode; > > Style nit: please put braces around multi-line if and else branches, > even if they're only multi-line because of comments. OK, thanks. >>> + >>> + sysctl_head_finish(head); >>> + >>> + /* Execute bits only allowed for directories */ >>> + if (!S_ISDIR(inode->i_mode)) >>> + max_mode &= ~0111; > > Why is this needed? > In general, /proc/sys does not allow executable permissions for the files, so I've continued this policy. -Topi
On Wed, Nov 13, 2019 at 12:22 AM Christian Brauner <christian.brauner@ubuntu.com> wrote: > On Sun, Nov 03, 2019 at 04:55:48PM +0200, Topi Miettinen wrote: > > Several items in /proc/sys need not be accessible to unprivileged > > tasks. Let the system administrator change the permissions, but only > > to more restrictive modes than what the sysctl tables allow. > > > > Signed-off-by: Topi Miettinen <toiwoton@gmail.com> > > --- > > fs/proc/proc_sysctl.c | 41 +++++++++++++++++++++++++++++++---------- > > 1 file changed, 31 insertions(+), 10 deletions(-) > > > > diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c > > index d80989b6c344..88c4ca7d2782 100644 > > --- a/fs/proc/proc_sysctl.c > > +++ b/fs/proc/proc_sysctl.c > > @@ -818,6 +818,10 @@ static int proc_sys_permission(struct inode *inode, int > > mask) > > if ((mask & MAY_EXEC) && S_ISREG(inode->i_mode)) > > return -EACCES; > > > > + error = generic_permission(inode, mask); > > + if (error) > > + return error; In kernel/ucount.c, the ->permissions handler set_permissions() grants access based on whether the caller has CAP_SYS_RESOURCE. And in net/sysctl_net.c, the handler net_ctl_permissions() grants access based on whether the caller has CAP_NET_ADMIN. This added check is going to break those, right?
On 13.11.2019 18.00, Jann Horn wrote: > On Wed, Nov 13, 2019 at 12:22 AM Christian Brauner > <christian.brauner@ubuntu.com> wrote: >> On Sun, Nov 03, 2019 at 04:55:48PM +0200, Topi Miettinen wrote: >>> Several items in /proc/sys need not be accessible to unprivileged >>> tasks. Let the system administrator change the permissions, but only >>> to more restrictive modes than what the sysctl tables allow. >>> >>> Signed-off-by: Topi Miettinen <toiwoton@gmail.com> >>> --- >>> fs/proc/proc_sysctl.c | 41 +++++++++++++++++++++++++++++++---------- >>> 1 file changed, 31 insertions(+), 10 deletions(-) >>> >>> diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c >>> index d80989b6c344..88c4ca7d2782 100644 >>> --- a/fs/proc/proc_sysctl.c >>> +++ b/fs/proc/proc_sysctl.c >>> @@ -818,6 +818,10 @@ static int proc_sys_permission(struct inode *inode, int >>> mask) >>> if ((mask & MAY_EXEC) && S_ISREG(inode->i_mode)) >>> return -EACCES; >>> >>> + error = generic_permission(inode, mask); >>> + if (error) >>> + return error; > > In kernel/ucount.c, the ->permissions handler set_permissions() grants > access based on whether the caller has CAP_SYS_RESOURCE. And in > net/sysctl_net.c, the handler net_ctl_permissions() grants access > based on whether the caller has CAP_NET_ADMIN. This added check is > going to break those, right? > Right. The comment above seems then a bit misleading: /* * sysctl entries that are not writeable, * are _NOT_ writeable, capabilities or not. */ -Topi
On Wed, Nov 13, 2019 at 5:19 PM Topi Miettinen <toiwoton@gmail.com> wrote: > On 13.11.2019 18.00, Jann Horn wrote: > > On Wed, Nov 13, 2019 at 12:22 AM Christian Brauner > > <christian.brauner@ubuntu.com> wrote: > >> On Sun, Nov 03, 2019 at 04:55:48PM +0200, Topi Miettinen wrote: > >>> Several items in /proc/sys need not be accessible to unprivileged > >>> tasks. Let the system administrator change the permissions, but only > >>> to more restrictive modes than what the sysctl tables allow. [...] > > In kernel/ucount.c, the ->permissions handler set_permissions() grants > > access based on whether the caller has CAP_SYS_RESOURCE. And in > > net/sysctl_net.c, the handler net_ctl_permissions() grants access > > based on whether the caller has CAP_NET_ADMIN. This added check is > > going to break those, right? > > > > Right. The comment above seems then a bit misleading: > /* > * sysctl entries that are not writeable, > * are _NOT_ writeable, capabilities or not. > */ I don't see the problem. Those handlers never make a file writable that doesn't have one of the three write bits (0222) set.
From 14ad2d9034ecb43b60f59f6422e597a780c65cd9 Mon Sep 17 00:00:00 2001 From: Topi Miettinen <toiwoton@gmail.com> Date: Sun, 3 Nov 2019 16:36:43 +0200 Subject: [PATCH] Allow restricting permissions in /proc/sys Several items in /proc/sys need not be accessible to unprivileged tasks. Let the system administrator change the permissions, but only to more restrictive modes than what the sysctl tables allow. Signed-off-by: Topi Miettinen <toiwoton@gmail.com> --- fs/proc/proc_sysctl.c | 41 +++++++++++++++++++++++++++++++---------- 1 file changed, 31 insertions(+), 10 deletions(-) diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c index d80989b6c344..88c4ca7d2782 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -818,6 +818,10 @@ static int proc_sys_permission(struct inode *inode, int mask) if ((mask & MAY_EXEC) && S_ISREG(inode->i_mode)) return -EACCES; + error = generic_permission(inode, mask); + if (error) + return error; + head = grab_header(inode); if (IS_ERR(head)) return PTR_ERR(head); @@ -837,9 +841,35 @@ static int proc_sys_setattr(struct dentry *dentry, struct iattr *attr) struct inode *inode = d_inode(dentry); int error; - if (attr->ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID)) + if (attr->ia_valid & (ATTR_UID | ATTR_GID)) return -EPERM; + if (attr->ia_valid & ATTR_MODE) { + struct ctl_table_header *head = grab_header(inode); + struct ctl_table *table = PROC_I(inode)->sysctl_entry; + umode_t max_mode = 0777; /* Only these bits may change */ + + if (IS_ERR(head)) + return PTR_ERR(head); + + if (!table) /* global root - r-xr-xr-x */ + max_mode &= ~0222; + else /* + * Don't allow permissions to become less + * restrictive than the sysctl table entry + */ + max_mode &= table->mode; + + sysctl_head_finish(head); + + /* Execute bits only allowed for directories */ + if (!S_ISDIR(inode->i_mode)) + max_mode &= ~0111; + + if (attr->ia_mode & ~S_IFMT & ~max_mode) + return -EPERM; + } + error = setattr_prepare(dentry, attr); if (error) return error; @@ -853,17 +883,8 @@ static int proc_sys_getattr(const struct path *path, struct kstat *stat, u32 request_mask, unsigned int query_flags) { struct inode *inode = d_inode(path->dentry); - struct ctl_table_header *head = grab_header(inode); - struct ctl_table *table = PROC_I(inode)->sysctl_entry; - - if (IS_ERR(head)) - return PTR_ERR(head); generic_fillattr(inode, stat); - if (table) - stat->mode = (stat->mode & S_IFMT) | table->mode; - - sysctl_head_finish(head); return 0; } -- 2.24.0.rc1
Several items in /proc/sys need not be accessible to unprivileged tasks. Let the system administrator change the permissions, but only to more restrictive modes than what the sysctl tables allow. Signed-off-by: Topi Miettinen <toiwoton@gmail.com> --- fs/proc/proc_sysctl.c | 41 +++++++++++++++++++++++++++++++---------- 1 file changed, 31 insertions(+), 10 deletions(-) return PTR_ERR(head); @@ -837,9 +841,35 @@ static int proc_sys_setattr(struct dentry *dentry, struct iattr *attr) struct inode *inode = d_inode(dentry); int error; - if (attr->ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID)) + if (attr->ia_valid & (ATTR_UID | ATTR_GID)) return -EPERM; + if (attr->ia_valid & ATTR_MODE) { + struct ctl_table_header *head = grab_header(inode); + struct ctl_table *table = PROC_I(inode)->sysctl_entry; + umode_t max_mode = 0777; /* Only these bits may change */ + + if (IS_ERR(head)) + return PTR_ERR(head); + + if (!table) /* global root - r-xr-xr-x */ + max_mode &= ~0222; + else /* + * Don't allow permissions to become less + * restrictive than the sysctl table entry + */ + max_mode &= table->mode; + + sysctl_head_finish(head); + + /* Execute bits only allowed for directories */ + if (!S_ISDIR(inode->i_mode)) + max_mode &= ~0111; + + if (attr->ia_mode & ~S_IFMT & ~max_mode) + return -EPERM; + } + error = setattr_prepare(dentry, attr); if (error) return error; @@ -853,17 +883,8 @@ static int proc_sys_getattr(const struct path *path, struct kstat *stat, u32 request_mask, unsigned int query_flags) { struct inode *inode = d_inode(path->dentry); - struct ctl_table_header *head = grab_header(inode); - struct ctl_table *table = PROC_I(inode)->sysctl_entry; - - if (IS_ERR(head)) - return PTR_ERR(head); generic_fillattr(inode, stat); - if (table) - stat->mode = (stat->mode & S_IFMT) | table->mode; - - sysctl_head_finish(head); return 0; }