mbox series

[v2,0/3] Allow initializing the kernfs node's secctx based on its parent

Message ID 20190109162830.8309-1-omosnace@redhat.com (mailing list archive)
Headers show
Series Allow initializing the kernfs node's secctx based on its parent | expand

Message

Ondrej Mosnacek Jan. 9, 2019, 4:28 p.m. UTC
Changes in v2:
- add docstring for the new hook in union security_list_options
- initialize *ctx to NULL and *ctxlen to 0 in case the hook is not
  implemented
v1: https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/

This series adds a new security hook that allows to initialize the security
context of kernfs properly, taking into account the parent context. Kernfs
nodes require special handling here, since they are not bound to specific
inodes/superblocks, but instead represent the backing tree structure that
is used to build the VFS tree when the kernfs tree is mounted.

The kernfs nodes initially do not store any security context and rely on
the LSM to assign some default context to inodes created over them. Kernfs
inodes, however, allow setting an explicit context via the *setxattr(2)
syscalls, in which case the context is stored inside the kernfs node's
metadata.

SELinux (and possibly other LSMs) initialize the context of newly created
FS objects based on the parent object's context (usually the child inherits
the parent's context, unless the policy dictates otherwise). This is done
by hooking the creation of the new inode corresponding to the newly created
file/directory via security_inode_init_security() (most filesystems always
create a fresh inode when a new FS object is created). However, kernfs nodes
can be created "behind the scenes" while the filesystem is not mounted
anywhere and thus no inodes exist.

Therefore, to allow maintaining similar behavior for kernfs nodes, a new LSM
hook is needed, which would allow initializing the kernfs node's security
context based on the context stored in the parent's node (if any).

The main motivation for this change is that the userspace users of cgroupfs
(which is built on kernfs) expect the usual security context inheritance
to work under SELinux (see [1] and [2]). This functionality is required for
better confinement of containers under SELinux.

The first patch adds the new LSM hook; the second patch implements the hook
in SELinux; and the third patch modifies kernfs to use the new hook to
initialize the security context of kernfs nodes whenever its parent node
has a non-default context set.

Note: the patches are based on current selinux/next [3], but they seem to
apply cleanly on top of v5.0-rc1 as well.

Testing:
- passed SELinux testsuite on Fedora 29 (x86_64) when applied on top of
  current Rawhide kernel (5.0.0-0.rc1.git0.1) [4]
- passed the reproducer from the last patch

[1] https://github.com/SELinuxProject/selinux-kernel/issues/39
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1553803
[3] https://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux.git/log/?h=selinux-pr-20181224
[4] https://copr.fedorainfracloud.org/coprs/omos/kernel-testing/build/842855/

Ondrej Mosnacek (3):
  LSM: Add new hook for generic node initialization
  selinux: Implement the object_init_security hook
  kernfs: Initialize security of newly created nodes

 fs/kernfs/dir.c             | 49 ++++++++++++++++++++++++++++++++++---
 fs/kernfs/inode.c           |  9 +++----
 fs/kernfs/kernfs-internal.h |  4 +++
 include/linux/lsm_hooks.h   | 30 +++++++++++++++++++++++
 include/linux/security.h    | 14 +++++++++++
 security/security.c         | 10 ++++++++
 security/selinux/hooks.c    | 41 +++++++++++++++++++++++++++++++
 7 files changed, 149 insertions(+), 8 deletions(-)

Comments

Casey Schaufler Jan. 9, 2019, 5:19 p.m. UTC | #1
On 1/9/2019 8:28 AM, Ondrej Mosnacek wrote:
> Changes in v2:
> - add docstring for the new hook in union security_list_options
> - initialize *ctx to NULL and *ctxlen to 0 in case the hook is not
>   implemented
> v1: https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/
>
> This series adds a new security hook that allows to initialize the security
> context of kernfs properly, taking into account the parent context. Kernfs
> nodes require special handling here, since they are not bound to specific
> inodes/superblocks, but instead represent the backing tree structure that
> is used to build the VFS tree when the kernfs tree is mounted.
>
> The kernfs nodes initially do not store any security context and rely on
> the LSM to assign some default context to inodes created over them. 

This seems like a bug in kernfs. Why doesn't kernfs adhere to the usual
and expected filesystem behavior?

> Kernfs
> inodes, however, allow setting an explicit context via the *setxattr(2)
> syscalls, in which case the context is stored inside the kernfs node's
> metadata.
>
> SELinux (and possibly other LSMs) initialize the context of newly created
> FS objects based on the parent object's context (usually the child inherits
> the parent's context, unless the policy dictates otherwise). 

An LSM might use information about the parent other than the "context".
Smack, for example, uses an attribute SMACK64TRANSMUTE from the parent
to determine whether the Smack label of the new object should be taken
from the parent or the process. Passing the "context" of the parent is
insufficient for Smack.

> This is done
> by hooking the creation of the new inode corresponding to the newly created
> file/directory via security_inode_init_security() (most filesystems always
> create a fresh inode when a new FS object is created). However, kernfs nodes
> can be created "behind the scenes" while the filesystem is not mounted
> anywhere and thus no inodes exist.
>
> Therefore, to allow maintaining similar behavior for kernfs nodes, a new LSM
> hook is needed, which would allow initializing the kernfs node's security
> context based on the context stored in the parent's node (if any).
>
> The main motivation for this change is that the userspace users of cgroupfs
> (which is built on kernfs) expect the usual security context inheritance
> to work under SELinux (see [1] and [2]). This functionality is required for
> better confinement of containers under SELinux.
>
> The first patch adds the new LSM hook; the second patch implements the hook
> in SELinux; and the third patch modifies kernfs to use the new hook to
> initialize the security context of kernfs nodes whenever its parent node
> has a non-default context set.
>
> Note: the patches are based on current selinux/next [3], but they seem to
> apply cleanly on top of v5.0-rc1 as well.
>
> Testing:
> - passed SELinux testsuite on Fedora 29 (x86_64) when applied on top of
>   current Rawhide kernel (5.0.0-0.rc1.git0.1) [4]
> - passed the reproducer from the last patch
>
> [1] https://github.com/SELinuxProject/selinux-kernel/issues/39
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1553803
> [3] https://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux.git/log/?h=selinux-pr-20181224
> [4] https://copr.fedorainfracloud.org/coprs/omos/kernel-testing/build/842855/
>
> Ondrej Mosnacek (3):
>   LSM: Add new hook for generic node initialization
>   selinux: Implement the object_init_security hook
>   kernfs: Initialize security of newly created nodes
>
>  fs/kernfs/dir.c             | 49 ++++++++++++++++++++++++++++++++++---
>  fs/kernfs/inode.c           |  9 +++----
>  fs/kernfs/kernfs-internal.h |  4 +++
>  include/linux/lsm_hooks.h   | 30 +++++++++++++++++++++++
>  include/linux/security.h    | 14 +++++++++++
>  security/security.c         | 10 ++++++++
>  security/selinux/hooks.c    | 41 +++++++++++++++++++++++++++++++
>  7 files changed, 149 insertions(+), 8 deletions(-)
>
Stephen Smalley Jan. 9, 2019, 8:37 p.m. UTC | #2
On 1/9/19 12:19 PM, Casey Schaufler wrote:
> On 1/9/2019 8:28 AM, Ondrej Mosnacek wrote:
>> Changes in v2:
>> - add docstring for the new hook in union security_list_options
>> - initialize *ctx to NULL and *ctxlen to 0 in case the hook is not
>>    implemented
>> v1: https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/
>>
>> This series adds a new security hook that allows to initialize the security
>> context of kernfs properly, taking into account the parent context. Kernfs
>> nodes require special handling here, since they are not bound to specific
>> inodes/superblocks, but instead represent the backing tree structure that
>> is used to build the VFS tree when the kernfs tree is mounted.
>>
>> The kernfs nodes initially do not store any security context and rely on
>> the LSM to assign some default context to inodes created over them.
> 
> This seems like a bug in kernfs. Why doesn't kernfs adhere to the usual
> and expected filesystem behavior?

sysfs / kernfs didn't support xattrs at all when we first added support 
for setting security contexts to it, so originally all sysfs / kernfs 
inodes had a single security context, and we only required separate 
storage for the inodes that were explicitly labeled by userspace.

Later kernfs grew support for trusted.* xattrs using simple_xattrs but 
the existing security.* support was left mostly unchanged.

> 
>> Kernfs
>> inodes, however, allow setting an explicit context via the *setxattr(2)
>> syscalls, in which case the context is stored inside the kernfs node's
>> metadata.
>>
>> SELinux (and possibly other LSMs) initialize the context of newly created
>> FS objects based on the parent object's context (usually the child inherits
>> the parent's context, unless the policy dictates otherwise).
> 
> An LSM might use information about the parent other than the "context".
> Smack, for example, uses an attribute SMACK64TRANSMUTE from the parent
> to determine whether the Smack label of the new object should be taken
> from the parent or the process. Passing the "context" of the parent is
> insufficient for Smack.

IIUC, this would involve switching the handling of security.* xattrs in 
kernfs over to use simple_xattrs too (so that we can store multiple such 
attributes), and then pass the entire simple_xattrs list or at least 
anything with a security.* prefix when initializing a new node or 
refreshing an existing inode.  Then the security module could extract 
any security.* attributes of interest for use in determining the label 
of new inodes and in refreshing the label of an inode.

> 
>> This is done
>> by hooking the creation of the new inode corresponding to the newly created
>> file/directory via security_inode_init_security() (most filesystems always
>> create a fresh inode when a new FS object is created). However, kernfs nodes
>> can be created "behind the scenes" while the filesystem is not mounted
>> anywhere and thus no inodes exist.
>>
>> Therefore, to allow maintaining similar behavior for kernfs nodes, a new LSM
>> hook is needed, which would allow initializing the kernfs node's security
>> context based on the context stored in the parent's node (if any).
>>
>> The main motivation for this change is that the userspace users of cgroupfs
>> (which is built on kernfs) expect the usual security context inheritance
>> to work under SELinux (see [1] and [2]). This functionality is required for
>> better confinement of containers under SELinux.
>>
>> The first patch adds the new LSM hook; the second patch implements the hook
>> in SELinux; and the third patch modifies kernfs to use the new hook to
>> initialize the security context of kernfs nodes whenever its parent node
>> has a non-default context set.
>>
>> Note: the patches are based on current selinux/next [3], but they seem to
>> apply cleanly on top of v5.0-rc1 as well.
>>
>> Testing:
>> - passed SELinux testsuite on Fedora 29 (x86_64) when applied on top of
>>    current Rawhide kernel (5.0.0-0.rc1.git0.1) [4]
>> - passed the reproducer from the last patch
>>
>> [1] https://github.com/SELinuxProject/selinux-kernel/issues/39
>> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1553803
>> [3] https://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux.git/log/?h=selinux-pr-20181224
>> [4] https://copr.fedorainfracloud.org/coprs/omos/kernel-testing/build/842855/
>>
>> Ondrej Mosnacek (3):
>>    LSM: Add new hook for generic node initialization
>>    selinux: Implement the object_init_security hook
>>    kernfs: Initialize security of newly created nodes
>>
>>   fs/kernfs/dir.c             | 49 ++++++++++++++++++++++++++++++++++---
>>   fs/kernfs/inode.c           |  9 +++----
>>   fs/kernfs/kernfs-internal.h |  4 +++
>>   include/linux/lsm_hooks.h   | 30 +++++++++++++++++++++++
>>   include/linux/security.h    | 14 +++++++++++
>>   security/security.c         | 10 ++++++++
>>   security/selinux/hooks.c    | 41 +++++++++++++++++++++++++++++++
>>   7 files changed, 149 insertions(+), 8 deletions(-)
>>
>
Casey Schaufler Jan. 9, 2019, 10:03 p.m. UTC | #3
On 1/9/2019 12:37 PM, Stephen Smalley wrote:
> On 1/9/19 12:19 PM, Casey Schaufler wrote:
>> On 1/9/2019 8:28 AM, Ondrej Mosnacek wrote:
>>> Changes in v2:
>>> - add docstring for the new hook in union security_list_options
>>> - initialize *ctx to NULL and *ctxlen to 0 in case the hook is not
>>>    implemented
>>> v1: https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/
>>>
>>> This series adds a new security hook that allows to initialize the security
>>> context of kernfs properly, taking into account the parent context. Kernfs
>>> nodes require special handling here, since they are not bound to specific
>>> inodes/superblocks, but instead represent the backing tree structure that
>>> is used to build the VFS tree when the kernfs tree is mounted.
>>>
>>> The kernfs nodes initially do not store any security context and rely on
>>> the LSM to assign some default context to inodes created over them.
>>
>> This seems like a bug in kernfs. Why doesn't kernfs adhere to the usual
>> and expected filesystem behavior?
>
> sysfs / kernfs didn't support xattrs at all when we first added support for setting security contexts to it, so originally all sysfs / kernfs inodes had a single security context, and we only required separate storage for the inodes that were explicitly labeled by userspace.
>
> Later kernfs grew support for trusted.* xattrs using simple_xattrs but the existing security.* support was left mostly unchanged.

OK, so as I said, this seems like a bug in kernfs.

>
>>
>>> Kernfs
>>> inodes, however, allow setting an explicit context via the *setxattr(2)
>>> syscalls, in which case the context is stored inside the kernfs node's
>>> metadata.
>>>
>>> SELinux (and possibly other LSMs) initialize the context of newly created
>>> FS objects based on the parent object's context (usually the child inherits
>>> the parent's context, unless the policy dictates otherwise).
>>
>> An LSM might use information about the parent other than the "context".
>> Smack, for example, uses an attribute SMACK64TRANSMUTE from the parent
>> to determine whether the Smack label of the new object should be taken
>> from the parent or the process. Passing the "context" of the parent is
>> insufficient for Smack.
>
> IIUC, this would involve switching the handling of security.* xattrs in kernfs over to use simple_xattrs too (so that we can store multiple such attributes), and then pass the entire simple_xattrs list or at least anything with a security.* prefix when initializing a new node or refreshing an existing inode.  Then the security module could extract any security.* attributes of interest for use in determining the label of new inodes and in refreshing the label of an inode.

Right. But I'll point out that there is nothing to prevent an
LSM from using inode information outside of the xattrs (e.g. uids)
to determine the security state it wants to give a new object.

I suggest that the better solution would be for kernfs to
use inodes like a real filesystem. Every special case like this
results in special cases like this special hook. It's hard
enough to keep track of the general case in the Linux kernel.

>
>>
>>> This is done
>>> by hooking the creation of the new inode corresponding to the newly created
>>> file/directory via security_inode_init_security() (most filesystems always
>>> create a fresh inode when a new FS object is created). However, kernfs nodes
>>> can be created "behind the scenes" while the filesystem is not mounted
>>> anywhere and thus no inodes exist.
>>>
>>> Therefore, to allow maintaining similar behavior for kernfs nodes, a new LSM
>>> hook is needed, which would allow initializing the kernfs node's security
>>> context based on the context stored in the parent's node (if any).
>>>
>>> The main motivation for this change is that the userspace users of cgroupfs
>>> (which is built on kernfs) expect the usual security context inheritance
>>> to work under SELinux (see [1] and [2]). This functionality is required for
>>> better confinement of containers under SELinux.
>>>
>>> The first patch adds the new LSM hook; the second patch implements the hook
>>> in SELinux; and the third patch modifies kernfs to use the new hook to
>>> initialize the security context of kernfs nodes whenever its parent node
>>> has a non-default context set.
>>>
>>> Note: the patches are based on current selinux/next [3], but they seem to
>>> apply cleanly on top of v5.0-rc1 as well.
>>>
>>> Testing:
>>> - passed SELinux testsuite on Fedora 29 (x86_64) when applied on top of
>>>    current Rawhide kernel (5.0.0-0.rc1.git0.1) [4]
>>> - passed the reproducer from the last patch
>>>
>>> [1] https://github.com/SELinuxProject/selinux-kernel/issues/39
>>> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1553803
>>> [3] https://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux.git/log/?h=selinux-pr-20181224
>>> [4] https://copr.fedorainfracloud.org/coprs/omos/kernel-testing/build/842855/
>>>
>>> Ondrej Mosnacek (3):
>>>    LSM: Add new hook for generic node initialization
>>>    selinux: Implement the object_init_security hook
>>>    kernfs: Initialize security of newly created nodes
>>>
>>>   fs/kernfs/dir.c             | 49 ++++++++++++++++++++++++++++++++++---
>>>   fs/kernfs/inode.c           |  9 +++----
>>>   fs/kernfs/kernfs-internal.h |  4 +++
>>>   include/linux/lsm_hooks.h   | 30 +++++++++++++++++++++++
>>>   include/linux/security.h    | 14 +++++++++++
>>>   security/security.c         | 10 ++++++++
>>>   security/selinux/hooks.c    | 41 +++++++++++++++++++++++++++++++
>>>   7 files changed, 149 insertions(+), 8 deletions(-)
>>>
>>
>
>
Stephen Smalley Jan. 10, 2019, 2:15 p.m. UTC | #4
On 1/9/19 5:03 PM, Casey Schaufler wrote:
> On 1/9/2019 12:37 PM, Stephen Smalley wrote:
>> On 1/9/19 12:19 PM, Casey Schaufler wrote:
>>> On 1/9/2019 8:28 AM, Ondrej Mosnacek wrote:
>>>> Changes in v2:
>>>> - add docstring for the new hook in union security_list_options
>>>> - initialize *ctx to NULL and *ctxlen to 0 in case the hook is not
>>>>     implemented
>>>> v1: https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/
>>>>
>>>> This series adds a new security hook that allows to initialize the security
>>>> context of kernfs properly, taking into account the parent context. Kernfs
>>>> nodes require special handling here, since they are not bound to specific
>>>> inodes/superblocks, but instead represent the backing tree structure that
>>>> is used to build the VFS tree when the kernfs tree is mounted.
>>>>
>>>> The kernfs nodes initially do not store any security context and rely on
>>>> the LSM to assign some default context to inodes created over them.
>>>
>>> This seems like a bug in kernfs. Why doesn't kernfs adhere to the usual
>>> and expected filesystem behavior?
>>
>> sysfs / kernfs didn't support xattrs at all when we first added support for setting security contexts to it, so originally all sysfs / kernfs inodes had a single security context, and we only required separate storage for the inodes that were explicitly labeled by userspace.
>>
>> Later kernfs grew support for trusted.* xattrs using simple_xattrs but the existing security.* support was left mostly unchanged.
> 
> OK, so as I said, this seems like a bug in kernfs.
> 
>>
>>>
>>>> Kernfs
>>>> inodes, however, allow setting an explicit context via the *setxattr(2)
>>>> syscalls, in which case the context is stored inside the kernfs node's
>>>> metadata.
>>>>
>>>> SELinux (and possibly other LSMs) initialize the context of newly created
>>>> FS objects based on the parent object's context (usually the child inherits
>>>> the parent's context, unless the policy dictates otherwise).
>>>
>>> An LSM might use information about the parent other than the "context".
>>> Smack, for example, uses an attribute SMACK64TRANSMUTE from the parent
>>> to determine whether the Smack label of the new object should be taken
>>> from the parent or the process. Passing the "context" of the parent is
>>> insufficient for Smack.
>>
>> IIUC, this would involve switching the handling of security.* xattrs in kernfs over to use simple_xattrs too (so that we can store multiple such attributes), and then pass the entire simple_xattrs list or at least anything with a security.* prefix when initializing a new node or refreshing an existing inode.  Then the security module could extract any security.* attributes of interest for use in determining the label of new inodes and in refreshing the label of an inode.
> 
> Right. But I'll point out that there is nothing to prevent an
> LSM from using inode information outside of the xattrs (e.g. uids)
> to determine the security state it wants to give a new object.

If that's a real concern, the hook could pass the ia_iattr structure in 
addition to the simple_xattrs list and the security module could use any 
inode attributes it likes in making the decision.  Effectively it would 
be passing the entire kernfs_iattrs structure, but probably not directly 
since that definition is presently private to kernfs.

> I suggest that the better solution would be for kernfs to
> use inodes like a real filesystem. Every special case like this
> results in special cases like this special hook. It's hard
> enough to keep track of the general case in the Linux kernel.

Feel free to propose an implementation if you like, but doing a complete 
rewrite of kernfs internals seems a bit out of scope.

> 
>>
>>>
>>>> This is done
>>>> by hooking the creation of the new inode corresponding to the newly created
>>>> file/directory via security_inode_init_security() (most filesystems always
>>>> create a fresh inode when a new FS object is created). However, kernfs nodes
>>>> can be created "behind the scenes" while the filesystem is not mounted
>>>> anywhere and thus no inodes exist.
>>>>
>>>> Therefore, to allow maintaining similar behavior for kernfs nodes, a new LSM
>>>> hook is needed, which would allow initializing the kernfs node's security
>>>> context based on the context stored in the parent's node (if any).
>>>>
>>>> The main motivation for this change is that the userspace users of cgroupfs
>>>> (which is built on kernfs) expect the usual security context inheritance
>>>> to work under SELinux (see [1] and [2]). This functionality is required for
>>>> better confinement of containers under SELinux.
>>>>
>>>> The first patch adds the new LSM hook; the second patch implements the hook
>>>> in SELinux; and the third patch modifies kernfs to use the new hook to
>>>> initialize the security context of kernfs nodes whenever its parent node
>>>> has a non-default context set.
>>>>
>>>> Note: the patches are based on current selinux/next [3], but they seem to
>>>> apply cleanly on top of v5.0-rc1 as well.
>>>>
>>>> Testing:
>>>> - passed SELinux testsuite on Fedora 29 (x86_64) when applied on top of
>>>>     current Rawhide kernel (5.0.0-0.rc1.git0.1) [4]
>>>> - passed the reproducer from the last patch
>>>>
>>>> [1] https://github.com/SELinuxProject/selinux-kernel/issues/39
>>>> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1553803
>>>> [3] https://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux.git/log/?h=selinux-pr-20181224
>>>> [4] https://copr.fedorainfracloud.org/coprs/omos/kernel-testing/build/842855/
>>>>
>>>> Ondrej Mosnacek (3):
>>>>     LSM: Add new hook for generic node initialization
>>>>     selinux: Implement the object_init_security hook
>>>>     kernfs: Initialize security of newly created nodes
>>>>
>>>>    fs/kernfs/dir.c             | 49 ++++++++++++++++++++++++++++++++++---
>>>>    fs/kernfs/inode.c           |  9 +++----
>>>>    fs/kernfs/kernfs-internal.h |  4 +++
>>>>    include/linux/lsm_hooks.h   | 30 +++++++++++++++++++++++
>>>>    include/linux/security.h    | 14 +++++++++++
>>>>    security/security.c         | 10 ++++++++
>>>>    security/selinux/hooks.c    | 41 +++++++++++++++++++++++++++++++
>>>>    7 files changed, 149 insertions(+), 8 deletions(-)
>>>>
>>>
>>
>>
>
Casey Schaufler Jan. 10, 2019, 5:54 p.m. UTC | #5
Resending after email configuration repair.

On 1/10/2019 6:15 AM, Stephen Smalley wrote:
> On 1/9/19 5:03 PM, Casey Schaufler wrote:
>> On 1/9/2019 12:37 PM, Stephen Smalley wrote:
>>> On 1/9/19 12:19 PM, Casey Schaufler wrote:
>>>> On 1/9/2019 8:28 AM, Ondrej Mosnacek wrote:
>>>>> Changes in v2:
>>>>> - add docstring for the new hook in union security_list_options
>>>>> - initialize *ctx to NULL and *ctxlen to 0 in case the hook is not
>>>>>     implemented
>>>>> v1: https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/
>>>>>
>>>>> This series adds a new security hook that allows to initialize the security
>>>>> context of kernfs properly, taking into account the parent context. Kernfs
>>>>> nodes require special handling here, since they are not bound to specific
>>>>> inodes/superblocks, but instead represent the backing tree structure that
>>>>> is used to build the VFS tree when the kernfs tree is mounted.
>>>>>
>>>>> The kernfs nodes initially do not store any security context and rely on
>>>>> the LSM to assign some default context to inodes created over them.
>>>>
>>>> This seems like a bug in kernfs. Why doesn't kernfs adhere to the usual
>>>> and expected filesystem behavior?
>>>
>>> sysfs / kernfs didn't support xattrs at all when we first added support for setting security contexts to it, so originally all sysfs / kernfs inodes had a single security context, and we only required separate storage for the inodes that were explicitly labeled by userspace.
>>>
>>> Later kernfs grew support for trusted.* xattrs using simple_xattrs but the existing security.* support was left mostly unchanged.
>>
>> OK, so as I said, this seems like a bug in kernfs.
>>
>>>
>>>>
>>>>> Kernfs
>>>>> inodes, however, allow setting an explicit context via the *setxattr(2)
>>>>> syscalls, in which case the context is stored inside the kernfs node's
>>>>> metadata.
>>>>>
>>>>> SELinux (and possibly other LSMs) initialize the context of newly created
>>>>> FS objects based on the parent object's context (usually the child inherits
>>>>> the parent's context, unless the policy dictates otherwise).
>>>>
>>>> An LSM might use information about the parent other than the "context".
>>>> Smack, for example, uses an attribute SMACK64TRANSMUTE from the parent
>>>> to determine whether the Smack label of the new object should be taken
>>>> from the parent or the process. Passing the "context" of the parent is
>>>> insufficient for Smack.
>>>
>>> IIUC, this would involve switching the handling of security.* xattrs in kernfs over to use simple_xattrs too (so that we can store multiple such attributes), and then pass the entire simple_xattrs list or at least anything with a security.* prefix when initializing a new node or refreshing an existing inode.  Then the security module could extract any security.* attributes of interest for use in determining the label of new inodes and in refreshing the label of an inode.
>>
>> Right. But I'll point out that there is nothing to prevent an
>> LSM from using inode information outside of the xattrs (e.g. uids)
>> to determine the security state it wants to give a new object.
>
> If that's a real concern, the hook could pass the ia_iattr structure in addition to the simple_xattrs list and the security module could use any inode attributes it likes in making the decision.  Effectively it would be passing the entire kernfs_iattrs structure, but probably not directly since that definition is presently private to kernfs.

Yes, it's a real concern. And no, just passing all of the kernfs internal data
out in j-random formats does not pass muster. Al Viro was commenting the other
day on how bad the LSM infrastructure interfaces are. The original proposal here
is already big, cluttered and inadequate. Adding more to it to make up for its
shortcomings should be sending up red flags.

I've been wallowing in the LSM infrastructure for the past seven years.
Interfaces like this one, that propagate the idiosyncrasies of both
the caller (kernfs) and one potential callee (SELinux) are much too
common. I understand that there is a problem that needs a solution.
This isn't it.


>> I suggest that the better solution would be for kernfs to
>> use inodes like a real filesystem. Every special case like this
>> results in special cases like this special hook. It's hard
>> enough to keep track of the general case in the Linux kernel.
>
> Feel free to propose an implementation if you like, but doing a complete rewrite of kernfs internals seems a bit out of scope.

If this issue points out a serious problem with the kernfs implementation
then I would expect that addressing the problem at its source would be in
everyone's best interest. Did anyone even look at the possibility? If I
said that I would do that, how long would you be willing to wait for it?
Stephen Smalley Jan. 10, 2019, 7:37 p.m. UTC | #6
On 1/10/19 12:54 PM, Casey Schaufler wrote:
> 
> Resending after email configuration repair.
> 
> On 1/10/2019 6:15 AM, Stephen Smalley wrote:
>> On 1/9/19 5:03 PM, Casey Schaufler wrote:
>>> On 1/9/2019 12:37 PM, Stephen Smalley wrote:
>>>> On 1/9/19 12:19 PM, Casey Schaufler wrote:
>>>>> On 1/9/2019 8:28 AM, Ondrej Mosnacek wrote:
>>>>>> Changes in v2:
>>>>>> - add docstring for the new hook in union security_list_options
>>>>>> - initialize *ctx to NULL and *ctxlen to 0 in case the hook is not
>>>>>>      implemented
>>>>>> v1: https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/
>>>>>>
>>>>>> This series adds a new security hook that allows to initialize the security
>>>>>> context of kernfs properly, taking into account the parent context. Kernfs
>>>>>> nodes require special handling here, since they are not bound to specific
>>>>>> inodes/superblocks, but instead represent the backing tree structure that
>>>>>> is used to build the VFS tree when the kernfs tree is mounted.
>>>>>>
>>>>>> The kernfs nodes initially do not store any security context and rely on
>>>>>> the LSM to assign some default context to inodes created over them.
>>>>>
>>>>> This seems like a bug in kernfs. Why doesn't kernfs adhere to the usual
>>>>> and expected filesystem behavior?
>>>>
>>>> sysfs / kernfs didn't support xattrs at all when we first added support for setting security contexts to it, so originally all sysfs / kernfs inodes had a single security context, and we only required separate storage for the inodes that were explicitly labeled by userspace.
>>>>
>>>> Later kernfs grew support for trusted.* xattrs using simple_xattrs but the existing security.* support was left mostly unchanged.
>>>
>>> OK, so as I said, this seems like a bug in kernfs.
>>>
>>>>
>>>>>
>>>>>> Kernfs
>>>>>> inodes, however, allow setting an explicit context via the *setxattr(2)
>>>>>> syscalls, in which case the context is stored inside the kernfs node's
>>>>>> metadata.
>>>>>>
>>>>>> SELinux (and possibly other LSMs) initialize the context of newly created
>>>>>> FS objects based on the parent object's context (usually the child inherits
>>>>>> the parent's context, unless the policy dictates otherwise).
>>>>>
>>>>> An LSM might use information about the parent other than the "context".
>>>>> Smack, for example, uses an attribute SMACK64TRANSMUTE from the parent
>>>>> to determine whether the Smack label of the new object should be taken
>>>>> from the parent or the process. Passing the "context" of the parent is
>>>>> insufficient for Smack.
>>>>
>>>> IIUC, this would involve switching the handling of security.* xattrs in kernfs over to use simple_xattrs too (so that we can store multiple such attributes), and then pass the entire simple_xattrs list or at least anything with a security.* prefix when initializing a new node or refreshing an existing inode.  Then the security module could extract any security.* attributes of interest for use in determining the label of new inodes and in refreshing the label of an inode.
>>>
>>> Right. But I'll point out that there is nothing to prevent an
>>> LSM from using inode information outside of the xattrs (e.g. uids)
>>> to determine the security state it wants to give a new object.
>>
>> If that's a real concern, the hook could pass the ia_iattr structure in addition to the simple_xattrs list and the security module could use any inode attributes it likes in making the decision.  Effectively it would be passing the entire kernfs_iattrs structure, but probably not directly since that definition is presently private to kernfs.
> 
> Yes, it's a real concern. And no, just passing all of the kernfs internal data
> out in j-random formats does not pass muster. Al Viro was commenting the other
> day on how bad the LSM infrastructure interfaces are. The original proposal here
> is already big, cluttered and inadequate. Adding more to it to make up for its
> shortcomings should be sending up red flags
I don't quite see how the original patch set or hook can be called big 
and cluttered.  Switching the handling of security xattrs in kernfs to 
use simple_xattrs (a natural and seemingly straightforward cleanup) and 
passing the entire simple_xattrs list to the hook interface would allow 
you to support SMACK64TRANSMUTE, which was the one actual inadequacy you 
identified.  You claim that someone might need/want the parent uid/gid 
too, but there are no in-tree security modules that do so nor any 
submitted AFAIK, and if that situation arises, all we need to do to 
support it is to add the iattrs.  Obviously they can all be wrapped up 
in some larger structure if desired. At that point the security modules 
would have access to all of the inode attributes supported by kernfs.

> I've been wallowing in the LSM infrastructure for the past seven years.
> Interfaces like this one, that propagate the idiosyncrasies of both
> the caller (kernfs) and one potential callee (SELinux) are much too
> common. I understand that there is a problem that needs a solution.
> This isn't it.
The solution sketched above should be capable of supporting the needs of 
any current security modules and of being easily extended for others if 
the need arises.

> 
> 
>>> I suggest that the better solution would be for kernfs to
>>> use inodes like a real filesystem. Every special case like this
>>> results in special cases like this special hook. It's hard
>>> enough to keep track of the general case in the Linux kernel.
>>
>> Feel free to propose an implementation if you like, but doing a complete rewrite of kernfs internals seems a bit out of scope.
> 
> If this issue points out a serious problem with the kernfs implementation
> then I would expect that addressing the problem at its source would be in
> everyone's best interest. Did anyone even look at the possibility? If I
> said that I would do that, how long would you be willing to wait for it?

I don't know that it points to a serious problem with kernfs.  But I'll 
let you convince the kernfs maintainers of that. Meanwhile, we have a 
proposed solution that solves the problem for all in-tree security 
modules.  I see no reason to hold that up. Don't over-design.
Paul Moore Jan. 11, 2019, 2:20 a.m. UTC | #7
On Thu, Jan 10, 2019 at 2:36 PM Stephen Smalley <sds@tycho.nsa.gov> wrote:
> On 1/10/19 12:54 PM, Casey Schaufler wrote:
> > Resending after email configuration repair.
> >
> > On 1/10/2019 6:15 AM, Stephen Smalley wrote:
> >> On 1/9/19 5:03 PM, Casey Schaufler wrote:
> >>> On 1/9/2019 12:37 PM, Stephen Smalley wrote:
> >>>> On 1/9/19 12:19 PM, Casey Schaufler wrote:
> >>>>> On 1/9/2019 8:28 AM, Ondrej Mosnacek wrote:
> >>>>>> Changes in v2:
> >>>>>> - add docstring for the new hook in union security_list_options
> >>>>>> - initialize *ctx to NULL and *ctxlen to 0 in case the hook is not
> >>>>>>      implemented
> >>>>>> v1: https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/
> >>>>>>
> >>>>>> This series adds a new security hook that allows to initialize the security
> >>>>>> context of kernfs properly, taking into account the parent context. Kernfs
> >>>>>> nodes require special handling here, since they are not bound to specific
> >>>>>> inodes/superblocks, but instead represent the backing tree structure that
> >>>>>> is used to build the VFS tree when the kernfs tree is mounted.
> >>>>>>
> >>>>>> The kernfs nodes initially do not store any security context and rely on
> >>>>>> the LSM to assign some default context to inodes created over them.
> >>>>>
> >>>>> This seems like a bug in kernfs. Why doesn't kernfs adhere to the usual
> >>>>> and expected filesystem behavior?
> >>>>
> >>>> sysfs / kernfs didn't support xattrs at all when we first added support for setting security contexts to it, so originally all sysfs / kernfs inodes had a single security context, and we only required separate storage for the inodes that were explicitly labeled by userspace.
> >>>>
> >>>> Later kernfs grew support for trusted.* xattrs using simple_xattrs but the existing security.* support was left mostly unchanged.
> >>>
> >>> OK, so as I said, this seems like a bug in kernfs.
> >>>
> >>>>
> >>>>>
> >>>>>> Kernfs
> >>>>>> inodes, however, allow setting an explicit context via the *setxattr(2)
> >>>>>> syscalls, in which case the context is stored inside the kernfs node's
> >>>>>> metadata.
> >>>>>>
> >>>>>> SELinux (and possibly other LSMs) initialize the context of newly created
> >>>>>> FS objects based on the parent object's context (usually the child inherits
> >>>>>> the parent's context, unless the policy dictates otherwise).
> >>>>>
> >>>>> An LSM might use information about the parent other than the "context".
> >>>>> Smack, for example, uses an attribute SMACK64TRANSMUTE from the parent
> >>>>> to determine whether the Smack label of the new object should be taken
> >>>>> from the parent or the process. Passing the "context" of the parent is
> >>>>> insufficient for Smack.
> >>>>
> >>>> IIUC, this would involve switching the handling of security.* xattrs in kernfs over to use simple_xattrs too (so that we can store multiple such attributes), and then pass the entire simple_xattrs list or at least anything with a security.* prefix when initializing a new node or refreshing an existing inode.  Then the security module could extract any security.* attributes of interest for use in determining the label of new inodes and in refreshing the label of an inode.
> >>>
> >>> Right. But I'll point out that there is nothing to prevent an
> >>> LSM from using inode information outside of the xattrs (e.g. uids)
> >>> to determine the security state it wants to give a new object.
> >>
> >> If that's a real concern, the hook could pass the ia_iattr structure in addition to the simple_xattrs list and the security module could use any inode attributes it likes in making the decision.  Effectively it would be passing the entire kernfs_iattrs structure, but probably not directly since that definition is presently private to kernfs.
> >
> > Yes, it's a real concern. And no, just passing all of the kernfs internal data
> > out in j-random formats does not pass muster. Al Viro was commenting the other
> > day on how bad the LSM infrastructure interfaces are. The original proposal here
> > is already big, cluttered and inadequate. Adding more to it to make up for its
> > shortcomings should be sending up red flags
>
> I don't quite see how the original patch set or hook can be called big
> and cluttered.  Switching the handling of security xattrs in kernfs to
> use simple_xattrs (a natural and seemingly straightforward cleanup) and
> passing the entire simple_xattrs list to the hook interface would allow
> you to support SMACK64TRANSMUTE, which was the one actual inadequacy you
> identified.  You claim that someone might need/want the parent uid/gid
> too, but there are no in-tree security modules that do so nor any
> submitted AFAIK, and if that situation arises, all we need to do to
> support it is to add the iattrs.  Obviously they can all be wrapped up
> in some larger structure if desired. At that point the security modules
> would have access to all of the inode attributes supported by kernfs.

I'm with Stephen on this; if Ondrej changes it over to simple_xattrs
as described above so that Smack would have what it needs, I don't see
why we should hold off on this.

Everything we are talking about is a kernel internal issue, we can
change it as needed to take into account new LSMs or new functionality
in existing LSMs.

Ondrej, a gentle reminder that it would be nice to have a simple
selinux-testsuite test to make sure we are labeling
kernfs-based/cgroup files correctly.
Casey Schaufler Jan. 11, 2019, 6:22 p.m. UTC | #8
On 1/10/2019 11:37 AM, Stephen Smalley wrote:
> On 1/10/19 12:54 PM, Casey Schaufler wrote:
>>
>> Resending after email configuration repair.
>>
>> On 1/10/2019 6:15 AM, Stephen Smalley wrote:
>>> On 1/9/19 5:03 PM, Casey Schaufler wrote:
>>>> On 1/9/2019 12:37 PM, Stephen Smalley wrote:
>>>>> On 1/9/19 12:19 PM, Casey Schaufler wrote:
>>>>>> On 1/9/2019 8:28 AM, Ondrej Mosnacek wrote:
>>>>>>> Changes in v2:
>>>>>>> - add docstring for the new hook in union security_list_options
>>>>>>> - initialize *ctx to NULL and *ctxlen to 0 in case the hook is not
>>>>>>>      implemented
>>>>>>> v1: https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/
>>>>>>>
>>>>>>> This series adds a new security hook that allows to initialize the security
>>>>>>> context of kernfs properly, taking into account the parent context. Kernfs
>>>>>>> nodes require special handling here, since they are not bound to specific
>>>>>>> inodes/superblocks, but instead represent the backing tree structure that
>>>>>>> is used to build the VFS tree when the kernfs tree is mounted.
>>>>>>>
>>>>>>> The kernfs nodes initially do not store any security context and rely on
>>>>>>> the LSM to assign some default context to inodes created over them.
>>>>>>
>>>>>> This seems like a bug in kernfs. Why doesn't kernfs adhere to the usual
>>>>>> and expected filesystem behavior?
>>>>>
>>>>> sysfs / kernfs didn't support xattrs at all when we first added support for setting security contexts to it, so originally all sysfs / kernfs inodes had a single security context, and we only required separate storage for the inodes that were explicitly labeled by userspace.
>>>>>
>>>>> Later kernfs grew support for trusted.* xattrs using simple_xattrs but the existing security.* support was left mostly unchanged.
>>>>
>>>> OK, so as I said, this seems like a bug in kernfs.
>>>>
>>>>>
>>>>>>
>>>>>>> Kernfs
>>>>>>> inodes, however, allow setting an explicit context via the *setxattr(2)
>>>>>>> syscalls, in which case the context is stored inside the kernfs node's
>>>>>>> metadata.
>>>>>>>
>>>>>>> SELinux (and possibly other LSMs) initialize the context of newly created
>>>>>>> FS objects based on the parent object's context (usually the child inherits
>>>>>>> the parent's context, unless the policy dictates otherwise).
>>>>>>
>>>>>> An LSM might use information about the parent other than the "context".
>>>>>> Smack, for example, uses an attribute SMACK64TRANSMUTE from the parent
>>>>>> to determine whether the Smack label of the new object should be taken
>>>>>> from the parent or the process. Passing the "context" of the parent is
>>>>>> insufficient for Smack.
>>>>>
>>>>> IIUC, this would involve switching the handling of security.* xattrs in kernfs over to use simple_xattrs too (so that we can store multiple such attributes), and then pass the entire simple_xattrs list or at least anything with a security.* prefix when initializing a new node or refreshing an existing inode.  Then the security module could extract any security.* attributes of interest for use in determining the label of new inodes and in refreshing the label of an inode.
>>>>
>>>> Right. But I'll point out that there is nothing to prevent an
>>>> LSM from using inode information outside of the xattrs (e.g. uids)
>>>> to determine the security state it wants to give a new object.
>>>
>>> If that's a real concern, the hook could pass the ia_iattr structure in addition to the simple_xattrs list and the security module could use any inode attributes it likes in making the decision.  Effectively it would be passing the entire kernfs_iattrs structure, but probably not directly since that definition is presently private to kernfs.
>>
>> Yes, it's a real concern. And no, just passing all of the kernfs internal data
>> out in j-random formats does not pass muster. Al Viro was commenting the other
>> day on how bad the LSM infrastructure interfaces are. The original proposal here
>> is already big, cluttered and inadequate. Adding more to it to make up for its
>> shortcomings should be sending up red flags
> I don't quite see how the original patch set or hook can be called big and cluttered.  Switching the handling of security xattrs in kernfs to use simple_xattrs (a natural and seemingly straightforward cleanup) and passing the entire simple_xattrs list to the hook interface would allow you to support SMACK64TRANSMUTE, which was the one actual inadequacy you identified.  You claim that someone might need/want the parent uid/gid too, but there are no in-tree security modules that do so nor any submitted AFAIK, and if that situation arises, all we need to do to support it is to add the iattrs.  Obviously they can all be wrapped up in some larger structure if desired. At that point the security modules would have access to all of the inode attributes supported by kernfs.

We already have a structure to wrap all this in. It's an inode.

But, as you point out, there are no in tree LSMs that use
anything beyond the xattrs. So the change you're suggesting
is arguably sufficient, and considerably easier.

>
>> I've been wallowing in the LSM infrastructure for the past seven years.
>> Interfaces like this one, that propagate the idiosyncrasies of both
>> the caller (kernfs) and one potential callee (SELinux) are much too
>> common. I understand that there is a problem that needs a solution.
>> This isn't it.
> The solution sketched above should be capable of supporting the needs of any current security modules and of being easily extended for others if the need arises.

It, like security_inode_init(), is going to be
challenging to get right in the stacking environment.
But, that's my problem.

>
>>
>>
>>>> I suggest that the better solution would be for kernfs to
>>>> use inodes like a real filesystem. Every special case like this
>>>> results in special cases like this special hook. It's hard
>>>> enough to keep track of the general case in the Linux kernel.
>>>
>>> Feel free to propose an implementation if you like, but doing a complete rewrite of kernfs internals seems a bit out of scope.
>>
>> If this issue points out a serious problem with the kernfs implementation
>> then I would expect that addressing the problem at its source would be in
>> everyone's best interest. Did anyone even look at the possibility? If I
>> said that I would do that, how long would you be willing to wait for it?
>
> I don't know that it points to a serious problem with kernfs.  But I'll let you convince the kernfs maintainers of that. Meanwhile, we have a proposed solution that solves the problem for all in-tree security modules.  I see no reason to hold that up. Don't over-design.

I think the patch presented was hasty. It clearly didn't
account for any LSM but SELinux. I understand why. Adding an
LSM interface needs to account for the entire security sub-system.
Ondrej Mosnacek Jan. 14, 2019, 9:01 a.m. UTC | #9
On Thu, Jan 10, 2019 at 6:55 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> Resending after email configuration repair.
>
> On 1/10/2019 6:15 AM, Stephen Smalley wrote:
> > On 1/9/19 5:03 PM, Casey Schaufler wrote:
> >> On 1/9/2019 12:37 PM, Stephen Smalley wrote:
> >>> On 1/9/19 12:19 PM, Casey Schaufler wrote:
> >>>> On 1/9/2019 8:28 AM, Ondrej Mosnacek wrote:
> >>>>> Changes in v2:
> >>>>> - add docstring for the new hook in union security_list_options
> >>>>> - initialize *ctx to NULL and *ctxlen to 0 in case the hook is not
> >>>>>     implemented
> >>>>> v1: https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/
> >>>>>
> >>>>> This series adds a new security hook that allows to initialize the security
> >>>>> context of kernfs properly, taking into account the parent context. Kernfs
> >>>>> nodes require special handling here, since they are not bound to specific
> >>>>> inodes/superblocks, but instead represent the backing tree structure that
> >>>>> is used to build the VFS tree when the kernfs tree is mounted.
> >>>>>
> >>>>> The kernfs nodes initially do not store any security context and rely on
> >>>>> the LSM to assign some default context to inodes created over them.
> >>>>
> >>>> This seems like a bug in kernfs. Why doesn't kernfs adhere to the usual
> >>>> and expected filesystem behavior?
> >>>
> >>> sysfs / kernfs didn't support xattrs at all when we first added support for setting security contexts to it, so originally all sysfs / kernfs inodes had a single security context, and we only required separate storage for the inodes that were explicitly labeled by userspace.
> >>>
> >>> Later kernfs grew support for trusted.* xattrs using simple_xattrs but the existing security.* support was left mostly unchanged.
> >>
> >> OK, so as I said, this seems like a bug in kernfs.
> >>
> >>>
> >>>>
> >>>>> Kernfs
> >>>>> inodes, however, allow setting an explicit context via the *setxattr(2)
> >>>>> syscalls, in which case the context is stored inside the kernfs node's
> >>>>> metadata.
> >>>>>
> >>>>> SELinux (and possibly other LSMs) initialize the context of newly created
> >>>>> FS objects based on the parent object's context (usually the child inherits
> >>>>> the parent's context, unless the policy dictates otherwise).
> >>>>
> >>>> An LSM might use information about the parent other than the "context".
> >>>> Smack, for example, uses an attribute SMACK64TRANSMUTE from the parent
> >>>> to determine whether the Smack label of the new object should be taken
> >>>> from the parent or the process. Passing the "context" of the parent is
> >>>> insufficient for Smack.
> >>>
> >>> IIUC, this would involve switching the handling of security.* xattrs in kernfs over to use simple_xattrs too (so that we can store multiple such attributes), and then pass the entire simple_xattrs list or at least anything with a security.* prefix when initializing a new node or refreshing an existing inode.  Then the security module could extract any security.* attributes of interest for use in determining the label of new inodes and in refreshing the label of an inode.

I actually had a patch to do just that at one point because I thought
for a while that it would be required to call
security_inode_init_security() (which I had tried to somehow force
into the kernfs node creation at some point), but then I realized it
is not actually needed (although would make thing a bit nicer) and put
it away... I will try to dig it out and reuse here.

> >>
> >> Right. But I'll point out that there is nothing to prevent an
> >> LSM from using inode information outside of the xattrs (e.g. uids)
> >> to determine the security state it wants to give a new object.
> >
> > If that's a real concern, the hook could pass the ia_iattr structure in addition to the simple_xattrs list and the security module could use any inode attributes it likes in making the decision.  Effectively it would be passing the entire kernfs_iattrs structure, but probably not directly since that definition is presently private to kernfs.
>
> Yes, it's a real concern. And no, just passing all of the kernfs internal data
> out in j-random formats does not pass muster. Al Viro was commenting the other
> day on how bad the LSM infrastructure interfaces are. The original proposal here
> is already big, cluttered and inadequate. Adding more to it to make up for its
> shortcomings should be sending up red flags.

I understand the concern about cluttering up things, but I just don't
see any nicer solution right now...

>
> I've been wallowing in the LSM infrastructure for the past seven years.
> Interfaces like this one, that propagate the idiosyncrasies of both
> the caller (kernfs) and one potential callee (SELinux) are much too
> common. I understand that there is a problem that needs a solution.
> This isn't it.
>
>
> >> I suggest that the better solution would be for kernfs to
> >> use inodes like a real filesystem. Every special case like this
> >> results in special cases like this special hook. It's hard
> >> enough to keep track of the general case in the Linux kernel.
> >
> > Feel free to propose an implementation if you like, but doing a complete rewrite of kernfs internals seems a bit out of scope.
>
> If this issue points out a serious problem with the kernfs implementation
> then I would expect that addressing the problem at its source would be in
> everyone's best interest. Did anyone even look at the possibility? If I
> said that I would do that, how long would you be willing to wait for it?

Granted, the "inodeless" abstractions in kernfs have perhaps gone too
far, but I believe that trying undo it would just shift the complexity
into kernfs and its users... IMHO that this solution (with the changes
proposed by Stephen) is not overly invasive and does not make the
potential future rework of kernfs and its handling by LSMs much more
difficult than it would be now. I'd prefer to apply an imperfect but
noninvasive solution to a practical problem now and leave extensive
refactoring as a separate task/discussion.

(Anyway, I don't want to rush this. I'll keep sending patches and
hopefully we'll eventually converge to some solution acceptable to
everyone.)

--
Ondrej Mosnacek <omosnace at redhat dot com>
Associate Software Engineer, Security Technologies
Red Hat, Inc.
Ondrej Mosnacek Jan. 14, 2019, 9:01 a.m. UTC | #10
On Fri, Jan 11, 2019 at 3:44 AM Paul Moore <paul@paul-moore.com> wrote:
> [...]
>
> Ondrej, a gentle reminder that it would be nice to have a simple
> selinux-testsuite test to make sure we are labeling
> kernfs-based/cgroup files correctly.

OK, I'll see if I can adapt the reproducer from the last patch into
the testsuite.



--
Ondrej Mosnacek <omosnace at redhat dot com>
Associate Software Engineer, Security Technologies
Red Hat, Inc.
Ondrej Mosnacek Jan. 22, 2019, 8:49 a.m. UTC | #11
On Mon, Jan 14, 2019 at 10:01 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
> On Thu, Jan 10, 2019 at 6:55 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> > Resending after email configuration repair.
> >
> > On 1/10/2019 6:15 AM, Stephen Smalley wrote:
> > > On 1/9/19 5:03 PM, Casey Schaufler wrote:
> > >> On 1/9/2019 12:37 PM, Stephen Smalley wrote:
> > >>> On 1/9/19 12:19 PM, Casey Schaufler wrote:
> > >>>> On 1/9/2019 8:28 AM, Ondrej Mosnacek wrote:
> > >>>>> Changes in v2:
> > >>>>> - add docstring for the new hook in union security_list_options
> > >>>>> - initialize *ctx to NULL and *ctxlen to 0 in case the hook is not
> > >>>>>     implemented
> > >>>>> v1: https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/
> > >>>>>
> > >>>>> This series adds a new security hook that allows to initialize the security
> > >>>>> context of kernfs properly, taking into account the parent context. Kernfs
> > >>>>> nodes require special handling here, since they are not bound to specific
> > >>>>> inodes/superblocks, but instead represent the backing tree structure that
> > >>>>> is used to build the VFS tree when the kernfs tree is mounted.
> > >>>>>
> > >>>>> The kernfs nodes initially do not store any security context and rely on
> > >>>>> the LSM to assign some default context to inodes created over them.
> > >>>>
> > >>>> This seems like a bug in kernfs. Why doesn't kernfs adhere to the usual
> > >>>> and expected filesystem behavior?
> > >>>
> > >>> sysfs / kernfs didn't support xattrs at all when we first added support for setting security contexts to it, so originally all sysfs / kernfs inodes had a single security context, and we only required separate storage for the inodes that were explicitly labeled by userspace.
> > >>>
> > >>> Later kernfs grew support for trusted.* xattrs using simple_xattrs but the existing security.* support was left mostly unchanged.
> > >>
> > >> OK, so as I said, this seems like a bug in kernfs.
> > >>
> > >>>
> > >>>>
> > >>>>> Kernfs
> > >>>>> inodes, however, allow setting an explicit context via the *setxattr(2)
> > >>>>> syscalls, in which case the context is stored inside the kernfs node's
> > >>>>> metadata.
> > >>>>>
> > >>>>> SELinux (and possibly other LSMs) initialize the context of newly created
> > >>>>> FS objects based on the parent object's context (usually the child inherits
> > >>>>> the parent's context, unless the policy dictates otherwise).
> > >>>>
> > >>>> An LSM might use information about the parent other than the "context".
> > >>>> Smack, for example, uses an attribute SMACK64TRANSMUTE from the parent
> > >>>> to determine whether the Smack label of the new object should be taken
> > >>>> from the parent or the process. Passing the "context" of the parent is
> > >>>> insufficient for Smack.
> > >>>
> > >>> IIUC, this would involve switching the handling of security.* xattrs in kernfs over to use simple_xattrs too (so that we can store multiple such attributes), and then pass the entire simple_xattrs list or at least anything with a security.* prefix when initializing a new node or refreshing an existing inode.  Then the security module could extract any security.* attributes of interest for use in determining the label of new inodes and in refreshing the label of an inode.
>
> I actually had a patch to do just that at one point because I thought
> for a while that it would be required to call
> security_inode_init_security() (which I had tried to somehow force
> into the kernfs node creation at some point), but then I realized it
> is not actually needed (although would make thing a bit nicer) and put
> it away... I will try to dig it out and reuse here.

Okay, now that I tried to do this with full xattr support I ran into a
problem. Along with converting kernfs to use simple_xattrs for
security attributes, I removed the call to
security_inode_notifysecctx() from kernfs_refresh_inode(), as it no
longer makes sense (kernfs doesn't know which attribute contains the
context; the LSM should now be able to pull it out via
vfs_getxattr()). However, SELinux now doesn't set the right security
context in the selinux_d_instantiate() hook, because the policy tells
it to use genfs, not xattr.

So... I'm not sure how to fix this. Setting fs_use_xattr for cgroupfs
in the policy won't work, because then all nodes will be unlabeled_t
by default. Maybe we could patch the genfs case in
inode_doinit_with_dentry() to try fetching the xattr first? I'm not
very confident about touching that part of the code, so I would
welcome some advice here.

This is the code I have so far, in case it helps:
https://gitlab.com/omos/linux-public/compare/selinux-next...selinux-fix-cgroupfs-v8

Thanks,
Stephen Smalley Jan. 22, 2019, 2:17 p.m. UTC | #12
On 1/22/19 3:49 AM, Ondrej Mosnacek wrote:
> On Mon, Jan 14, 2019 at 10:01 AM Ondrej Mosnacek <omosnace@redhat.com> wrote:
>> On Thu, Jan 10, 2019 at 6:55 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>>> Resending after email configuration repair.
>>>
>>> On 1/10/2019 6:15 AM, Stephen Smalley wrote:
>>>> On 1/9/19 5:03 PM, Casey Schaufler wrote:
>>>>> On 1/9/2019 12:37 PM, Stephen Smalley wrote:
>>>>>> On 1/9/19 12:19 PM, Casey Schaufler wrote:
>>>>>>> On 1/9/2019 8:28 AM, Ondrej Mosnacek wrote:
>>>>>>>> Changes in v2:
>>>>>>>> - add docstring for the new hook in union security_list_options
>>>>>>>> - initialize *ctx to NULL and *ctxlen to 0 in case the hook is not
>>>>>>>>      implemented
>>>>>>>> v1: https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/
>>>>>>>>
>>>>>>>> This series adds a new security hook that allows to initialize the security
>>>>>>>> context of kernfs properly, taking into account the parent context. Kernfs
>>>>>>>> nodes require special handling here, since they are not bound to specific
>>>>>>>> inodes/superblocks, but instead represent the backing tree structure that
>>>>>>>> is used to build the VFS tree when the kernfs tree is mounted.
>>>>>>>>
>>>>>>>> The kernfs nodes initially do not store any security context and rely on
>>>>>>>> the LSM to assign some default context to inodes created over them.
>>>>>>>
>>>>>>> This seems like a bug in kernfs. Why doesn't kernfs adhere to the usual
>>>>>>> and expected filesystem behavior?
>>>>>>
>>>>>> sysfs / kernfs didn't support xattrs at all when we first added support for setting security contexts to it, so originally all sysfs / kernfs inodes had a single security context, and we only required separate storage for the inodes that were explicitly labeled by userspace.
>>>>>>
>>>>>> Later kernfs grew support for trusted.* xattrs using simple_xattrs but the existing security.* support was left mostly unchanged.
>>>>>
>>>>> OK, so as I said, this seems like a bug in kernfs.
>>>>>
>>>>>>
>>>>>>>
>>>>>>>> Kernfs
>>>>>>>> inodes, however, allow setting an explicit context via the *setxattr(2)
>>>>>>>> syscalls, in which case the context is stored inside the kernfs node's
>>>>>>>> metadata.
>>>>>>>>
>>>>>>>> SELinux (and possibly other LSMs) initialize the context of newly created
>>>>>>>> FS objects based on the parent object's context (usually the child inherits
>>>>>>>> the parent's context, unless the policy dictates otherwise).
>>>>>>>
>>>>>>> An LSM might use information about the parent other than the "context".
>>>>>>> Smack, for example, uses an attribute SMACK64TRANSMUTE from the parent
>>>>>>> to determine whether the Smack label of the new object should be taken
>>>>>>> from the parent or the process. Passing the "context" of the parent is
>>>>>>> insufficient for Smack.
>>>>>>
>>>>>> IIUC, this would involve switching the handling of security.* xattrs in kernfs over to use simple_xattrs too (so that we can store multiple such attributes), and then pass the entire simple_xattrs list or at least anything with a security.* prefix when initializing a new node or refreshing an existing inode.  Then the security module could extract any security.* attributes of interest for use in determining the label of new inodes and in refreshing the label of an inode.
>>
>> I actually had a patch to do just that at one point because I thought
>> for a while that it would be required to call
>> security_inode_init_security() (which I had tried to somehow force
>> into the kernfs node creation at some point), but then I realized it
>> is not actually needed (although would make thing a bit nicer) and put
>> it away... I will try to dig it out and reuse here.
> 
> Okay, now that I tried to do this with full xattr support I ran into a
> problem. Along with converting kernfs to use simple_xattrs for
> security attributes, I removed the call to
> security_inode_notifysecctx() from kernfs_refresh_inode(), as it no
> longer makes sense (kernfs doesn't know which attribute contains the
> context; the LSM should now be able to pull it out via
> vfs_getxattr()). However, SELinux now doesn't set the right security
> context in the selinux_d_instantiate() hook, because the policy tells
> it to use genfs, not xattr.
> 
> So... I'm not sure how to fix this. Setting fs_use_xattr for cgroupfs
> in the policy won't work, because then all nodes will be unlabeled_t
> by default. Maybe we could patch the genfs case in
> inode_doinit_with_dentry() to try fetching the xattr first? I'm not
> very confident about touching that part of the code, so I would
> welcome some advice here.
> 
> This is the code I have so far, in case it helps:
> https://gitlab.com/omos/linux-public/compare/selinux-next...selinux-fix-cgroupfs-v8

I would have left security_inode_notifysecctx() or an equivalent that 
passes all of the xattrs to push the security attributes to the security 
module.

Blindly calling __vfs_getxattr() on genfs could be a problem; IIRC, 
doing so on fuse filesytems can create a deadlock during mount.  Or at 
least that was the issue with switching fuse to fs_use_xattr in the past.
Stephen Smalley Jan. 22, 2019, 3:26 p.m. UTC | #13
On 1/22/19 9:17 AM, Stephen Smalley wrote:
> On 1/22/19 3:49 AM, Ondrej Mosnacek wrote:
>> On Mon, Jan 14, 2019 at 10:01 AM Ondrej Mosnacek <omosnace@redhat.com> 
>> wrote:
>>> On Thu, Jan 10, 2019 at 6:55 PM Casey Schaufler 
>>> <casey@schaufler-ca.com> wrote:
>>>> Resending after email configuration repair.
>>>>
>>>> On 1/10/2019 6:15 AM, Stephen Smalley wrote:
>>>>> On 1/9/19 5:03 PM, Casey Schaufler wrote:
>>>>>> On 1/9/2019 12:37 PM, Stephen Smalley wrote:
>>>>>>> On 1/9/19 12:19 PM, Casey Schaufler wrote:
>>>>>>>> On 1/9/2019 8:28 AM, Ondrej Mosnacek wrote:
>>>>>>>>> Changes in v2:
>>>>>>>>> - add docstring for the new hook in union security_list_options
>>>>>>>>> - initialize *ctx to NULL and *ctxlen to 0 in case the hook is not
>>>>>>>>>      implemented
>>>>>>>>> v1: 
>>>>>>>>> https://lore.kernel.org/selinux/20190109091028.24485-1-omosnace@redhat.com/T/ 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This series adds a new security hook that allows to initialize 
>>>>>>>>> the security
>>>>>>>>> context of kernfs properly, taking into account the parent 
>>>>>>>>> context. Kernfs
>>>>>>>>> nodes require special handling here, since they are not bound 
>>>>>>>>> to specific
>>>>>>>>> inodes/superblocks, but instead represent the backing tree 
>>>>>>>>> structure that
>>>>>>>>> is used to build the VFS tree when the kernfs tree is mounted.
>>>>>>>>>
>>>>>>>>> The kernfs nodes initially do not store any security context 
>>>>>>>>> and rely on
>>>>>>>>> the LSM to assign some default context to inodes created over 
>>>>>>>>> them.
>>>>>>>>
>>>>>>>> This seems like a bug in kernfs. Why doesn't kernfs adhere to 
>>>>>>>> the usual
>>>>>>>> and expected filesystem behavior?
>>>>>>>
>>>>>>> sysfs / kernfs didn't support xattrs at all when we first added 
>>>>>>> support for setting security contexts to it, so originally all 
>>>>>>> sysfs / kernfs inodes had a single security context, and we only 
>>>>>>> required separate storage for the inodes that were explicitly 
>>>>>>> labeled by userspace.
>>>>>>>
>>>>>>> Later kernfs grew support for trusted.* xattrs using 
>>>>>>> simple_xattrs but the existing security.* support was left mostly 
>>>>>>> unchanged.
>>>>>>
>>>>>> OK, so as I said, this seems like a bug in kernfs.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>> Kernfs
>>>>>>>>> inodes, however, allow setting an explicit context via the 
>>>>>>>>> *setxattr(2)
>>>>>>>>> syscalls, in which case the context is stored inside the kernfs 
>>>>>>>>> node's
>>>>>>>>> metadata.
>>>>>>>>>
>>>>>>>>> SELinux (and possibly other LSMs) initialize the context of 
>>>>>>>>> newly created
>>>>>>>>> FS objects based on the parent object's context (usually the 
>>>>>>>>> child inherits
>>>>>>>>> the parent's context, unless the policy dictates otherwise).
>>>>>>>>
>>>>>>>> An LSM might use information about the parent other than the 
>>>>>>>> "context".
>>>>>>>> Smack, for example, uses an attribute SMACK64TRANSMUTE from the 
>>>>>>>> parent
>>>>>>>> to determine whether the Smack label of the new object should be 
>>>>>>>> taken
>>>>>>>> from the parent or the process. Passing the "context" of the 
>>>>>>>> parent is
>>>>>>>> insufficient for Smack.
>>>>>>>
>>>>>>> IIUC, this would involve switching the handling of security.* 
>>>>>>> xattrs in kernfs over to use simple_xattrs too (so that we can 
>>>>>>> store multiple such attributes), and then pass the entire 
>>>>>>> simple_xattrs list or at least anything with a security.* prefix 
>>>>>>> when initializing a new node or refreshing an existing inode.  
>>>>>>> Then the security module could extract any security.* attributes 
>>>>>>> of interest for use in determining the label of new inodes and in 
>>>>>>> refreshing the label of an inode.
>>>
>>> I actually had a patch to do just that at one point because I thought
>>> for a while that it would be required to call
>>> security_inode_init_security() (which I had tried to somehow force
>>> into the kernfs node creation at some point), but then I realized it
>>> is not actually needed (although would make thing a bit nicer) and put
>>> it away... I will try to dig it out and reuse here.
>>
>> Okay, now that I tried to do this with full xattr support I ran into a
>> problem. Along with converting kernfs to use simple_xattrs for
>> security attributes, I removed the call to
>> security_inode_notifysecctx() from kernfs_refresh_inode(), as it no
>> longer makes sense (kernfs doesn't know which attribute contains the
>> context; the LSM should now be able to pull it out via
>> vfs_getxattr()). However, SELinux now doesn't set the right security
>> context in the selinux_d_instantiate() hook, because the policy tells
>> it to use genfs, not xattr.
>>
>> So... I'm not sure how to fix this. Setting fs_use_xattr for cgroupfs
>> in the policy won't work, because then all nodes will be unlabeled_t
>> by default. Maybe we could patch the genfs case in
>> inode_doinit_with_dentry() to try fetching the xattr first? I'm not
>> very confident about touching that part of the code, so I would
>> welcome some advice here.
>>
>> This is the code I have so far, in case it helps:
>> https://gitlab.com/omos/linux-public/compare/selinux-next...selinux-fix-cgroupfs-v8 
>>
> 
> I would have left security_inode_notifysecctx() or an equivalent that 
> passes all of the xattrs to push the security attributes to the security 
> module.
> 
> Blindly calling __vfs_getxattr() on genfs could be a problem; IIRC, 
> doing so on fuse filesytems can create a deadlock during mount.  Or at 
> least that was the issue with switching fuse to fs_use_xattr in the past.

See commits 4d546f81717d253ab67643bf072c6d8821a9249c, 
102aefdda4d8275ce7d7100bc16c88c74272b260, 
089be43e403a78cd6889cde2fba164fefe9dfd89, 
811f3799279e567aa354c649ce22688d949ac7a9 and
https://bugzilla.redhat.com/show_bug.cgi?id=1256635#c34 for some prior 
work and discussions in this area.