diff mbox series

[v6,5/5] kernfs: initialize security of newly created nodes

Message ID 20190214095015.16032-6-omosnace@redhat.com (mailing list archive)
State Superseded
Headers show
Series Allow initializing the kernfs node's secctx based on its parent | expand

Commit Message

Ondrej Mosnacek Feb. 14, 2019, 9:50 a.m. UTC
Use the new security_kernfs_init_security() hook to allow LSMs to
possibly assign a non-default security context to a newly created kernfs
node based on the attributes of the new node and also its parent node.

This fixes an issue with cgroupfs under SELinux, where newly created
cgroup subdirectories/files would not inherit its parent's context if
it had been set explicitly to a non-default value (other than the genfs
context specified by the policy). This can be reproduced as follows (on
Fedora/RHEL):

    # mkdir /sys/fs/cgroup/unified/test
    # # Need permissive to change the label under Fedora policy:
    # setenforce 0
    # chcon -t container_file_t /sys/fs/cgroup/unified/test
    # ls -lZ /sys/fs/cgroup/unified
    total 0
    -r--r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.controllers
    -rw-r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.max.depth
    -rw-r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.max.descendants
    -rw-r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.procs
    -r--r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.stat
    -rw-r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.subtree_control
    -rw-r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.threads
    drwxr-xr-x.  2 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 init.scope
    drwxr-xr-x. 26 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:21 system.slice
    drwxr-xr-x.  3 root root system_u:object_r:container_file_t:s0 0 Jan 29 03:15 test
    drwxr-xr-x.  3 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 user.slice
    # mkdir /sys/fs/cgroup/unified/test/subdir

Actual result:

    # ls -ldZ /sys/fs/cgroup/unified/test/subdir
    drwxr-xr-x. 2 root root system_u:object_r:cgroup_t:s0 0 Jan 29 03:15 /sys/fs/cgroup/unified/test/subdir

Expected result:

    # ls -ldZ /sys/fs/cgroup/unified/test/subdir
    drwxr-xr-x. 2 root root unconfined_u:object_r:container_file_t:s0 0 Jan 29 03:15 /sys/fs/cgroup/unified/test/subdir

Link: https://github.com/SELinuxProject/selinux-kernel/issues/39
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
---
 fs/kernfs/dir.c             | 57 +++++++++++++++++++++++++++++++++++--
 fs/kernfs/inode.c           | 25 +++++++++-------
 fs/kernfs/kernfs-internal.h |  2 ++
 include/linux/xattr.h       | 15 ++++++++++
 4 files changed, 86 insertions(+), 13 deletions(-)

Comments

Tejun Heo Feb. 14, 2019, 3:48 p.m. UTC | #1
On Thu, Feb 14, 2019 at 10:50:15AM +0100, Ondrej Mosnacek wrote:
> +static int kernfs_node_init_security(struct kernfs_node *parent,
> +				     struct kernfs_node *kn)

Can we skip the whole thing if security is not enabled?

Thanks.
Ondrej Mosnacek Feb. 15, 2019, 3:45 p.m. UTC | #2
On Thu, Feb 14, 2019 at 4:49 PM Tejun Heo <tj@kernel.org> wrote:
> On Thu, Feb 14, 2019 at 10:50:15AM +0100, Ondrej Mosnacek wrote:
> > +static int kernfs_node_init_security(struct kernfs_node *parent,
> > +                                  struct kernfs_node *kn)
>
> Can we skip the whole thing if security is not enabled?

Do you mean just skipping the whole part when CONFIG_SECURITY=n? That
is easy to do and I can add it in the next respin (although the
compiler should be able to optimize most of it out in that case).

If you mean dynamically checking if calling the hook would actually do
anything (i.e. if any LSM actually registered that particular hook),
then that should also be possible (just check if
security_hook_heads.kernfs_init_security is a non-empty list), but I'm
not sure if that wouldn't be too hacky...

--
Ondrej Mosnacek <omosnace at redhat dot com>
Associate Software Engineer, Security Technologies
Red Hat, Inc.
Tejun Heo Feb. 15, 2019, 3:50 p.m. UTC | #3
On Fri, Feb 15, 2019 at 04:45:44PM +0100, Ondrej Mosnacek wrote:
> On Thu, Feb 14, 2019 at 4:49 PM Tejun Heo <tj@kernel.org> wrote:
> > On Thu, Feb 14, 2019 at 10:50:15AM +0100, Ondrej Mosnacek wrote:
> > > +static int kernfs_node_init_security(struct kernfs_node *parent,
> > > +                                  struct kernfs_node *kn)
> >
> > Can we skip the whole thing if security is not enabled?
> 
> Do you mean just skipping the whole part when CONFIG_SECURITY=n? That
> is easy to do and I can add it in the next respin (although the
> compiler should be able to optimize most of it out in that case).

So the goal is allowing folks who don't use this to not pay.  It'd be
better the evaulation can be as late as possible but obviously there's
a point where that'd be too complicated.  Maybe "ever enabled in this
boot" is a good and simple enough at the same time?

Thanks.
Ondrej Mosnacek Feb. 18, 2019, 10:03 a.m. UTC | #4
On Fri, Feb 15, 2019 at 4:50 PM Tejun Heo <tj@kernel.org> wrote:
> On Fri, Feb 15, 2019 at 04:45:44PM +0100, Ondrej Mosnacek wrote:
> > On Thu, Feb 14, 2019 at 4:49 PM Tejun Heo <tj@kernel.org> wrote:
> > > On Thu, Feb 14, 2019 at 10:50:15AM +0100, Ondrej Mosnacek wrote:
> > > > +static int kernfs_node_init_security(struct kernfs_node *parent,
> > > > +                                  struct kernfs_node *kn)
> > >
> > > Can we skip the whole thing if security is not enabled?
> >
> > Do you mean just skipping the whole part when CONFIG_SECURITY=n? That
> > is easy to do and I can add it in the next respin (although the
> > compiler should be able to optimize most of it out in that case).
>
> So the goal is allowing folks who don't use this to not pay.  It'd be
> better the evaulation can be as late as possible but obviously there's
> a point where that'd be too complicated.  Maybe "ever enabled in this
> boot" is a good and simple enough at the same time?

I don't think there is a way currently to check whether some LSM has
been enabled at boot or not. I suppose we could add such function for
this kind of heuristics, but I'm not sure how it would interplay with
the plans to allow multiple LSM to be enabled simultaneously...
Perhaps it would be better/easier to just add a
security_kernfs_needs_init() function, which would simply check if the
list of registered kernfs_init_security hooks is empty.

I propose something like the patch below (the whitespace is mangled -
intended just for visual review). I plan to fold it into the next
respin if there are no objections to this approach.

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 735a6d382d9d..5b99205da919 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -625,6 +625,9 @@ static int kernfs_node_init_security(struct
kernfs_node *parent,
        struct qstr q;
        int ret;

+       if (!security_kernfs_needs_init() || !parent)
+               return 0;
+
        if (!parent->iattr) {
                kernfs_iattr_init(&iattr_parent, parent);
                simple_xattrs_init(&xattr_parent);
@@ -720,11 +723,9 @@ static struct kernfs_node
*__kernfs_new_node(struct kernfs_root *root,
                        goto err_out3;
        }

-       if (parent) {
-               ret = kernfs_node_init_security(parent, kn);
-               if (ret)
-                       goto err_out3;
-       }
+       ret = kernfs_node_init_security(parent, kn);
+       if (ret)
+               goto err_out3;

        return kn;

diff --git a/include/linux/security.h b/include/linux/security.h
index 581944d1e61e..49a083dbc464 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -292,6 +292,7 @@ int security_inode_listsecurity(struct inode
*inode, char *buffer, size_t buffer
 void security_inode_getsecid(struct inode *inode, u32 *secid);
 int security_inode_copy_up(struct dentry *src, struct cred **new);
 int security_inode_copy_up_xattr(const char *name);
+int security_kernfs_needs_init(void);
 int security_kernfs_init_security(const struct qstr *qstr,
                                  const struct iattr *dir_iattr,
                                  struct simple_xattrs *dir_secattr,
@@ -789,6 +790,11 @@ static inline int security_inode_copy_up(struct
dentry *src, struct cred **new)
        return 0;
 }

+static inline int security_kernfs_needs_init(void)
+{
+       return 0;
+}
+
 static inline int security_kernfs_init_security(
                const struct qstr *qstr, const struct iattr *dir_iattr,
                struct simple_xattrs *dir_secattr, const struct iattr *iattr,
diff --git a/security/security.c b/security/security.c
index 836e0822874a..3c8b9b5baabc 100644
--- a/security/security.c
+++ b/security/security.c
@@ -892,6 +892,11 @@ int security_inode_copy_up_xattr(const char *name)
 }
 EXPORT_SYMBOL(security_inode_copy_up_xattr);

+int security_kernfs_needs_init(void)
+{
+       return !hlist_empty(&security_hook_heads.kernfs_init_security);
+}
+
 int security_kernfs_init_security(const struct qstr *qstr,
                                  const struct iattr *dir_iattr,
                                  struct simple_xattrs *dir_secattr,

--
Ondrej Mosnacek <omosnace at redhat dot com>
Associate Software Engineer, Security Technologies
Red Hat, Inc.
Tejun Heo Feb. 18, 2019, 9:02 p.m. UTC | #5
Hello,

On Mon, Feb 18, 2019 at 11:03:58AM +0100, Ondrej Mosnacek wrote:
> I don't think there is a way currently to check whether some LSM has
> been enabled at boot or not. I suppose we could add such function for
> this kind of heuristics, but I'm not sure how it would interplay with
> the plans to allow multiple LSM to be enabled simultaneously...
> Perhaps it would be better/easier to just add a
> security_kernfs_needs_init() function, which would simply check if the
> list of registered kernfs_init_security hooks is empty.
> 
> I propose something like the patch below (the whitespace is mangled -
> intended just for visual review). I plan to fold it into the next
> respin if there are no objections to this approach.

Sounds good to me.

Thanks.
Casey Schaufler Feb. 19, 2019, 12:28 a.m. UTC | #6
On 2/18/2019 2:03 AM, Ondrej Mosnacek wrote:
> On Fri, Feb 15, 2019 at 4:50 PM Tejun Heo <tj@kernel.org> wrote:
>> On Fri, Feb 15, 2019 at 04:45:44PM +0100, Ondrej Mosnacek wrote:
>>> On Thu, Feb 14, 2019 at 4:49 PM Tejun Heo <tj@kernel.org> wrote:
>>>> On Thu, Feb 14, 2019 at 10:50:15AM +0100, Ondrej Mosnacek wrote:
>>>>> +static int kernfs_node_init_security(struct kernfs_node *parent,
>>>>> +                                  struct kernfs_node *kn)
>>>> Can we skip the whole thing if security is not enabled?
>>> Do you mean just skipping the whole part when CONFIG_SECURITY=n? That
>>> is easy to do and I can add it in the next respin (although the
>>> compiler should be able to optimize most of it out in that case).
>> So the goal is allowing folks who don't use this to not pay.  It'd be
>> better the evaulation can be as late as possible but obviously there's
>> a point where that'd be too complicated.  Maybe "ever enabled in this
>> boot" is a good and simple enough at the same time?
> I don't think there is a way currently to check whether some LSM has
> been enabled at boot or not. I suppose we could add such function for
> this kind of heuristics, but I'm not sure how it would interplay with
> the plans to allow multiple LSM to be enabled simultaneously...
> Perhaps it would be better/easier to just add a
> security_kernfs_needs_init() function, which would simply check if the
> list of registered kernfs_init_security hooks is empty.
>
> I propose something like the patch below (the whitespace is mangled -
> intended just for visual review). I plan to fold it into the next
> respin if there are no objections to this approach.
>
> diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
> index 735a6d382d9d..5b99205da919 100644
> --- a/fs/kernfs/dir.c
> +++ b/fs/kernfs/dir.c
> @@ -625,6 +625,9 @@ static int kernfs_node_init_security(struct
> kernfs_node *parent,
>          struct qstr q;
>          int ret;
>
> +       if (!security_kernfs_needs_init() || !parent)
> +               return 0;
> +
>          if (!parent->iattr) {
>                  kernfs_iattr_init(&iattr_parent, parent);
>                  simple_xattrs_init(&xattr_parent);
> @@ -720,11 +723,9 @@ static struct kernfs_node
> *__kernfs_new_node(struct kernfs_root *root,
>                          goto err_out3;
>          }
>
> -       if (parent) {
> -               ret = kernfs_node_init_security(parent, kn);
> -               if (ret)
> -                       goto err_out3;
> -       }
> +       ret = kernfs_node_init_security(parent, kn);
> +       if (ret)
> +               goto err_out3;
>
>          return kn;
>
> diff --git a/include/linux/security.h b/include/linux/security.h
> index 581944d1e61e..49a083dbc464 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -292,6 +292,7 @@ int security_inode_listsecurity(struct inode
> *inode, char *buffer, size_t buffer
>   void security_inode_getsecid(struct inode *inode, u32 *secid);
>   int security_inode_copy_up(struct dentry *src, struct cred **new);
>   int security_inode_copy_up_xattr(const char *name);
> +int security_kernfs_needs_init(void);
>   int security_kernfs_init_security(const struct qstr *qstr,
>                                    const struct iattr *dir_iattr,
>                                    struct simple_xattrs *dir_secattr,
> @@ -789,6 +790,11 @@ static inline int security_inode_copy_up(struct
> dentry *src, struct cred **new)
>          return 0;
>   }
>
> +static inline int security_kernfs_needs_init(void)
> +{
> +       return 0;
> +}
> +
>   static inline int security_kernfs_init_security(
>                  const struct qstr *qstr, const struct iattr *dir_iattr,
>                  struct simple_xattrs *dir_secattr, const struct iattr *iattr,
> diff --git a/security/security.c b/security/security.c
> index 836e0822874a..3c8b9b5baabc 100644
> --- a/security/security.c
> +++ b/security/security.c
> @@ -892,6 +892,11 @@ int security_inode_copy_up_xattr(const char *name)
>   }
>   EXPORT_SYMBOL(security_inode_copy_up_xattr);
>
> +int security_kernfs_needs_init(void)
> +{
> +       return !hlist_empty(&security_hook_heads.kernfs_init_security);
> +}
> +

Yuck. That's an awful lot of infrastructure just to track
that state. May I suggest that instead you have the
security_kernfs_init_security() hook return -EOPNOTSUPP
in the no-LSM case (2nd argument to call_in_hook). You could
then have a state flag in kernfs that you can set to indicate
you don't need to call security_kernfs_init_security() again.

>   int security_kernfs_init_security(const struct qstr *qstr,
>                                    const struct iattr *dir_iattr,
>                                    struct simple_xattrs *dir_secattr,
>
> --
> Ondrej Mosnacek <omosnace at redhat dot com>
> Associate Software Engineer, Security Technologies
> Red Hat, Inc.
Ondrej Mosnacek Feb. 19, 2019, 2:10 p.m. UTC | #7
On Tue, Feb 19, 2019 at 1:28 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
> On 2/18/2019 2:03 AM, Ondrej Mosnacek wrote:
> > On Fri, Feb 15, 2019 at 4:50 PM Tejun Heo <tj@kernel.org> wrote:
> >> On Fri, Feb 15, 2019 at 04:45:44PM +0100, Ondrej Mosnacek wrote:
> >>> On Thu, Feb 14, 2019 at 4:49 PM Tejun Heo <tj@kernel.org> wrote:
> >>>> On Thu, Feb 14, 2019 at 10:50:15AM +0100, Ondrej Mosnacek wrote:
> >>>>> +static int kernfs_node_init_security(struct kernfs_node *parent,
> >>>>> +                                  struct kernfs_node *kn)
> >>>> Can we skip the whole thing if security is not enabled?
> >>> Do you mean just skipping the whole part when CONFIG_SECURITY=n? That
> >>> is easy to do and I can add it in the next respin (although the
> >>> compiler should be able to optimize most of it out in that case).
> >> So the goal is allowing folks who don't use this to not pay.  It'd be
> >> better the evaulation can be as late as possible but obviously there's
> >> a point where that'd be too complicated.  Maybe "ever enabled in this
> >> boot" is a good and simple enough at the same time?
> > I don't think there is a way currently to check whether some LSM has
> > been enabled at boot or not. I suppose we could add such function for
> > this kind of heuristics, but I'm not sure how it would interplay with
> > the plans to allow multiple LSM to be enabled simultaneously...
> > Perhaps it would be better/easier to just add a
> > security_kernfs_needs_init() function, which would simply check if the
> > list of registered kernfs_init_security hooks is empty.
> >
> > I propose something like the patch below (the whitespace is mangled -
> > intended just for visual review). I plan to fold it into the next
> > respin if there are no objections to this approach.
> >
> > diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
> > index 735a6d382d9d..5b99205da919 100644
> > --- a/fs/kernfs/dir.c
> > +++ b/fs/kernfs/dir.c
> > @@ -625,6 +625,9 @@ static int kernfs_node_init_security(struct
> > kernfs_node *parent,
> >          struct qstr q;
> >          int ret;
> >
> > +       if (!security_kernfs_needs_init() || !parent)
> > +               return 0;
> > +
> >          if (!parent->iattr) {
> >                  kernfs_iattr_init(&iattr_parent, parent);
> >                  simple_xattrs_init(&xattr_parent);
> > @@ -720,11 +723,9 @@ static struct kernfs_node
> > *__kernfs_new_node(struct kernfs_root *root,
> >                          goto err_out3;
> >          }
> >
> > -       if (parent) {
> > -               ret = kernfs_node_init_security(parent, kn);
> > -               if (ret)
> > -                       goto err_out3;
> > -       }
> > +       ret = kernfs_node_init_security(parent, kn);
> > +       if (ret)
> > +               goto err_out3;
> >
> >          return kn;
> >
> > diff --git a/include/linux/security.h b/include/linux/security.h
> > index 581944d1e61e..49a083dbc464 100644
> > --- a/include/linux/security.h
> > +++ b/include/linux/security.h
> > @@ -292,6 +292,7 @@ int security_inode_listsecurity(struct inode
> > *inode, char *buffer, size_t buffer
> >   void security_inode_getsecid(struct inode *inode, u32 *secid);
> >   int security_inode_copy_up(struct dentry *src, struct cred **new);
> >   int security_inode_copy_up_xattr(const char *name);
> > +int security_kernfs_needs_init(void);
> >   int security_kernfs_init_security(const struct qstr *qstr,
> >                                    const struct iattr *dir_iattr,
> >                                    struct simple_xattrs *dir_secattr,
> > @@ -789,6 +790,11 @@ static inline int security_inode_copy_up(struct
> > dentry *src, struct cred **new)
> >          return 0;
> >   }
> >
> > +static inline int security_kernfs_needs_init(void)
> > +{
> > +       return 0;
> > +}
> > +
> >   static inline int security_kernfs_init_security(
> >                  const struct qstr *qstr, const struct iattr *dir_iattr,
> >                  struct simple_xattrs *dir_secattr, const struct iattr *iattr,
> > diff --git a/security/security.c b/security/security.c
> > index 836e0822874a..3c8b9b5baabc 100644
> > --- a/security/security.c
> > +++ b/security/security.c
> > @@ -892,6 +892,11 @@ int security_inode_copy_up_xattr(const char *name)
> >   }
> >   EXPORT_SYMBOL(security_inode_copy_up_xattr);
> >
> > +int security_kernfs_needs_init(void)
> > +{
> > +       return !hlist_empty(&security_hook_heads.kernfs_init_security);
> > +}
> > +
>
> Yuck. That's an awful lot of infrastructure just to track
> that state. May I suggest that instead you have the
> security_kernfs_init_security() hook return -EOPNOTSUPP
> in the no-LSM case (2nd argument to call_in_hook). You could
> then have a state flag in kernfs that you can set to indicate
> you don't need to call security_kernfs_init_security() again.

Well, maintaining a global variable sounds even more yucky to me...
And I don't understand why you'd consider a simple one-line function
to be "an awful lot of infrastructure" :) But at the end of the day it
is up to the maintainers - Greg/Tejun and James/Serge (who I forgot to
Cc on these patches, sorry) - what works better for them.

>
> >   int security_kernfs_init_security(const struct qstr *qstr,
> >                                    const struct iattr *dir_iattr,
> >                                    struct simple_xattrs *dir_secattr,
> >
> > --
> > Ondrej Mosnacek <omosnace at redhat dot com>
> > Associate Software Engineer, Security Technologies
> > Red Hat, Inc.

--
Ondrej Mosnacek <omosnace at redhat dot com>
Associate Software Engineer, Security Technologies
Red Hat, Inc.
Tejun Heo Feb. 19, 2019, 2:21 p.m. UTC | #8
Hello,

On Tue, Feb 19, 2019 at 03:10:30PM +0100, Ondrej Mosnacek wrote:
> Well, maintaining a global variable sounds even more yucky to me...
> And I don't understand why you'd consider a simple one-line function
> to be "an awful lot of infrastructure" :) But at the end of the day it
> is up to the maintainers - Greg/Tejun and James/Serge (who I forgot to
> Cc on these patches, sorry) - what works better for them.

As long as the cost can be avoided for folks who don't use the
relevant features, I don't have a strong opinion on how that's done.

Thanks.
Casey Schaufler Feb. 19, 2019, 4:43 p.m. UTC | #9
On 2/19/2019 6:10 AM, Ondrej Mosnacek wrote:
> On Tue, Feb 19, 2019 at 1:28 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
>> On 2/18/2019 2:03 AM, Ondrej Mosnacek wrote:
>>> On Fri, Feb 15, 2019 at 4:50 PM Tejun Heo <tj@kernel.org> wrote:
>>>> On Fri, Feb 15, 2019 at 04:45:44PM +0100, Ondrej Mosnacek wrote:
>>>>> On Thu, Feb 14, 2019 at 4:49 PM Tejun Heo <tj@kernel.org> wrote:
>>>>>> On Thu, Feb 14, 2019 at 10:50:15AM +0100, Ondrej Mosnacek wrote:
>>>>>>> +static int kernfs_node_init_security(struct kernfs_node *parent,
>>>>>>> +                                  struct kernfs_node *kn)
>>>>>> Can we skip the whole thing if security is not enabled?
>>>>> Do you mean just skipping the whole part when CONFIG_SECURITY=n? That
>>>>> is easy to do and I can add it in the next respin (although the
>>>>> compiler should be able to optimize most of it out in that case).
>>>> So the goal is allowing folks who don't use this to not pay.  It'd be
>>>> better the evaulation can be as late as possible but obviously there's
>>>> a point where that'd be too complicated.  Maybe "ever enabled in this
>>>> boot" is a good and simple enough at the same time?
>>> I don't think there is a way currently to check whether some LSM has
>>> been enabled at boot or not. I suppose we could add such function for
>>> this kind of heuristics, but I'm not sure how it would interplay with
>>> the plans to allow multiple LSM to be enabled simultaneously...
>>> Perhaps it would be better/easier to just add a
>>> security_kernfs_needs_init() function, which would simply check if the
>>> list of registered kernfs_init_security hooks is empty.
>>>
>>> I propose something like the patch below (the whitespace is mangled -
>>> intended just for visual review). I plan to fold it into the next
>>> respin if there are no objections to this approach.
>>>
>>> diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
>>> index 735a6d382d9d..5b99205da919 100644
>>> --- a/fs/kernfs/dir.c
>>> +++ b/fs/kernfs/dir.c
>>> @@ -625,6 +625,9 @@ static int kernfs_node_init_security(struct
>>> kernfs_node *parent,
>>>           struct qstr q;
>>>           int ret;
>>>
>>> +       if (!security_kernfs_needs_init() || !parent)
>>> +               return 0;
>>> +
>>>           if (!parent->iattr) {
>>>                   kernfs_iattr_init(&iattr_parent, parent);
>>>                   simple_xattrs_init(&xattr_parent);
>>> @@ -720,11 +723,9 @@ static struct kernfs_node
>>> *__kernfs_new_node(struct kernfs_root *root,
>>>                           goto err_out3;
>>>           }
>>>
>>> -       if (parent) {
>>> -               ret = kernfs_node_init_security(parent, kn);
>>> -               if (ret)
>>> -                       goto err_out3;
>>> -       }
>>> +       ret = kernfs_node_init_security(parent, kn);
>>> +       if (ret)
>>> +               goto err_out3;
>>>
>>>           return kn;
>>>
>>> diff --git a/include/linux/security.h b/include/linux/security.h
>>> index 581944d1e61e..49a083dbc464 100644
>>> --- a/include/linux/security.h
>>> +++ b/include/linux/security.h
>>> @@ -292,6 +292,7 @@ int security_inode_listsecurity(struct inode
>>> *inode, char *buffer, size_t buffer
>>>    void security_inode_getsecid(struct inode *inode, u32 *secid);
>>>    int security_inode_copy_up(struct dentry *src, struct cred **new);
>>>    int security_inode_copy_up_xattr(const char *name);
>>> +int security_kernfs_needs_init(void);
>>>    int security_kernfs_init_security(const struct qstr *qstr,
>>>                                     const struct iattr *dir_iattr,
>>>                                     struct simple_xattrs *dir_secattr,
>>> @@ -789,6 +790,11 @@ static inline int security_inode_copy_up(struct
>>> dentry *src, struct cred **new)
>>>           return 0;
>>>    }
>>>
>>> +static inline int security_kernfs_needs_init(void)
>>> +{
>>> +       return 0;
>>> +}
>>> +
>>>    static inline int security_kernfs_init_security(
>>>                   const struct qstr *qstr, const struct iattr *dir_iattr,
>>>                   struct simple_xattrs *dir_secattr, const struct iattr *iattr,
>>> diff --git a/security/security.c b/security/security.c
>>> index 836e0822874a..3c8b9b5baabc 100644
>>> --- a/security/security.c
>>> +++ b/security/security.c
>>> @@ -892,6 +892,11 @@ int security_inode_copy_up_xattr(const char *name)
>>>    }
>>>    EXPORT_SYMBOL(security_inode_copy_up_xattr);
>>>
>>> +int security_kernfs_needs_init(void)
>>> +{
>>> +       return !hlist_empty(&security_hook_heads.kernfs_init_security);
>>> +}
>>> +
>> Yuck. That's an awful lot of infrastructure just to track
>> that state. May I suggest that instead you have the
>> security_kernfs_init_security() hook return -EOPNOTSUPP
>> in the no-LSM case (2nd argument to call_in_hook). You could
>> then have a state flag in kernfs that you can set to indicate
>> you don't need to call security_kernfs_init_security() again.
> Well, maintaining a global variable sounds even more yucky to me...

The state you're maintaining is kernfs state, not LSM
infrastructure state. The state should be maintained in
kernfs, not in the LSM infrastructure.

> And I don't understand why you'd consider a simple one-line function
> to be "an awful lot of infrastructure" :)

That is because you haven't been working with the LSM
infrastructure very long. If you'll recall, I have already
objected to kernfs specific LSM hooks. Now, when you find
that your approach for using a hook has issues, you want
to add another function that does nothing but tell you
that there's nothing to do. You can get that by calling
the security_kernfs_init_security() hook. The LSM infrastructure
is set up to be as painless as possible when you don't
want to use it.

>   But at the end of the day it
> is up to the maintainers - Greg/Tejun and James/Serge (who I forgot to
> Cc on these patches, sorry)

Yes, James and Serge are the maintainers for the
security sub-system, but I have done the transition
from hook vector to hook lists and am currently doing
the transition from module to infrastructure blob management.
I know it better, and have more invested in it, than anyone
else just now.

>   - what works better for them.

Kernfs is an important component of the kernel. So is
the security infrastructure. I would hope you don't want
to turn this into a contest to see which maintainer has
the biggest clout.
Ondrej Mosnacek Feb. 21, 2019, 9:13 a.m. UTC | #10
On Tue, Feb 19, 2019 at 5:43 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> On 2/19/2019 6:10 AM, Ondrej Mosnacek wrote:
> > On Tue, Feb 19, 2019 at 1:28 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
> >> On 2/18/2019 2:03 AM, Ondrej Mosnacek wrote:
> >>> On Fri, Feb 15, 2019 at 4:50 PM Tejun Heo <tj@kernel.org> wrote:
> >>>> On Fri, Feb 15, 2019 at 04:45:44PM +0100, Ondrej Mosnacek wrote:
> >>>>> On Thu, Feb 14, 2019 at 4:49 PM Tejun Heo <tj@kernel.org> wrote:
> >>>>>> On Thu, Feb 14, 2019 at 10:50:15AM +0100, Ondrej Mosnacek wrote:
> >>>>>>> +static int kernfs_node_init_security(struct kernfs_node *parent,
> >>>>>>> +                                  struct kernfs_node *kn)
> >>>>>> Can we skip the whole thing if security is not enabled?
> >>>>> Do you mean just skipping the whole part when CONFIG_SECURITY=n? That
> >>>>> is easy to do and I can add it in the next respin (although the
> >>>>> compiler should be able to optimize most of it out in that case).
> >>>> So the goal is allowing folks who don't use this to not pay.  It'd be
> >>>> better the evaulation can be as late as possible but obviously there's
> >>>> a point where that'd be too complicated.  Maybe "ever enabled in this
> >>>> boot" is a good and simple enough at the same time?
> >>> I don't think there is a way currently to check whether some LSM has
> >>> been enabled at boot or not. I suppose we could add such function for
> >>> this kind of heuristics, but I'm not sure how it would interplay with
> >>> the plans to allow multiple LSM to be enabled simultaneously...
> >>> Perhaps it would be better/easier to just add a
> >>> security_kernfs_needs_init() function, which would simply check if the
> >>> list of registered kernfs_init_security hooks is empty.
> >>>
> >>> I propose something like the patch below (the whitespace is mangled -
> >>> intended just for visual review). I plan to fold it into the next
> >>> respin if there are no objections to this approach.
> >>>
> >>> diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
> >>> index 735a6d382d9d..5b99205da919 100644
> >>> --- a/fs/kernfs/dir.c
> >>> +++ b/fs/kernfs/dir.c
> >>> @@ -625,6 +625,9 @@ static int kernfs_node_init_security(struct
> >>> kernfs_node *parent,
> >>>           struct qstr q;
> >>>           int ret;
> >>>
> >>> +       if (!security_kernfs_needs_init() || !parent)
> >>> +               return 0;
> >>> +
> >>>           if (!parent->iattr) {
> >>>                   kernfs_iattr_init(&iattr_parent, parent);
> >>>                   simple_xattrs_init(&xattr_parent);
> >>> @@ -720,11 +723,9 @@ static struct kernfs_node
> >>> *__kernfs_new_node(struct kernfs_root *root,
> >>>                           goto err_out3;
> >>>           }
> >>>
> >>> -       if (parent) {
> >>> -               ret = kernfs_node_init_security(parent, kn);
> >>> -               if (ret)
> >>> -                       goto err_out3;
> >>> -       }
> >>> +       ret = kernfs_node_init_security(parent, kn);
> >>> +       if (ret)
> >>> +               goto err_out3;
> >>>
> >>>           return kn;
> >>>
> >>> diff --git a/include/linux/security.h b/include/linux/security.h
> >>> index 581944d1e61e..49a083dbc464 100644
> >>> --- a/include/linux/security.h
> >>> +++ b/include/linux/security.h
> >>> @@ -292,6 +292,7 @@ int security_inode_listsecurity(struct inode
> >>> *inode, char *buffer, size_t buffer
> >>>    void security_inode_getsecid(struct inode *inode, u32 *secid);
> >>>    int security_inode_copy_up(struct dentry *src, struct cred **new);
> >>>    int security_inode_copy_up_xattr(const char *name);
> >>> +int security_kernfs_needs_init(void);
> >>>    int security_kernfs_init_security(const struct qstr *qstr,
> >>>                                     const struct iattr *dir_iattr,
> >>>                                     struct simple_xattrs *dir_secattr,
> >>> @@ -789,6 +790,11 @@ static inline int security_inode_copy_up(struct
> >>> dentry *src, struct cred **new)
> >>>           return 0;
> >>>    }
> >>>
> >>> +static inline int security_kernfs_needs_init(void)
> >>> +{
> >>> +       return 0;
> >>> +}
> >>> +
> >>>    static inline int security_kernfs_init_security(
> >>>                   const struct qstr *qstr, const struct iattr *dir_iattr,
> >>>                   struct simple_xattrs *dir_secattr, const struct iattr *iattr,
> >>> diff --git a/security/security.c b/security/security.c
> >>> index 836e0822874a..3c8b9b5baabc 100644
> >>> --- a/security/security.c
> >>> +++ b/security/security.c
> >>> @@ -892,6 +892,11 @@ int security_inode_copy_up_xattr(const char *name)
> >>>    }
> >>>    EXPORT_SYMBOL(security_inode_copy_up_xattr);
> >>>
> >>> +int security_kernfs_needs_init(void)
> >>> +{
> >>> +       return !hlist_empty(&security_hook_heads.kernfs_init_security);
> >>> +}
> >>> +
> >> Yuck. That's an awful lot of infrastructure just to track
> >> that state. May I suggest that instead you have the
> >> security_kernfs_init_security() hook return -EOPNOTSUPP
> >> in the no-LSM case (2nd argument to call_in_hook). You could
> >> then have a state flag in kernfs that you can set to indicate
> >> you don't need to call security_kernfs_init_security() again.
> > Well, maintaining a global variable sounds even more yucky to me...
>
> The state you're maintaining is kernfs state, not LSM
> infrastructure state. The state should be maintained in
> kernfs, not in the LSM infrastructure.

But I'm not maintaining any state. I'm merely trying to answer the
query "Is there anything that will handle this hook? Do I need to
prepare stuff for it?", which is obviously a query about the LSM
state. Granted, ideally we wouldn't need to do any preparatory work at
all, but that would require exposing more of the kernfs internals
(which brings its own issues, but maybe I'll need to look into that
approach more...).

>
> > And I don't understand why you'd consider a simple one-line function
> > to be "an awful lot of infrastructure" :)
>
> That is because you haven't been working with the LSM
> infrastructure very long. If you'll recall, I have already
> objected to kernfs specific LSM hooks. Now, when you find
> that your approach for using a hook has issues, you want
> to add another function that does nothing but tell you
> that there's nothing to do. You can get that by calling
> the security_kernfs_init_security() hook. The LSM infrastructure
> is set up to be as painless as possible when you don't
> want to use it.
>
> >   But at the end of the day it
> > is up to the maintainers - Greg/Tejun and James/Serge (who I forgot to
> > Cc on these patches, sorry)
>
> Yes, James and Serge are the maintainers for the
> security sub-system, but I have done the transition
> from hook vector to hook lists and am currently doing
> the transition from module to infrastructure blob management.
> I know it better, and have more invested in it, than anyone
> else just now.

Fair enough.

>
> >   - what works better for them.
>
> Kernfs is an important component of the kernel. So is
> the security infrastructure. I would hope you don't want
> to turn this into a contest to see which maintainer has
> the biggest clout.

Oh, no, you misunderstood my intention. I just got a feeling that this
thread was turning into a discussion about perceived code ugliness
(and about which subsystem that ugliness ends up in), which is
naturally a very subjective topic, so I wanted to know what is the
opinion of the people that have the final decision about whether the
code should get in or not. Anyway, I'll try to find a more elegant
variant of the solution once again, hopefully I manage to get to
something less controversial.

--
Ondrej Mosnacek <omosnace at redhat dot com>
Associate Software Engineer, Security Technologies
Red Hat, Inc.
Casey Schaufler Feb. 21, 2019, 4:52 p.m. UTC | #11
On 2/21/2019 1:13 AM, Ondrej Mosnacek wrote:
> On Tue, Feb 19, 2019 at 5:43 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> .....
>> The state you're maintaining is kernfs state, not LSM
>> infrastructure state. The state should be maintained in
>> kernfs, not in the LSM infrastructure.
> But I'm not maintaining any state. I'm merely trying to answer the
> query "Is there anything that will handle this hook? Do I need to
> prepare stuff for it?", which is obviously a query about the LSM
> state. Granted, ideally we wouldn't need to do any preparatory work at
> all, but that would require exposing more of the kernfs internals
> (which brings its own issues, but maybe I'll need to look into that
> approach more...).

It sounds like you're bumping up against the limitations
of the finely honed optimized implementation of kernfs. :(
If it where still the pre-android era, when using an LSM
was rare, the check for an LSM might have made sense. Today,
with the vast majority of systems using LSMs*, optimizing for
the no LSM case is nonsensical.

---
* Android, Tizen, Fedora/RHEL, Ubuntu

> ...
> Kernfs is an important component of the kernel. So is
> the security infrastructure. I would hope you don't want
> to turn this into a contest to see which maintainer has
> the biggest clout.
> Oh, no, you misunderstood my intention. I just got a feeling that this
> thread was turning into a discussion about perceived code ugliness
> (and about which subsystem that ugliness ends up in), which is
> naturally a very subjective topic, so I wanted to know what is the
> opinion of the people that have the final decision about whether the
> code should get in or not. Anyway, I'll try to find a more elegant
> variant of the solution once again, hopefully I manage to get to
> something less controversial.

Thank you. I believe (which, of course, doesn't make it true)
that when a component goes outside the general system architecture
the way that kernfs does *even for performance reasons* that it is
responsible for the edge cases it encounters. I know that I've had
to do a good bit of that in Smack.
Ondrej Mosnacek Feb. 22, 2019, 12:52 p.m. UTC | #12
On Thu, Feb 21, 2019 at 5:52 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> On 2/21/2019 1:13 AM, Ondrej Mosnacek wrote:
> > On Tue, Feb 19, 2019 at 5:43 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> > .....
> >> The state you're maintaining is kernfs state, not LSM
> >> infrastructure state. The state should be maintained in
> >> kernfs, not in the LSM infrastructure.
> > But I'm not maintaining any state. I'm merely trying to answer the
> > query "Is there anything that will handle this hook? Do I need to
> > prepare stuff for it?", which is obviously a query about the LSM
> > state. Granted, ideally we wouldn't need to do any preparatory work at
> > all, but that would require exposing more of the kernfs internals
> > (which brings its own issues, but maybe I'll need to look into that
> > approach more...).
>
> It sounds like you're bumping up against the limitations
> of the finely honed optimized implementation of kernfs. :(
> If it where still the pre-android era, when using an LSM
> was rare, the check for an LSM might have made sense. Today,
> with the vast majority of systems using LSMs*, optimizing for
> the no LSM case is nonsensical.

I imagine it might make sense on some very minimal embedded systems
(microcontrollers?), where you almost certainly need kernfs (via sysfs
or debugfs), but you don't necessarily need advanced security controls
(you might just have a single monolithic userspace process running
there and very limited memory/CPU resources). I'm hardly an expert on
embedded platforms, but it sounds like a reasonable use case.
(Although in such cases I'd expect CONFIG_SECURITY to be simply set to
'n', so no need for runtime checks anyway...).

>
> ---
> * Android, Tizen, Fedora/RHEL, Ubuntu
>
> > ...
> > Kernfs is an important component of the kernel. So is
> > the security infrastructure. I would hope you don't want
> > to turn this into a contest to see which maintainer has
> > the biggest clout.
> > Oh, no, you misunderstood my intention. I just got a feeling that this
> > thread was turning into a discussion about perceived code ugliness
> > (and about which subsystem that ugliness ends up in), which is
> > naturally a very subjective topic, so I wanted to know what is the
> > opinion of the people that have the final decision about whether the
> > code should get in or not. Anyway, I'll try to find a more elegant
> > variant of the solution once again, hopefully I manage to get to
> > something less controversial.
>
> Thank you. I believe (which, of course, doesn't make it true)
> that when a component goes outside the general system architecture
> the way that kernfs does *even for performance reasons* that it is
> responsible for the edge cases it encounters. I know that I've had
> to do a good bit of that in Smack.

OK, so I tried taking the other approach (pass kernfs nodes and expose
necessary kernfs internals) and I'm quite happy with the result. It
turns out the only thing I actually need to expose (at least for
SELinux) is getting and setting the security xattrs, which is just two
simple functions. The rest of the attributes (uid, gid, and access
times) don't seem important and they can always be exposed by adding
more helper functions as needed. With this design the last patch now
becomes embarrassingly simple - there is just a single call that will
either call an LSM hook or do nothing at all.

I still need to do some testing on the new patches before posting. In
the meantime they are available here for the curious:

https://gitlab.com/omos/linux-public/compare/selinux-next...selinux-fix-cgroupfs-v12

--
Ondrej Mosnacek <omosnace at redhat dot com>
Associate Software Engineer, Security Technologies
Red Hat, Inc.
diff mbox series

Patch

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index ad7e3356bcc5..735a6d382d9d 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -15,6 +15,7 @@ 
 #include <linux/slab.h>
 #include <linux/security.h>
 #include <linux/hash.h>
+#include <linux/stringhash.h>
 
 #include "kernfs-internal.h"
 
@@ -616,7 +617,53 @@  struct kernfs_node *kernfs_node_from_dentry(struct dentry *dentry)
 	return NULL;
 }
 
+static int kernfs_node_init_security(struct kernfs_node *parent,
+				     struct kernfs_node *kn)
+{
+	struct simple_xattrs xattr_child, xattr_parent, *pxattr_parent;
+	struct iattr iattr_child, iattr_parent, *piattr_parent;
+	struct qstr q;
+	int ret;
+
+	if (!parent->iattr) {
+		kernfs_iattr_init(&iattr_parent, parent);
+		simple_xattrs_init(&xattr_parent);
+		piattr_parent = &iattr_parent;
+		pxattr_parent = &xattr_parent;
+	} else {
+		piattr_parent = &parent->iattr->ia_iattr;
+		pxattr_parent = &parent->iattr->xattrs_security;
+	}
+
+	kernfs_iattr_init(&iattr_child, kn);
+	simple_xattrs_init(&xattr_child);
+
+	q.name = kn->name;
+	q.hash_len = hashlen_string(parent, kn->name);
+
+	ret = security_kernfs_init_security(&q, piattr_parent, pxattr_parent,
+					    &iattr_child, &xattr_child);
+	if (pxattr_parent == &xattr_parent)
+		simple_xattrs_free(&xattr_parent);
+	if (!ret && !simple_xattrs_empty(&xattr_child)) {
+		/*
+		 * Child has new security xattrs, allocate its kernfs_iattrs
+		 * and put our local xattrs in there.
+		 */
+		struct kernfs_iattrs *attrs = kernfs_iattrs(kn);
+
+		if (!attrs) {
+			simple_xattrs_free(&xattr_child);
+			return -ENOMEM;
+		}
+		simple_xattrs_move(&attrs->xattrs_security, &xattr_child);
+	}
+	simple_xattrs_free(&xattr_child);
+	return ret;
+}
+
 static struct kernfs_node *__kernfs_new_node(struct kernfs_root *root,
+					     struct kernfs_node *parent,
 					     const char *name, umode_t mode,
 					     kuid_t uid, kgid_t gid,
 					     unsigned flags)
@@ -673,6 +720,12 @@  static struct kernfs_node *__kernfs_new_node(struct kernfs_root *root,
 			goto err_out3;
 	}
 
+	if (parent) {
+		ret = kernfs_node_init_security(parent, kn);
+		if (ret)
+			goto err_out3;
+	}
+
 	return kn;
 
  err_out3:
@@ -691,7 +744,7 @@  struct kernfs_node *kernfs_new_node(struct kernfs_node *parent,
 {
 	struct kernfs_node *kn;
 
-	kn = __kernfs_new_node(kernfs_root(parent),
+	kn = __kernfs_new_node(kernfs_root(parent), parent,
 			       name, mode, uid, gid, flags);
 	if (kn) {
 		kernfs_get(parent);
@@ -961,7 +1014,7 @@  struct kernfs_root *kernfs_create_root(struct kernfs_syscall_ops *scops,
 	INIT_LIST_HEAD(&root->supers);
 	root->next_generation = 1;
 
-	kn = __kernfs_new_node(root, "", S_IFDIR | S_IRUGO | S_IXUGO,
+	kn = __kernfs_new_node(root, NULL, "", S_IFDIR | S_IRUGO | S_IXUGO,
 			       GLOBAL_ROOT_UID, GLOBAL_ROOT_GID,
 			       KERNFS_DIR);
 	if (!kn) {
diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c
index f0e2cb4379c0..6a9084aecbe5 100644
--- a/fs/kernfs/inode.c
+++ b/fs/kernfs/inode.c
@@ -31,11 +31,22 @@  static const struct inode_operations kernfs_iops = {
 	.listxattr	= kernfs_iop_listxattr,
 };
 
-static struct kernfs_iattrs *kernfs_iattrs(struct kernfs_node *kn)
+void kernfs_iattr_init(struct iattr *iattrs, struct kernfs_node *kn)
+{
+	/* assign default attributes */
+	iattrs->ia_mode = kn->mode;
+	iattrs->ia_uid = GLOBAL_ROOT_UID;
+	iattrs->ia_gid = GLOBAL_ROOT_GID;
+
+	ktime_get_real_ts64(&iattrs->ia_atime);
+	iattrs->ia_mtime = iattrs->ia_atime;
+	iattrs->ia_ctime = iattrs->ia_atime;
+}
+
+struct kernfs_iattrs *kernfs_iattrs(struct kernfs_node *kn)
 {
 	static DEFINE_MUTEX(iattr_mutex);
 	struct kernfs_iattrs *ret;
-	struct iattr *iattrs;
 
 	mutex_lock(&iattr_mutex);
 
@@ -45,16 +56,8 @@  static struct kernfs_iattrs *kernfs_iattrs(struct kernfs_node *kn)
 	kn->iattr = kzalloc(sizeof(struct kernfs_iattrs), GFP_KERNEL);
 	if (!kn->iattr)
 		goto out_unlock;
-	iattrs = &kn->iattr->ia_iattr;
-
-	/* assign default attributes */
-	iattrs->ia_mode = kn->mode;
-	iattrs->ia_uid = GLOBAL_ROOT_UID;
-	iattrs->ia_gid = GLOBAL_ROOT_GID;
 
-	ktime_get_real_ts64(&iattrs->ia_atime);
-	iattrs->ia_mtime = iattrs->ia_atime;
-	iattrs->ia_ctime = iattrs->ia_atime;
+	kernfs_iattr_init(&kn->iattr->ia_iattr, kn);
 
 	simple_xattrs_init(&kn->iattr->xattrs_trusted);
 	simple_xattrs_init(&kn->iattr->xattrs_security);
diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h
index 93bf1dcd0306..ad80f438d8d4 100644
--- a/fs/kernfs/kernfs-internal.h
+++ b/fs/kernfs/kernfs-internal.h
@@ -90,6 +90,8 @@  int kernfs_iop_getattr(const struct path *path, struct kstat *stat,
 		       u32 request_mask, unsigned int query_flags);
 ssize_t kernfs_iop_listxattr(struct dentry *dentry, char *buf, size_t size);
 int __kernfs_setattr(struct kernfs_node *kn, const struct iattr *iattr);
+void kernfs_iattr_init(struct iattr *iattrs, struct kernfs_node *kn);
+struct kernfs_iattrs *kernfs_iattrs(struct kernfs_node *kn);
 
 /*
  * dir.c
diff --git a/include/linux/xattr.h b/include/linux/xattr.h
index 6dad031be3c2..05fc6812d554 100644
--- a/include/linux/xattr.h
+++ b/include/linux/xattr.h
@@ -108,4 +108,19 @@  ssize_t simple_xattr_list(struct inode *inode, struct simple_xattrs *xattrs, cha
 void simple_xattr_list_add(struct simple_xattrs *xattrs,
 			   struct simple_xattr *new_xattr);
 
+static inline int simple_xattrs_empty(struct simple_xattrs *xattrs)
+{
+	return list_empty(&xattrs->head);
+}
+
+/**
+ * Move the xattr list from @src to @dst, leaving @src empty.
+ */
+static inline void simple_xattrs_move(struct simple_xattrs *dst,
+				      struct simple_xattrs *src)
+{
+	simple_xattrs_free(dst);
+	list_replace_init(&src->head, &dst->head);
+}
+
 #endif	/* _LINUX_XATTR_H */