diff mbox

[RESEND,v2,11/18] fs: Ensure the mounter of a filesystem is privileged towards its inodes

Message ID 1451930639-94331-12-git-send-email-seth.forshee@canonical.com (mailing list archive)
State Superseded
Headers show

Commit Message

Seth Forshee Jan. 4, 2016, 6:03 p.m. UTC
The mounter of a filesystem should be privileged towards the
inodes of that filesystem. Extend the checks in
inode_owner_or_capable() and capable_wrt_inode_uidgid() to
permit access by users priviliged in the user namespace of the
inode's superblock.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
---
 fs/inode.c          |  3 +++
 kernel/capability.c | 13 +++++++++----
 2 files changed, 12 insertions(+), 4 deletions(-)

Comments

Seth Forshee March 3, 2016, 5:02 p.m. UTC | #1
On Mon, Jan 04, 2016 at 12:03:50PM -0600, Seth Forshee wrote:
> The mounter of a filesystem should be privileged towards the
> inodes of that filesystem. Extend the checks in
> inode_owner_or_capable() and capable_wrt_inode_uidgid() to
> permit access by users priviliged in the user namespace of the
> inode's superblock.

Eric - I've discovered a problem related to this patch. The patches
you've already applied to your testing branch make it so that s_user_ns
can be an unprivileged user for proc and kernfs-based mounts. In some
cases DAC is the only thing protecting files in these mounts (ignoring
MAC), and with this patch an unprivileged user could bypass DAC.

There's a simple solution - always set s_user_ns to &init_user_ns for
those filesystems. I think this is the right thing to do, since the
backing store behind these filesystems are really kernel objects.  But
this would break the assumption behind your patch "userns: Simpilify
MNT_NODEV handling" and cause a regression in mounting behavior.

I've come up with several possible solutions for this conflict.

 1. Drop this patch and keep on setting s_user_ns to unprivilged users.
    This would be unfortunate because I think this patch does make sense
    for most filesystems.
 2. Restrict this patch so that a user privileged towards s_user_ns is
    only privileged towards the super blocks inodes if s_user_ns has a
    mapping for both i_uid and i_gid. This is better than (1) but still
    not ideal in my mind.
 3. Drop your patch and maintain the current MNT_NODEV behavior.
 4. Add a new s_iflags flag to indicate a super block is from an
    unprivileged mount, and use this in your patch instead of s_user_ns.

Any preference, or any other ideas?

Thanks,
Seth
Eric W. Biederman March 4, 2016, 10:43 p.m. UTC | #2
Seth Forshee <seth.forshee@canonical.com> writes:

> On Mon, Jan 04, 2016 at 12:03:50PM -0600, Seth Forshee wrote:
>> The mounter of a filesystem should be privileged towards the
>> inodes of that filesystem. Extend the checks in
>> inode_owner_or_capable() and capable_wrt_inode_uidgid() to
>> permit access by users priviliged in the user namespace of the
>> inode's superblock.
>
> Eric - I've discovered a problem related to this patch. The patches
> you've already applied to your testing branch make it so that s_user_ns
> can be an unprivileged user for proc and kernfs-based mounts. In some
> cases DAC is the only thing protecting files in these mounts (ignoring
> MAC), and with this patch an unprivileged user could bypass DAC.
>
> There's a simple solution - always set s_user_ns to &init_user_ns for
> those filesystems. I think this is the right thing to do, since the
> backing store behind these filesystems are really kernel objects.  But
> this would break the assumption behind your patch "userns: Simpilify
> MNT_NODEV handling" and cause a regression in mounting behavior.
>
> I've come up with several possible solutions for this conflict.
>
>  1. Drop this patch and keep on setting s_user_ns to unprivilged users.
>     This would be unfortunate because I think this patch does make sense
>     for most filesystems.
>  2. Restrict this patch so that a user privileged towards s_user_ns is
>     only privileged towards the super blocks inodes if s_user_ns has a
>     mapping for both i_uid and i_gid. This is better than (1) but still
>     not ideal in my mind.
>  3. Drop your patch and maintain the current MNT_NODEV behavior.
>  4. Add a new s_iflags flag to indicate a super block is from an
>     unprivileged mount, and use this in your patch instead of s_user_ns.
>
> Any preference, or any other ideas?

In general this is only an issue if uids and gids on the filesystem
do not map into the user namespace.

Therefore the general fix is to limit the logic of checking for
capabilities in s_user_ns if we are dealing with INVALID_UID and
INVALID_GID.  For proc and kernfs that should never be the case
so the problem becomes a non-issue.

Further I would look at limiting that relaxation to just
inode_change_ok.  So that we can easily wrap that check per filesystem
and deny the relaxation for proc and kernfs.  proc and kernfs already
have wrappers for .setattr so denying changes when !uid_vaid and
!gid_valid would be a trivial addition, and ensure calamity does
not ensure.

Furthmore by limiting any additional to inode_change_ok we keep
the work of the additional tests off of the fast paths.

Eric
Seth Forshee March 6, 2016, 3:48 p.m. UTC | #3
On Fri, Mar 04, 2016 at 04:43:06PM -0600, Eric W. Biederman wrote:
> Seth Forshee <seth.forshee@canonical.com> writes:
> 
> > On Mon, Jan 04, 2016 at 12:03:50PM -0600, Seth Forshee wrote:
> >> The mounter of a filesystem should be privileged towards the
> >> inodes of that filesystem. Extend the checks in
> >> inode_owner_or_capable() and capable_wrt_inode_uidgid() to
> >> permit access by users priviliged in the user namespace of the
> >> inode's superblock.
> >
> > Eric - I've discovered a problem related to this patch. The patches
> > you've already applied to your testing branch make it so that s_user_ns
> > can be an unprivileged user for proc and kernfs-based mounts. In some
> > cases DAC is the only thing protecting files in these mounts (ignoring
> > MAC), and with this patch an unprivileged user could bypass DAC.
> >
> > There's a simple solution - always set s_user_ns to &init_user_ns for
> > those filesystems. I think this is the right thing to do, since the
> > backing store behind these filesystems are really kernel objects.  But
> > this would break the assumption behind your patch "userns: Simpilify
> > MNT_NODEV handling" and cause a regression in mounting behavior.
> >
> > I've come up with several possible solutions for this conflict.
> >
> >  1. Drop this patch and keep on setting s_user_ns to unprivilged users.
> >     This would be unfortunate because I think this patch does make sense
> >     for most filesystems.
> >  2. Restrict this patch so that a user privileged towards s_user_ns is
> >     only privileged towards the super blocks inodes if s_user_ns has a
> >     mapping for both i_uid and i_gid. This is better than (1) but still
> >     not ideal in my mind.
> >  3. Drop your patch and maintain the current MNT_NODEV behavior.
> >  4. Add a new s_iflags flag to indicate a super block is from an
> >     unprivileged mount, and use this in your patch instead of s_user_ns.
> >
> > Any preference, or any other ideas?
> 
> In general this is only an issue if uids and gids on the filesystem
> do not map into the user namespace.

Yes, both capable_wrt_inode_uidgid and inode_owner_or_capable will
return true for a privileged user in the current namespace if the ids
map into that namespace.

> Therefore the general fix is to limit the logic of checking for
> capabilities in s_user_ns if we are dealing with INVALID_UID and
> INVALID_GID.  For proc and kernfs that should never be the case
> so the problem becomes a non-issue.
> 
> Further I would look at limiting that relaxation to just
> inode_change_ok.  So that we can easily wrap that check per filesystem
> and deny the relaxation for proc and kernfs.  proc and kernfs already
> have wrappers for .setattr so denying changes when !uid_vaid and
> !gid_valid would be a trivial addition, and ensure calamity does
> not ensure.
> 
> Furthmore by limiting any additional to inode_change_ok we keep
> the work of the additional tests off of the fast paths.

So then the inode would need to be chowned before a privileged user in a
non-init namespace would be capable towards it. That seems workable. It
looks like INVALID_UID and INVALID_GID do map into init_user_ns (which
seems a bit odd) so real root remains capable towards those indoes.

That seems okay to me then.

Thanks,
Seth
Eric W. Biederman March 6, 2016, 10:07 p.m. UTC | #4
Seth Forshee <seth.forshee@canonical.com> writes:

> On Fri, Mar 04, 2016 at 04:43:06PM -0600, Eric W. Biederman wrote:
>> Seth Forshee <seth.forshee@canonical.com> writes:
>> 
>> > On Mon, Jan 04, 2016 at 12:03:50PM -0600, Seth Forshee wrote:
>> >> The mounter of a filesystem should be privileged towards the
>> >> inodes of that filesystem. Extend the checks in
>> >> inode_owner_or_capable() and capable_wrt_inode_uidgid() to
>> >> permit access by users priviliged in the user namespace of the
>> >> inode's superblock.
>> >
>> > Eric - I've discovered a problem related to this patch. The patches
>> > you've already applied to your testing branch make it so that s_user_ns
>> > can be an unprivileged user for proc and kernfs-based mounts. In some
>> > cases DAC is the only thing protecting files in these mounts (ignoring
>> > MAC), and with this patch an unprivileged user could bypass DAC.
>> >
>> > There's a simple solution - always set s_user_ns to &init_user_ns for
>> > those filesystems. I think this is the right thing to do, since the
>> > backing store behind these filesystems are really kernel objects.  But
>> > this would break the assumption behind your patch "userns: Simpilify
>> > MNT_NODEV handling" and cause a regression in mounting behavior.
>> >
>> > I've come up with several possible solutions for this conflict.
>> >
>> >  1. Drop this patch and keep on setting s_user_ns to unprivilged users.
>> >     This would be unfortunate because I think this patch does make sense
>> >     for most filesystems.
>> >  2. Restrict this patch so that a user privileged towards s_user_ns is
>> >     only privileged towards the super blocks inodes if s_user_ns has a
>> >     mapping for both i_uid and i_gid. This is better than (1) but still
>> >     not ideal in my mind.
>> >  3. Drop your patch and maintain the current MNT_NODEV behavior.
>> >  4. Add a new s_iflags flag to indicate a super block is from an
>> >     unprivileged mount, and use this in your patch instead of s_user_ns.
>> >
>> > Any preference, or any other ideas?
>> 
>> In general this is only an issue if uids and gids on the filesystem
>> do not map into the user namespace.
>
> Yes, both capable_wrt_inode_uidgid and inode_owner_or_capable will
> return true for a privileged user in the current namespace if the ids
> map into that namespace.
>
>> Therefore the general fix is to limit the logic of checking for
>> capabilities in s_user_ns if we are dealing with INVALID_UID and
>> INVALID_GID.  For proc and kernfs that should never be the case
>> so the problem becomes a non-issue.
>> 
>> Further I would look at limiting that relaxation to just
>> inode_change_ok.  So that we can easily wrap that check per filesystem
>> and deny the relaxation for proc and kernfs.  proc and kernfs already
>> have wrappers for .setattr so denying changes when !uid_vaid and
>> !gid_valid would be a trivial addition, and ensure calamity does
>> not ensure.
>> 
>> Furthmore by limiting any additional to inode_change_ok we keep
>> the work of the additional tests off of the fast paths.
>
> So then the inode would need to be chowned before a privileged user in a
> non-init namespace would be capable towards it. That seems workable. It
> looks like INVALID_UID and INVALID_GID do map into init_user_ns (which
> seems a bit odd) so real root remains capable towards those indoes.
>
> That seems okay to me then.

If I was not clear I was suggesting that we allow a sufficiently
privileged user in the filesysteme's s_user_ns to allow chowning files
with INVALID_UID and INVALID_GID.

The global root user would always be able to do that because unless
capabilities are dropped it is sufficiently privileged in ever user
namespace.

Eric
Seth Forshee March 7, 2016, 1:32 p.m. UTC | #5
On Sun, Mar 06, 2016 at 04:07:49PM -0600, Eric W. Biederman wrote:
> Seth Forshee <seth.forshee@canonical.com> writes:
> 
> > On Fri, Mar 04, 2016 at 04:43:06PM -0600, Eric W. Biederman wrote:
> >> Seth Forshee <seth.forshee@canonical.com> writes:
> >> 
> >> > On Mon, Jan 04, 2016 at 12:03:50PM -0600, Seth Forshee wrote:
> >> >> The mounter of a filesystem should be privileged towards the
> >> >> inodes of that filesystem. Extend the checks in
> >> >> inode_owner_or_capable() and capable_wrt_inode_uidgid() to
> >> >> permit access by users priviliged in the user namespace of the
> >> >> inode's superblock.
> >> >
> >> > Eric - I've discovered a problem related to this patch. The patches
> >> > you've already applied to your testing branch make it so that s_user_ns
> >> > can be an unprivileged user for proc and kernfs-based mounts. In some
> >> > cases DAC is the only thing protecting files in these mounts (ignoring
> >> > MAC), and with this patch an unprivileged user could bypass DAC.
> >> >
> >> > There's a simple solution - always set s_user_ns to &init_user_ns for
> >> > those filesystems. I think this is the right thing to do, since the
> >> > backing store behind these filesystems are really kernel objects.  But
> >> > this would break the assumption behind your patch "userns: Simpilify
> >> > MNT_NODEV handling" and cause a regression in mounting behavior.
> >> >
> >> > I've come up with several possible solutions for this conflict.
> >> >
> >> >  1. Drop this patch and keep on setting s_user_ns to unprivilged users.
> >> >     This would be unfortunate because I think this patch does make sense
> >> >     for most filesystems.
> >> >  2. Restrict this patch so that a user privileged towards s_user_ns is
> >> >     only privileged towards the super blocks inodes if s_user_ns has a
> >> >     mapping for both i_uid and i_gid. This is better than (1) but still
> >> >     not ideal in my mind.
> >> >  3. Drop your patch and maintain the current MNT_NODEV behavior.
> >> >  4. Add a new s_iflags flag to indicate a super block is from an
> >> >     unprivileged mount, and use this in your patch instead of s_user_ns.
> >> >
> >> > Any preference, or any other ideas?
> >> 
> >> In general this is only an issue if uids and gids on the filesystem
> >> do not map into the user namespace.
> >
> > Yes, both capable_wrt_inode_uidgid and inode_owner_or_capable will
> > return true for a privileged user in the current namespace if the ids
> > map into that namespace.
> >
> >> Therefore the general fix is to limit the logic of checking for
> >> capabilities in s_user_ns if we are dealing with INVALID_UID and
> >> INVALID_GID.  For proc and kernfs that should never be the case
> >> so the problem becomes a non-issue.
> >> 
> >> Further I would look at limiting that relaxation to just
> >> inode_change_ok.  So that we can easily wrap that check per filesystem
> >> and deny the relaxation for proc and kernfs.  proc and kernfs already
> >> have wrappers for .setattr so denying changes when !uid_vaid and
> >> !gid_valid would be a trivial addition, and ensure calamity does
> >> not ensure.
> >> 
> >> Furthmore by limiting any additional to inode_change_ok we keep
> >> the work of the additional tests off of the fast paths.
> >
> > So then the inode would need to be chowned before a privileged user in a
> > non-init namespace would be capable towards it. That seems workable. It
> > looks like INVALID_UID and INVALID_GID do map into init_user_ns (which
> > seems a bit odd) so real root remains capable towards those indoes.
> >
> > That seems okay to me then.
> 
> If I was not clear I was suggesting that we allow a sufficiently
> privileged user in the filesysteme's s_user_ns to allow chowning files
> with INVALID_UID and INVALID_GID.

Right, I got that.

> The global root user would always be able to do that because unless
> capabilities are dropped it is sufficiently privileged in ever user
> namespace.

Sure. I was just commenting on one result - that ns-root has to chown
the file before being privileged wrt that file but global root does not,
on account of the fact that the invalid ids are mapped in init_user_ns.

Seth
Eric W. Biederman March 30, 2016, 1:36 a.m. UTC | #6
Seth Forshee <seth.forshee@canonical.com> writes:

> On Fri, Mar 04, 2016 at 04:43:06PM -0600, Eric W. Biederman wrote:
>> In general this is only an issue if uids and gids on the filesystem
>> do not map into the user namespace.
>> 
>> Therefore the general fix is to limit the logic of checking for
>> capabilities in s_user_ns if we are dealing with INVALID_UID and
>> INVALID_GID.  For proc and kernfs that should never be the case
>> so the problem becomes a non-issue.
>> 
>> Further I would look at limiting that relaxation to just
>> inode_change_ok. 
>
> Finally got around to implementing this today; is the patch below what
> you had in mind?

Pretty much.

For the same reason that capble_wrt_inode_uidgid(inode) had to look
at both inode->i_uid and inode->i_gid I think we need to look at
both inode->i_uid and inode->i_gid in those case.

I am worried about chgrp_ok in cases such as inode->i_uid is valid
but unmapped.  I have a similiar worry about chown_ok where
inode->i_gid is valid but unmapped (although that worry is less
serious).

>> So that we can easily wrap that check per filesystem
>> and deny the relaxation for proc and kernfs.  proc and kernfs already
>> have wrappers for .setattr so denying changes when !uid_vaid and
>> !gid_valid would be a trivial addition, and ensure calamity does
>> not ensure.
>
> I'm confused about this part though. As you say above, proc and kernfs
> will never have inodes with invalid ids, so it's not an issue. Do you
> just mean this to be extra insurance against problems?

I meant two things.
1) As filesystems explicitly have to call inode_change_ok they can
   over ride the default if it is possible.

2) Because being paranoid about backward compatibility matters, it
   almost certainly workth add adding a check:
   "if (!uid_valid(inode->i_uid) ||!gid_valid(inode->i_gid)) return -EPERM"
   To proc and sysfs just before they call inode_change_ok just so we
   don't need to analyze them and confirm that they don't use
   INVALID_UID.

   That just makes the patch more robust.

   The we could leave removing that code for a follow on patch where
   someone takes the time to read through and audit all of the proc and
   sysfs code to ensure that the case does not arise, instead of just
   implicitily assuming it.

   That is the usual pattern when pushing down changes.  Do something
   that is easily guaranteed to work, and leave the careful looking for
   a patch all of it's own.

Eric


> Thanks,
> Seth
>
> ---
>
> diff --git a/fs/attr.c b/fs/attr.c
> index 3cfaaac4a18e..f2bcd3f7dfbb 100644
> --- a/fs/attr.c
> +++ b/fs/attr.c
> @@ -16,6 +16,31 @@
>  #include <linux/evm.h>
>  #include <linux/ima.h>
>  
> +static bool chown_ok(const struct inode *inode, kuid_t uid)
> +{
> +	if (uid_eq(current_fsuid(), inode->i_uid) && uid_eq(uid, inode->i_uid))
> +		return true;
> +	if (capable_wrt_inode_uidgid(inode, CAP_CHOWN))
> +		return true;
> +	if (!uid_valid(inode->i_uid) &&
> +	    ns_capable(inode->i_sb->s_user_ns, CAP_CHOWN))
> +		return true;
> +	return false;
> +}
> +
> +static bool chgrp_ok(const struct inode *inode, kgid_t gid)
> +{
> +	if (uid_eq(current_fsuid(), inode->i_uid) &&
> +	    (in_group_p(gid) || gid_eq(gid, inode->i_gid)))
> +		return true;
> +	if (capable_wrt_inode_uidgid(inode, CAP_CHOWN))
> +		return true;
> +	if (!gid_valid(inode->i_gid) &&
> +	    ns_capable(inode->i_sb->s_user_ns, CAP_CHOWN))
> +		return true;
> +	return false;
> +}
> +
>  /**
>   * inode_change_ok - check if attribute changes to an inode are allowed
>   * @inode:	inode to check
> @@ -58,17 +83,11 @@ int inode_change_ok(const struct inode *inode, struct iattr *attr)
>  		return 0;
>  
>  	/* Make sure a caller can chown. */
> -	if ((ia_valid & ATTR_UID) &&
> -	    (!uid_eq(current_fsuid(), inode->i_uid) ||
> -	     !uid_eq(attr->ia_uid, inode->i_uid)) &&
> -	    !capable_wrt_inode_uidgid(inode, CAP_CHOWN))
> +	if ((ia_valid & ATTR_UID) && !chown_ok(inode, attr->ia_uid))
>  		return -EPERM;
>  
>  	/* Make sure caller can chgrp. */
> -	if ((ia_valid & ATTR_GID) &&
> -	    (!uid_eq(current_fsuid(), inode->i_uid) ||
> -	    (!in_group_p(attr->ia_gid) && !gid_eq(attr->ia_gid, inode->i_gid))) &&
> -	    !capable_wrt_inode_uidgid(inode, CAP_CHOWN))
> +	if ((ia_valid & ATTR_GID) && !chgrp_ok(inode, attr->ia_gid))
>  		return -EPERM;
>  
>  	/* Make sure a caller can chmod. */
Seth Forshee March 30, 2016, 2:58 p.m. UTC | #7
On Tue, Mar 29, 2016 at 08:36:09PM -0500, Eric W. Biederman wrote:
> Seth Forshee <seth.forshee@canonical.com> writes:
> 
> > On Fri, Mar 04, 2016 at 04:43:06PM -0600, Eric W. Biederman wrote:
> >> In general this is only an issue if uids and gids on the filesystem
> >> do not map into the user namespace.
> >> 
> >> Therefore the general fix is to limit the logic of checking for
> >> capabilities in s_user_ns if we are dealing with INVALID_UID and
> >> INVALID_GID.  For proc and kernfs that should never be the case
> >> so the problem becomes a non-issue.
> >> 
> >> Further I would look at limiting that relaxation to just
> >> inode_change_ok. 
> >
> > Finally got around to implementing this today; is the patch below what
> > you had in mind?
> 
> Pretty much.
> 
> For the same reason that capble_wrt_inode_uidgid(inode) had to look
> at both inode->i_uid and inode->i_gid I think we need to look at
> both inode->i_uid and inode->i_gid in those case.
> 
> I am worried about chgrp_ok in cases such as inode->i_uid is valid
> but unmapped.  I have a similiar worry about chown_ok where
> inode->i_gid is valid but unmapped (although that worry is less
> serious).

That makes sense.

So then what is wanted is to check that the other id is either invalid,
or else it maps into s_user_ns. So for chown_ok() something like this:

    if (!uid_valid(inode->i_uid) &&
        (!gid_valid(inode->i_gid) || kgid_has_mapping(inode->i_sb->s_user_ns, inode->i_gid)) &&
        ns_capable(inode->i_sb->s_user_ns, CAP_CHOWN))
            return true;

and likewise for chgrp_ok(). Does that satisfy your concerns?

> >> So that we can easily wrap that check per filesystem
> >> and deny the relaxation for proc and kernfs.  proc and kernfs already
> >> have wrappers for .setattr so denying changes when !uid_vaid and
> >> !gid_valid would be a trivial addition, and ensure calamity does
> >> not ensure.
> >
> > I'm confused about this part though. As you say above, proc and kernfs
> > will never have inodes with invalid ids, so it's not an issue. Do you
> > just mean this to be extra insurance against problems?
> 
> I meant two things.
> 1) As filesystems explicitly have to call inode_change_ok they can
>    over ride the default if it is possible.
> 
> 2) Because being paranoid about backward compatibility matters, it
>    almost certainly workth add adding a check:
>    "if (!uid_valid(inode->i_uid) ||!gid_valid(inode->i_gid)) return -EPERM"
>    To proc and sysfs just before they call inode_change_ok just so we
>    don't need to analyze them and confirm that they don't use
>    INVALID_UID.
> 
>    That just makes the patch more robust.
> 
>    The we could leave removing that code for a follow on patch where
>    someone takes the time to read through and audit all of the proc and
>    sysfs code to ensure that the case does not arise, instead of just
>    implicitily assuming it.
> 
>    That is the usual pattern when pushing down changes.  Do something
>    that is easily guaranteed to work, and leave the careful looking for
>    a patch all of it's own.

Okay, I'll add checks.

Thanks,
Seth
Eric W. Biederman March 30, 2016, 8:18 p.m. UTC | #8
Seth Forshee <seth.forshee@canonical.com> writes:

> On Tue, Mar 29, 2016 at 08:36:09PM -0500, Eric W. Biederman wrote:
>> Seth Forshee <seth.forshee@canonical.com> writes:
>> 
>> > On Fri, Mar 04, 2016 at 04:43:06PM -0600, Eric W. Biederman wrote:
>> >> In general this is only an issue if uids and gids on the filesystem
>> >> do not map into the user namespace.
>> >> 
>> >> Therefore the general fix is to limit the logic of checking for
>> >> capabilities in s_user_ns if we are dealing with INVALID_UID and
>> >> INVALID_GID.  For proc and kernfs that should never be the case
>> >> so the problem becomes a non-issue.
>> >> 
>> >> Further I would look at limiting that relaxation to just
>> >> inode_change_ok. 
>> >
>> > Finally got around to implementing this today; is the patch below what
>> > you had in mind?
>> 
>> Pretty much.
>> 
>> For the same reason that capble_wrt_inode_uidgid(inode) had to look
>> at both inode->i_uid and inode->i_gid I think we need to look at
>> both inode->i_uid and inode->i_gid in those case.
>> 
>> I am worried about chgrp_ok in cases such as inode->i_uid is valid
>> but unmapped.  I have a similiar worry about chown_ok where
>> inode->i_gid is valid but unmapped (although that worry is less
>> serious).
>
> That makes sense.
>
> So then what is wanted is to check that the other id is either invalid,
> or else it maps into s_user_ns. So for chown_ok() something like this:
>
>     if (!uid_valid(inode->i_uid) &&
>         (!gid_valid(inode->i_gid) || kgid_has_mapping(inode->i_sb->s_user_ns, inode->i_gid)) &&
>         ns_capable(inode->i_sb->s_user_ns, CAP_CHOWN))
>             return true;
>
> and likewise for chgrp_ok(). Does that satisfy your concerns?

Yes it does.

Eric
diff mbox

Patch

diff --git a/fs/inode.c b/fs/inode.c
index 1be5f9003eb3..01c036fe1950 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1962,6 +1962,9 @@  bool inode_owner_or_capable(const struct inode *inode)
 	ns = current_user_ns();
 	if (ns_capable(ns, CAP_FOWNER) && kuid_has_mapping(ns, inode->i_uid))
 		return true;
+
+	if (ns_capable(inode->i_sb->s_user_ns, CAP_FOWNER))
+		return true;
 	return false;
 }
 EXPORT_SYMBOL(inode_owner_or_capable);
diff --git a/kernel/capability.c b/kernel/capability.c
index 45432b54d5c6..5137a38a5670 100644
--- a/kernel/capability.c
+++ b/kernel/capability.c
@@ -437,13 +437,18 @@  EXPORT_SYMBOL(file_ns_capable);
  *
  * Return true if the current task has the given capability targeted at
  * its own user namespace and that the given inode's uid and gid are
- * mapped into the current user namespace.
+ * mapped into the current user namespace, or if the current task has
+ * the capability towards the user namespace of the inode's superblock.
  */
 bool capable_wrt_inode_uidgid(const struct inode *inode, int cap)
 {
-	struct user_namespace *ns = current_user_ns();
+	struct user_namespace *ns;
 
-	return ns_capable(ns, cap) && kuid_has_mapping(ns, inode->i_uid) &&
-		kgid_has_mapping(ns, inode->i_gid);
+	ns = current_user_ns();
+	if (ns_capable(ns, cap) && kuid_has_mapping(ns, inode->i_uid) &&
+	    kgid_has_mapping(ns, inode->i_gid))
+		return true;
+
+	return ns_capable(inode->i_sb->s_user_ns, cap);
 }
 EXPORT_SYMBOL(capable_wrt_inode_uidgid);