[0/7] Initial support for user namespace owned mounts

Message ID	87615k7pyu.fsf@x220.int.ebiederm.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-fsdevel-owner@kernel.org> From: ebiederm@xmission.com (Eric W. Biederman) To: Seth Forshee <seth.forshee@canonical.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk>, linux-fsdevel@vger.kernel.org, linux-security-module@vger.kernel.org, selinux@tycho.nsa.gov, Serge Hallyn <serge.hallyn@canonical.com>, Andy Lutomirski <luto@amacapital.net>, linux-kernel@vger.kernel.org, Casey Schaufler <casey@schaufler-ca.com> References: <1436989569-69582-1-git-send-email-seth.forshee@canonical.com> Date: Wed, 15 Jul 2015 22:15:21 -0500 In-Reply-To: <1436989569-69582-1-git-send-email-seth.forshee@canonical.com> (Seth Forshee's message of "Wed, 15 Jul 2015 14:46:01 -0500") Message-ID: <87615k7pyu.fsf@x220.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain parse: 1.22 (0.1%), extract_message_metadata: 4 (0.3%), get_uri_detail_list: 1.52 (0.1%), tests_pri_-1000: 5 (0.4%), tests_pri_-950: 2.00 (0.1%), tests_pri_-900: 1.55 (0.1%), tests_pri_-400: 28 (2.1%), check_bayes: 26 (1.9%), tests_pri_0: 1289 (95.8%), tests_pri_500: 6 (0.4%), rewrite_mail: 0.00 (0.0%) Subject: Re: [PATCH 0/7] Initial support for user namespace owned mounts Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk

Eric W. Biederman July 16, 2015, 3:15 a.m. UTC

Seth I think for the LSMs we should start with:


Then we should push this down into all of the lsms.
Then when we should remove or relax or change the check as appropriate
in each lsm.

The point is this is good enough to see that it is trivially safe,
and this allows us to focus on the core issues, and stop worrying about
the lsms for a bit.

Then we can focus on each lsm one at at time and take the time to really
understand them and talk with their maintainers etc to make certain
we get things correct.

This should remove the need for your patches 5, 6 and 7. For the
immediate future.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Seth Forshee July 16, 2015, 1:59 p.m. UTC | #1

On Wed, Jul 15, 2015 at 10:15:21PM -0500, Eric W. Biederman wrote:
> 
> Seth I think for the LSMs we should start with:
> 
> diff --git a/security/security.c b/security/security.c
> index 062f3c997fdc..5b6ece92a8e5 100644
> --- a/security/security.c
> +++ b/security/security.c
> @@ -310,6 +310,8 @@ int security_sb_statfs(struct dentry *dentry)
>  int security_sb_mount(const char *dev_name, struct path *path,
>                         const char *type, unsigned long flags, void *data)
>  {
> +       if (current_user_ns() != &init_user_ns)
> +               return -EPERM;
>         return call_int_hook(sb_mount, 0, dev_name, path, type, flags, data);
>  }

This just makes it impossible to mount from a user namespace. Every
mount from current_user_ns() != &init_user_ns will fail.

> Then we should push this down into all of the lsms.
> Then when we should remove or relax or change the check as appropriate
> in each lsm.
> 
> The point is this is good enough to see that it is trivially safe,
> and this allows us to focus on the core issues, and stop worrying about
> the lsms for a bit.
> 
> Then we can focus on each lsm one at at time and take the time to really
> understand them and talk with their maintainers etc to make certain
> we get things correct.
> 
> This should remove the need for your patches 5, 6 and 7. For the
> immediate future.

I'm still not entirely sure what you were trying to do, maybe refuse to
mount whenever a security module is loaded? I think this could be a good
option to start, but couldn't we restrict it to only the LSMs which use
xattrs for security labels? In situations where the filesystem cannot
supply security policy metadata I can't think of any reason to disallow
the mounts.

Seth
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Casey Schaufler July 16, 2015, 3:09 p.m. UTC | #2

On 7/16/2015 6:59 AM, Seth Forshee wrote:
> On Wed, Jul 15, 2015 at 10:15:21PM -0500, Eric W. Biederman wrote:
>> Seth I think for the LSMs we should start with:
>>
>> diff --git a/security/security.c b/security/security.c
>> index 062f3c997fdc..5b6ece92a8e5 100644
>> --- a/security/security.c
>> +++ b/security/security.c
>> @@ -310,6 +310,8 @@ int security_sb_statfs(struct dentry *dentry)
>>  int security_sb_mount(const char *dev_name, struct path *path,
>>                         const char *type, unsigned long flags, void *data)
>>  {
>> +       if (current_user_ns() != &init_user_ns)
>> +               return -EPERM;
>>         return call_int_hook(sb_mount, 0, dev_name, path, type, flags, data);
>>  }
> This just makes it impossible to mount from a user namespace. Every
> mount from current_user_ns() != &init_user_ns will fail.
>
>> Then we should push this down into all of the lsms.
>> Then when we should remove or relax or change the check as appropriate
>> in each lsm.
>>
>> The point is this is good enough to see that it is trivially safe,
>> and this allows us to focus on the core issues, and stop worrying about
>> the lsms for a bit.

Given the extent to which LSMs are deployed I find it a bit
worrisome that they might not be considered a "core issue".

>> Then we can focus on each lsm one at at time and take the time to really
>> understand them and talk with their maintainers etc to make certain
>> we get things correct.

The "Do the easy stuff, fix the hard stuff after we've sold the product"
approach works really well until you get to the point of fixing the hard
stuff. This is the origin of the 90/90 rule of software development.

>>
>> This should remove the need for your patches 5, 6 and 7. For the
>> immediate future.
> I'm still not entirely sure what you were trying to do, maybe refuse to
> mount whenever a security module is loaded? I think this could be a good
> option to start, but couldn't we restrict it to only the LSMs which use
> xattrs for security labels? In situations where the filesystem cannot
> supply security policy metadata I can't think of any reason to disallow
> the mounts.

This whole notion of mounting a generic filesystem (e.g. ext4) that
is "owned" by a user (as opposed to the system) has lots of implications,
and I seriously doubt that many of them have been accounted for.

Think back to the "negative group access" issue. You can't just
ignore issues that are inconvenient, or claim that you have a reasonable
system just because *you* can't think of a problem.

> Seth
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Seth Forshee July 16, 2015, 3:59 p.m. UTC | #3

On Thu, Jul 16, 2015 at 08:59:47AM -0500, Seth Forshee wrote:
> On Wed, Jul 15, 2015 at 10:15:21PM -0500, Eric W. Biederman wrote:
> > 
> > Seth I think for the LSMs we should start with:
> > 
> > diff --git a/security/security.c b/security/security.c
> > index 062f3c997fdc..5b6ece92a8e5 100644
> > --- a/security/security.c
> > +++ b/security/security.c
> > @@ -310,6 +310,8 @@ int security_sb_statfs(struct dentry *dentry)
> >  int security_sb_mount(const char *dev_name, struct path *path,
> >                         const char *type, unsigned long flags, void *data)
> >  {
> > +       if (current_user_ns() != &init_user_ns)
> > +               return -EPERM;
> >         return call_int_hook(sb_mount, 0, dev_name, path, type, flags, data);
> >  }
> 
> This just makes it impossible to mount from a user namespace. Every
> mount from current_user_ns() != &init_user_ns will fail.

What might work instead is to add a check in security_sb_kern_mount.
Then it would need to check s_user_ns, that way if proc, sysfs, etc.
use sget_userns(..., &init_user_ns) they can still be mounted in
containers.

It would be nicer to have a hook after sget but before fill_super so
that a bunch of work doesn't have to be done and then undone. Right now
there doesn't seem to be any suitable hook.

> > Then we should push this down into all of the lsms.
> > Then when we should remove or relax or change the check as appropriate
> > in each lsm.
> > 
> > The point is this is good enough to see that it is trivially safe,
> > and this allows us to focus on the core issues, and stop worrying about
> > the lsms for a bit.
> > 
> > Then we can focus on each lsm one at at time and take the time to really
> > understand them and talk with their maintainers etc to make certain
> > we get things correct.
> > 
> > This should remove the need for your patches 5, 6 and 7. For the
> > immediate future.
> 
> I'm still not entirely sure what you were trying to do, maybe refuse to
> mount whenever a security module is loaded? I think this could be a good
> option to start, but couldn't we restrict it to only the LSMs which use
> xattrs for security labels? In situations where the filesystem cannot
> supply security policy metadata I can't think of any reason to disallow
> the mounts.
> 
> Seth
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Seth Forshee July 16, 2015, 6:57 p.m. UTC | #4

On Thu, Jul 16, 2015 at 08:09:20AM -0700, Casey Schaufler wrote:
> On 7/16/2015 6:59 AM, Seth Forshee wrote:
> > On Wed, Jul 15, 2015 at 10:15:21PM -0500, Eric W. Biederman wrote:
> >> Seth I think for the LSMs we should start with:
> >>
> >> diff --git a/security/security.c b/security/security.c
> >> index 062f3c997fdc..5b6ece92a8e5 100644
> >> --- a/security/security.c
> >> +++ b/security/security.c
> >> @@ -310,6 +310,8 @@ int security_sb_statfs(struct dentry *dentry)
> >>  int security_sb_mount(const char *dev_name, struct path *path,
> >>                         const char *type, unsigned long flags, void *data)
> >>  {
> >> +       if (current_user_ns() != &init_user_ns)
> >> +               return -EPERM;
> >>         return call_int_hook(sb_mount, 0, dev_name, path, type, flags, data);
> >>  }
> > This just makes it impossible to mount from a user namespace. Every
> > mount from current_user_ns() != &init_user_ns will fail.
> >
> >> Then we should push this down into all of the lsms.
> >> Then when we should remove or relax or change the check as appropriate
> >> in each lsm.
> >>
> >> The point is this is good enough to see that it is trivially safe,
> >> and this allows us to focus on the core issues, and stop worrying about
> >> the lsms for a bit.
> 
> Given the extent to which LSMs are deployed I find it a bit
> worrisome that they might not be considered a "core issue".
> 
> >> Then we can focus on each lsm one at at time and take the time to really
> >> understand them and talk with their maintainers etc to make certain
> >> we get things correct.
> 
> The "Do the easy stuff, fix the hard stuff after we've sold the product"
> approach works really well until you get to the point of fixing the hard
> stuff. This is the origin of the 90/90 rule of software development.
> 
> >>
> >> This should remove the need for your patches 5, 6 and 7. For the
> >> immediate future.
> > I'm still not entirely sure what you were trying to do, maybe refuse to
> > mount whenever a security module is loaded? I think this could be a good
> > option to start, but couldn't we restrict it to only the LSMs which use
> > xattrs for security labels? In situations where the filesystem cannot
> > supply security policy metadata I can't think of any reason to disallow
> > the mounts.
> 
> This whole notion of mounting a generic filesystem (e.g. ext4) that
> is "owned" by a user (as opposed to the system) has lots of implications,
> and I seriously doubt that many of them have been accounted for.
> 
> Think back to the "negative group access" issue. You can't just
> ignore issues that are inconvenient, or claim that you have a reasonable
> system just because *you* can't think of a problem.

I've spent a lot of time considering the implications and previous
vulnerabilities, and I've addressed everything I turned up. Now I'm
asking for review from those with more experience with and expertise of
the code in question. I'm not sure what more I should be doing.

I welcome feedback about anything I've missed, but stating generally
that you think I probably missed something isn't very helpful.

The LSM issue is thornier than the rest of it though, which is why I
specifically asked for review there in the cover letter. There's a lot
of complexity and nuance, and I still don't have a grasp on all the
subtleties. One such subtlety is the full impact of simply ignoring the
security labels on disk (but I am still confused as to why this is
different from filesystems which don't support xattrs at all).

I was unaware of Lukasz's patches until yesterday, and I will have a
look at them. But since we don't have the LSM support for user
namespaces yet, I don't see the problem with doing something safe for
LSMs initially and evolving the LSM integration for user ns mounts along
with the rest of the user ns integration.

Your point is taken about my less-than-expert opinion about the other
security modules. We should at minimum get acks from the maintainers of
those modules that unprivileged mounts will not compromise MAC.

For Smack specifically, I believe my only concern was the SMACK64EXEC
attribute, as all the other attributes only affected subjects' access to
the files. So maybe it would be possible to simply ignore this attribute
in unprivileged mounts and respect the others, even lacking more
complete LSM support for user namespaces.

Seth
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Casey Schaufler July 16, 2015, 9:42 p.m. UTC | #5

On 7/16/2015 11:57 AM, Seth Forshee wrote:
> On Thu, Jul 16, 2015 at 08:09:20AM -0700, Casey Schaufler wrote:
>> On 7/16/2015 6:59 AM, Seth Forshee wrote:
>>> On Wed, Jul 15, 2015 at 10:15:21PM -0500, Eric W. Biederman wrote:
>>>> Seth I think for the LSMs we should start with:
>>>>
>>>> diff --git a/security/security.c b/security/security.c
>>>> index 062f3c997fdc..5b6ece92a8e5 100644
>>>> --- a/security/security.c
>>>> +++ b/security/security.c
>>>> @@ -310,6 +310,8 @@ int security_sb_statfs(struct dentry *dentry)
>>>>  int security_sb_mount(const char *dev_name, struct path *path,
>>>>                         const char *type, unsigned long flags, void *data)
>>>>  {
>>>> +       if (current_user_ns() != &init_user_ns)
>>>> +               return -EPERM;
>>>>         return call_int_hook(sb_mount, 0, dev_name, path, type, flags, data);
>>>>  }
>>> This just makes it impossible to mount from a user namespace. Every
>>> mount from current_user_ns() != &init_user_ns will fail.
>>>
>>>> Then we should push this down into all of the lsms.
>>>> Then when we should remove or relax or change the check as appropriate
>>>> in each lsm.
>>>>
>>>> The point is this is good enough to see that it is trivially safe,
>>>> and this allows us to focus on the core issues, and stop worrying about
>>>> the lsms for a bit.
>> Given the extent to which LSMs are deployed I find it a bit
>> worrisome that they might not be considered a "core issue".
>>
>>>> Then we can focus on each lsm one at at time and take the time to really
>>>> understand them and talk with their maintainers etc to make certain
>>>> we get things correct.
>> The "Do the easy stuff, fix the hard stuff after we've sold the product"
>> approach works really well until you get to the point of fixing the hard
>> stuff. This is the origin of the 90/90 rule of software development.
>>
>>>> This should remove the need for your patches 5, 6 and 7. For the
>>>> immediate future.
>>> I'm still not entirely sure what you were trying to do, maybe refuse to
>>> mount whenever a security module is loaded? I think this could be a good
>>> option to start, but couldn't we restrict it to only the LSMs which use
>>> xattrs for security labels? In situations where the filesystem cannot
>>> supply security policy metadata I can't think of any reason to disallow
>>> the mounts.
>> This whole notion of mounting a generic filesystem (e.g. ext4) that
>> is "owned" by a user (as opposed to the system) has lots of implications,
>> and I seriously doubt that many of them have been accounted for.
>>
>> Think back to the "negative group access" issue. You can't just
>> ignore issues that are inconvenient, or claim that you have a reasonable
>> system just because *you* can't think of a problem.
> I've spent a lot of time considering the implications and previous
> vulnerabilities, and I've addressed everything I turned up. Now I'm
> asking for review from those with more experience with and expertise of
> the code in question. I'm not sure what more I should be doing.

Part of the problem I see is that you're looking at the details
when there's an architectural issue. That's OK, it happens all
the time, but we have to pull the issue up slightly higher in
order to address the underlying difficulties.

You want to provide a mechanism whereby an unprivileged user (Seth)
can mount a filesystem for his own use. You want full filesystem
semantics, but you're willing to accept restrictions on certain
filesystem features to avoid opening security holes. You are not
willing to accept restrictions that make the filesystem unusable,
such as making it read-only.

I am going to present a suggestion. Feel free to correct my
assumptions and my reasoning. For simplicity let's use loop-back
mounting of a filesystem contained in a file as an example. The
principles should apply to newly created memory based filesystems
or disk partitions "owned" by Seth.

Seth wants to mount a file (~seth/myfs) which contains an ext4
filesystem. There is already a filesystem object, with security
attributes, that the system knows how to deal with. If Seth mounts
this as a filesystem he, and potentially other people, will be
able to access the content of this object without accessing the
object itself.

	seth$ mount --justforme -t ext4 ~seth/myfs /tmp/seth
	seth$ chmod 777 /tmp/seth
	seth$ ls -la /tmp/seth
	drwxrwxrwx.  3  seth     seth   260 Jul 16 12:59 .
	drwxrwxrwxt 18  root     root  4069 Jul 16 11:13 ..
	seth$

Everything's fine at this point. Wilma is also using the system,
being the sort who likes to hide things in out of the way places

	wilma$ cp ~/scandals /tmp/seth
	wilma$ chmod 600 /tmp/seth/scandals

puts her list of scandals on the unsuspecting filesystem, and changes
the mode to ensure that no one can find out what went on after the
office party.

Seth unmounts /tmp/seth. He looks in ~seth/myfs, finds out what really
happened at the office party, and the story goes from there.

Wilma did everything correctly according to the system security policy,
but the system security policy did not protect her as advertised. The
system was tricked into behaving as if it was in control of the content
of the filesystem when in fact it was not.

One way to fix this problem is for unprivileged mounts to recognize the
attributes of the object mounted and to propagate those attributes to all
the objects they present. All files on /tmp/seth would be owned by seth
and protected by the mode bits, ACL and LSM requirements of ~/seth/myfs.
opening a file on /tmp/seth would require the same permissions as opening
the file containing the mounted filesystem. These attributes would have to
be immutable, or at least demonstrably more restrictive (chmod might be
allowed in some cases, but chown would never be) when changed. I don't see
how a user other than seth could create a new file, as you'd either have
a magical change in ownership or a false sense of security.

I don't see that the presence of user namespaces changes anything. You
may reduce the set of uids available, but the problems with putting a
uid into someone else's file is just as real.

> I welcome feedback about anything I've missed, but stating generally
> that you think I probably missed something isn't very helpful.

True enough. I hope I've explained myself above.

> The LSM issue is thornier than the rest of it though, which is why I
> specifically asked for review there in the cover letter. There's a lot
> of complexity and nuance, and I still don't have a grasp on all the
> subtleties. One such subtlety is the full impact of simply ignoring the
> security labels on disk (but I am still confused as to why this is
> different from filesystems which don't support xattrs at all).

If you can mount a filesystem such that the labels are ignored you
are effectively specifying that the Smack label on the files be 
determined by the defaulting rules. With CAP_MAC_ADMIN that's fine.
Without it, it's not.

> I was unaware of Lukasz's patches until yesterday, and I will have a
> look at them. But since we don't have the LSM support for user
> namespaces yet, I don't see the problem with doing something safe for
> LSMs initially and evolving the LSM integration for user ns mounts along
> with the rest of the user ns integration.

Ignoring the security attributes is not safe!

> Your point is taken about my less-than-expert opinion about the other
> security modules. We should at minimum get acks from the maintainers of
> those modules that unprivileged mounts will not compromise MAC.

I am the Smack maintainer. Unprivileged mounts as you have
described them compromise MAC. They compromise DAC, too.

> For Smack specifically, I believe my only concern was the SMACK64EXEC
> attribute, as all the other attributes only affected subjects' access to
> the files. So maybe it would be possible to simply ignore this attribute
> in unprivileged mounts and respect the others, even lacking more
> complete LSM support for user namespaces.

SMACK64EXEC is analogous to the setuid bit, but I would rather see
exec() of programs with this attribute refused that for it to be
blindly ignored.

> Seth
>

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Andy Lutomirski July 16, 2015, 10:27 p.m. UTC | #6

On Thu, Jul 16, 2015 at 2:42 PM, Casey Schaufler <casey@schaufler-ca.com> wrote:
> You want to provide a mechanism whereby an unprivileged user (Seth)
> can mount a filesystem for his own use. You want full filesystem
> semantics, but you're willing to accept restrictions on certain
> filesystem features to avoid opening security holes. You are not
> willing to accept restrictions that make the filesystem unusable,
> such as making it read-only.
>
> I am going to present a suggestion. Feel free to correct my
> assumptions and my reasoning. For simplicity let's use loop-back
> mounting of a filesystem contained in a file as an example. The
> principles should apply to newly created memory based filesystems
> or disk partitions "owned" by Seth.
>
> Seth wants to mount a file (~seth/myfs) which contains an ext4
> filesystem. There is already a filesystem object, with security
> attributes, that the system knows how to deal with. If Seth mounts
> this as a filesystem he, and potentially other people, will be
> able to access the content of this object without accessing the
> object itself.
>
>         seth$ mount --justforme -t ext4 ~seth/myfs /tmp/seth
>         seth$ chmod 777 /tmp/seth
>         seth$ ls -la /tmp/seth
>         drwxrwxrwx.  3  seth     seth   260 Jul 16 12:59 .
>         drwxrwxrwxt 18  root     root  4069 Jul 16 11:13 ..
>         seth$
>
> Everything's fine at this point. Wilma is also using the system,
> being the sort who likes to hide things in out of the way places
>
>         wilma$ cp ~/scandals /tmp/seth
>         wilma$ chmod 600 /tmp/seth/scandals

This is already impossible as described.  Seth can only mount the
filesystem in a private mount namespace inside a user namespace that
he created.  Wilma can't see it unless Seth passes an fd to Wilma and
Wilma accepts and uses it.

>
> puts her list of scandals on the unsuspecting filesystem, and changes
> the mode to ensure that no one can find out what went on after the
> office party.
>
> Seth unmounts /tmp/seth. He looks in ~seth/myfs, finds out what really
> happened at the office party, and the story goes from there.
>
> Wilma did everything correctly according to the system security policy,
> but the system security policy did not protect her as advertised. The
> system was tricked into behaving as if it was in control of the content
> of the filesystem when in fact it was not.


I would argue that, if Wilma writes to some place described by an fd
and doesn't verify where she's writing to, then she has no expectation
of privacy.  After all, she could just *tell* Seth directly whatever
she wants (assuming she can communicate with Seth in the first place).

>
> One way to fix this problem is for unprivileged mounts to recognize the
> attributes of the object mounted and to propagate those attributes to all
> the objects they present. All files on /tmp/seth would be owned by seth
> and protected by the mode bits, ACL and LSM requirements of ~/seth/myfs.

This is impossible to enforce, because Seth could use FUSE instead of ext4.

> opening a file on /tmp/seth would require the same permissions as opening
> the file containing the mounted filesystem. These attributes would have to
> be immutable, or at least demonstrably more restrictive (chmod might be
> allowed in some cases, but chown would never be) when changed. I don't see
> how a user other than seth could create a new file, as you'd either have
> a magical change in ownership or a false sense of security.

This would be a very harsh restriction.  Seth might legitimately want
to give a user access to a file on backing store he owns without
giving that user access to the backing store.  Root on a normal system
does that all the time.

> If you can mount a filesystem such that the labels are ignored you
> are effectively specifying that the Smack label on the files be
> determined by the defaulting rules. With CAP_MAC_ADMIN that's fine.
> Without it, it's not.

Can you explain what the threat model is here?  I don't see what it is
that you're trying to prevent.

>> Your point is taken about my less-than-expert opinion about the other
>> security modules. We should at minimum get acks from the maintainers of
>> those modules that unprivileged mounts will not compromise MAC.
>
> I am the Smack maintainer. Unprivileged mounts as you have
> described them compromise MAC. They compromise DAC, too.
>

How do they compromise DAC?

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Casey Schaufler July 16, 2015, 11:08 p.m. UTC | #7

On 7/16/2015 3:27 PM, Andy Lutomirski wrote:
> On Thu, Jul 16, 2015 at 2:42 PM, Casey Schaufler <casey@schaufler-ca.com> wrote:
>> You want to provide a mechanism whereby an unprivileged user (Seth)
>> can mount a filesystem for his own use. You want full filesystem
>> semantics, but you're willing to accept restrictions on certain
>> filesystem features to avoid opening security holes. You are not
>> willing to accept restrictions that make the filesystem unusable,
>> such as making it read-only.
>>
>> I am going to present a suggestion. Feel free to correct my
>> assumptions and my reasoning. For simplicity let's use loop-back
>> mounting of a filesystem contained in a file as an example. The
>> principles should apply to newly created memory based filesystems
>> or disk partitions "owned" by Seth.
>>
>> Seth wants to mount a file (~seth/myfs) which contains an ext4
>> filesystem. There is already a filesystem object, with security
>> attributes, that the system knows how to deal with. If Seth mounts
>> this as a filesystem he, and potentially other people, will be
>> able to access the content of this object without accessing the
>> object itself.
>>
>>         seth$ mount --justforme -t ext4 ~seth/myfs /tmp/seth
>>         seth$ chmod 777 /tmp/seth
>>         seth$ ls -la /tmp/seth
>>         drwxrwxrwx.  3  seth     seth   260 Jul 16 12:59 .
>>         drwxrwxrwxt 18  root     root  4069 Jul 16 11:13 ..
>>         seth$
>>
>> Everything's fine at this point. Wilma is also using the system,
>> being the sort who likes to hide things in out of the way places
>>
>>         wilma$ cp ~/scandals /tmp/seth
>>         wilma$ chmod 600 /tmp/seth/scandals
> This is already impossible as described.  Seth can only mount the
> filesystem in a private mount namespace inside a user namespace that
> he created.  Wilma can't see it unless Seth passes an fd to Wilma and
> Wilma accepts and uses it.

But you do have multiple UIDs withing your user namespace, right?
There are processes running as someone other than seth, right?

>
>> puts her list of scandals on the unsuspecting filesystem, and changes
>> the mode to ensure that no one can find out what went on after the
>> office party.
>>
>> Seth unmounts /tmp/seth. He looks in ~seth/myfs, finds out what really
>> happened at the office party, and the story goes from there.
>>
>> Wilma did everything correctly according to the system security policy,
>> but the system security policy did not protect her as advertised. The
>> system was tricked into behaving as if it was in control of the content
>> of the filesystem when in fact it was not.
>
> I would argue that, if Wilma writes to some place described by an fd
> and doesn't verify where she's writing to, then she has no expectation
> of privacy.  After all, she could just *tell* Seth directly whatever
> she wants (assuming she can communicate with Seth in the first place).

Don't ascribe either wisdom or good intentions to Wilma.

>> One way to fix this problem is for unprivileged mounts to recognize the
>> attributes of the object mounted and to propagate those attributes to all
>> the objects they present. All files on /tmp/seth would be owned by seth
>> and protected by the mode bits, ACL and LSM requirements of ~/seth/myfs.
> This is impossible to enforce, because Seth could use FUSE instead of ext4.

I never said that things aren't already broken. And, if you want
to ignore the potential DAC issues (read, negative groups) just
do it for the LSM xattrs.


>
>> opening a file on /tmp/seth would require the same permissions as opening
>> the file containing the mounted filesystem. These attributes would have to
>> be immutable, or at least demonstrably more restrictive (chmod might be
>> allowed in some cases, but chown would never be) when changed. I don't see
>> how a user other than seth could create a new file, as you'd either have
>> a magical change in ownership or a false sense of security.
> This would be a very harsh restriction.  Seth might legitimately want
> to give a user access to a file on backing store he owns without
> giving that user access to the backing store.  Root on a normal system
> does that all the time.

You already said that it was impossible for Wilma to get
access, so how is this more restrictive? Besides, Seth can
always set the mode on ~/seth so that Wilma can't read the
files it contains. This isn't an old problem or a novel
solution.

>> If you can mount a filesystem such that the labels are ignored you
>> are effectively specifying that the Smack label on the files be
>> determined by the defaulting rules. With CAP_MAC_ADMIN that's fine.
>> Without it, it's not.
> Can you explain what the threat model is here?  I don't see what it is
> that you're trying to prevent.

Um, OK.
The filesystem has files with a hundred different Smack labels on it.
I mount it as an unlabeled filesystem and everything is readable by
everyone. Bad jojo.

>
>>> Your point is taken about my less-than-expert opinion about the other
>>> security modules. We should at minimum get acks from the maintainers of
>>> those modules that unprivileged mounts will not compromise MAC.
>> I am the Smack maintainer. Unprivileged mounts as you have
>> described them compromise MAC. They compromise DAC, too.
>>
> How do they compromise DAC?

Wilma's expectation (or the application running with a mapped UID)
that chmod will keep Seth out of the file.

> --Andy
>

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Andy Lutomirski July 16, 2015, 11:29 p.m. UTC | #8

On Thu, Jul 16, 2015 at 4:08 PM, Casey Schaufler <casey@schaufler-ca.com> wrote:
> On 7/16/2015 3:27 PM, Andy Lutomirski wrote:
>> On Thu, Jul 16, 2015 at 2:42 PM, Casey Schaufler <casey@schaufler-ca.com> wrote:
>>> You want to provide a mechanism whereby an unprivileged user (Seth)
>>> can mount a filesystem for his own use. You want full filesystem
>>> semantics, but you're willing to accept restrictions on certain
>>> filesystem features to avoid opening security holes. You are not
>>> willing to accept restrictions that make the filesystem unusable,
>>> such as making it read-only.
>>>
>>> I am going to present a suggestion. Feel free to correct my
>>> assumptions and my reasoning. For simplicity let's use loop-back
>>> mounting of a filesystem contained in a file as an example. The
>>> principles should apply to newly created memory based filesystems
>>> or disk partitions "owned" by Seth.
>>>
>>> Seth wants to mount a file (~seth/myfs) which contains an ext4
>>> filesystem. There is already a filesystem object, with security
>>> attributes, that the system knows how to deal with. If Seth mounts
>>> this as a filesystem he, and potentially other people, will be
>>> able to access the content of this object without accessing the
>>> object itself.
>>>
>>>         seth$ mount --justforme -t ext4 ~seth/myfs /tmp/seth
>>>         seth$ chmod 777 /tmp/seth
>>>         seth$ ls -la /tmp/seth
>>>         drwxrwxrwx.  3  seth     seth   260 Jul 16 12:59 .
>>>         drwxrwxrwxt 18  root     root  4069 Jul 16 11:13 ..
>>>         seth$
>>>
>>> Everything's fine at this point. Wilma is also using the system,
>>> being the sort who likes to hide things in out of the way places
>>>
>>>         wilma$ cp ~/scandals /tmp/seth
>>>         wilma$ chmod 600 /tmp/seth/scandals
>> This is already impossible as described.  Seth can only mount the
>> filesystem in a private mount namespace inside a user namespace that
>> he created.  Wilma can't see it unless Seth passes an fd to Wilma and
>> Wilma accepts and uses it.
>
> But you do have multiple UIDs withing your user namespace, right?
> There are processes running as someone other than seth, right?
>

Only if root set it up that way.  For example, root could set up
"subuids" (this is a userspace concept) that belong to Seth.  These
would be uids that Seth controls and that represent subsets of Seth's
authority. Wilma wouldn't be one of these subuids unless she was
somehow part of Seth (or if root completely screwed up).

>>
>>> puts her list of scandals on the unsuspecting filesystem, and changes
>>> the mode to ensure that no one can find out what went on after the
>>> office party.
>>>
>>> Seth unmounts /tmp/seth. He looks in ~seth/myfs, finds out what really
>>> happened at the office party, and the story goes from there.
>>>
>>> Wilma did everything correctly according to the system security policy,
>>> but the system security policy did not protect her as advertised. The
>>> system was tricked into behaving as if it was in control of the content
>>> of the filesystem when in fact it was not.
>>
>> I would argue that, if Wilma writes to some place described by an fd
>> and doesn't verify where she's writing to, then she has no expectation
>> of privacy.  After all, she could just *tell* Seth directly whatever
>> she wants (assuming she can communicate with Seth in the first place).
>
> Don't ascribe either wisdom or good intentions to Wilma.

In that case, I'll mention the futility of solving the problem, even
without user namespaces.  If Wilma tells Seth something, he's going to
find out.  If Wilma pokes it (in whatever form) into an fd provided by
Seth, then Seth is extremely likely to find out, regardless of what
root or the MAC owner tries to do.

If Wilma writes to a path that's mounted in her namespace, then, sure,
overall policy associated with her namespace (which, in your example,
is the root namespace) must apply.  But Seth can't mount things into
Wilma's namespace without having CAP_SYS_ADMIN in that namespace and,
if he has CAP_SYS_ADMIN, it's already game over.

>
>>> One way to fix this problem is for unprivileged mounts to recognize the
>>> attributes of the object mounted and to propagate those attributes to all
>>> the objects they present. All files on /tmp/seth would be owned by seth
>>> and protected by the mode bits, ACL and LSM requirements of ~/seth/myfs.
>> This is impossible to enforce, because Seth could use FUSE instead of ext4.
>
> I never said that things aren't already broken. And, if you want
> to ignore the potential DAC issues (read, negative groups) just
> do it for the LSM xattrs.
>

Negative groups are a solved problem, I believe.

>
>>
>>> opening a file on /tmp/seth would require the same permissions as opening
>>> the file containing the mounted filesystem. These attributes would have to
>>> be immutable, or at least demonstrably more restrictive (chmod might be
>>> allowed in some cases, but chown would never be) when changed. I don't see
>>> how a user other than seth could create a new file, as you'd either have
>>> a magical change in ownership or a false sense of security.
>> This would be a very harsh restriction.  Seth might legitimately want
>> to give a user access to a file on backing store he owns without
>> giving that user access to the backing store.  Root on a normal system
>> does that all the time.
>
> You already said that it was impossible for Wilma to get
> access, so how is this more restrictive? Besides, Seth can
> always set the mode on ~/seth so that Wilma can't read the
> files it contains. This isn't an old problem or a novel
> solution.

Seth can pass an fd around.  This is actually a plausible thing to do:
Seth creates a userns to sandbox himself, mounts some FUSE thing in
there, and passes an fd out for the benefit of some daemon.  That
daemon had better validate the thing before using it, though.

I really don't see the benefit of making up extra rules that apply to
users outside a userns who try to access specifically a filesystem
with backing store.  They wouldn't make sense for filesystems without
backing store.

>
>>> If you can mount a filesystem such that the labels are ignored you
>>> are effectively specifying that the Smack label on the files be
>>> determined by the defaulting rules. With CAP_MAC_ADMIN that's fine.
>>> Without it, it's not.
>> Can you explain what the threat model is here?  I don't see what it is
>> that you're trying to prevent.
>
> Um, OK.
> The filesystem has files with a hundred different Smack labels on it.
> I mount it as an unlabeled filesystem and everything is readable by
> everyone. Bad jojo.

I still don't understand.  If it's a filesystem backed by a file that
Seth has RW access to, then Seth can read everything on it, full stop.
The security labels in the filesystem are irrelevant.

This is like saying that, if you put restrictive labels in the
filesystem that lives on /dev/sda2 and give Seth ownership of
/dev/sda2, then you expect Seth to be unable to bypass the policy
specifies by your labels.

Or maybe I'm misunderstanding you.

>
>>
>>>> Your point is taken about my less-than-expert opinion about the other
>>>> security modules. We should at minimum get acks from the maintainers of
>>>> those modules that unprivileged mounts will not compromise MAC.
>>> I am the Smack maintainer. Unprivileged mounts as you have
>>> described them compromise MAC. They compromise DAC, too.
>>>
>> How do they compromise DAC?
>
> Wilma's expectation (or the application running with a mapped UID)
> that chmod will keep Seth out of the file.

That was never true.  If Seth has an open fd, Wilma can chmod all day
and it won't matter.  In this example, Seth owns the entire filesystem
along with its backing store.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Casey Schaufler July 17, 2015, 12:45 a.m. UTC | #9

On 7/16/2015 4:29 PM, Andy Lutomirski wrote:
> On Thu, Jul 16, 2015 at 4:08 PM, Casey Schaufler <casey@schaufler-ca.com> wrote:
>> On 7/16/2015 3:27 PM, Andy Lutomirski wrote:
>>> On Thu, Jul 16, 2015 at 2:42 PM, Casey Schaufler <casey@schaufler-ca.com> wrote:
>>>> You want to provide a mechanism whereby an unprivileged user (Seth)
>>>> can mount a filesystem for his own use. You want full filesystem
>>>> semantics, but you're willing to accept restrictions on certain
>>>> filesystem features to avoid opening security holes. You are not
>>>> willing to accept restrictions that make the filesystem unusable,
>>>> such as making it read-only.
>>>>
>>>> I am going to present a suggestion. Feel free to correct my
>>>> assumptions and my reasoning. For simplicity let's use loop-back
>>>> mounting of a filesystem contained in a file as an example. The
>>>> principles should apply to newly created memory based filesystems
>>>> or disk partitions "owned" by Seth.
>>>>
>>>> Seth wants to mount a file (~seth/myfs) which contains an ext4
>>>> filesystem. There is already a filesystem object, with security
>>>> attributes, that the system knows how to deal with. If Seth mounts
>>>> this as a filesystem he, and potentially other people, will be
>>>> able to access the content of this object without accessing the
>>>> object itself.
>>>>
>>>>         seth$ mount --justforme -t ext4 ~seth/myfs /tmp/seth
>>>>         seth$ chmod 777 /tmp/seth
>>>>         seth$ ls -la /tmp/seth
>>>>         drwxrwxrwx.  3  seth     seth   260 Jul 16 12:59 .
>>>>         drwxrwxrwxt 18  root     root  4069 Jul 16 11:13 ..
>>>>         seth$
>>>>
>>>> Everything's fine at this point. Wilma is also using the system,
>>>> being the sort who likes to hide things in out of the way places
>>>>
>>>>         wilma$ cp ~/scandals /tmp/seth
>>>>         wilma$ chmod 600 /tmp/seth/scandals
>>> This is already impossible as described.  Seth can only mount the
>>> filesystem in a private mount namespace inside a user namespace that
>>> he created.  Wilma can't see it unless Seth passes an fd to Wilma and
>>> Wilma accepts and uses it.
>> But you do have multiple UIDs withing your user namespace, right?
>> There are processes running as someone other than seth, right?
>>
> Only if root set it up that way.  For example, root could set up
> "subuids" (this is a userspace concept) that belong to Seth.  These
> would be uids that Seth controls and that represent subsets of Seth's
> authority. Wilma wouldn't be one of these subuids unless she was
> somehow part of Seth (or if root completely screwed up).

Or if root had some really unexpected and inappropriate ideas
on what qualifies as "clever". But I'll back off. It looks like
this particular objection of mine is covered.

>
>>>> puts her list of scandals on the unsuspecting filesystem, and changes
>>>> the mode to ensure that no one can find out what went on after the
>>>> office party.
>>>>
>>>> Seth unmounts /tmp/seth. He looks in ~seth/myfs, finds out what really
>>>> happened at the office party, and the story goes from there.
>>>>
>>>> Wilma did everything correctly according to the system security policy,
>>>> but the system security policy did not protect her as advertised. The
>>>> system was tricked into behaving as if it was in control of the content
>>>> of the filesystem when in fact it was not.
>>> I would argue that, if Wilma writes to some place described by an fd
>>> and doesn't verify where she's writing to, then she has no expectation
>>> of privacy.  After all, she could just *tell* Seth directly whatever
>>> she wants (assuming she can communicate with Seth in the first place).
>> Don't ascribe either wisdom or good intentions to Wilma.
> In that case, I'll mention the futility of solving the problem, even
> without user namespaces.  If Wilma tells Seth something, he's going to
> find out.  If Wilma pokes it (in whatever form) into an fd provided by
> Seth, then Seth is extremely likely to find out, regardless of what
> root or the MAC owner tries to do.

I'll buy that, too. I still get queasy every time someone
tells me that passing file descriptors is a security feature.

> If Wilma writes to a path that's mounted in her namespace, then, sure,
> overall policy associated with her namespace (which, in your example,
> is the root namespace) must apply.  But Seth can't mount things into
> Wilma's namespace without having CAP_SYS_ADMIN in that namespace and,
> if he has CAP_SYS_ADMIN, it's already game over.

And so long as it's restricted to the namespace ...
I'm starting to get it now.

>>>> One way to fix this problem is for unprivileged mounts to recognize the
>>>> attributes of the object mounted and to propagate those attributes to all
>>>> the objects they present. All files on /tmp/seth would be owned by seth
>>>> and protected by the mode bits, ACL and LSM requirements of ~/seth/myfs.
>>> This is impossible to enforce, because Seth could use FUSE instead of ext4.
>> I never said that things aren't already broken. And, if you want
>> to ignore the potential DAC issues (read, negative groups) just
>> do it for the LSM xattrs.
>>
> Negative groups are a solved problem, I believe.

My position is that there's a workaround but that the
design is still fundamentally flawed. 

>
>>>> opening a file on /tmp/seth would require the same permissions as opening
>>>> the file containing the mounted filesystem. These attributes would have to
>>>> be immutable, or at least demonstrably more restrictive (chmod might be
>>>> allowed in some cases, but chown would never be) when changed. I don't see
>>>> how a user other than seth could create a new file, as you'd either have
>>>> a magical change in ownership or a false sense of security.
>>> This would be a very harsh restriction.  Seth might legitimately want
>>> to give a user access to a file on backing store he owns without
>>> giving that user access to the backing store.  Root on a normal system
>>> does that all the time.
>> You already said that it was impossible for Wilma to get
>> access, so how is this more restrictive? Besides, Seth can
>> always set the mode on ~/seth so that Wilma can't read the
>> files it contains. This isn't an old problem or a novel
>> solution.
> Seth can pass an fd around.  This is actually a plausible thing to do:
> Seth creates a userns to sandbox himself, mounts some FUSE thing in
> there, and passes an fd out for the benefit of some daemon.  That
> daemon had better validate the thing before using it, though.

Point. It won't, but it should.

> I really don't see the benefit of making up extra rules that apply to
> users outside a userns who try to access specifically a filesystem
> with backing store.  They wouldn't make sense for filesystems without
> backing store.

Sure it would. For Smack, it would be the label a file would be
created with, which would be the label of the process creating
the memory based filesystem. For SELinux the rules are more a
touch more sophisticated, but I'm sure that Paul or Stephen could
come up with how to determine it.

The point, looping all the way back to the beginning, where we
were talking about just ignoring the labels on the filesystem,
is that if you use the same Smack label on the files in the
filesystem as the backing store file has, we'll all be happy.
If that label isn't something user can write to, he won't be
able to write to the mounted objects, either. If there is no
backing store then use the label of the process creating the
filesystem, which will be the user, which will mean everything
will work hunky dory.

Yes, there's work involved, but I doubt there's a lot. Getting
the label from the backing store or the creating process is
simple enough.


>>>> If you can mount a filesystem such that the labels are ignored you
>>>> are effectively specifying that the Smack label on the files be
>>>> determined by the defaulting rules. With CAP_MAC_ADMIN that's fine.
>>>> Without it, it's not.
>>> Can you explain what the threat model is here?  I don't see what it is
>>> that you're trying to prevent.
>> Um, OK.
>> The filesystem has files with a hundred different Smack labels on it.
>> I mount it as an unlabeled filesystem and everything is readable by
>> everyone. Bad jojo.
> I still don't understand.  If it's a filesystem backed by a file that
> Seth has RW access to, then Seth can read everything on it, full stop.
> The security labels in the filesystem are irrelevant.

Well, they can't be trusted, if that's what you mean.
That's why I'm saying that the objects exposed by mounting
this backing store need to be treated with the same security
attributes as the backing store. Fudge it for DAC if you are
so inclined, but I think it's the right way to go for MAC.

> This is like saying that, if you put restrictive labels in the
> filesystem that lives on /dev/sda2 and give Seth ownership of
> /dev/sda2, then you expect Seth to be unable to bypass the policy
> specifies by your labels.

Consider the Smack label on /dev/sda2. Smack does not care
who owns it, just what the Smack label is. Just like on
~/seth/myfs. The backing store "object" is /dev/sda2 in the
one case, ~/seth/myfs in the other, and something in the ether
for a memory based filesystem. So long as the labels of the
files exposed on the mount point match those of the backing
store "object", Smack is going to be happy. Since you're
running without privilege, you can't change the labels on
the files.

Now Seth, being the sneaky person that he is, could change
the Smack labels on the files in the backing store while it's
offline. Since he has access to the backing store, he can't
give himself more access by changing the labels within the
filesystem. He can give himself less, but I'm OK with that.

> Or maybe I'm misunderstanding you.

Probably, but I'm undoubtedly doing the same.

If you're going to be at LinuxCon in Seattle we should
continue this discussion over the beverage of your choice.

>>>>> Your point is taken about my less-than-expert opinion about the other
>>>>> security modules. We should at minimum get acks from the maintainers of
>>>>> those modules that unprivileged mounts will not compromise MAC.
>>>> I am the Smack maintainer. Unprivileged mounts as you have
>>>> described them compromise MAC. They compromise DAC, too.
>>>>
>>> How do they compromise DAC?
>> Wilma's expectation (or the application running with a mapped UID)
>> that chmod will keep Seth out of the file.
> That was never true.  If Seth has an open fd, Wilma can chmod all day
> and it won't matter.  In this example, Seth owns the entire filesystem
> along with its backing store.
>
> --Andy
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Andy Lutomirski July 17, 2015, 12:59 a.m. UTC | #10

On Thu, Jul 16, 2015 at 5:45 PM, Casey Schaufler <casey@schaufler-ca.com> wrote:
> On 7/16/2015 4:29 PM, Andy Lutomirski wrote:
>> I really don't see the benefit of making up extra rules that apply to
>> users outside a userns who try to access specifically a filesystem
>> with backing store.  They wouldn't make sense for filesystems without
>> backing store.
>
> Sure it would. For Smack, it would be the label a file would be
> created with, which would be the label of the process creating
> the memory based filesystem. For SELinux the rules are more a
> touch more sophisticated, but I'm sure that Paul or Stephen could
> come up with how to determine it.
>
> The point, looping all the way back to the beginning, where we
> were talking about just ignoring the labels on the filesystem,
> is that if you use the same Smack label on the files in the
> filesystem as the backing store file has, we'll all be happy.
> If that label isn't something user can write to, he won't be
> able to write to the mounted objects, either. If there is no
> backing store then use the label of the process creating the
> filesystem, which will be the user, which will mean everything
> will work hunky dory.
>
> Yes, there's work involved, but I doubt there's a lot. Getting
> the label from the backing store or the creating process is
> simple enough.
>

So what if Smack used the label of the user creating the filesystem
even for filesystems with backing store?  IMO this ought to be doable
with the LSM hooks -- it certainly seems reasonable for the LSM to be
aware of who created a filesystem.  In fact, I'd argue that if Smack
can't do this with the proposed LSM hooks, then the hooks are
insufficient.

Presumably Smack could also figure out what was mounted, but keep in
mind that there are filesystems like ntfs-3g out there.  While ntfs-3g
logically has backing store, I don't think the kernel actually knows
about it.

>
>>>>> If you can mount a filesystem such that the labels are ignored you
>>>>> are effectively specifying that the Smack label on the files be
>>>>> determined by the defaulting rules. With CAP_MAC_ADMIN that's fine.
>>>>> Without it, it's not.
>>>> Can you explain what the threat model is here?  I don't see what it is
>>>> that you're trying to prevent.
>>> Um, OK.
>>> The filesystem has files with a hundred different Smack labels on it.
>>> I mount it as an unlabeled filesystem and everything is readable by
>>> everyone. Bad jojo.
>> I still don't understand.  If it's a filesystem backed by a file that
>> Seth has RW access to, then Seth can read everything on it, full stop.
>> The security labels in the filesystem are irrelevant.
>
> Well, they can't be trusted, if that's what you mean.
> That's why I'm saying that the objects exposed by mounting
> this backing store need to be treated with the same security
> attributes as the backing store. Fudge it for DAC if you are
> so inclined, but I think it's the right way to go for MAC.
>
>> This is like saying that, if you put restrictive labels in the
>> filesystem that lives on /dev/sda2 and give Seth ownership of
>> /dev/sda2, then you expect Seth to be unable to bypass the policy
>> specifies by your labels.
>
> Consider the Smack label on /dev/sda2. Smack does not care
> who owns it, just what the Smack label is. Just like on
> ~/seth/myfs. The backing store "object" is /dev/sda2 in the
> one case, ~/seth/myfs in the other, and something in the ether
> for a memory based filesystem. So long as the labels of the
> files exposed on the mount point match those of the backing
> store "object", Smack is going to be happy. Since you're
> running without privilege, you can't change the labels on
> the files.
>
> Now Seth, being the sneaky person that he is, could change
> the Smack labels on the files in the backing store while it's
> offline. Since he has access to the backing store, he can't
> give himself more access by changing the labels within the
> filesystem. He can give himself less, but I'm OK with that.
>
>> Or maybe I'm misunderstanding you.
>
> Probably, but I'm undoubtedly doing the same.
>
> If you're going to be at LinuxCon in Seattle we should
> continue this discussion over the beverage of your choice.

There's a small but not quite zero chance I'll be there.  I'll
probably be in Seoul.  It's too bad that LSS and KS are in different
places this year.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Seth Forshee July 17, 2015, 1:21 p.m. UTC | #11

On Thu, Jul 16, 2015 at 02:42:22PM -0700, Casey Schaufler wrote:

<snip>

> > I welcome feedback about anything I've missed, but stating generally
> > that you think I probably missed something isn't very helpful.
> 
> True enough. I hope I've explained myself above.

Thanks, that definitely clarified where we were having a disconnect.
Andy's done a fantastic job explaining how those concerns are addressed.

> > The LSM issue is thornier than the rest of it though, which is why I
> > specifically asked for review there in the cover letter. There's a lot
> > of complexity and nuance, and I still don't have a grasp on all the
> > subtleties. One such subtlety is the full impact of simply ignoring the
> > security labels on disk (but I am still confused as to why this is
> > different from filesystems which don't support xattrs at all).
> 
> If you can mount a filesystem such that the labels are ignored you
> are effectively specifying that the Smack label on the files be 
> determined by the defaulting rules. With CAP_MAC_ADMIN that's fine.
> Without it, it's not.
> 
> > I was unaware of Lukasz's patches until yesterday, and I will have a
> > look at them. But since we don't have the LSM support for user
> > namespaces yet, I don't see the problem with doing something safe for
> > LSMs initially and evolving the LSM integration for user ns mounts along
> > with the rest of the user ns integration.
> 
> Ignoring the security attributes is not safe!

Understood. It's surely safe for each LSM to deny such mounts until it
has a way to handle them safely though.

I'm not trying to completely punt on the issue of security modules, just
break this down into more manageable chunks. You've given good guidance
for Smack (thanks very much for that), so I can plan to work on that
soon.

> > Your point is taken about my less-than-expert opinion about the other
> > security modules. We should at minimum get acks from the maintainers of
> > those modules that unprivileged mounts will not compromise MAC.
> 
> I am the Smack maintainer. Unprivileged mounts as you have
> described them compromise MAC. They compromise DAC, too.

It looks like Andy's more or less convinced you that DAC isn't
(additionally?) compromised. And there's a plan for MAC, that the
security module can deny mounts from user namespaces until it has a
solution for allowing them safely.

> > For Smack specifically, I believe my only concern was the SMACK64EXEC
> > attribute, as all the other attributes only affected subjects' access to
> > the files. So maybe it would be possible to simply ignore this attribute
> > in unprivileged mounts and respect the others, even lacking more
> > complete LSM support for user namespaces.
> 
> SMACK64EXEC is analogous to the setuid bit, but I would rather see
> exec() of programs with this attribute refused that for it to be
> blindly ignored.

That's fine, it's your call.

Thanks,
Seth
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Serge E. Hallyn July 17, 2015, 2:28 p.m. UTC | #12

On Thu, Jul 16, 2015 at 05:59:22PM -0700, Andy Lutomirski wrote:
> On Thu, Jul 16, 2015 at 5:45 PM, Casey Schaufler <casey@schaufler-ca.com> wrote:
> > On 7/16/2015 4:29 PM, Andy Lutomirski wrote:
> >> I really don't see the benefit of making up extra rules that apply to
> >> users outside a userns who try to access specifically a filesystem
> >> with backing store.  They wouldn't make sense for filesystems without
> >> backing store.
> >
> > Sure it would. For Smack, it would be the label a file would be
> > created with, which would be the label of the process creating
> > the memory based filesystem. For SELinux the rules are more a
> > touch more sophisticated, but I'm sure that Paul or Stephen could
> > come up with how to determine it.
> >
> > The point, looping all the way back to the beginning, where we
> > were talking about just ignoring the labels on the filesystem,
> > is that if you use the same Smack label on the files in the
> > filesystem as the backing store file has, we'll all be happy.
> > If that label isn't something user can write to, he won't be
> > able to write to the mounted objects, either. If there is no
> > backing store then use the label of the process creating the
> > filesystem, which will be the user, which will mean everything
> > will work hunky dory.
> >
> > Yes, there's work involved, but I doubt there's a lot. Getting
> > the label from the backing store or the creating process is
> > simple enough.
> >
> 
> So what if Smack used the label of the user creating the filesystem
> even for filesystems with backing store?  IMO this ought to be doable

The more usual LSM-ish way to handle this would be to ask the LSM, at
mount time, with a new security_mount_bdev_in_userns() hook, passing
it the user's label and the backing store's label (if any), and storing
the label to be used for the files.  Even more LSM-ish (though risking
performance hit) would be to then have the LSM at each inode_init_security
decide whether to use that label or trust what's in the fs anyway (or
do something else).  That could allow the LSM to use policy to decide
that.

Because I don't know that for all LSMs it makes sense for a 'subject'
label to be assigned to an object.

> with the LSM hooks -- it certainly seems reasonable for the LSM to be
> aware of who created a filesystem.  In fact, I'd argue that if Smack
> can't do this with the proposed LSM hooks, then the hooks are
> insufficient.
> 
> Presumably Smack could also figure out what was mounted, but keep in
> mind that there are filesystems like ntfs-3g out there.  While ntfs-3g
> logically has backing store, I don't think the kernel actually knows
> about it.
> 
> >
> >>>>> If you can mount a filesystem such that the labels are ignored you
> >>>>> are effectively specifying that the Smack label on the files be
> >>>>> determined by the defaulting rules. With CAP_MAC_ADMIN that's fine.
> >>>>> Without it, it's not.
> >>>> Can you explain what the threat model is here?  I don't see what it is
> >>>> that you're trying to prevent.
> >>> Um, OK.
> >>> The filesystem has files with a hundred different Smack labels on it.
> >>> I mount it as an unlabeled filesystem and everything is readable by
> >>> everyone. Bad jojo.
> >> I still don't understand.  If it's a filesystem backed by a file that
> >> Seth has RW access to, then Seth can read everything on it, full stop.
> >> The security labels in the filesystem are irrelevant.
> >
> > Well, they can't be trusted, if that's what you mean.
> > That's why I'm saying that the objects exposed by mounting
> > this backing store need to be treated with the same security
> > attributes as the backing store. Fudge it for DAC if you are
> > so inclined, but I think it's the right way to go for MAC.
> >
> >> This is like saying that, if you put restrictive labels in the
> >> filesystem that lives on /dev/sda2 and give Seth ownership of
> >> /dev/sda2, then you expect Seth to be unable to bypass the policy
> >> specifies by your labels.
> >
> > Consider the Smack label on /dev/sda2. Smack does not care
> > who owns it, just what the Smack label is. Just like on
> > ~/seth/myfs. The backing store "object" is /dev/sda2 in the
> > one case, ~/seth/myfs in the other, and something in the ether
> > for a memory based filesystem. So long as the labels of the
> > files exposed on the mount point match those of the backing
> > store "object", Smack is going to be happy. Since you're
> > running without privilege, you can't change the labels on
> > the files.
> >
> > Now Seth, being the sneaky person that he is, could change
> > the Smack labels on the files in the backing store while it's
> > offline. Since he has access to the backing store, he can't
> > give himself more access by changing the labels within the
> > filesystem. He can give himself less, but I'm OK with that.
> >
> >> Or maybe I'm misunderstanding you.
> >
> > Probably, but I'm undoubtedly doing the same.
> >
> > If you're going to be at LinuxCon in Seattle we should
> > continue this discussion over the beverage of your choice.
> 
> There's a small but not quite zero chance I'll be there.  I'll
> probably be in Seoul.  It's too bad that LSS and KS are in different
> places this year.

FWIW I'll be there and happy to discuss.

-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Seth Forshee July 17, 2015, 2:56 p.m. UTC | #13

On Fri, Jul 17, 2015 at 09:28:32AM -0500, Serge E. Hallyn wrote:
> > > If you're going to be at LinuxCon in Seattle we should
> > > continue this discussion over the beverage of your choice.
> > 
> > There's a small but not quite zero chance I'll be there.  I'll
> > probably be in Seoul.  It's too bad that LSS and KS are in different
> > places this year.
> 
> FWIW I'll be there and happy to discuss.

I'll also be in Seattle and happy to discuss.

Seth
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Casey Schaufler July 17, 2015, 5:14 p.m. UTC | #14

On 7/17/2015 6:21 AM, Seth Forshee wrote:
> On Thu, Jul 16, 2015 at 02:42:22PM -0700, Casey Schaufler wrote:
>
> <snip>
>
>>> I welcome feedback about anything I've missed, but stating generally
>>> that you think I probably missed something isn't very helpful.
>> True enough. I hope I've explained myself above.
> Thanks, that definitely clarified where we were having a disconnect.
> Andy's done a fantastic job explaining how those concerns are addressed.
>
>>> The LSM issue is thornier than the rest of it though, which is why I
>>> specifically asked for review there in the cover letter. There's a lot
>>> of complexity and nuance, and I still don't have a grasp on all the
>>> subtleties. One such subtlety is the full impact of simply ignoring the
>>> security labels on disk (but I am still confused as to why this is
>>> different from filesystems which don't support xattrs at all).
>> If you can mount a filesystem such that the labels are ignored you
>> are effectively specifying that the Smack label on the files be 
>> determined by the defaulting rules. With CAP_MAC_ADMIN that's fine.
>> Without it, it's not.
>>
>>> I was unaware of Lukasz's patches until yesterday, and I will have a
>>> look at them. But since we don't have the LSM support for user
>>> namespaces yet, I don't see the problem with doing something safe for
>>> LSMs initially and evolving the LSM integration for user ns mounts along
>>> with the rest of the user ns integration.
>> Ignoring the security attributes is not safe!
> Understood. It's surely safe for each LSM to deny such mounts until it
> has a way to handle them safely though.
>
> I'm not trying to completely punt on the issue of security modules, just
> break this down into more manageable chunks. You've given good guidance
> for Smack (thanks very much for that), so I can plan to work on that
> soon.
>
>>> Your point is taken about my less-than-expert opinion about the other
>>> security modules. We should at minimum get acks from the maintainers of
>>> those modules that unprivileged mounts will not compromise MAC.
>> I am the Smack maintainer. Unprivileged mounts as you have
>> described them compromise MAC. They compromise DAC, too.
> It looks like Andy's more or less convinced you that DAC isn't
> (additionally?) compromised. And there's a plan for MAC, that the
> security module can deny mounts from user namespaces until it has a
> solution for allowing them safely.

I wouldn't say that Andy has me convinced on DAC. I would say that
he's taken me deeper into the details of namespaces than I feel
comfortable making arguments about. I don't know that he's right,
I just don't know how to argue that he isn't. Part of what bothers
me is the dependence on namespaces. If you could come up with a
mechanism that wasn't dependent on namespaces it would be much
easier for dinosaurs like me to comprehend.

As far as declaring that MAC and namespace owned mounts are
incompatible goes, I think that I said early on that wasn't
going to fly. Too much of the Linux population (Fedora, Android,
Tizen, ...) uses MAC for the feature to be considered ready
for general consumption without it. And no, I don't believe in
partial implementations. You wouldn't get away with putting this
in if it only worked on s370 processors.

>>> For Smack specifically, I believe my only concern was the SMACK64EXEC
>>> attribute, as all the other attributes only affected subjects' access to
>>> the files. So maybe it would be possible to simply ignore this attribute
>>> in unprivileged mounts and respect the others, even lacking more
>>> complete LSM support for user namespaces.
>> SMACK64EXEC is analogous to the setuid bit, but I would rather see
>> exec() of programs with this attribute refused that for it to be
>> blindly ignored.
> That's fine, it's your call.

I said it, but on reflection the current NOSETUID behavior is
as you described it, so I wouldn't change that.

>
> Thanks,
> Seth
> --
> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[0/7] Initial support for user namespace owned mounts

Commit Message

Comments

Patch