Message ID | 58736B2E.90201@huawei.com (mailing list archive) |
---|---|
State | Rejected |
Headers | show |
I want to use SELinux in system container and only concern the function in the container. this system container run in vm and every vm has only one system container. How do I use now? docker run ... system-contaier /sbin/init after init is running ,the following service is also running: #this is the part of service file which will run in container after starting the container. ... semodule -R #use the policy in container. restorecon / #if needed ... this method seem to work if host os and the docker images use the same content for rootfs, but if host use redhat7 and docker images use centos7, it will deny many normal operations , and this let some host service not work. If SELinux is permissive in host and enforcing in container ,it will resolve my problem. Unfortunately, there is no namespace for SELinux. Isolate SELinux is difficult and it has a lot of work to do, but is easier to isolate selinux_enforcing. What do you think ? Think you very much.
On Thu, 2017-03-09 at 17:03 +0800, yangshukui wrote: > I want to use SELinux in system container and only concern the > function > in the container. > this system container run in vm and every vm has only one system > container. > > How do I use now? > docker run ... system-contaier /sbin/init > after init is running ,the following service is also running: > > #this is the part of service file which will run in container after > starting the container. > ... > semodule -R #use the policy in container. > restorecon / #if needed > ... > > this method seem to work if host os and the docker images use the > same > content for rootfs, but if host use > redhat7 and docker images use centos7, it will deny many normal > operations , and this let some host service not work. > > If SELinux is permissive in host and enforcing in container ,it will > resolve my problem. Unfortunately, > there is no namespace for SELinux. > > Isolate SELinux is difficult and it has a lot of work to do, but is > easier to isolate selinux_enforcing. > > What do you think ? I'd rather see proper SELinux policy namespace support implemented. Admittedly, that won't be straightforward. FWIW, ChromiumOS appears to have done something similar to what you suggest for supporting Android containers (i.e. SELinux enforcing for the Android container, permissive for ChromiumOS processes outside the container), but they never discussed it with upstream SELinux developers AFAIK. My only knowledge of what they have done comes from their kernel repository [1]. It appears that they experimented with a hack to narrow the scope of selinux_enforcing to a PID namespace [2], then reverted that change later and just implemented an option to suppress audit denials for permissive domains [3] (evidently they are running the Chromium OS processes in a permissive domain; I haven't seen their policy). I wouldn't recommend either approach; the former won't properly handle permission checks that occur outside of process context or certain permission checks where the source context is not the current task context (e.g. an inter-object relationship check), while the latter requires leaving a permissive domain in the production policy (which seemingly would violate CTS; not sure why that gets a pass, and if that is ok, then why didn't they just create a domain allowed all permissions and use that outside the container instead - then they won't need to suppress audit at all?) and further requires use of a separate kernel for policy development/debugging. Note btw that they could have silenced the permissive denials via dontaudit rules instead (as Android does for its su domain) but chose not to do so to avoid taking the slow path. [1] https://chromium.googlesource.com/chromiumos/third_party/kernel [2] https://chromium-review.googlesource.com/c/361464/ [3] https://chromium-review.googlesource.com/c/424948/
On Thu, 2017-03-09 at 10:28 -0500, Stephen Smalley wrote: > On Thu, 2017-03-09 at 17:03 +0800, yangshukui wrote: > > > > I want to use SELinux in system container and only concern the > > function > > in the container. > > this system container run in vm and every vm has only one system > > container. > > > > How do I use now? > > docker run ... system-contaier /sbin/init > > after init is running ,the following service is also running: > > > > #this is the part of service file which will run in container > > after > > starting the container. > > ... > > semodule -R #use the policy in container. > > restorecon / #if needed > > ... > > > > this method seem to work if host os and the docker images use the > > same > > content for rootfs, but if host use > > redhat7 and docker images use centos7, it will deny many normal > > operations , and this let some host service not work. > > > > If SELinux is permissive in host and enforcing in container ,it > > will > > resolve my problem. Unfortunately, > > there is no namespace for SELinux. > > > > Isolate SELinux is difficult and it has a lot of work to do, but > > is > > easier to isolate selinux_enforcing. > > > > What do you think ? > > I'd rather see proper SELinux policy namespace support implemented. > Admittedly, that won't be straightforward. > > FWIW, ChromiumOS appears to have done something similar to what you > suggest for supporting Android containers (i.e. SELinux enforcing for > the Android container, permissive for ChromiumOS processes outside > the > container), but they never discussed it with upstream SELinux > developers AFAIK. My only knowledge of what they have done comes > from > their kernel repository [1]. It appears that they experimented with a > hack to narrow the scope of selinux_enforcing to a PID namespace [2], > then reverted that change later and just implemented an option to > suppress audit denials for permissive domains [3] (evidently they are > running the Chromium OS processes in a permissive domain; I haven't > seen their policy). I wouldn't recommend either approach; the former > won't properly handle permission checks that occur outside of process > context or certain permission checks where the source context is not > the current task context (e.g. an inter-object relationship check), > while the latter requires leaving a permissive domain in the > production > policy (which seemingly would violate CTS; not sure why that gets a > pass, and if that is ok, then why didn't they just create a domain > allowed all permissions and use that outside the container instead - > then they won't need to suppress audit at all?) and further requires > use of a separate kernel for policy development/debugging. Note btw > that they could have silenced the permissive denials via dontaudit > rules instead (as Android does for its su domain) but chose not to do > so to avoid taking the slow path. Sorry, should have looked more closely at their actual change - that last part of their rationale is bogus; a dontaudit rule would have prevented calling slow_avc_audit() at all, whereas their change merely returns early from slow_avc_audit(). So I really don't understand why they didn't just define dontaudit rules for all permissions (if using a permissive domain) or allow rules for all permissions (if using an enforcing, allow-all domain). Neither one is especially hard to write, and they could have just looked at the su domain in Android for an example of the former. > > [1] https://chromium.googlesource.com/chromiumos/third_party/kernel > [2] https://chromium-review.googlesource.com/c/361464/ > [3] https://chromium-review.googlesource.com/c/424948/
On 3/9/2017 1:03 AM, yangshukui wrote: > I want to use SELinux in system container and only concern the function in the container. > this system container run in vm and every vm has only one system container. > > How do I use now? > docker run ... system-contaier /sbin/init > after init is running ,the following service is also running: > > #this is the part of service file which will run in container after starting the container. > .. > semodule -R #use the policy in container. > restorecon / #if needed > .. > > this method seem to work if host os and the docker images use the same content for rootfs, but if host use > redhat7 and docker images use centos7, it will deny many normal operations , and this let some host service not work. > > If SELinux is permissive in host and enforcing in container ,it will resolve my problem. Unfortunately, > there is no namespace for SELinux. The LSM infrastructure is essentially a set of lists. These lists are rooted globally, but there's no reason* they couldn't be rooted in a namespace. That would give each namespace the option of using whatever security scheme was deemed appropriate. There are a number of issues, such as namespacing policy, that would have to be addressed, but the mechanism could work fine. I would look at patches. --- * Other than the sheer insanity of making security claims about such a system. I would not expect that minor issue to slow demand or deployment any more than it has in the past. > > Isolate SELinux is difficult and it has a lot of work to do, but is easier to isolate selinux_enforcing. > > What do you think ? > > Think you very much. > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-security-module" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >
Casey Schaufler <casey@schaufler-ca.com> writes: > On 3/9/2017 1:03 AM, yangshukui wrote: >> I want to use SELinux in system container and only concern the function in the container. >> this system container run in vm and every vm has only one system container. >> >> How do I use now? >> docker run ... system-contaier /sbin/init >> after init is running ,the following service is also running: >> >> #this is the part of service file which will run in container after starting the container. >> .. >> semodule -R #use the policy in container. >> restorecon / #if needed >> .. >> >> this method seem to work if host os and the docker images use the same content for rootfs, but if host use >> redhat7 and docker images use centos7, it will deny many normal operations , and this let some host service not work. >> >> If SELinux is permissive in host and enforcing in container ,it will resolve my problem. Unfortunately, >> there is no namespace for SELinux. This is mostly a SELinux problem. > The LSM infrastructure is essentially a set of lists. > These lists are rooted globally, but there's no reason* > they couldn't be rooted in a namespace. That would give > each namespace the option of using whatever security > scheme was deemed appropriate. There are a number of > issues, such as namespacing policy, that would have to > be addressed, but the mechanism could work fine. I would > look at patches. > > --- > * Other than the sheer insanity of making security > claims about such a system. I would not expect that > minor issue to slow demand or deployment any more > than it has in the past. I would tend to insist that the container local policy stacks inside the global policy. So that at the least the global security claims would not be reduced. My expectation is that a container would run as essentially all one label from a global perspective. To implement this would require a revision on the selinux labels xattrs so that they can be marked as being part of a container... But having the labels look ordinary inside the container. We almost have a patch that implements something like that for the capability xattr. Eric
On Thu, Mar 9, 2017 at 3:49 PM, Eric W. Biederman <ebiederm@xmission.com> wrote: > Casey Schaufler <casey@schaufler-ca.com> writes: > >> On 3/9/2017 1:03 AM, yangshukui wrote: >>> I want to use SELinux in system container and only concern the function in the container. >>> this system container run in vm and every vm has only one system container. >>> >>> How do I use now? >>> docker run ... system-contaier /sbin/init >>> after init is running ,the following service is also running: >>> >>> #this is the part of service file which will run in container after starting the container. >>> .. >>> semodule -R #use the policy in container. >>> restorecon / #if needed >>> .. >>> >>> this method seem to work if host os and the docker images use the same content for rootfs, but if host use >>> redhat7 and docker images use centos7, it will deny many normal operations , and this let some host service not work. >>> >>> If SELinux is permissive in host and enforcing in container ,it will resolve my problem. Unfortunately, >>> there is no namespace for SELinux. > > This is mostly a SELinux problem. > >> The LSM infrastructure is essentially a set of lists. >> These lists are rooted globally, but there's no reason* >> they couldn't be rooted in a namespace. That would give >> each namespace the option of using whatever security >> scheme was deemed appropriate. There are a number of >> issues, such as namespacing policy, that would have to >> be addressed, but the mechanism could work fine. I would >> look at patches. > >> >> --- >> * Other than the sheer insanity of making security >> claims about such a system. I would not expect that >> minor issue to slow demand or deployment any more >> than it has in the past. > > I would tend to insist that the container local policy stacks inside the > global policy. So that at the least the global security claims would > not be reduced. My current thinking is that namespacing is best left to the individual LSMs, as it is unlikely we will all want to solve it the same way. With SELinux we already have some basic support for what Eric describes via bounded domains, but that alone isn't likely to solve SELinux inside containers in a sense that most would expect; for that you will need what Stephen already described.
On Thu, 9 Mar 2017, Eric W. Biederman wrote: > My expectation is that a container would run as essentially all one > label from a global perspective. > Keep in mind that a different classes of objects may have distinct labeling in SELinux. e.g. a process and a file typically have different labels (say, sshd_t vs. sshd_key_t). Also, I think you will want to have the global namespace always use the original security labels. If accessing an object from outside the container, the original global policy should always apply. Really, this needs to be an invariant property. I'd suggest implementing an orthogonal 2nd set of security labels which are only ever used within the container. > To implement this would require a revision on the selinux labels xattrs > so that they can be marked as being part of a container... But having > the labels look ordinary inside the container. > > We almost have a patch that implements something like that for the > capability xattr. It'll be interesting to see.
On 3/13/2017 12:06 AM, James Morris wrote: > On Thu, 9 Mar 2017, Eric W. Biederman wrote: > >> My expectation is that a container would run as essentially all one >> label from a global perspective. >> > Keep in mind that a different classes of objects may have distinct > labeling in SELinux. e.g. a process and a file typically have different > labels (say, sshd_t vs. sshd_key_t). > > Also, I think you will want to have the global namespace always use the > original security labels. If accessing an object from outside the > container, the original global policy should always apply. Really, this > needs to be an invariant property. > > I'd suggest implementing an orthogonal 2nd set of security labels which > are only ever used within the container. The work that's been done for Smack namespaces https://lwn.net/Articles/652320 may come in handy during during your deliberations for SELinux. Conceptually you can create aliases for your base labels, and use those within the container. Very much like the UID mapping of user namespaces. Labels that don't have an alias can't be accessed within the namespace. >> To implement this would require a revision on the selinux labels xattrs >> so that they can be marked as being part of a container... But having >> the labels look ordinary inside the container. >> >> We almost have a patch that implements something like that for the >> capability xattr. > It'll be interesting to see. >
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index 57a2020..c10c58c 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -3596,6 +3596,9 @@ static int selinux_task_kill(struct task_struct *p, struct siginfo *info, static int selinux_task_wait(struct task_struct *p) { + if (pid_vnr(task_tgid(current)) == 1){ + return 0; + } return task_has_perm(p, current, PROCESS__SIGCHLD); }