Message ID | 17576.1477412418@warthog.procyon.org.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Oct 25, 2016 at 05:20:18PM +0100, David Howells wrote: > I have some questions about user namespacing, with regard to making keyrings > namespaced. My current idea is to follow the following method: > > (1) A new key/keyring records the user_namespace active when it is created. > > (2) If a process's user_namespace doesn't match that recorded in a key then > it gets ENOKEY if it tries to refer to it or access it and can't see it > in /proc/keys. > > (3) A process's keyring subscriptions are cleared if CLONE_NEWUSER is passed > to clone() or to unshare() so that it doesn't retain a keyring it can't > access. > > (4) Each user_namespace has its own separate register of persistent keyrings > and KEYCTL_GET_PERSISTENT can only get from the register of the currently > caller's user_namespace. This is already upstream > > as this seems the simplest solution. I don't want to add a new CLONE_xxx flag > as there isn't exactly a whole lot of room left. I think this sounds sensible. > Whilst I've got this partially working, there is a problem because the > user_struct contains pointers to the user's user-keyring and user-session > keyrings - and these would need replacing when entering a new user_namespace. > > However, the active user_struct is *not* replaced by create_user_ns(). Should > it be? > > I'm not sure whether there's a need to use the user_struct inherited from > before the unsharing - certainly setresuid(), for example, doesn't seem to > keep the values. Would it be possible to create a new user_struct with the > same kuid_t as the old one, but in the context of the new user_struct in case > it gets mapped? Most things in user_struct should not be per-namespace, at least not without the consent of root in the init namespace or so. Otherwise, someone could e.g. create a bunch of new namespaces and lock the maximum permitted amount of memory in each one, effectively allowing an arbitrary amount of memory to be pinned. However, I think that there are people who would benefit from being able to have the user_struct be per-namespace because they actually want separate limits for multiple containers, but have too many containers to be able to allocate separate kuid ranges for all of them. (I think James Bottomley said that in his Security Summit talk?) Therefore, I think it might make sense to have separate user_structs for user namespaces, but let them have a "struct user_struct *parent_pointer" to the user_struct belonging to the same kuid in the parent namespace and a resource_tracking_parent pointer that normally points to the user with the same kuid in the init namespace. Then, you could use the keyring stuff in the user_struct belonging to the current namespace, all the existing users of user_struct could follow the ->parent_pointer for now, and if someone wants to have per-ns resource tracking, they can add it in later relatively easily. But maybe I'm just overthinking this? -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2016-10-25 at 17:20 +0100, David Howells wrote: > I have some questions about user namespacing, with regard to making > keyrings > namespaced. My current idea is to follow the following method: > > (1) A new key/keyring records the user_namespace active when it is > created. > > (2) If a process's user_namespace doesn't match that recorded in a > key then > it gets ENOKEY if it tries to refer to it or access it and can't > see it > in /proc/keys. > > (3) A process's keyring subscriptions are cleared if CLONE_NEWUSER > is passed > to clone() or to unshare() so that it doesn't retain a keyring > it can't > access. > > (4) Each user_namespace has its own separate register of persistent > keyrings > and KEYCTL_GET_PERSISTENT can only get from the register of the > currently > caller's user_namespace. This is already upstream > > as this seems the simplest solution. I don't want to add a new > CLONE_xxx flag as there isn't exactly a whole lot of room left. OK, so the specific problem with this is that it fits perfectly the container orchestration system use case but pretty much fails for any non-orchestration use case. For me it would fail in the architecture emulation use case: I actually want an administerable container running as me, which means I use a user_ns to escalate my privilege over the constrained system. However, doing this in your world would now divorce me from any keyring I had access to, which isn't what I want. Why not make it the job of the orchestration system to drop keyrings? So by default CLONE_NEWUSER subscribes to the same keyrings the owning user does? That would mean it's configurable and you can add an extra KEYCTL_ to give fine grained access, so subscriptions can be dropped before the container is populated by its eventual end user, but the default case is likely what someone using creation of a user_ns to elevate their privilege wants. James -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Howells <dhowells@redhat.com> wrote: > (2) If a process's user_namespace doesn't match that recorded in a key then > it gets ENOKEY if it tries to refer to it or access it and can't see it > in /proc/keys. There's another possibility here - since user_namespaces are hierarchical, does it make sense to let a process see keys that are in an ancestral namespace? David -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2016-10-25 at 17:53 +0100, David Howells wrote: > David Howells <dhowells@redhat.com> wrote: > > > (2) If a process's user_namespace doesn't match that recorded in a > > key then > > it gets ENOKEY if it tries to refer to it or access it and > > can't see it > > in /proc/keys. > > There's another possibility here - since user_namespaces are > hierarchical, does it make sense to let a process see keys that are > in an ancestral namespace? I think that should be the decision of the owner. If you're creating a userns to de-privilege the next user, likely you don't want this, but if you're creating a userns to enhance it, then you do. I think you want to behave exactly as the mount namespace does: on initial clone, you get a fully cloned mount tree and then you customise it by unmounting pieces. James -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 25, 2016 at 09:56:45AM -0700, James Bottomley wrote: > On Tue, 2016-10-25 at 17:53 +0100, David Howells wrote: > > David Howells <dhowells@redhat.com> wrote: > > > > > (2) If a process's user_namespace doesn't match that recorded in a > > > key then > > > it gets ENOKEY if it tries to refer to it or access it and > > > can't see it > > > in /proc/keys. > > > > There's another possibility here - since user_namespaces are > > hierarchical, does it make sense to let a process see keys that are > > in an ancestral namespace? > > I think that should be the decision of the owner. If you're creating a > userns to de-privilege the next user, likely you don't want this, but > if you're creating a userns to enhance it, then you do. > > I think you want to behave exactly as the mount namespace does: on > initial clone, you get a fully cloned mount tree and then you customise > it by unmounting pieces. The issue with keys is that their interface is pretty broken when exposed to user namespaces with different UID mappings, and this also causes security issues where you can get access to other users' keyrings and stuff like that. The proposed fix works around that brokenness by simply never exposing keys to other namespaces. Your suggestion would probably be more flexible, but it'll require some more, moderately ugly kernel code where various keyname prefixes are special-cased so that names with these prefixes are dynamically translated into strings containing the KUID or so. -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > > There's another possibility here - since user_namespaces are > > hierarchical, does it make sense to let a process see keys that are > > in an ancestral namespace? > > I think that should be the decision of the owner. If you're creating a > userns to de-privilege the next user, likely you don't want this, but > if you're creating a userns to enhance it, then you do. Maybe the simplest is to put a 'stop here' flag on a user_namespace. Then when we look to see if a key is in the caller's namespace, we go up the tree until we hit the flag. If you don't find the key's ns within the caller's permitted subtree, you don't get to see the key. > I think you want to behave exactly as the mount namespace does: on > initial clone, you get a fully cloned mount tree and then you customise > it by unmounting pieces. That adds quite a bit of complexity, given that: (1) we don't have an automatic updating view of a keyring as the cloned mount tree does; (2) there are quotas on the number of keys a user is permitted; (3) you may not have permission to 'dismount' part of the tree. That said, it might be enough to just clone the *keyrings* and share the non-keyring keys. David -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2016-10-25 at 19:06 +0200, Jann Horn wrote: > On Tue, Oct 25, 2016 at 09:56:45AM -0700, James Bottomley wrote: > > On Tue, 2016-10-25 at 17:53 +0100, David Howells wrote: > > > David Howells <dhowells@redhat.com> wrote: > > > > > > > (2) If a process's user_namespace doesn't match that recorded > > > > in a > > > > key then > > > > it gets ENOKEY if it tries to refer to it or access it and > > > > can't see it > > > > in /proc/keys. > > > > > > There's another possibility here - since user_namespaces are > > > hierarchical, does it make sense to let a process see keys that > > > are > > > in an ancestral namespace? > > > > I think that should be the decision of the owner. If you're > > creating a > > userns to de-privilege the next user, likely you don't want this, > > but > > if you're creating a userns to enhance it, then you do. > > > > I think you want to behave exactly as the mount namespace does: on > > initial clone, you get a fully cloned mount tree and then you > > customise > > it by unmounting pieces. > > The issue with keys is that their interface is pretty broken when > exposed to user namespaces with different UID mappings, and this also > causes security issues where you can get access to other users' > keyrings and stuff like that. The proposed fix works around that > brokenness by simply never exposing keys to other namespaces. Define how ... we can fix the keyring interface if it's hugely problematic. If you're simply thinking that it's a uid range mapping leak, then the way I think of this is the same way the shadow utils do: the owning user owns the exterior range of ids. This seems to be the assumption the unprivileged container systems are working on as well, so it doesn't matter that there's a theoretical uid leak to the exterior mapped ids. > Your suggestion would probably be more flexible, but it'll require > some more, moderately ugly kernel code where various keyname prefixes > are special-cased so that names with these prefixes are dynamically > translated into strings containing the KUID or so. Can we not simply do it like mnt? have a refcounted pointer structure (struct mnt in its case) that points to a keyring. This would represent the subscription. Once you detach it, you wouldn't be able to recover it again. So a user namespace wouldn't really be part of the keyring prefix namespace. James -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2016-10-25 at 18:30 +0100, David Howells wrote: > James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > > > > There's another possibility here - since user_namespaces are > > > hierarchical, does it make sense to let a process see keys that > > > are > > > in an ancestral namespace? > > > > I think that should be the decision of the owner. If you're > > creating a > > userns to de-privilege the next user, likely you don't want this, > > but > > if you're creating a userns to enhance it, then you do. > > Maybe the simplest is to put a 'stop here' flag on a user_namespace. > Then when we look to see if a key is in the caller's namespace, we > go up the tree until we hit the flag. If you don't find the key's ns > within the caller's permitted subtree, you don't get to see the key. That sounds a bit nasty. Even for nested user namespaces, what we traditionally see is the interior uid and the exterior kuid; we don't usually see the intermediary mappings unless you specifically traverse the userns tree. > > I think you want to behave exactly as the mount namespace does: on > > initial clone, you get a fully cloned mount tree and then you > > customise it by unmounting pieces. > > That adds quite a bit of complexity, given that: > > (1) we don't have an automatic updating view of a keyring as the > cloned mount tree does; You're thinking of propagation? The keyrings aren't currently hierarchical in the same sense are they, so we really only need a top level cloned, refcounted pointer. Any change to the underlying keyring could either COW the keyring or simply operate on the current one (shared mount behaviour). I'm betting the latter is far easier and if you want the former, the orchestration system could simply copy all the keys to a new keyring, release the old one and then rename the new one to what the old one was, making it an orchestration problem to sort out. > (2) there are quotas on the number of keys a user is permitted; If this is counted per-id, then yes, mapping a range of ids with an owning exterior non root user does allow you to escape the count ... how important is it? Could we not simply hand wave it away as yet another consequence you have to think about when giving a user a range of exterior ids (because if there are N, their key count could get up to N * the limit). > (3) you may not have permission to 'dismount' part of the tree. "root" in the userns should have. I presume there's also some capability we can give a non-root user. > That said, it might be enough to just clone the *keyrings* and share > the non-keyring keys. Right ... do a refcounted clone structure and you can release the keyring if the refcount goes to zero. > David > _______________________________________________ > Containers mailing list > Containers@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/containers > -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 25, 2016 at 11:05:08AM -0700, James Bottomley wrote: > On Tue, 2016-10-25 at 19:06 +0200, Jann Horn wrote: > > On Tue, Oct 25, 2016 at 09:56:45AM -0700, James Bottomley wrote: > > > On Tue, 2016-10-25 at 17:53 +0100, David Howells wrote: > > > > David Howells <dhowells@redhat.com> wrote: > > > > > > > > > (2) If a process's user_namespace doesn't match that recorded > > > > > in a > > > > > key then > > > > > it gets ENOKEY if it tries to refer to it or access it and > > > > > can't see it > > > > > in /proc/keys. > > > > > > > > There's another possibility here - since user_namespaces are > > > > hierarchical, does it make sense to let a process see keys that > > > > are > > > > in an ancestral namespace? > > > > > > I think that should be the decision of the owner. If you're > > > creating a > > > userns to de-privilege the next user, likely you don't want this, > > > but > > > if you're creating a userns to enhance it, then you do. > > > > > > I think you want to behave exactly as the mount namespace does: on > > > initial clone, you get a fully cloned mount tree and then you > > > customise > > > it by unmounting pieces. > > > > The issue with keys is that their interface is pretty broken when > > exposed to user namespaces with different UID mappings, and this also > > causes security issues where you can get access to other users' > > keyrings and stuff like that. The proposed fix works around that > > brokenness by simply never exposing keys to other namespaces. > > Define how ... we can fix the keyring interface if it's hugely > problematic. See http://lkml.iu.edu/hypermail/linux/kernel/1606.2/06794.html . Here's a copy of my explanation in that message: | Not just questionable, completely wrong. The gist is that there is a | *global* name -> key mapping for accessing keys by name, and user | keyrings are stored in there under the name "_uid.%u", where %u | refers to the *namespaced* UID. (See install_user_keyrings().) | The result is that, if e.g. the user with UID 1000 has no running | processes, a local attacker can enter a new user namespace, map UID | 1000 in the namespace to some KUID he controls, do | setresuid(1000, 1000, 1000), and now he owns user 1000's keyring. | This ends up permitting the attacker to dump the contents of KUID | 1000's keys after KUID 1000 signs in. I discovered this while going | through the kuid->uid conversions when I thought about writing this | feature the first time. > If you're simply thinking that it's a uid range mapping leak, then the > way I think of this is the same way the shadow utils do: the owning > user owns the exterior range of ids. This seems to be the assumption > the unprivileged container systems are working on as well, so it > doesn't matter that there's a theoretical uid leak to the exterior > mapped ids. Nope, that's not the issue. -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2016-10-25 at 20:17 +0200, Jann Horn wrote: > On Tue, Oct 25, 2016 at 11:05:08AM -0700, James Bottomley wrote: > > On Tue, 2016-10-25 at 19:06 +0200, Jann Horn wrote: > > > On Tue, Oct 25, 2016 at 09:56:45AM -0700, James Bottomley wrote: > > > > On Tue, 2016-10-25 at 17:53 +0100, David Howells wrote: > > > > > David Howells <dhowells@redhat.com> wrote: > > > > > > > > > > > (2) If a process's user_namespace doesn't match that > > > > > > recorded > > > > > > in a > > > > > > key then > > > > > > it gets ENOKEY if it tries to refer to it or access it > > > > > > and > > > > > > can't see it > > > > > > in /proc/keys. > > > > > > > > > > There's another possibility here - since user_namespaces are > > > > > hierarchical, does it make sense to let a process see keys > > > > > that > > > > > are > > > > > in an ancestral namespace? > > > > > > > > I think that should be the decision of the owner. If you're > > > > creating a > > > > userns to de-privilege the next user, likely you don't want > > > > this, > > > > but > > > > if you're creating a userns to enhance it, then you do. > > > > > > > > I think you want to behave exactly as the mount namespace does: > > > > on > > > > initial clone, you get a fully cloned mount tree and then you > > > > customise > > > > it by unmounting pieces. > > > > > > The issue with keys is that their interface is pretty broken when > > > exposed to user namespaces with different UID mappings, and this > > > also > > > causes security issues where you can get access to other users' > > > keyrings and stuff like that. The proposed fix works around that > > > brokenness by simply never exposing keys to other namespaces. > > > > Define how ... we can fix the keyring interface if it's hugely > > problematic. > > See http://lkml.iu.edu/hypermail/linux/kernel/1606.2/06794.html . > Here's a copy of my explanation in that message: > > > Not just questionable, completely wrong. The gist is that there is > > a > > *global* name -> key mapping for accessing keys by name, and user > > keyrings are stored in there under the name "_uid.%u", where %u > > refers to the *namespaced* UID. (See install_user_keyrings().) > > The result is that, if e.g. the user with UID 1000 has no running > > processes, a local attacker can enter a new user namespace, map UID > > 1000 in the namespace to some KUID he controls, do > > setresuid(1000, 1000, 1000), and now he owns user 1000's keyring. > > This ends up permitting the attacker to dump the contents of KUID > > 1000's keys after KUID 1000 signs in. I discovered this while going > > through the kuid->uid conversions when I thought about writing this > > feature the first time. Well, that's obviously wrong and we fix it before adding namespaced views of keyrings. Originally I would say it needs to be kuid, like filesystems, but now Eric is on with an additional namespace mapping at the filesystem boundary, I'll defer to him whether he thinks we should have something like it at the key naming one as well. This is starting to sound like a useful late arriving discussion topic for the Plumbers Containers MC ... James -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 25, 2016 at 11:13:24AM -0700, James Bottomley wrote: > On Tue, 2016-10-25 at 18:30 +0100, David Howells wrote: > > James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > > > I think you want to behave exactly as the mount namespace does: on > > > initial clone, you get a fully cloned mount tree and then you > > > customise it by unmounting pieces. > > > > That adds quite a bit of complexity, given that: [...] > > (2) there are quotas on the number of keys a user is permitted; > > If this is counted per-id, then yes, mapping a range of ids with an > owning exterior non root user does allow you to escape the count ... > how important is it? Could we not simply hand wave it away as yet > another consequence you have to think about when giving a user a range > of exterior ids (because if there are N, their key count could get up > to N * the limit). Wouldn't it be N * normal_limit * number_of_namespaces, where number_of_namespaces is how many namespaces you can create (afaik unbounded)? -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2016-10-25 at 20:22 +0200, Jann Horn wrote: > On Tue, Oct 25, 2016 at 11:13:24AM -0700, James Bottomley wrote: > > On Tue, 2016-10-25 at 18:30 +0100, David Howells wrote: > > > James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > > > > I think you want to behave exactly as the mount namespace does: > > > > on > > > > initial clone, you get a fully cloned mount tree and then you > > > > customise it by unmounting pieces. > > > > > > That adds quite a bit of complexity, given that: > [...] > > > (2) there are quotas on the number of keys a user is permitted; > > > > If this is counted per-id, then yes, mapping a range of ids with an > > owning exterior non root user does allow you to escape the count > > ... how important is it? Could we not simply hand wave it away as > > yet another consequence you have to think about when giving a user > > a range of exterior ids (because if there are N, their key count > > could get up to N * the limit). > > Wouldn't it be N * normal_limit * number_of_namespaces, where > number_of_namespaces is how many namespaces you can create (afaik > unbounded)? I need to understand why we have a limit first. The above was assuming it's something simple like an arbitrary limit to prevent resource exhaustion. but, assuming that again ... I was thinking enforce the limit by exterior uid (kuid), so it becomes number of uids a user owns * normal_limit. It wouldn't vary based on the number of namespaces you create. James -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 25, 2016 at 11:17 AM, Jann Horn <jann@thejh.net> wrote: > On Tue, Oct 25, 2016 at 11:05:08AM -0700, James Bottomley wrote: >> On Tue, 2016-10-25 at 19:06 +0200, Jann Horn wrote: >> > On Tue, Oct 25, 2016 at 09:56:45AM -0700, James Bottomley wrote: >> > > On Tue, 2016-10-25 at 17:53 +0100, David Howells wrote: >> > > > David Howells <dhowells@redhat.com> wrote: >> > > > >> > > > > (2) If a process's user_namespace doesn't match that recorded >> > > > > in a >> > > > > key then >> > > > > it gets ENOKEY if it tries to refer to it or access it and >> > > > > can't see it >> > > > > in /proc/keys. >> > > > >> > > > There's another possibility here - since user_namespaces are >> > > > hierarchical, does it make sense to let a process see keys that >> > > > are >> > > > in an ancestral namespace? >> > > >> > > I think that should be the decision of the owner. If you're >> > > creating a >> > > userns to de-privilege the next user, likely you don't want this, >> > > but >> > > if you're creating a userns to enhance it, then you do. >> > > >> > > I think you want to behave exactly as the mount namespace does: on >> > > initial clone, you get a fully cloned mount tree and then you >> > > customise >> > > it by unmounting pieces. >> > >> > The issue with keys is that their interface is pretty broken when >> > exposed to user namespaces with different UID mappings, and this also >> > causes security issues where you can get access to other users' >> > keyrings and stuff like that. The proposed fix works around that >> > brokenness by simply never exposing keys to other namespaces. >> >> Define how ... we can fix the keyring interface if it's hugely >> problematic. > > See http://lkml.iu.edu/hypermail/linux/kernel/1606.2/06794.html . > Here's a copy of my explanation in that message: > > | Not just questionable, completely wrong. The gist is that there is a > | *global* name -> key mapping for accessing keys by name, and user > | keyrings are stored in there under the name "_uid.%u", where %u > | refers to the *namespaced* UID. (See install_user_keyrings().) > | The result is that, if e.g. the user with UID 1000 has no running > | processes, a local attacker can enter a new user namespace, map UID > | 1000 in the namespace to some KUID he controls, do > | setresuid(1000, 1000, 1000), and now he owns user 1000's keyring. > | This ends up permitting the attacker to dump the contents of KUID > | 1000's keys after KUID 1000 signs in. I discovered this while going > | through the kuid->uid conversions when I thought about writing this > | feature the first time. I found this issue, too, several years ago. ;-/ Can we use the namespacing of keys as an opportunity to fix things like this? Perhaps we could simply *remove* the concept of named keys and keyrings. -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Andy Lutomirski <luto@amacapital.net> wrote:
> ... Perhaps we could simply *remove* the concept of named keys and keyrings.
See Linus's dictum about breaking userspace.
The problem isn't named keys: keys have to be named - the description is how
they're looked up typically. Further, non-keyring keys can't be looked up
directly by name - you have to search for them in a keyring.
The issue here is named keyrings and keyctl_join_session_keyring(). It might
well have been a bad idea - though I've seen some people arguing for a single
session keyring shared across all a user's logins, in which case, we might
want this after all (or use the user-default session).
One thing we perhaps do want to do, though, is restrict the names of keyrings
to the user_namespace in which the keyring was created.
David
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2016-10-25 at 21:41 +0100, David Howells wrote: > Andy Lutomirski <luto@amacapital.net> wrote: > > > ... Perhaps we could simply *remove* the concept of named keys and > > keyrings. > > See Linus's dictum about breaking userspace. We wouldn't have to ... we'd simply declare the current interface to be a legacy one, which means it doesn't work with containers and namespaces, give it a new compile time option so the distros could remove it if they had no actual users, and then present a v2 which does what we need and works with namespaces. > The problem isn't named keys: keys have to be named - the description > is how they're looked up typically. Further, non-keyring keys can't > be looked up directly by name - you have to search for them in a > keyring. > > The issue here is named keyrings and keyctl_join_session_keyring(). > It might well have been a bad idea - though I've seen some people > arguing for a single session keyring shared across all a user's > logins, in which case, we might want this after all (or use the user > -default session). > > One thing we perhaps do want to do, though, is restrict the names of > keyrings to the user_namespace in which the keyring was created. Sounds fine to me since most things created within a namespace either get destroyed or placed in the parent when it is destroyed. Perhaps we want to agree on what semantics we want first before we start inventing the new API. James -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Howells <dhowells@redhat.com> writes: > I have some questions about user namespacing, with regard to making keyrings > namespaced. My current idea is to follow the following method: I am not certain your stated goal is a proper description of what you are trying to achieve. > (1) A new key/keyring records the user_namespace active when it is created. > > (2) If a process's user_namespace doesn't match that recorded in a key then > it gets ENOKEY if it tries to refer to it or access it and can't see it > in /proc/keys. > > (3) A process's keyring subscriptions are cleared if CLONE_NEWUSER is passed > to clone() or to unshare() so that it doesn't retain a keyring it can't > access. > > (4) Each user_namespace has its own separate register of persistent keyrings > and KEYCTL_GET_PERSISTENT can only get from the register of the currently > caller's user_namespace. This is already upstream > > as this seems the simplest solution. I don't want to add a new CLONE_xxx flag > as there isn't exactly a whole lot of room left. > > Whilst I've got this partially working, there is a problem because the > user_struct contains pointers to the user's user-keyring and user-session > keyrings - and these would need replacing when entering a new > user_namespace. The session keyring should change when someone changes their session. It is a deliberate choice for a session to span namespaces. Similarly upon entering a user namespace the user does not immediately change. So I expect the user mechanism by which the session and user keyrings are changed should suffice when entering a user namespace. Also keep in mind that users are fundamentally global. A given kuid always maps to the same user_struct. > However, the active user_struct is *not* replaced by create_user_ns(). Should > it be? No. The user has not changed, so changing the user_struct does not make sense. > I'm not sure whether there's a need to use the user_struct inherited from > before the unsharing - certainly setresuid(), for example, doesn't seem to > keep the values. Would it be possible to create a new user_struct with the > same kuid_t as the old one, but in the context of the new user_struct in case > it gets mapped? That would not make much sense. Are you being asked for different user keys for the same user in different user namespaces? Or do you possibly have the impression that users in different user namespaces are disjoint? (Users in different user namespaces are not disjoint, they are just the users outside of the user namespace that they are mapped to). I have seen a mistaken notion that it is safe to map users in different containers to the same user outside of the container and there won't be problems. Are people with that notion asking you to fix the code to conform to their mistaken notion of reality? Now if it does make sense to change the user key to be fundamentally per user per user namespace we can do that, but we need to be very clear about what we are trying to accomplish and why. > I've attached the patch showing my current changes for reference. Thank you. The current design if for only the keys owned by users in the current user namespace to be accessible. Something that key_task_permission appears to enforce quite well. There is a legitimate criticism that the CRIU folks can bring that we have a global space of key_serial_t/key inode numbers in the kernel. It might make sense to move key_serial_tree into struct user_namespace to fix that. In general I don't think it matters because the number is a random number that is essentially unpredicatable. Elsewhere it has been mentioned there is an implementation bug or two that needs to be corrected, but that is separate from the bulk of the implementation that I am not seeing any issues jump out at me when I look at it. Eric p.s. Apologies for the delay it has taken a bit for my state on the key subsystem to page back in. p.p.s. In case I was not clear. I think your patch below is very much the wrong approach. > David > --- > diff --git a/include/linux/key.h b/include/linux/key.h > index 722914798f37..8d785279f7b4 100644 > --- a/include/linux/key.h > +++ b/include/linux/key.h > @@ -142,6 +142,7 @@ struct key { > struct rb_node serial_node; > }; > struct rw_semaphore sem; /* change vs change sem */ > + struct user_namespace *ns; /* Owning namespace */ > struct key_user *user; /* owner of this key */ > void *security; /* security data for this key */ > union { > diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c > index 68f594212759..43f5c47de3a3 100644 > --- a/kernel/user_namespace.c > +++ b/kernel/user_namespace.c > @@ -42,6 +42,12 @@ static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns) > cred->cap_ambient = CAP_EMPTY_SET; > cred->cap_bset = CAP_FULL_SET; > #ifdef CONFIG_KEYS > + key_put(cred->session_keyring); > + cred->session_keyring = NULL; > + key_put(cred->process_keyring); > + cred->process_keyring = NULL; > + key_put(cred->thread_keyring); > + cred->thread_keyring = NULL; > key_put(cred->request_key_auth); > cred->request_key_auth = NULL; > #endif > diff --git a/security/keys/gc.c b/security/keys/gc.c > index addf060399e0..0f0c589bf49d 100644 > --- a/security/keys/gc.c > +++ b/security/keys/gc.c > @@ -12,6 +12,7 @@ > #include <linux/module.h> > #include <linux/slab.h> > #include <linux/security.h> > +#include <linux/user_namespace.h> > #include <keys/keyring-type.h> > #include "internal.h" > > @@ -155,6 +156,7 @@ static noinline void key_gc_unused_keys(struct list_head *keys) > atomic_dec(&key->user->nikeys); > > key_user_put(key->user); > + put_user_ns(key->ns); > > kfree(key->description); > > diff --git a/security/keys/key.c b/security/keys/key.c > index 346fbf201c22..09df168907fd 100644 > --- a/security/keys/key.c > +++ b/security/keys/key.c > @@ -15,6 +15,7 @@ > #include <linux/sched.h> > #include <linux/slab.h> > #include <linux/security.h> > +#include <linux/user_namespace.h> > #include <linux/workqueue.h> > #include <linux/random.h> > #include <linux/err.h> > @@ -289,6 +290,7 @@ struct key *key_alloc(struct key_type *type, const char *desc, > init_rwsem(&key->sem); > lockdep_set_class(&key->sem, &type->lock_class); > key->index_key.type = type; > + key->ns = get_user_ns(current_user_ns()); > key->user = user; > key->quotalen = quotalen; > key->datalen = type->def_datalen; > diff --git a/security/keys/keyring.c b/security/keys/keyring.c > index c91e4e0cea08..70a399bd572c 100644 > --- a/security/keys/keyring.c > +++ b/security/keys/keyring.c > @@ -1016,6 +1016,9 @@ struct key *find_keyring_by_name(const char *name, bool skip_perm_check) > &keyring_name_hash[bucket], > name_link > ) { > + if (keyring->ns != current_user_ns()) > + continue; > + > if (!kuid_has_mapping(current_user_ns(), keyring->user->uid)) > continue; > > diff --git a/security/keys/permission.c b/security/keys/permission.c > index 732cc0beffdf..3c1bd572d9da 100644 > --- a/security/keys/permission.c > +++ b/security/keys/permission.c > @@ -24,8 +24,9 @@ > * > * The caller must hold either a ref on cred or must hold the RCU readlock. > * > - * Returns 0 if successful, -EACCES if access is denied based on the > - * permissions bits or the LSM check. > + * Returns 0 if successful, -ENOKEY if the key is outside of the caller's user > + * namespace, -EACCES if access is denied based on the permissions bits or the > + * LSM check. > */ > int key_task_permission(const key_ref_t key_ref, const struct cred *cred, > unsigned perm) > @@ -36,6 +37,9 @@ int key_task_permission(const key_ref_t key_ref, const struct cred *cred, > > key = key_ref_to_ptr(key_ref); > > + if (key->ns != current_user_ns()) > + return -ENOKEY; > + > /* use the second 8-bits of permissions for keys the caller owns */ > if (uid_eq(key->uid, cred->fsuid)) { > kperm = key->perm >> 16; > diff --git a/security/keys/process_keys.c b/security/keys/process_keys.c > index 40a885239782..52866e90d51a 100644 > --- a/security/keys/process_keys.c > +++ b/security/keys/process_keys.c > @@ -691,6 +694,11 @@ try_again: > break; > } > > + /* We don't see keys that are outside the caller's user namespace */ > + ret = -ENOKEY; > + if (key->ns != current_user_ns()) > + goto invalid_key; > + > /* unlink does not use the nominated key in any way, so can skip all > * the permission checks as it is only concerned with the keyring */ > if (lflags & KEY_LOOKUP_FOR_UNLINK) { -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Howells <dhowells@redhat.com> writes: > James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > >> > There's another possibility here - since user_namespaces are >> > hierarchical, does it make sense to let a process see keys that are >> > in an ancestral namespace? >> >> I think that should be the decision of the owner. If you're creating a >> userns to de-privilege the next user, likely you don't want this, but >> if you're creating a userns to enhance it, then you do. > > Maybe the simplest is to put a 'stop here' flag on a user_namespace. Then > when we look to see if a key is in the caller's namespace, we go up the tree > until we hit the flag. If you don't find the key's ns within the caller's > permitted subtree, you don't get to see the key. Let me just say we already have all of that (in a much nicer format) by limiting the set of keys we can access to the set of users visible in the user namespace. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le mardi 25 octobre 2016 à 17:53 +0100, David Howells a écrit : > David Howells <dhowells@redhat.com> wrote: > > > > > (2) If a process's user_namespace doesn't match that recorded in a > > key then > > it gets ENOKEY if it tries to refer to it or access it and > > can't see it > > in /proc/keys. > > There's another possibility here - since user_namespaces are > hierarchical, > does it make sense to let a process see keys that are in an ancestral > namespace? > > David > -- Hello all, I used this approach for PTags. Let me explain how it works for PTags in few words. - each ptags is in fact a couple (user-namespace, ptag-name, flags) - when looking to tags in a given namespace: o either it is found with the given current name-space o or the ancestor-hierarchy of name-spaces is looked to get the containing ancestor - in both case when a tag is found, the flag indicates if it is alive or was removed in the namespace It works well (from what I sew 8) Technical details: - assumption is done that ancestors wait their children to die - shadow-ghost pointers are used to weakly handle user namespaces Best regards José Bollo > To unsubscribe from this list: send the line "unsubscribe linux- > security-module" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Eric W. Biederman <ebiederm@xmission.com> wrote: > David Howells <dhowells@redhat.com> writes: > > > James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > > > >> > There's another possibility here - since user_namespaces are > >> > hierarchical, does it make sense to let a process see keys that are > >> > in an ancestral namespace? > >> > >> I think that should be the decision of the owner. If you're creating a > >> userns to de-privilege the next user, likely you don't want this, but > >> if you're creating a userns to enhance it, then you do. > > > > Maybe the simplest is to put a 'stop here' flag on a user_namespace. Then > > when we look to see if a key is in the caller's namespace, we go up the tree > > until we hit the flag. If you don't find the key's ns within the caller's > > permitted subtree, you don't get to see the key. > > Let me just say we already have all of that (in a much nicer format) by > limiting the set of keys we can access to the set of users visible in > the user namespace. Except that we don't. Users outside of a namespace can see any key that grants permissions to 'other' - though this is something we don't do by default. David -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
I think it might help if I lay out the current state of play: Keys and keyrings can be accessed through a number of mechanisms: (1) Direct key lookup by serial number. The serial number set is global, so any key can potentially be looked up directly with the condition: (A) The key must grant the appropriate permission to the caller for the operation being proposed. (2) Direct keyring lookup by description (name). The keyring name set is global, so potentially any keyring could be set to the caller's session keyring. However, there are two noteworthy conditions: (A) The keyring's uid must have a mapping in the current user namespace. (B) The keyring must grant SEARCH permission to the caller. (3) Process-subscribed keyrings. A process has a number of keyring subscriptions (the session, process and thread keyrings). These are accessed by special negative serial numbers, macroised as KEY_SPEC_{THREAD,PROCESS,SESSION}_KEYRING. If a key can be reached via one of these keyrings then it is classed as being 'possessed' and the possession permissions grant is added to the user/group/other grant, with the following conditions: (A) The keyring being accessed must grant SEARCH to look inside it or some other appropriate permission to deal with the keyring itself. (4) User-subscribed keyrings. A user ID also has a couple of subscriptions (user and user-session keyrings). These are accessed by KEY_SPEC_{USER,USER_SESSION}_KEYRING. (5) Group-subscribed keyrings. Linus wanted this for completeness, but it was never implemented. KEY_SPEC_GROUP_KEYRING is defined for this. (6) Search in directly-looked up keyring. A keyring can be looked up directly as per (1) and then searched inside for a key by description or some other identifier. The conditions are: (A) The keyring must grant SEARCH permission, as must any subordinate keyrings that need to be searched. (B) The target key must grant SEARCH permission. Note that each key considered will *only* be considered possessed if the starting keyring is determined to be possessed. Whilst this will give some false negatives, it massively reduces computation time. (7) Search in session-, process- and thread-keyrings. As (4), but the starting keyring is definitely possessed. Further, the user-session keyring is used in place of the session keyring if the caller doesn't have a session keyring, and the keys in there are counted as possessed. (8) Search in the user-keyring and user-session keyring. As (4). These two keyrings are treated no differently from any other keyring. pam_keyinit, however, places a link to the user-keyring in any session-keyring it creates, thereby making it possessed. [I wish I hadn't made user- and user-session keyrings as they're kind of difficult to get right. It would've been better not to automatically fall back to the user-session keyring, I think.] (9) Lookup in the persistent keyring register. The persistent register is per-user_namespace. You can ask for the persistent keyring for any UID, but: (A) That UID must be within your user_namespace (B) The UID must match your process UID or EUID - or you must have CAP_SETUID. Further, no matter which one you ask for: (C) The persistent keyring must grant you LINK permission. Note that persistent keyrings only grant VIEW and READ to user and grant everything bar SETATTR to possessed. (10) /proc/keys. This shows all keys to which you have at least one permit, with the condition: (A) The keyring's uid must have a mapping in the current user namespace. (11) Request-key upcall specials. If a process was created by request_key() as an upcall, then three additional keys become available: (i) A new session keyring concocted by request_key() holding the authorisation key that allows the protection mechanisms to be bypassed by keyctl_instantiate() and friends for the key being constructed. (ii) The currently active authorisation key. If one is set by keyctl_assume_authority(), then this permits the upcall to search the subscribed keyrings of the original calling process as if they were its own. This would allow it, say, to retrieve the caller's Kerberos TGT and to request a ticket with which to instantiate a key. (iii) The keyring passed to request_key_and_link() as the destination. This has no special permissions. It's treated like any other non-possessed keyring unless the upcall program links it to such as its session keyring or it is accessible through the authorisation key. Note that it's only available if an authorisation key is in use. So: (*) Key serial numbers are global, though randomly generated and hopefully unpredictable. (*) Normal keys and keyrings are created by default with zero OTHER and GROUP permits, minimal USER permits and full POSSESSOR permits. Special keys may get a different balance of USER and POSSESSOR permits, but OTHER and GROUP are always initially zero. (*) You cannot access an ordinary key outside of your namespace unless: (A) it belongs to the founding user, you are that user within the namespace and it grants USER permits, or (B) it grants OTHER permits, or (C) it's attached to a subscribed keyring and it grants POSSESSOR permits, including SEARCH, or (D) you're trying to unlink it from a keyring you have WRITE access to (and then all you can do is unlink it). (*) You cannot find a session keyring that's owned by a UID that's outside of your namespace in order to join it. Trying to do so gives a new session keyring of the same name. (*) Session-, process- and thread-keyrings are carried across clone() (as appropriate) or unshare(), even if CLONE_NEWUSER is set. (*) You cannot see in /proc/keys any keys owned by the founding user of a new namespace, though you can create and manipulate keys accessible through the carried over keyrings or accessible by ID. This is may need fixing - you should be able to see any key that you can access. (*) The installation of user and user-session keyrings in a user_struct waives the SEARCH permit that's a requirement for when keyctl_join_session_keyring() does a by-name search. Thoughts: (*) request_key() upcalls are not namespaced. It's not clear how this should be done, since the upcall might depend on network namespace or filesystem namespace, for example. (*) Possibly user and user-session keyrings could be kept in the persistent keyring rather than the user_struct, though there's a problem with the persistent keyring potentially timing out. David -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Andy Lutomirski <luto@amacapital.net> wrote: > > | Not just questionable, completely wrong. The gist is that there is a > > | *global* name -> key mapping for accessing keys by name, and user > > | keyrings are stored in there under the name "_uid.%u", where %u > > | refers to the *namespaced* UID. (See install_user_keyrings().) > > | The result is that, if e.g. the user with UID 1000 has no running > > | processes, a local attacker can enter a new user namespace, map UID > > | 1000 in the namespace to some KUID he controls, do > > | setresuid(1000, 1000, 1000), and now he owns user 1000's keyring. Hmmm... Having checked over the code, it might not be that simple. Thanks to something Eric Biederman added into find_keyring_by_name(), unless a keyring's UID is a member of the current user namespace, you can't find that keyring by name. However, I'm not sure that that stops someone trying to find it by colliding kuids since keys can't pin the user_struct without causing a loop. Further, if the user_struct referring to a user (or user-session) keyring gets deallocated, it drops its link to that keyring - though that's no guarantee that the user keyrings will be deallocated themselves (there may be links from elsewhere). I'm wondering if I should move the user keyrings out of the user_struct so that keys can pin the user_struct (and I could then merge the key_user struct into it). The user-keyring and user-session keyring could be kept in the user's persistent keyring or in their own tree in the user namespace. The former would allow their easy reuse within a certain amount of time and the latter would allow them to be easily revoked when the user_namespace struct is destroyed. I think Eric's patch to move the keyring list into the user_namespace struct rather than having it be global is also a good step. David -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Oct 26, 2016 at 03:34:50PM +0100, David Howells wrote: > Andy Lutomirski <luto@amacapital.net> wrote: > > > > | Not just questionable, completely wrong. The gist is that there is a > > > | *global* name -> key mapping for accessing keys by name, and user > > > | keyrings are stored in there under the name "_uid.%u", where %u > > > | refers to the *namespaced* UID. (See install_user_keyrings().) > > > | The result is that, if e.g. the user with UID 1000 has no running > > > | processes, a local attacker can enter a new user namespace, map UID > > > | 1000 in the namespace to some KUID he controls, do > > > | setresuid(1000, 1000, 1000), and now he owns user 1000's keyring. > > Hmmm... Having checked over the code, it might not be that simple. Thanks to > something Eric Biederman added into find_keyring_by_name(), unless a keyring's > UID is a member of the current user namespace, you can't find that keyring by > name. find_keyring_by_name() checks that the UID of the keyring's owner is mapped into the current user namespace. But that doesn't catch the scenario I described: The keyring is created in an attacker-created namespace and looked up from the init namespace, into which all kuids are mapped.
Jann Horn <jann@thejh.net> wrote: > find_keyring_by_name() checks that the UID of the keyring's owner is mapped into > the current user namespace. But that doesn't catch the scenario I described: > The keyring is created in an attacker-created namespace and looked up from the > init namespace, into which all kuids are mapped. Ah - gotcha. David -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Howells <dhowells@redhat.com> writes: > Jann Horn <jann@thejh.net> wrote: > >> find_keyring_by_name() checks that the UID of the keyring's owner is mapped into >> the current user namespace. But that doesn't catch the scenario I described: >> The keyring is created in an attacker-created namespace and looked up from the >> init namespace, into which all kuids are mapped. > > Ah - gotcha. Unless I am misreading something it actually gets worse. You don't even need a user namespace. You can just call keyctl_join_session_keyring and the named keyring of your choice will be created. Plus there are various really weird things in their where the keyring names of _tid, _pid, _ses, get reused over and over again. So it looks like there are some significant things to fix. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Eric W. Biederman <ebiederm@xmission.com> wrote: > > Jann Horn <jann@thejh.net> wrote: > > > >> find_keyring_by_name() checks that the UID of the keyring's owner is > >> mapped into the current user namespace. But that doesn't catch the > >> scenario I described: The keyring is created in an attacker-created > >> namespace and looked up from the init namespace, into which all kuids are > >> mapped. > > > > Ah - gotcha. > > Unless I am misreading something it actually gets worse. You don't even > need a user namespace. You can just call keyctl_join_session_keyring > and the named keyring of your choice will be created. It's meant to do that - assuming a keyring of that name doesn't yet exist or isn't accessible by you. I should perhaps prohibit the creation of a _uid.N keyring. However, I think that conceptually, it should be possible to join the user-session keyring as your session - or I should make it so that the user-session keyring doesn't exist (but I'm pretty certain that will break userspace). > Plus there are various really weird things in their where the keyring > names of _tid, _pid, _ses, get reused over and over again. True, however per-thread (_tid) and per-process(_pid) keyrings are always allocated by key_alloc() and never looked up by name when being created. Anonymous session (_ses) keyrings are also created by key_alloc() and not looked up when created. It's only when a named session keyring is requested that a look up by name is done. I could make the per-thread, per-process and anon-session keyrings nameless by default, or prefix them with '.' and not permit joining of a keyring whose name begins with a '.' (you aren't allowed to use add_key() to create a such keyrings, so that really ought to be extended to here too). > So it looks like there are some significant things to fix. So: (1) Make the keyring name list per user_namespace. (2) Don't let _uid.N, _tid, _pid and _ses (anonymous session) keyrings be joined. (3) Don't let keyrings whose name begins '.' be joined. (4) Don't keep the user and user-session keyrings in the user_struct, but rather make them subordinates of the user namespace. for starters. David -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Howells <dhowells@redhat.com> wrote: > > Plus there are various really weird things in their where the keyring > > names of _tid, _pid, _ses, get reused over and over again. > > True, however per-thread (_tid) and per-process(_pid) keyrings are always > allocated by key_alloc() and never looked up by name when being created. > > Anonymous session (_ses) keyrings are also created by key_alloc() and not > looked up when created. It's only when a named session keyring is requested > that a look up by name is done. > > I could make the per-thread, per-process and anon-session keyrings nameless by > default, or prefix them with '.' and not permit joining of a keyring whose > name begins with a '.' (you aren't allowed to use add_key() to create a such > keyrings, so that really ought to be extended to here too). Note that the per-thread, per-process and anon-session keyrings are not joinable by default as they don't come with SEARCH permission for u/g/o. David -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/linux/key.h b/include/linux/key.h index 722914798f37..8d785279f7b4 100644 --- a/include/linux/key.h +++ b/include/linux/key.h @@ -142,6 +142,7 @@ struct key { struct rb_node serial_node; }; struct rw_semaphore sem; /* change vs change sem */ + struct user_namespace *ns; /* Owning namespace */ struct key_user *user; /* owner of this key */ void *security; /* security data for this key */ union { diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c index 68f594212759..43f5c47de3a3 100644 --- a/kernel/user_namespace.c +++ b/kernel/user_namespace.c @@ -42,6 +42,12 @@ static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns) cred->cap_ambient = CAP_EMPTY_SET; cred->cap_bset = CAP_FULL_SET; #ifdef CONFIG_KEYS + key_put(cred->session_keyring); + cred->session_keyring = NULL; + key_put(cred->process_keyring); + cred->process_keyring = NULL; + key_put(cred->thread_keyring); + cred->thread_keyring = NULL; key_put(cred->request_key_auth); cred->request_key_auth = NULL; #endif diff --git a/security/keys/gc.c b/security/keys/gc.c index addf060399e0..0f0c589bf49d 100644 --- a/security/keys/gc.c +++ b/security/keys/gc.c @@ -12,6 +12,7 @@ #include <linux/module.h> #include <linux/slab.h> #include <linux/security.h> +#include <linux/user_namespace.h> #include <keys/keyring-type.h> #include "internal.h" @@ -155,6 +156,7 @@ static noinline void key_gc_unused_keys(struct list_head *keys) atomic_dec(&key->user->nikeys); key_user_put(key->user); + put_user_ns(key->ns); kfree(key->description); diff --git a/security/keys/key.c b/security/keys/key.c index 346fbf201c22..09df168907fd 100644 --- a/security/keys/key.c +++ b/security/keys/key.c @@ -15,6 +15,7 @@ #include <linux/sched.h> #include <linux/slab.h> #include <linux/security.h> +#include <linux/user_namespace.h> #include <linux/workqueue.h> #include <linux/random.h> #include <linux/err.h> @@ -289,6 +290,7 @@ struct key *key_alloc(struct key_type *type, const char *desc, init_rwsem(&key->sem); lockdep_set_class(&key->sem, &type->lock_class); key->index_key.type = type; + key->ns = get_user_ns(current_user_ns()); key->user = user; key->quotalen = quotalen; key->datalen = type->def_datalen; diff --git a/security/keys/keyring.c b/security/keys/keyring.c index c91e4e0cea08..70a399bd572c 100644 --- a/security/keys/keyring.c +++ b/security/keys/keyring.c @@ -1016,6 +1016,9 @@ struct key *find_keyring_by_name(const char *name, bool skip_perm_check) &keyring_name_hash[bucket], name_link ) { + if (keyring->ns != current_user_ns()) + continue; + if (!kuid_has_mapping(current_user_ns(), keyring->user->uid)) continue; diff --git a/security/keys/permission.c b/security/keys/permission.c index 732cc0beffdf..3c1bd572d9da 100644 --- a/security/keys/permission.c +++ b/security/keys/permission.c @@ -24,8 +24,9 @@ * * The caller must hold either a ref on cred or must hold the RCU readlock. * - * Returns 0 if successful, -EACCES if access is denied based on the - * permissions bits or the LSM check. + * Returns 0 if successful, -ENOKEY if the key is outside of the caller's user + * namespace, -EACCES if access is denied based on the permissions bits or the + * LSM check. */ int key_task_permission(const key_ref_t key_ref, const struct cred *cred, unsigned perm) @@ -36,6 +37,9 @@ int key_task_permission(const key_ref_t key_ref, const struct cred *cred, key = key_ref_to_ptr(key_ref); + if (key->ns != current_user_ns()) + return -ENOKEY; + /* use the second 8-bits of permissions for keys the caller owns */ if (uid_eq(key->uid, cred->fsuid)) { kperm = key->perm >> 16; diff --git a/security/keys/process_keys.c b/security/keys/process_keys.c index 40a885239782..52866e90d51a 100644 --- a/security/keys/process_keys.c +++ b/security/keys/process_keys.c @@ -691,6 +694,11 @@ try_again: break; } + /* We don't see keys that are outside the caller's user namespace */ + ret = -ENOKEY; + if (key->ns != current_user_ns()) + goto invalid_key; + /* unlink does not use the nominated key in any way, so can skip all * the permission checks as it is only concerned with the keyring */ if (lflags & KEY_LOOKUP_FOR_UNLINK) {