Message ID | 155966609977.17449.5624614375035334363.stgit@warthog.procyon.org.uk (mailing list archive) |
---|---|
Headers | show |
Series | Mount, FS, Block and Keyrings notifications [ver #2] | expand |
On Tue, Jun 4, 2019 at 9:35 AM David Howells <dhowells@redhat.com> wrote: > > > Hi Al, > > Here's a set of patches to add a general variable-length notification queue > concept and to add sources of events for: I asked before and didn't see a response, so I'll ask again. Why are you paying any attention at all to the creds that generate an event? It seems like the resulting security model will be vary hard to understand and probably buggy. Can't you define a sensible model in which only the listener creds matter? > LSM support is included: > > (1) The creds of the process that did the fput() that reduced the refcount > to zero are cached in the file struct. > > (2) __fput() overrides the current creds with the creds from (1) whilst > doing the cleanup, thereby making sure that the creds seen by the > destruction notification generated by mntput() appears to come from > the last fputter. That looks like duct tape that is, at best, likely to be very buggy. > > (3) security_post_notification() is called for each queue that we might > want to post a notification into, thereby allowing the LSM to prevent > covert communications. This seems like the wrong approach. If an LSM wants to prevent covert communication from, say, mount actions, then it shouldn't allow the watch to be set up in the first place.
n 6/4/2019 10:43 AM, Andy Lutomirski wrote: > On Tue, Jun 4, 2019 at 9:35 AM David Howells <dhowells@redhat.com> wrote: >> >> Hi Al, >> >> Here's a set of patches to add a general variable-length notification queue >> concept and to add sources of events for: > I asked before and didn't see a response, so I'll ask again. Why are > you paying any attention at all to the creds that generate an event? > It seems like the resulting security model will be vary hard to > understand and probably buggy. Can't you define a sensible model in > which only the listener creds matter? We've spent the last 18 months reeling from the implications of what can happen when one process has the ability to snoop on another. Introducing yet another mechanism that is trivial to exploit is a very bad idea. I will try to explain the problem once again. If process A sends a signal (writes information) to process B the kernel checks that either process A has the same UID as process B or that process A has privilege to override that policy. Process B is passive in this access control decision, while process A is active. In the event delivery case, process A does something (e.g. modifies a keyring) that generates an event, which is then sent to process B's event buffer. Again, A is active and B is passive. Process A must have write access (defined by some policy) to process B's event buffer. To implement such a policy requires A's credential, and some information about the object (passive entity) to which the event is being delivered. You can't just use the credential from Process B because it is not the active entity, it is the passive entity. > >> LSM support is included: >> >> (1) The creds of the process that did the fput() that reduced the refcount >> to zero are cached in the file struct. >> >> (2) __fput() overrides the current creds with the creds from (1) whilst >> doing the cleanup, thereby making sure that the creds seen by the >> destruction notification generated by mntput() appears to come from >> the last fputter. > That looks like duct tape that is, at best, likely to be very buggy. > >> (3) security_post_notification() is called for each queue that we might >> want to post a notification into, thereby allowing the LSM to prevent >> covert communications. > This seems like the wrong approach. If an LSM wants to prevent covert > communication from, say, mount actions, then it shouldn't allow the > watch to be set up in the first place.
Andy Lutomirski <luto@kernel.org> wrote: > > Here's a set of patches to add a general variable-length notification queue > > concept and to add sources of events for: > > I asked before and didn't see a response, so I'll ask again. Why are you > paying any attention at all to the creds that generate an event? Casey responded to you. It's one of his requirements. I'm not sure of the need, and I particularly don't like trying to make indirect destruction events (mount destruction keyed on fput, for instance) carry the creds of the triggerer. Indeed, the trigger can come from all sorts of places - including af_unix queue destruction, someone poking around in procfs, a variety of processes fputting simultaneously. Only one of them can win, and the LSM needs to handle *all* the possibilities. However, the LSMs (or at least SELinux) ignore f_cred and use current_cred() when checking permissions. See selinux_revalidate_file_permission() for example - it uses current_cred() not file->f_cred to re-evaluate the perms, and the fd might be shared between a number of processes with different creds. > This seems like the wrong approach. If an LSM wants to prevent covert > communication from, say, mount actions, then it shouldn't allow the > watch to be set up in the first place. Yeah, I can agree to that. Casey? David
On Tue, Jun 4, 2019 at 1:39 PM David Howells <dhowells@redhat.com> wrote: > > Andy Lutomirski <luto@kernel.org> wrote: > > > > Here's a set of patches to add a general variable-length notification queue > > > concept and to add sources of events for: > > > > I asked before and didn't see a response, so I'll ask again. Why are you > > paying any attention at all to the creds that generate an event? > > Casey responded to you. It's one of his requirements. > It being a "requirement" doesn't make it okay. > However, the LSMs (or at least SELinux) ignore f_cred and use current_cred() > when checking permissions. See selinux_revalidate_file_permission() for > example - it uses current_cred() not file->f_cred to re-evaluate the perms, > and the fd might be shared between a number of processes with different creds. That's a bug. It's arguably a rather severe bug. If I ever get around to writing the patch I keep thinking of that will warn if we use creds from invalid contexts, it will warn. Let's please not repeat this.
On Tue, Jun 4, 2019 at 1:31 PM Casey Schaufler <casey@schaufler-ca.com> wrote: > > n 6/4/2019 10:43 AM, Andy Lutomirski wrote: > > On Tue, Jun 4, 2019 at 9:35 AM David Howells <dhowells@redhat.com> wrote: > >> > >> Hi Al, > >> > >> Here's a set of patches to add a general variable-length notification queue > >> concept and to add sources of events for: > > I asked before and didn't see a response, so I'll ask again. Why are > > you paying any attention at all to the creds that generate an event? > > It seems like the resulting security model will be vary hard to > > understand and probably buggy. Can't you define a sensible model in > > which only the listener creds matter? > > We've spent the last 18 months reeling from the implications > of what can happen when one process has the ability to snoop > on another. Introducing yet another mechanism that is trivial > to exploit is a very bad idea. If you're talking about Spectre, etc, this is IMO entirely irrelevant. Among other things, setting these watches can and should require some degree of privilege. > > I will try to explain the problem once again. If process A > sends a signal (writes information) to process B the kernel > checks that either process A has the same UID as process B > or that process A has privilege to override that policy. > Process B is passive in this access control decision, while > process A is active. Are you stating what you see to be a requirement? > Process A must have write access > (defined by some policy) to process B's event buffer. No, stop right here. Process B is monitoring some aspect of the system. Process A is doing something. Process B should need permission to monitor whatever it's monitoring, and process A should have permission to do whatever it's doing. I don't think it makes sense to try to ascribe an identity to the actor doing some action to decide to omit it from the watch -- this has all kinds of correctness issues. If you're writing a policy and you don't like letting process B spy on processes doing various things, then disallow that type of spying. > To > implement such a policy requires A's credential, You may not design a new mechanism that looks at the credential in a context where looking at a credential is invalid unless you have some very strong justification for why all of the known reasons that it's a bad idea don't apply to what you're doing. So, without a much stronger justification, NAK.
On 6/4/2019 1:39 PM, David Howells wrote: > Andy Lutomirski <luto@kernel.org> wrote: > >>> Here's a set of patches to add a general variable-length notification queue >>> concept and to add sources of events for: >> I asked before and didn't see a response, so I'll ask again. Why are you >> paying any attention at all to the creds that generate an event? > Casey responded to you. It's one of his requirements. Process A takes an action. As a result of that action, an event is written to Process B's event buffer. This isn't a covert channel, it's a direct access, just like sending a signal. Process A is the subject and the event buffer, which is part of Process B, is the object. > I'm not sure of the need, and I particularly don't like trying to make > indirect destruction events (mount destruction keyed on fput, for instance) > carry the creds of the triggerer. Indeed, the trigger can come from all sorts > of places - including af_unix queue destruction, someone poking around in > procfs, a variety of processes fputting simultaneously. Only one of them can > win, and the LSM needs to handle *all* the possibilities. Yes, it's a hairy problem. It was a significant factor in the demise of kdbus. > However, the LSMs (or at least SELinux) ignore f_cred and use current_cred() > when checking permissions. See selinux_revalidate_file_permission() for > example - it uses current_cred() not file->f_cred to re-evaluate the perms, > and the fd might be shared between a number of processes with different creds. > >> This seems like the wrong approach. If an LSM wants to prevent covert >> communication from, say, mount actions, then it shouldn't allow the >> watch to be set up in the first place. > Yeah, I can agree to that. Casey? Back to your earlier point, you don't know where the event is coming from when you create the event watch. If you enforce a watch time, what are you going to check? Isn't this going to be considered too restrictive?
On 6/4/2019 2:05 PM, Andy Lutomirski wrote: > On Tue, Jun 4, 2019 at 1:31 PM Casey Schaufler <casey@schaufler-ca.com> wrote: >> n 6/4/2019 10:43 AM, Andy Lutomirski wrote: >>> On Tue, Jun 4, 2019 at 9:35 AM David Howells <dhowells@redhat.com> wrote: >>>> Hi Al, >>>> >>>> Here's a set of patches to add a general variable-length notification queue >>>> concept and to add sources of events for: >>> I asked before and didn't see a response, so I'll ask again. Why are >>> you paying any attention at all to the creds that generate an event? >>> It seems like the resulting security model will be vary hard to >>> understand and probably buggy. Can't you define a sensible model in >>> which only the listener creds matter? >> We've spent the last 18 months reeling from the implications >> of what can happen when one process has the ability to snoop >> on another. Introducing yet another mechanism that is trivial >> to exploit is a very bad idea. > If you're talking about Spectre, etc, this is IMO entirely irrelevant. We're seeing significant interest in using obscure mechanisms in system exploits. Mechanisms will be exploited. > Among other things, setting these watches can and should require some > degree of privilege. Requiring privilege would address the concerns for most situations, although I don't see that it would help for SELinux. SELinux does not generally put much credence in what others consider "privilege". Extreme care would probably be required for namespaces, too. > >> I will try to explain the problem once again. If process A >> sends a signal (writes information) to process B the kernel >> checks that either process A has the same UID as process B >> or that process A has privilege to override that policy. >> Process B is passive in this access control decision, while >> process A is active. > Are you stating what you see to be a requirement? Basic subject/object access control is the core of the Linux security model. Yes, there are exceptions, but mostly they're historical in origin. >> Process A must have write access >> (defined by some policy) to process B's event buffer. > No, stop right here. Listening ... > Process B is monitoring some aspect of the > system. Process B is not "monitoring". At some point in the past it has registered a request for information should an event occur. It is currently passive. > Process A is doing something. Yes. It is active.' > Process B should need > permission to monitor whatever it's monitoring, OK, I'm good with that. But the only time you can tell that is when the event is registered, and at that time you can't tell who might be causing the event. (Or can you?) > and process A should > have permission to do whatever it's doing. So there needs to be some connection between what B can request events for and what events A can cause. Then you can deny B's requests because of A. > I don't think it makes > sense to try to ascribe an identity to the actor doing some action to > decide to omit it from the watch -- this has all kinds of correctness > issues. It works for signals and UDP, but in general I get the concern. > If you're writing a policy and you don't like letting process B spy on > processes doing various things, then disallow that type of spying. That gets you into a situation where you can't do the legitimate monitoring you want to do just because there's the off chance you might see something you shouldn't. "I hate security! It's confusing, and always gets in the way!" >> To >> implement such a policy requires A's credential, > You may not design a new mechanism that looks at the credential in a > context where looking at a credential is invalid unless you have some > very strong justification for why all of the known reasons that it's a > bad idea don't apply to what you're doing. Point. But you also don't get to ignore basic security policy just because someone's spiffy lazy memory free cache hashing tree (or similar mechanism) throws away references to important information while it's still needed. > So, without a much stronger justification, NAK. I try to be reasonable. Really. All I want is something with a security model that can be explained coherently within the context of the basic Linux security model. There are enough variations as it is.
On Tue, Jun 4, 2019 at 6:18 PM Stephen Smalley <stephen.smalley@gmail.com> wrote: > > On Tue, Jun 4, 2019 at 4:58 PM Andy Lutomirski <luto@kernel.org> wrote: >> >> On Tue, Jun 4, 2019 at 1:39 PM David Howells <dhowells@redhat.com> wrote: >> > >> > Andy Lutomirski <luto@kernel.org> wrote: >> > >> > > > Here's a set of patches to add a general variable-length notification queue >> > > > concept and to add sources of events for: >> > > >> > > I asked before and didn't see a response, so I'll ask again. Why are you >> > > paying any attention at all to the creds that generate an event? >> > >> > Casey responded to you. It's one of his requirements. >> > >> >> It being a "requirement" doesn't make it okay. >> >> > However, the LSMs (or at least SELinux) ignore f_cred and use current_cred() >> > when checking permissions. See selinux_revalidate_file_permission() for >> > example - it uses current_cred() not file->f_cred to re-evaluate the perms, >> > and the fd might be shared between a number of processes with different creds. >> >> That's a bug. It's arguably a rather severe bug. If I ever get >> around to writing the patch I keep thinking of that will warn if we >> use creds from invalid contexts, it will warn. > > > No, not a bug. Working as designed. Initial validation on open, but revalidation upon read/write if something has changed since open (process SID differs from opener, inode SID has changed, policy has changed). Current subject SID should be used for the revalidation. It's a MAC vs DAC difference. > Can you explain how the design is valid, then? Consider nasty cases like this: $ sudo -u lotsofgarbage 2>/dev/whatever It is certainly the case that drivers, fs code, and other core code MUST NOT look at current_cred() in the context of syscalls like open(). Jann, I, and others have found quite a few rootable bugs of this sort. What makes MAC special here? I would believe there are cases where auditing write() callers makes some sense, but anyone reading those logs needs to understand that the creds are dubious at best.
Casey Schaufler <casey@schaufler-ca.com> wrote: > I will try to explain the problem once again. If process A > sends a signal (writes information) to process B the kernel > checks that either process A has the same UID as process B > or that process A has privilege to override that policy. > Process B is passive in this access control decision, while > process A is active. In the event delivery case, process A > does something (e.g. modifies a keyring) that generates an > event, which is then sent to process B's event buffer. I think this might be the core sticking point here. It looks like two different situations: (1) A explicitly sends event to B (eg. signalling, sendmsg, etc.) (2) A implicitly and unknowingly sends event to B as a side effect of some other action (eg. B has a watch for the event A did). The LSM treats them as the same: that is B must have MAC authorisation to send a message to A. But there are problems with not sending the event: (1) B's internal state is then corrupt (or, at least, unknowingly invalid). (2) B can potentially figure out that the event happened by other means. I've implemented four event sources so far: (1) Keys/keyrings. You can only get events on a key you have View permission on and the other process has to have write access to it, so I think this is good enough. (2) Block layer. Currently this will only get you hardware error events, which is probably safe. I'm not sure you can manipulate those without permission to directly access the device files. (3) Superblock. This is trickier since it can see events that can be manufactured (R/W <-> R/O remounting, EDQUOT) as well as events that can't without hardware control (EIO, network link loss, RF kill). (4) Mount topology. This is the trickiest since it allows you to see events beyond the point at which you placed your watch (in essence, you place a subtree watch). The question is what permission checking should I do? Ideally, I'd emulate a pathwalk between the watchpoint and the eventing object to see if the owner of the watchpoint could reach it. I'd need to do a reverse walk, calling inode_permission(MAY_NOT_BLOCK) for each directory between the eventing object and the watchpoint to see if one rejects it - but some filesystems have a permission check that can't be called in this state. It would also be necessary to do this separately for each watchpoint in the parental chain. Further, each permissions check would generate an audit event and could generate FAN_ACCESS and/or FAN_ACCESS_PERM fanotify events - which could be a problem if fanotify is also trying to post those events to the same watch queue. David
On 6/5/19 12:19 AM, Andy Lutomirski wrote: > On Tue, Jun 4, 2019 at 6:18 PM Stephen Smalley > <stephen.smalley@gmail.com> wrote: >> >> On Tue, Jun 4, 2019 at 4:58 PM Andy Lutomirski <luto@kernel.org> wrote: >>> >>> On Tue, Jun 4, 2019 at 1:39 PM David Howells <dhowells@redhat.com> wrote: >>>> >>>> Andy Lutomirski <luto@kernel.org> wrote: >>>> >>>>>> Here's a set of patches to add a general variable-length notification queue >>>>>> concept and to add sources of events for: >>>>> >>>>> I asked before and didn't see a response, so I'll ask again. Why are you >>>>> paying any attention at all to the creds that generate an event? >>>> >>>> Casey responded to you. It's one of his requirements. >>>> >>> >>> It being a "requirement" doesn't make it okay. >>> >>>> However, the LSMs (or at least SELinux) ignore f_cred and use current_cred() >>>> when checking permissions. See selinux_revalidate_file_permission() for >>>> example - it uses current_cred() not file->f_cred to re-evaluate the perms, >>>> and the fd might be shared between a number of processes with different creds. >>> >>> That's a bug. It's arguably a rather severe bug. If I ever get >>> around to writing the patch I keep thinking of that will warn if we >>> use creds from invalid contexts, it will warn. >> >> >> No, not a bug. Working as designed. Initial validation on open, but revalidation upon read/write if something has changed since open (process SID differs from opener, inode SID has changed, policy has changed). Current subject SID should be used for the revalidation. It's a MAC vs DAC difference. >> > > Can you explain how the design is valid, then? Consider nasty cases like this: > > $ sudo -u lotsofgarbage 2>/dev/whatever (sorry for the previous html email; gmail or my inability to properly use it strikes again!) Here we have four (or more) opportunities to say no: 1) Upon selinux_inode_permission(), when checking write access to /dev/whatever in the context of the shell process, 2) Upon selinux_file_open(), when checking and caching the open and write access for shell to /dev/whatever in the file security struct, 3) Upon selinux_bprm_committing_creds() -> flush_unauthorized_files(), when revalidating write access to /dev/whatever in the context of sudo, 4) Upon selinux_file_permission() -> selinux_revalidate_file_permission(), when revalidating write access to /dev/whatever in the context of sudo. If any of those fail, then access is denied, so unless both the shell and sudo are authorized to write to /dev/whatever, it is a no-go. NB Only the shell context requires open permission here; the sudo context only needs write. > It is certainly the case that drivers, fs code, and other core code > MUST NOT look at current_cred() in the context of syscalls like > open(). Jann, I, and others have found quite a few rootable bugs of > this sort. What makes MAC special here? Do you mean syscalls like write(), not open()? I think your concern is that they apply some check only during write() and not open() and therefore are susceptible to confused deputy scenario above. In contrast we are validating access at open, transfer/inherit, and use. If we use file->f_cred instead of current_cred() in selinux_revalidate_file_permission() and the current process SID differs from that of the opener, we'll never apply a check for the actual security context performing the write(), so information can flow in violation of the MAC policy. > I would believe there are cases where auditing write() callers makes > some sense, but anyone reading those logs needs to understand that the > creds are dubious at best.
On 6/5/2019 1:41 AM, David Howells wrote: > Casey Schaufler <casey@schaufler-ca.com> wrote: > >> I will try to explain the problem once again. If process A >> sends a signal (writes information) to process B the kernel >> checks that either process A has the same UID as process B >> or that process A has privilege to override that policy. >> Process B is passive in this access control decision, while >> process A is active. In the event delivery case, process A >> does something (e.g. modifies a keyring) that generates an >> event, which is then sent to process B's event buffer. > I think this might be the core sticking point here. It looks like two > different situations: > > (1) A explicitly sends event to B (eg. signalling, sendmsg, etc.) > > (2) A implicitly and unknowingly sends event to B as a side effect of some > other action (eg. B has a watch for the event A did). > > The LSM treats them as the same: that is B must have MAC authorisation to send > a message to A. YES! Threat is about what you can do, not what you intend to do. And it would be really great if you put some thought into what a rational model would be for UID based controls, too. > But there are problems with not sending the event: > > (1) B's internal state is then corrupt (or, at least, unknowingly invalid). Then B is a badly written program. > (2) B can potentially figure out that the event happened by other means. Then why does it need the event mechanism in the first place? > I've implemented four event sources so far: > > (1) Keys/keyrings. You can only get events on a key you have View permission > on and the other process has to have write access to it, so I think this > is good enough. Sounds fine. > (2) Block layer. Currently this will only get you hardware error events, > which is probably safe. I'm not sure you can manipulate those without > permission to directly access the device files. There's an argument to be made that this should require CAP_SYS_ADMIN, or that an LSM like SELinux might include hardware error events in policy, but generally I agree that system generated events like this are both harmless and pointless for the general public to watch. > (3) Superblock. This is trickier since it can see events that can be > manufactured (R/W <-> R/O remounting, EDQUOT) as well as events that > can't without hardware control (EIO, network link loss, RF kill). The events generated by processes (the 1st set) need controls like keys. The events generated by the system (the 2nd set) may need controls like the block layer. > (4) Mount topology. This is the trickiest since it allows you to see events > beyond the point at which you placed your watch (in essence, you place a > subtree watch). Like keys. > The question is what permission checking should I do? Ideally, I'd > emulate a pathwalk between the watchpoint and the eventing object to see > if the owner of the watchpoint could reach it. That will depend, as I've been saying, on what causes the event to be generated. If it's from a process, the question is "can the active process, the one that generated the event, write to the passive, watching process?" If it's the system on a hardware event, you may want the watcher to have CAP_SYS_ADMIN. > I'd need to do a reverse walk, calling inode_permission(MAY_NOT_BLOCK) > for each directory between the eventing object and the watchpoint to see > if one rejects it - but some filesystems have a permission check that > can't be called in this state. This is for setting the watch, right? > It would also be necessary to do this separately for each watchpoint in > the parental chain. > > Further, each permissions check would generate an audit event and could > generate FAN_ACCESS and/or FAN_ACCESS_PERM fanotify events - which could > be a problem if fanotify is also trying to post those events to the same > watch queue. If you required that the watching process open(dir) what you want to watch you'd get this for free. Or did I miss something obvious? > David
On Wed, Jun 5, 2019 at 7:51 AM Casey Schaufler <casey@schaufler-ca.com> wrote: > > On 6/5/2019 1:41 AM, David Howells wrote: > > Casey Schaufler <casey@schaufler-ca.com> wrote: > > > >> I will try to explain the problem once again. If process A > >> sends a signal (writes information) to process B the kernel > >> checks that either process A has the same UID as process B > >> or that process A has privilege to override that policy. > >> Process B is passive in this access control decision, while > >> process A is active. In the event delivery case, process A > >> does something (e.g. modifies a keyring) that generates an > >> event, which is then sent to process B's event buffer. > > I think this might be the core sticking point here. It looks like two > > different situations: > > > > (1) A explicitly sends event to B (eg. signalling, sendmsg, etc.) > > > > (2) A implicitly and unknowingly sends event to B as a side effect of some > > other action (eg. B has a watch for the event A did). > > > > The LSM treats them as the same: that is B must have MAC authorisation to send > > a message to A. > > YES! > > Threat is about what you can do, not what you intend to do. > > And it would be really great if you put some thought into what > a rational model would be for UID based controls, too. > > > But there are problems with not sending the event: > > > > (1) B's internal state is then corrupt (or, at least, unknowingly invalid). > > Then B is a badly written program. Either I'm misunderstanding you or I strongly disagree. If B has authority to detect a certain action, and A has authority to perform that action, then refusing to notify B because B is somehow missing some special authorization to be notified by A is nuts. This is just introducing incorrectness into the design in support of a not-actually-helpful security idea. If I can read /proc/self/mounts, I can detect changes to my mount namespace. Giving me a faster and nicer way to do this is fine, AS LONG AS IT ACTUALLY WORKS. "Works" means it needs to detect all changes.
Casey Schaufler <casey@schaufler-ca.com> wrote: > YES! I'm trying to decide if that's fervour or irritation at this point ;-) > And it would be really great if you put some thought into what > a rational model would be for UID based controls, too. I have put some thought into it, but I don't see a single rational model. It depends very much on the situation. In any case, that's what I was referring to when I said I might need to call inode_permission(). But UIDs don't exist for all filesystems, for example, and there are no UIDs on superblocks, mount objects or hardware events. Now, I could see that you ignore UIDs on things like keys and hardware-triggered events, but how does this interact with things like mount watches that see directories that have UIDs? Are you advocating making it such that process B can only see events triggered by process A if they have the same UID, for example? David
On 6/5/2019 9:04 AM, Andy Lutomirski wrote: > On Wed, Jun 5, 2019 at 7:51 AM Casey Schaufler <casey@schaufler-ca.com> wrote: >> On 6/5/2019 1:41 AM, David Howells wrote: >>> Casey Schaufler <casey@schaufler-ca.com> wrote: >>> >>>> I will try to explain the problem once again. If process A >>>> sends a signal (writes information) to process B the kernel >>>> checks that either process A has the same UID as process B >>>> or that process A has privilege to override that policy. >>>> Process B is passive in this access control decision, while >>>> process A is active. In the event delivery case, process A >>>> does something (e.g. modifies a keyring) that generates an >>>> event, which is then sent to process B's event buffer. >>> I think this might be the core sticking point here. It looks like two >>> different situations: >>> >>> (1) A explicitly sends event to B (eg. signalling, sendmsg, etc.) >>> >>> (2) A implicitly and unknowingly sends event to B as a side effect of some >>> other action (eg. B has a watch for the event A did). >>> >>> The LSM treats them as the same: that is B must have MAC authorisation to send >>> a message to A. >> YES! >> >> Threat is about what you can do, not what you intend to do. >> >> And it would be really great if you put some thought into what >> a rational model would be for UID based controls, too. >> >>> But there are problems with not sending the event: >>> >>> (1) B's internal state is then corrupt (or, at least, unknowingly invalid). >> Then B is a badly written program. > Either I'm misunderstanding you or I strongly disagree. A program needs to be aware of the conditions under which it gets event, *including the possibility that it may not get an event that it's not allowed*. Do you regularly write programs that go into corrupt states if an open() fails? Or where read() returns less than the amount of data you ask for? > If B has > authority to detect a certain action, and A has authority to perform > that action, then refusing to notify B because B is somehow missing > some special authorization to be notified by A is nuts. You are hand-waving the notion of authority. You are assuming that if A can read X and B can read X that A can write B. > This is just > introducing incorrectness into the design in support of a > not-actually-helpful security idea. Where is the incorrectness? Are you seriously saying that you expect all events to be generated exactly as you think they should? Have you ever even used systemd? > If I can read /proc/self/mounts, I can detect changes to my mount > namespace. Then read /proc/self/mounts! Can't you poll on an fd open on /proc/self/mounts? > Giving me a faster and nicer way to do this is fine, AS > LONG AS IT ACTUALLY WORKS. "Works" means it needs to detect all > changes. So long as "WORKS" includes maintaining the system security policy, I agree. No, I don't. We already have too many bizarre and unnatural mechanisms to address whimsical special cases. If speed is such an issue you could look at making /proc better.
Casey Schaufler <casey@schaufler-ca.com> wrote: > > But there are problems with not sending the event: > > > > (1) B's internal state is then corrupt (or, at least, unknowingly invalid). > > Then B is a badly written program. No. It may have the expectation that it will get events but then it is denied those events and doesn't even know they've happened. > > (2) B can potentially figure out that the event happened by other means. > > Then why does it need the event mechanism in the first place? Why does a CPU have interrupt lines? It can always continuously poll the hardware. Why do poll() and select() exist? > > I've implemented four event sources so far: > > > > (1) Keys/keyrings. You can only get events on a key you have View permission > > on and the other process has to have write access to it, so I think this > > is good enough. > > Sounds fine. > > > (2) Block layer. Currently this will only get you hardware error events, > > which is probably safe. I'm not sure you can manipulate those without > > permission to directly access the device files. > > There's an argument to be made that this should require CAP_SYS_ADMIN, > or that an LSM like SELinux might include hardware error events in > policy, but generally I agree that system generated events like this > are both harmless and pointless for the general public to watch. CAP_SYS_ADMIN is probably too broad a hammer - this is something you might want to let a file manager or desktop environment use. I wonder if we could add a CAP_SYS_NOTIFY - or is it too late for adding new caps? > > (3) Superblock. This is trickier since it can see events that can be > > manufactured (R/W <-> R/O remounting, EDQUOT) as well as events that > > can't without hardware control (EIO, network link loss, RF kill). > > The events generated by processes (the 1st set) need controls > like keys. The events generated by the system (the 2nd set) may > need controls like the block layer. > > > > (4) Mount topology. This is the trickiest since it allows you to see > > events beyond the point at which you placed your watch (in essence, > > you place a subtree watch). > > Like keys. > > > The question is what permission checking should I do? Ideally, I'd > > emulate a pathwalk between the watchpoint and the eventing object to > > see if the owner of the watchpoint could reach it. > > That will depend, as I've been saying, on what causes > the event to be generated. If it's from a process, the > question is "can the active process, the one that generated > the event, write to the passive, watching process?" > If it's the system on a hardware event, you may want the watcher > to have CAP_SYS_ADMIN. > > > I'd need to do a reverse walk, calling > > inode_permission(MAY_NOT_BLOCK) for each directory between the > > eventing object and the watchpoint to see if one rejects it - but > > some filesystems have a permission check that can't be called in this > > state. > > This is for setting the watch, right? No. Setting the watch requires execute permission on the directory on which you're setting the watch, but there's no way to know what permissions will be required for an event at that point. I'm talking about when an event is generated (hence "eventing object"). Imagine you have a subpath: dirA/dirB/dirC/dirD/dirE where dir* are directories. If you place a watch on dirA and then an event occurs on dirB (such as someone mounting on it), I do a walk back up the parental tree, in the order: dirE, dirD, dirC, dirB, dirA If I need to check permissions on all the directories, I would find the watchpoint on dirA, then I would have to repeat the walk to find out whether the owner of the watchpoint can access all of those directories (perhaps skipping dirA since I had permission to place a watchpoint thereon). Note that this is subject to going awry if there's a race versus rename(). > > It would also be necessary to do this separately for each watchpoint in > > the parental chain. > > > > Further, each permissions check would generate an audit event and > > could generate FAN_ACCESS and/or FAN_ACCESS_PERM fanotify events - > > which could be a problem if fanotify is also trying to post those > > events to the same watch queue. > > If you required that the watching process open(dir) what > you want to watch you'd get this for free. Or did I miss > something obvious? A subtree watch, such as the mount topology watch, watches not only the directory and mount object you pointed directly at, but the subtree rooted thereon. Take the sample program in the last patch. It places a watch on "/" with no filter against WATCH_INFO_RECURSIVE, so it sees all mount topology events that happen under the VFS path subtree rooted at "/" - whether or not it can actually pathwalk to those mounts. David
On 6/5/2019 9:56 AM, David Howells wrote: > Casey Schaufler <casey@schaufler-ca.com> wrote: > >> YES! > I'm trying to decide if that's fervour or irritation at this point ;-) I think I finally got the point that the underlying mechanism, direct or indirect, isn't the issue. It's the end result that matters. That makes me happier. >> And it would be really great if you put some thought into what >> a rational model would be for UID based controls, too. > I have put some thought into it, but I don't see a single rational model. It > depends very much on the situation. Right. You're mixing the kind of things that can generate events, and that makes having a single policy difficult. > In any case, that's what I was referring to when I said I might need to call > inode_permission(). But UIDs don't exist for all filesystems, for example, > and there are no UIDs on superblocks, mount objects or hardware events. If you open() or stat() a file on those filesystems the UID used in the access control comes from somewhere. Setting a watch on things with UIDs should use the access mode on the file, just like any other filesystem operation. Things like superblocks are sticker because we don't generally think of them as objects. If you can do statfs(), you should be able to set a watch on the filesystem metadata. How would you specify a watch for a hardware event? If you say you have to open /dev/mumble to sent a watch for mumbles, you're good there, too. > Now, I could see that you ignore UIDs on things like keys and > hardware-triggered events, but how does this interact with things like mount > watches that see directories that have UIDs? > > Are you advocating making it such that process B can only see events triggered > by process A if they have the same UID, for example? It's always seemed arbitrary to me that you can't open your process up to get signals from other users. What about putting mode bits on your ring buffer? By default you could only accept your own events, but you could do a rb_chmod(0222) and let all events through. Subject to LSM addition restrictions, of course. That would require the cred of the process that triggered the event or a system cred for "hardware" events. If you don't like mode bits you could use an ACL for fine granularity or a single "let'em all in" bit for coarse. I'm not against access, I'm against uncontrolled access in conflict with basic system policy. > David
> On Jun 5, 2019, at 10:01 AM, Casey Schaufler <casey@schaufler-ca.com> wrote: > >> On 6/5/2019 9:04 AM, Andy Lutomirski wrote: >>> On Wed, Jun 5, 2019 at 7:51 AM Casey Schaufler <casey@schaufler-ca.com> wrote: >>>> On 6/5/2019 1:41 AM, David Howells wrote: >>>> Casey Schaufler <casey@schaufler-ca.com> wrote: >>>> >>>>> I will try to explain the problem once again. If process A >>>>> sends a signal (writes information) to process B the kernel >>>>> checks that either process A has the same UID as process B >>>>> or that process A has privilege to override that policy. >>>>> Process B is passive in this access control decision, while >>>>> process A is active. In the event delivery case, process A >>>>> does something (e.g. modifies a keyring) that generates an >>>>> event, which is then sent to process B's event buffer. >>>> I think this might be the core sticking point here. It looks like two >>>> different situations: >>>> >>>> (1) A explicitly sends event to B (eg. signalling, sendmsg, etc.) >>>> >>>> (2) A implicitly and unknowingly sends event to B as a side effect of some >>>> other action (eg. B has a watch for the event A did). >>>> >>>> The LSM treats them as the same: that is B must have MAC authorisation to send >>>> a message to A. >>> YES! >>> >>> Threat is about what you can do, not what you intend to do. >>> >>> And it would be really great if you put some thought into what >>> a rational model would be for UID based controls, too. >>> >>>> But there are problems with not sending the event: >>>> >>>> (1) B's internal state is then corrupt (or, at least, unknowingly invalid). >>> Then B is a badly written program. >> Either I'm misunderstanding you or I strongly disagree. > > A program needs to be aware of the conditions under > which it gets event, *including the possibility that > it may not get an event that it's not allowed*. Do you > regularly write programs that go into corrupt states > if an open() fails? Or where read() returns less than > the amount of data you ask for? I do not regularly write programs that handle read() omitting data in the middle of a TCP stream. I also don’t write programs that wait for processes to die and need to handle the case where a child is dead, waitid() can see it, but SIGCHLD wasn’t sent because “security”. > >> If B has >> authority to detect a certain action, and A has authority to perform >> that action, then refusing to notify B because B is somehow missing >> some special authorization to be notified by A is nuts. > > You are hand-waving the notion of authority. You are assuming > that if A can read X and B can read X that A can write B. No, read it again please. I’m assuming that if A can *write* X and B can read X then A can send information to B.
On 6/5/2019 10:47 AM, Andy Lutomirski wrote: >> On Jun 5, 2019, at 10:01 AM, Casey Schaufler <casey@schaufler-ca.com> wrote: >> >>> On 6/5/2019 9:04 AM, Andy Lutomirski wrote: >>>> On Wed, Jun 5, 2019 at 7:51 AM Casey Schaufler <casey@schaufler-ca.com> wrote: >>>>> On 6/5/2019 1:41 AM, David Howells wrote: >>>>> Casey Schaufler <casey@schaufler-ca.com> wrote: >>>>> >>>>>> I will try to explain the problem once again. If process A >>>>>> sends a signal (writes information) to process B the kernel >>>>>> checks that either process A has the same UID as process B >>>>>> or that process A has privilege to override that policy. >>>>>> Process B is passive in this access control decision, while >>>>>> process A is active. In the event delivery case, process A >>>>>> does something (e.g. modifies a keyring) that generates an >>>>>> event, which is then sent to process B's event buffer. >>>>> I think this might be the core sticking point here. It looks like two >>>>> different situations: >>>>> >>>>> (1) A explicitly sends event to B (eg. signalling, sendmsg, etc.) >>>>> >>>>> (2) A implicitly and unknowingly sends event to B as a side effect of some >>>>> other action (eg. B has a watch for the event A did). >>>>> >>>>> The LSM treats them as the same: that is B must have MAC authorisation to send >>>>> a message to A. >>>> YES! >>>> >>>> Threat is about what you can do, not what you intend to do. >>>> >>>> And it would be really great if you put some thought into what >>>> a rational model would be for UID based controls, too. >>>> >>>>> But there are problems with not sending the event: >>>>> >>>>> (1) B's internal state is then corrupt (or, at least, unknowingly invalid). >>>> Then B is a badly written program. >>> Either I'm misunderstanding you or I strongly disagree. >> A program needs to be aware of the conditions under >> which it gets event, *including the possibility that >> it may not get an event that it's not allowed*. Do you >> regularly write programs that go into corrupt states >> if an open() fails? Or where read() returns less than >> the amount of data you ask for? > I do not regularly write programs that handle read() omitting data in the middle of a TCP stream. I also don’t write programs that wait for processes to die and need to handle the case where a child is dead, waitid() can see it, but SIGCHLD wasn’t sent because “security”. > >>> If B has >>> authority to detect a certain action, and A has authority to perform >>> that action, then refusing to notify B because B is somehow missing >>> some special authorization to be notified by A is nuts. >> You are hand-waving the notion of authority. You are assuming >> that if A can read X and B can read X that A can write B. > No, read it again please. I’m assuming that if A can *write* X and B can read X then A can send information to B. That is *not* a valid assumption: A can write to /dev/null. B can read from /dev/null. Does not imply B can read what A wrote. Does not imply A can send a signal to B. A can send a UDP datagram to port 3343 B can is bound to port 3343 Does not imply the packet will be delivered
On 6/5/19 1:47 PM, Andy Lutomirski wrote: > >> On Jun 5, 2019, at 10:01 AM, Casey Schaufler <casey@schaufler-ca.com> wrote: >> >>> On 6/5/2019 9:04 AM, Andy Lutomirski wrote: >>>> On Wed, Jun 5, 2019 at 7:51 AM Casey Schaufler <casey@schaufler-ca.com> wrote: >>>>> On 6/5/2019 1:41 AM, David Howells wrote: >>>>> Casey Schaufler <casey@schaufler-ca.com> wrote: >>>>> >>>>>> I will try to explain the problem once again. If process A >>>>>> sends a signal (writes information) to process B the kernel >>>>>> checks that either process A has the same UID as process B >>>>>> or that process A has privilege to override that policy. >>>>>> Process B is passive in this access control decision, while >>>>>> process A is active. In the event delivery case, process A >>>>>> does something (e.g. modifies a keyring) that generates an >>>>>> event, which is then sent to process B's event buffer. >>>>> I think this might be the core sticking point here. It looks like two >>>>> different situations: >>>>> >>>>> (1) A explicitly sends event to B (eg. signalling, sendmsg, etc.) >>>>> >>>>> (2) A implicitly and unknowingly sends event to B as a side effect of some >>>>> other action (eg. B has a watch for the event A did). >>>>> >>>>> The LSM treats them as the same: that is B must have MAC authorisation to send >>>>> a message to A. >>>> YES! >>>> >>>> Threat is about what you can do, not what you intend to do. >>>> >>>> And it would be really great if you put some thought into what >>>> a rational model would be for UID based controls, too. >>>> >>>>> But there are problems with not sending the event: >>>>> >>>>> (1) B's internal state is then corrupt (or, at least, unknowingly invalid). >>>> Then B is a badly written program. >>> Either I'm misunderstanding you or I strongly disagree. >> >> A program needs to be aware of the conditions under >> which it gets event, *including the possibility that >> it may not get an event that it's not allowed*. Do you >> regularly write programs that go into corrupt states >> if an open() fails? Or where read() returns less than >> the amount of data you ask for? > > I do not regularly write programs that handle read() omitting data in the middle of a TCP stream. I also don’t write programs that wait for processes to die and need to handle the case where a child is dead, waitid() can see it, but SIGCHLD wasn’t sent because “security”. > >> >>> If B has >>> authority to detect a certain action, and A has authority to perform >>> that action, then refusing to notify B because B is somehow missing >>> some special authorization to be notified by A is nuts. >> >> You are hand-waving the notion of authority. You are assuming >> that if A can read X and B can read X that A can write B. > > No, read it again please. I’m assuming that if A can *write* X and B can read X then A can send information to B. I guess the questions here are: 1) How do we handle recursive notification support, since we can't check that B can read everything below a given directory easily? Perhaps we can argue that if I have watch permission to / then that implies visibility to everything below it but that is rather broad. 2) Is there always a corresponding labeled object in view for each of these notifications to which we can check access when the watch is set? 3) Are notifications only generated for write events or can they be generated by processes that only have read access to the object?
On Wed, Jun 05, 2019 at 02:25:33PM -0400, Stephen Smalley wrote: > On 6/5/19 1:47 PM, Andy Lutomirski wrote: > > > > > On Jun 5, 2019, at 10:01 AM, Casey Schaufler <casey@schaufler-ca.com> wrote: > > > > > > > On 6/5/2019 9:04 AM, Andy Lutomirski wrote: > > > > > On Wed, Jun 5, 2019 at 7:51 AM Casey Schaufler <casey@schaufler-ca.com> wrote: > > > > > > On 6/5/2019 1:41 AM, David Howells wrote: > > > > > > Casey Schaufler <casey@schaufler-ca.com> wrote: > > > > > > > > > > > > > I will try to explain the problem once again. If process A > > > > > > > sends a signal (writes information) to process B the kernel > > > > > > > checks that either process A has the same UID as process B > > > > > > > or that process A has privilege to override that policy. > > > > > > > Process B is passive in this access control decision, while > > > > > > > process A is active. In the event delivery case, process A > > > > > > > does something (e.g. modifies a keyring) that generates an > > > > > > > event, which is then sent to process B's event buffer. > > > > > > I think this might be the core sticking point here. It looks like two > > > > > > different situations: > > > > > > > > > > > > (1) A explicitly sends event to B (eg. signalling, sendmsg, etc.) > > > > > > > > > > > > (2) A implicitly and unknowingly sends event to B as a side effect of some > > > > > > other action (eg. B has a watch for the event A did). > > > > > > > > > > > > The LSM treats them as the same: that is B must have MAC authorisation to send > > > > > > a message to A. > > > > > YES! > > > > > > > > > > Threat is about what you can do, not what you intend to do. > > > > > > > > > > And it would be really great if you put some thought into what > > > > > a rational model would be for UID based controls, too. > > > > > > > > > > > But there are problems with not sending the event: > > > > > > > > > > > > (1) B's internal state is then corrupt (or, at least, unknowingly invalid). > > > > > Then B is a badly written program. > > > > Either I'm misunderstanding you or I strongly disagree. > > > > > > A program needs to be aware of the conditions under > > > which it gets event, *including the possibility that > > > it may not get an event that it's not allowed*. Do you > > > regularly write programs that go into corrupt states > > > if an open() fails? Or where read() returns less than > > > the amount of data you ask for? > > > > I do not regularly write programs that handle read() omitting data in the middle of a TCP stream. I also don’t write programs that wait for processes to die and need to handle the case where a child is dead, waitid() can see it, but SIGCHLD wasn’t sent because “security”. > > > > > > > > > If B has > > > > authority to detect a certain action, and A has authority to perform > > > > that action, then refusing to notify B because B is somehow missing > > > > some special authorization to be notified by A is nuts. > > > > > > You are hand-waving the notion of authority. You are assuming > > > that if A can read X and B can read X that A can write B. > > > > No, read it again please. I’m assuming that if A can *write* X and B can read X then A can send information to B. > > I guess the questions here are: > > 1) How do we handle recursive notification support, since we can't check > that B can read everything below a given directory easily? Perhaps we can > argue that if I have watch permission to / then that implies visibility to > everything below it but that is rather broad. How do you handle fanotify today which I think can do this? thanks, greg k-h
On 6/5/19 3:28 PM, Greg KH wrote: > On Wed, Jun 05, 2019 at 02:25:33PM -0400, Stephen Smalley wrote: >> On 6/5/19 1:47 PM, Andy Lutomirski wrote: >>> >>>> On Jun 5, 2019, at 10:01 AM, Casey Schaufler <casey@schaufler-ca.com> wrote: >>>> >>>>> On 6/5/2019 9:04 AM, Andy Lutomirski wrote: >>>>>> On Wed, Jun 5, 2019 at 7:51 AM Casey Schaufler <casey@schaufler-ca.com> wrote: >>>>>>> On 6/5/2019 1:41 AM, David Howells wrote: >>>>>>> Casey Schaufler <casey@schaufler-ca.com> wrote: >>>>>>> >>>>>>>> I will try to explain the problem once again. If process A >>>>>>>> sends a signal (writes information) to process B the kernel >>>>>>>> checks that either process A has the same UID as process B >>>>>>>> or that process A has privilege to override that policy. >>>>>>>> Process B is passive in this access control decision, while >>>>>>>> process A is active. In the event delivery case, process A >>>>>>>> does something (e.g. modifies a keyring) that generates an >>>>>>>> event, which is then sent to process B's event buffer. >>>>>>> I think this might be the core sticking point here. It looks like two >>>>>>> different situations: >>>>>>> >>>>>>> (1) A explicitly sends event to B (eg. signalling, sendmsg, etc.) >>>>>>> >>>>>>> (2) A implicitly and unknowingly sends event to B as a side effect of some >>>>>>> other action (eg. B has a watch for the event A did). >>>>>>> >>>>>>> The LSM treats them as the same: that is B must have MAC authorisation to send >>>>>>> a message to A. >>>>>> YES! >>>>>> >>>>>> Threat is about what you can do, not what you intend to do. >>>>>> >>>>>> And it would be really great if you put some thought into what >>>>>> a rational model would be for UID based controls, too. >>>>>> >>>>>>> But there are problems with not sending the event: >>>>>>> >>>>>>> (1) B's internal state is then corrupt (or, at least, unknowingly invalid). >>>>>> Then B is a badly written program. >>>>> Either I'm misunderstanding you or I strongly disagree. >>>> >>>> A program needs to be aware of the conditions under >>>> which it gets event, *including the possibility that >>>> it may not get an event that it's not allowed*. Do you >>>> regularly write programs that go into corrupt states >>>> if an open() fails? Or where read() returns less than >>>> the amount of data you ask for? >>> >>> I do not regularly write programs that handle read() omitting data in the middle of a TCP stream. I also don’t write programs that wait for processes to die and need to handle the case where a child is dead, waitid() can see it, but SIGCHLD wasn’t sent because “security”. >>> >>>> >>>>> If B has >>>>> authority to detect a certain action, and A has authority to perform >>>>> that action, then refusing to notify B because B is somehow missing >>>>> some special authorization to be notified by A is nuts. >>>> >>>> You are hand-waving the notion of authority. You are assuming >>>> that if A can read X and B can read X that A can write B. >>> >>> No, read it again please. I’m assuming that if A can *write* X and B can read X then A can send information to B. >> >> I guess the questions here are: >> >> 1) How do we handle recursive notification support, since we can't check >> that B can read everything below a given directory easily? Perhaps we can >> argue that if I have watch permission to / then that implies visibility to >> everything below it but that is rather broad. > > How do you handle fanotify today which I think can do this? Doesn't appear to have been given much thought; looks like fanotify_init() checks capable(CAP_SYS_ADMIN) and fanotify_mark() checks inode_permission(MAY_READ) on the mount/directory/file. File descriptors for monitored files returned upon events at least get vetted through security_file_open() so that can prevent the monitoring process from receiving arbitrary descriptors. Would be preferable if fanotify_mark() did some kind of security_path_watch() or similar check, and distinguished mounts versus directories since monitoring of directories is not recursive.
Casey Schaufler <casey@schaufler-ca.com> wrote: > Right. You're mixing the kind of things that can generate events, > and that makes having a single policy difficult. Whilst that's true, the notifications are clearly marked as to type, so it should be possible to select different policies for different notification types. Question for you: what does the LSM *actually* need? There are a bunch of things available, some of which may be the same thing: (1) The creds of the process that created a watch_queue (ie. opened /dev/watch_queue). (2) The creds of the process that set a watch (ie. called sb_notify, KEYCTL_NOTIFY, ...); (3) The creds of the process that tripped the event (which might be the system). (4) The security attributes of the object on which the watch was set (uid, gid, mode, labels). (5) The security attributes of the object on which the event was tripped. (6) The security attributes of all the objects between the object in (5) and the object in (4), assuming we work from (5) towards (4) if the two aren't coincident (WATCH_INFO_RECURSIVE). At the moment, when post_one_notification() wants to write a notification into a queue, it calls security_post_notification() to ask if it should be allowed to do so. This is passed (1) and (3) above plus the notification record. The only problem I really have is that for a destruction message you want to get the creds of who did the last put on an object and caused it to be destroyed - I think everything else probably gets the right creds, even if they aren't even in the same namespaces (mount propagation, yuck). However, that one is a biggie because close()/exit() must propagate it to deferred-fput, which must propagate it to af_unix-cleanup, and thence back to deferred-fput and thence to implicit unmount (dissolve_on_fput()[*]). [*] Though it should be noted that if this happens, the subtree cannot be attached to the root of a namespace. > > In any case, that's what I was referring to when I said I might need to call > > inode_permission(). But UIDs don't exist for all filesystems, for example, > > and there are no UIDs on superblocks, mount objects or hardware events. > > If you open() or stat() a file on those filesystems the UID > used in the access control comes from somewhere. Setting a watch > on things with UIDs should use the access mode on the file, > just like any other filesystem operation. Another question for you: Do I need to let the LSM pass judgement on a watch that a process is trying to set? I think I probably do. This would require separate hooks for different object types: int security_watch_key(struct watch *watch, struct key *key); int security_watch_sb(struct watch *watch, struct path *path); int security_watch_mount(struct watch *watch, struct path *path); int security_watch_devices(struct watch *watch); so that the LSM can see the object the watch is being placed on (the last has a global queue, so there is no object). Further, do I need to put a "void *security" pointer in struct watch and indicate to the LSM the object bring watched? The watch could then be passed to security_post_notification() instead of the watch queue creds (which I could then dispense with). security_post_notification(const struct watch *watch, const struct cred *trigger_cred, struct watch_notification *n); Also, should I let the LSM audit/edit the filter set by IOC_WATCH_QUEUE_SET_FILTER? Userspace can't retrieve the filter, so the LSM could edit it to exclude certain things. That might be a bit too complicated, though. > Things like superblocks are sticker because we don't generally > think of them as objects. If you can do statfs(), you should be > able to set a watch on the filesystem metadata. > > How would you specify a watch for a hardware event? If you say > you have to open /dev/mumble to sent a watch for mumbles, you're > good there, too. That's not how that works at the moment. There's a global watch list for device events. I've repurposed it to carry any device's events - so it will carry blockdev events (I/O errors only at the moment) and usb events (add/remove device, add/remove bus, reset device at the moment). > > Now, I could see that you ignore UIDs on things like keys and > > hardware-triggered events, but how does this interact with things like mount > > watches that see directories that have UIDs? > > > > Are you advocating making it such that process B can only see events > > triggered by process A if they have the same UID, for example? > > It's always seemed arbitrary to me that you can't open your process up to > get signals from other users. What about putting mode bits on your ring > buffer? By default you could only accept your own events, but you could do a > rb_chmod(0222) and let all events through. Ummm... This mechanism is pretty much about events generated by others. Depend on what you mean by 'you' and 'your own events', it might be considered that you would know what events you were directly causing and wouldn't need a notification system for it. > Subject to LSM addition restrictions, of course. That would require the cred > of the process that triggered the event or a system cred for "hardware" > events. If you don't like mode bits you could use an ACL for fine > granularity or a single "let'em all in" bit for coarse. I'm not entirely sure how an ACL would help. If someone creates a watch queue, sets an ACL with only a "let everything in" ACE, we're back to the situation we're in now. As I understand it, the issue you have is stopping them getting events that they're willing to accept that you think they shouldn't be allowed. > I'm not against access, I'm against uncontrolled access in conflict with > basic system policy. David