Message ID | 20191121145245.8637-2-sds@tycho.nsa.gov (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [RFC,1/2] selinux: revert "stop passing MAY_NOT_BLOCK to the AVC upon follow_link" | expand |
On Thu, Nov 21, 2019 at 9:52 AM Stephen Smalley <sds@tycho.nsa.gov> wrote: > commit bda0be7ad994 ("security: make inode_follow_link RCU-walk aware") > passed down the rcu flag to the SELinux AVC, but failed to adjust the > test in slow_avc_audit() to also return -ECHILD on LSM_AUDIT_DATA_DENTRY. > Previously, we only returned -ECHILD if generating an audit record with > LSM_AUDIT_DATA_INODE since this was only relevant from inode_permission. > Return -ECHILD on either LSM_AUDIT_DATA_INODE or LSM_AUDIT_DATA_DENTRY. > LSM_AUDIT_DATA_INODE only requires this handling due to the fact > that dump_common_audit_data() calls d_find_alias() and collects the > dname from the result if any. > Other cases that might require similar treatment in the future are > LSM_AUDIT_DATA_PATH and LSM_AUDIT_DATA_FILE if any hook that takes > a path or file is called under RCU-walk. > > Fixes: bda0be7ad994 ("security: make inode_follow_link RCU-walk aware") > Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> > --- > security/selinux/avc.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/security/selinux/avc.c b/security/selinux/avc.c > index 74c43ebe34bb..f1fa1072230c 100644 > --- a/security/selinux/avc.c > +++ b/security/selinux/avc.c > @@ -779,7 +779,8 @@ noinline int slow_avc_audit(struct selinux_state *state, > * during retry. However this is logically just as if the operation > * happened a little later. > */ > - if ((a->type == LSM_AUDIT_DATA_INODE) && > + if ((a->type == LSM_AUDIT_DATA_INODE || > + a->type == LSM_AUDIT_DATA_DENTRY) && > (flags & MAY_NOT_BLOCK)) > return -ECHILD; Added the LSM list as I'm beginning to wonder if we should push this logic down into common_lsm_audit(), this problem around blocking shouldn't be SELinux specific. For the LSM folks just joining, the full patchset can be found here: * https://lore.kernel.org/selinux/20191121145245.8637-1-sds@tycho.nsa.gov/T/#t
On Thu, Nov 21, 2019 at 7:12 PM Paul Moore <paul@paul-moore.com> wrote: > On Thu, Nov 21, 2019 at 9:52 AM Stephen Smalley <sds@tycho.nsa.gov> wrote: > > commit bda0be7ad994 ("security: make inode_follow_link RCU-walk aware") > > passed down the rcu flag to the SELinux AVC, but failed to adjust the > > test in slow_avc_audit() to also return -ECHILD on LSM_AUDIT_DATA_DENTRY. > > Previously, we only returned -ECHILD if generating an audit record with > > LSM_AUDIT_DATA_INODE since this was only relevant from inode_permission. > > Return -ECHILD on either LSM_AUDIT_DATA_INODE or LSM_AUDIT_DATA_DENTRY. > > LSM_AUDIT_DATA_INODE only requires this handling due to the fact > > that dump_common_audit_data() calls d_find_alias() and collects the > > dname from the result if any. > > Other cases that might require similar treatment in the future are > > LSM_AUDIT_DATA_PATH and LSM_AUDIT_DATA_FILE if any hook that takes > > a path or file is called under RCU-walk. > > > > Fixes: bda0be7ad994 ("security: make inode_follow_link RCU-walk aware") > > Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> > > --- > > security/selinux/avc.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/security/selinux/avc.c b/security/selinux/avc.c > > index 74c43ebe34bb..f1fa1072230c 100644 > > --- a/security/selinux/avc.c > > +++ b/security/selinux/avc.c > > @@ -779,7 +779,8 @@ noinline int slow_avc_audit(struct selinux_state *state, > > * during retry. However this is logically just as if the operation > > * happened a little later. > > */ > > - if ((a->type == LSM_AUDIT_DATA_INODE) && > > + if ((a->type == LSM_AUDIT_DATA_INODE || > > + a->type == LSM_AUDIT_DATA_DENTRY) && > > (flags & MAY_NOT_BLOCK)) > > return -ECHILD; With LSM_AUDIT_DATA_INODE we eventually end up calling d_find_alias() in dump_common_audit_data() which could block, which is bad, that I understand. However, looking at LSM_AUDIT_DATA_DENTRY I'm less clear on why that is bad? It makes a few audit_log*() calls and one call to d_backing_inode() which is non-blocking and trivial. What am I missing? > Added the LSM list as I'm beginning to wonder if we should push this > logic down into common_lsm_audit(), this problem around blocking > shouldn't be SELinux specific. > > For the LSM folks just joining, the full patchset can be found here: > * https://lore.kernel.org/selinux/20191121145245.8637-1-sds@tycho.nsa.gov/T/#t -- paul moore www.paul-moore.com
On 11/21/19 7:30 PM, Paul Moore wrote: > On Thu, Nov 21, 2019 at 7:12 PM Paul Moore <paul@paul-moore.com> wrote: >> On Thu, Nov 21, 2019 at 9:52 AM Stephen Smalley <sds@tycho.nsa.gov> wrote: >>> commit bda0be7ad994 ("security: make inode_follow_link RCU-walk aware") >>> passed down the rcu flag to the SELinux AVC, but failed to adjust the >>> test in slow_avc_audit() to also return -ECHILD on LSM_AUDIT_DATA_DENTRY. >>> Previously, we only returned -ECHILD if generating an audit record with >>> LSM_AUDIT_DATA_INODE since this was only relevant from inode_permission. >>> Return -ECHILD on either LSM_AUDIT_DATA_INODE or LSM_AUDIT_DATA_DENTRY. >>> LSM_AUDIT_DATA_INODE only requires this handling due to the fact >>> that dump_common_audit_data() calls d_find_alias() and collects the >>> dname from the result if any. >>> Other cases that might require similar treatment in the future are >>> LSM_AUDIT_DATA_PATH and LSM_AUDIT_DATA_FILE if any hook that takes >>> a path or file is called under RCU-walk. >>> >>> Fixes: bda0be7ad994 ("security: make inode_follow_link RCU-walk aware") >>> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> >>> --- >>> security/selinux/avc.c | 3 ++- >>> 1 file changed, 2 insertions(+), 1 deletion(-) >>> >>> diff --git a/security/selinux/avc.c b/security/selinux/avc.c >>> index 74c43ebe34bb..f1fa1072230c 100644 >>> --- a/security/selinux/avc.c >>> +++ b/security/selinux/avc.c >>> @@ -779,7 +779,8 @@ noinline int slow_avc_audit(struct selinux_state *state, >>> * during retry. However this is logically just as if the operation >>> * happened a little later. >>> */ >>> - if ((a->type == LSM_AUDIT_DATA_INODE) && >>> + if ((a->type == LSM_AUDIT_DATA_INODE || >>> + a->type == LSM_AUDIT_DATA_DENTRY) && >>> (flags & MAY_NOT_BLOCK)) >>> return -ECHILD; > > With LSM_AUDIT_DATA_INODE we eventually end up calling d_find_alias() > in dump_common_audit_data() which could block, which is bad, that I > understand. However, looking at LSM_AUDIT_DATA_DENTRY I'm less clear > on why that is bad? It makes a few audit_log*() calls and one call to > d_backing_inode() which is non-blocking and trivial. > > What am I missing? For those who haven't, you may wish to also read the earlier thread here that led to this one: https://lore.kernel.org/selinux/20191119184057.14961-1-will@kernel.org/T/#t AFAIK, neither the LSM_AUDIT_DATA_INODE nor the LSM_AUDIT_DATA_DENTRY case truly block (d_find_alias does not block AFAICT, nor should audit_log* as long as we use audit_log_start with GFP_ATOMIC or GFP_NOWAIT). My impression from the comment in slow_avc_audit() is that the issue is not really about blocking but rather about the inability to safely dereference the dentry->d_name during RCU walk, which is something that can occur under LSM_AUDIT_DATA_INODE or _DENTRY (or _PATH or _FILE, but neither of the latter two are currently used from the two hooks that are called during RCU walk, inode_permission and inode_follow_link). Originally _PATH, _DENTRY, and _INODE were all under a single _FS type and the original test in slow_avc_audit() was against LSM_AUDIT_DATA_FS before the split. > >> Added the LSM list as I'm beginning to wonder if we should push this >> logic down into common_lsm_audit(), this problem around blocking >> shouldn't be SELinux specific. That would require passing down the MAY_NOT_BLOCK flag or a rcu bool to common_lsm_audit() just so that it could immediately return if that is set and a->type is _INODE or _DENTRY (or _PATH or _FILE). And the individual security module still needs to have its own handling of MAY_NOT_BLOCK/rcu for its own processing, so it won't free the security module authors from thinking about it. This is only relevant for modules implementing the inode_permission and/or inode_follow_link hooks, so it only currently affects SELinux and Smack, and Smack only presently implements inode_permission and always returns -ECHILD if MAY_NOT_BLOCK (aside from a couple trivial cases), so it will never reach common_lsm_audit() in that case. >> >> For the LSM folks just joining, the full patchset can be found here: >> * https://lore.kernel.org/selinux/20191121145245.8637-1-sds@tycho.nsa.gov/T/#t
On 11/22/19 8:37 AM, Stephen Smalley wrote: > On 11/21/19 7:30 PM, Paul Moore wrote: >> On Thu, Nov 21, 2019 at 7:12 PM Paul Moore <paul@paul-moore.com> wrote: >>> On Thu, Nov 21, 2019 at 9:52 AM Stephen Smalley <sds@tycho.nsa.gov> >>> wrote: >>>> commit bda0be7ad994 ("security: make inode_follow_link RCU-walk aware") >>>> passed down the rcu flag to the SELinux AVC, but failed to adjust the >>>> test in slow_avc_audit() to also return -ECHILD on >>>> LSM_AUDIT_DATA_DENTRY. >>>> Previously, we only returned -ECHILD if generating an audit record with >>>> LSM_AUDIT_DATA_INODE since this was only relevant from >>>> inode_permission. >>>> Return -ECHILD on either LSM_AUDIT_DATA_INODE or LSM_AUDIT_DATA_DENTRY. >>>> LSM_AUDIT_DATA_INODE only requires this handling due to the fact >>>> that dump_common_audit_data() calls d_find_alias() and collects the >>>> dname from the result if any. >>>> Other cases that might require similar treatment in the future are >>>> LSM_AUDIT_DATA_PATH and LSM_AUDIT_DATA_FILE if any hook that takes >>>> a path or file is called under RCU-walk. >>>> >>>> Fixes: bda0be7ad994 ("security: make inode_follow_link RCU-walk aware") >>>> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> >>>> --- >>>> security/selinux/avc.c | 3 ++- >>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/security/selinux/avc.c b/security/selinux/avc.c >>>> index 74c43ebe34bb..f1fa1072230c 100644 >>>> --- a/security/selinux/avc.c >>>> +++ b/security/selinux/avc.c >>>> @@ -779,7 +779,8 @@ noinline int slow_avc_audit(struct selinux_state >>>> *state, >>>> * during retry. However this is logically just as if the >>>> operation >>>> * happened a little later. >>>> */ >>>> - if ((a->type == LSM_AUDIT_DATA_INODE) && >>>> + if ((a->type == LSM_AUDIT_DATA_INODE || >>>> + a->type == LSM_AUDIT_DATA_DENTRY) && >>>> (flags & MAY_NOT_BLOCK)) >>>> return -ECHILD; >> >> With LSM_AUDIT_DATA_INODE we eventually end up calling d_find_alias() >> in dump_common_audit_data() which could block, which is bad, that I >> understand. However, looking at LSM_AUDIT_DATA_DENTRY I'm less clear >> on why that is bad? It makes a few audit_log*() calls and one call to >> d_backing_inode() which is non-blocking and trivial. >> >> What am I missing? > > For those who haven't, you may wish to also read the earlier thread here > that led to this one: > https://lore.kernel.org/selinux/20191119184057.14961-1-will@kernel.org/T/#t > > AFAIK, neither the LSM_AUDIT_DATA_INODE nor the LSM_AUDIT_DATA_DENTRY > case truly block (d_find_alias does not block AFAICT, nor should > audit_log* as long as we use audit_log_start with GFP_ATOMIC or > GFP_NOWAIT). My impression from the comment in slow_avc_audit() is that > the issue is not really about blocking but rather about the inability to > safely dereference the dentry->d_name during RCU walk, which is > something that can occur under LSM_AUDIT_DATA_INODE or _DENTRY (or _PATH > or _FILE, but neither of the latter two are currently used from the two > hooks that are called during RCU walk, inode_permission and > inode_follow_link). Originally _PATH, _DENTRY, and _INODE were all > under a single _FS type and the original test in slow_avc_audit() was > against LSM_AUDIT_DATA_FS before the split. > >> >>> Added the LSM list as I'm beginning to wonder if we should push this >>> logic down into common_lsm_audit(), this problem around blocking >>> shouldn't be SELinux specific. > > That would require passing down the MAY_NOT_BLOCK flag or a rcu bool to > common_lsm_audit() just so that it could immediately return if that is > set and a->type is _INODE or _DENTRY (or _PATH or _FILE). And the > individual security module still needs to have its own handling of > MAY_NOT_BLOCK/rcu for its own processing, so it won't free the security > module authors from thinking about it. This is only relevant for > modules implementing the inode_permission and/or inode_follow_link > hooks, so it only currently affects SELinux and Smack, and Smack only > presently implements inode_permission and always returns -ECHILD if > MAY_NOT_BLOCK (aside from a couple trivial cases), so it will never > reach common_lsm_audit() in that case. This would also require changing common_lsm_audit() to be able to return errors so that it can return -ECHILD and updating all callers to handle that. > > >>> >>> For the LSM folks just joining, the full patchset can be found here: >>> * >>> https://lore.kernel.org/selinux/20191121145245.8637-1-sds@tycho.nsa.gov/T/#t >>>
On Fri, Nov 22, 2019 at 8:37 AM Stephen Smalley <sds@tycho.nsa.gov> wrote: > On 11/21/19 7:30 PM, Paul Moore wrote: > > On Thu, Nov 21, 2019 at 7:12 PM Paul Moore <paul@paul-moore.com> wrote: > >> On Thu, Nov 21, 2019 at 9:52 AM Stephen Smalley <sds@tycho.nsa.gov> wrote: > >>> commit bda0be7ad994 ("security: make inode_follow_link RCU-walk aware") > >>> passed down the rcu flag to the SELinux AVC, but failed to adjust the > >>> test in slow_avc_audit() to also return -ECHILD on LSM_AUDIT_DATA_DENTRY. > >>> Previously, we only returned -ECHILD if generating an audit record with > >>> LSM_AUDIT_DATA_INODE since this was only relevant from inode_permission. > >>> Return -ECHILD on either LSM_AUDIT_DATA_INODE or LSM_AUDIT_DATA_DENTRY. > >>> LSM_AUDIT_DATA_INODE only requires this handling due to the fact > >>> that dump_common_audit_data() calls d_find_alias() and collects the > >>> dname from the result if any. > >>> Other cases that might require similar treatment in the future are > >>> LSM_AUDIT_DATA_PATH and LSM_AUDIT_DATA_FILE if any hook that takes > >>> a path or file is called under RCU-walk. > >>> > >>> Fixes: bda0be7ad994 ("security: make inode_follow_link RCU-walk aware") > >>> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> > >>> --- > >>> security/selinux/avc.c | 3 ++- > >>> 1 file changed, 2 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/security/selinux/avc.c b/security/selinux/avc.c > >>> index 74c43ebe34bb..f1fa1072230c 100644 > >>> --- a/security/selinux/avc.c > >>> +++ b/security/selinux/avc.c > >>> @@ -779,7 +779,8 @@ noinline int slow_avc_audit(struct selinux_state *state, > >>> * during retry. However this is logically just as if the operation > >>> * happened a little later. > >>> */ > >>> - if ((a->type == LSM_AUDIT_DATA_INODE) && > >>> + if ((a->type == LSM_AUDIT_DATA_INODE || > >>> + a->type == LSM_AUDIT_DATA_DENTRY) && > >>> (flags & MAY_NOT_BLOCK)) > >>> return -ECHILD; > > > > With LSM_AUDIT_DATA_INODE we eventually end up calling d_find_alias() > > in dump_common_audit_data() which could block, which is bad, that I > > understand. However, looking at LSM_AUDIT_DATA_DENTRY I'm less clear > > on why that is bad? It makes a few audit_log*() calls and one call to > > d_backing_inode() which is non-blocking and trivial. > > > > What am I missing? > > For those who haven't, you may wish to also read the earlier thread here > that led to this one: > https://lore.kernel.org/selinux/20191119184057.14961-1-will@kernel.org/T/#t > > AFAIK, neither the LSM_AUDIT_DATA_INODE nor the LSM_AUDIT_DATA_DENTRY > case truly block (d_find_alias does not block AFAICT, nor should > audit_log* as long as we use audit_log_start with GFP_ATOMIC or > GFP_NOWAIT). Yes, the audit_log*() functions should be safe, if not I would consider that a bug; I thought d_find_alias() might block, but it's very likely I'm wrong in that regard. > My impression from the comment in slow_avc_audit() is that > the issue is not really about blocking but rather about the inability to > safely dereference the dentry->d_name during RCU walk, which is > something that can occur under LSM_AUDIT_DATA_INODE or _DENTRY (or _PATH > or _FILE, but neither of the latter two are currently used from the two > hooks that are called during RCU walk, inode_permission and > inode_follow_link). Originally _PATH, _DENTRY, and _INODE were all > under a single _FS type and the original test in slow_avc_audit() was > against LSM_AUDIT_DATA_FS before the split. Thanks, I think that is the part I was missing. I was focused too much on the VFS stuff that I didn't pay enough attention to slow_avc_audit(). If that is the case, the comment and code in dentry_cmp() would seem to indicate that it would be safe to fetch the dentry name string as long as we use READ_ONCE(). The length field in the qstr might be off, but the audit_log_untrustedstring() function doesn't use the qstr's length information. I suppose if we don't mind the extra spinlock we could use take_dentry_name_snapshot(); that should be safe and we are already in the "slow" path. I didn't check the _PATH or _FILE cases. Once again, let me know if I'm missing something. As an aside, if we somehow can guarantee (e.g. via a name_snapshot) that qstr length information is valid, we might want to consider moving from audit_log_unstrustedstring() to audit_log_n_untrustedstring() to save us a call to strlen(). > >> Added the LSM list as I'm beginning to wonder if we should push this > >> logic down into common_lsm_audit(), this problem around blocking > >> shouldn't be SELinux specific. > > That would require passing down the MAY_NOT_BLOCK flag or a rcu bool to > common_lsm_audit() just so that it could immediately return if that is > set and a->type is _INODE or _DENTRY (or _PATH or _FILE). And the > individual security module still needs to have its own handling of > MAY_NOT_BLOCK/rcu for its own processing, so it won't free the security > module authors from thinking about it. Looking at the current SELinux code, all we do is bail out early with -ECHILD. If we didn't have that check it looks like the only impact would be some extra assignments into a struct living on the stack and a call into common_lsm_audit(). That doesn't seem terrible for a slow path, especially if it pushes this code into a LSM common function. > >> For the LSM folks just joining, the full patchset can be found here: > >> * https://lore.kernel.org/selinux/20191121145245.8637-1-sds@tycho.nsa.gov/T/#t
On 11/22/19 9:49 AM, Paul Moore wrote: > On Fri, Nov 22, 2019 at 8:37 AM Stephen Smalley <sds@tycho.nsa.gov> wrote: >> On 11/21/19 7:30 PM, Paul Moore wrote: >>> On Thu, Nov 21, 2019 at 7:12 PM Paul Moore <paul@paul-moore.com> wrote: >>>> On Thu, Nov 21, 2019 at 9:52 AM Stephen Smalley <sds@tycho.nsa.gov> wrote: >>>>> commit bda0be7ad994 ("security: make inode_follow_link RCU-walk aware") >>>>> passed down the rcu flag to the SELinux AVC, but failed to adjust the >>>>> test in slow_avc_audit() to also return -ECHILD on LSM_AUDIT_DATA_DENTRY. >>>>> Previously, we only returned -ECHILD if generating an audit record with >>>>> LSM_AUDIT_DATA_INODE since this was only relevant from inode_permission. >>>>> Return -ECHILD on either LSM_AUDIT_DATA_INODE or LSM_AUDIT_DATA_DENTRY. >>>>> LSM_AUDIT_DATA_INODE only requires this handling due to the fact >>>>> that dump_common_audit_data() calls d_find_alias() and collects the >>>>> dname from the result if any. >>>>> Other cases that might require similar treatment in the future are >>>>> LSM_AUDIT_DATA_PATH and LSM_AUDIT_DATA_FILE if any hook that takes >>>>> a path or file is called under RCU-walk. >>>>> >>>>> Fixes: bda0be7ad994 ("security: make inode_follow_link RCU-walk aware") >>>>> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> >>>>> --- >>>>> security/selinux/avc.c | 3 ++- >>>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/security/selinux/avc.c b/security/selinux/avc.c >>>>> index 74c43ebe34bb..f1fa1072230c 100644 >>>>> --- a/security/selinux/avc.c >>>>> +++ b/security/selinux/avc.c >>>>> @@ -779,7 +779,8 @@ noinline int slow_avc_audit(struct selinux_state *state, >>>>> * during retry. However this is logically just as if the operation >>>>> * happened a little later. >>>>> */ >>>>> - if ((a->type == LSM_AUDIT_DATA_INODE) && >>>>> + if ((a->type == LSM_AUDIT_DATA_INODE || >>>>> + a->type == LSM_AUDIT_DATA_DENTRY) && >>>>> (flags & MAY_NOT_BLOCK)) >>>>> return -ECHILD; >>> >>> With LSM_AUDIT_DATA_INODE we eventually end up calling d_find_alias() >>> in dump_common_audit_data() which could block, which is bad, that I >>> understand. However, looking at LSM_AUDIT_DATA_DENTRY I'm less clear >>> on why that is bad? It makes a few audit_log*() calls and one call to >>> d_backing_inode() which is non-blocking and trivial. >>> >>> What am I missing? >> >> For those who haven't, you may wish to also read the earlier thread here >> that led to this one: >> https://lore.kernel.org/selinux/20191119184057.14961-1-will@kernel.org/T/#t >> >> AFAIK, neither the LSM_AUDIT_DATA_INODE nor the LSM_AUDIT_DATA_DENTRY >> case truly block (d_find_alias does not block AFAICT, nor should >> audit_log* as long as we use audit_log_start with GFP_ATOMIC or >> GFP_NOWAIT). > > Yes, the audit_log*() functions should be safe, if not I would > consider that a bug; I thought d_find_alias() might block, but it's > very likely I'm wrong in that regard. No, it doesn't appear to block. However, it does take d_lock and increment d_lockref.count, which IIUC aren't permitted during rcu-walk. >> My impression from the comment in slow_avc_audit() is that >> the issue is not really about blocking but rather about the inability to >> safely dereference the dentry->d_name during RCU walk, which is >> something that can occur under LSM_AUDIT_DATA_INODE or _DENTRY (or _PATH >> or _FILE, but neither of the latter two are currently used from the two >> hooks that are called during RCU walk, inode_permission and >> inode_follow_link). Originally _PATH, _DENTRY, and _INODE were all >> under a single _FS type and the original test in slow_avc_audit() was >> against LSM_AUDIT_DATA_FS before the split. > > Thanks, I think that is the part I was missing. I was focused too > much on the VFS stuff that I didn't pay enough attention to > slow_avc_audit(). > > If that is the case, the comment and code in dentry_cmp() would seem > to indicate that it would be safe to fetch the dentry name string as > long as we use READ_ONCE(). The length field in the qstr might be > off, but the audit_log_untrustedstring() function doesn't use the > qstr's length information. I suppose if we don't mind the extra > spinlock we could use take_dentry_name_snapshot(); that should be safe > and we are already in the "slow" path. I didn't check the _PATH or > _FILE cases. > > Once again, let me know if I'm missing something. We can't take any spinlocks on the dentry during rcu-walk IIUC; that would defeat the purpose. In looking for a parallel with filesystem implementations, I noted that fs/namei.c:get_link() doesn't even pass the dentry to the filesystem get_link() method in the rcu-walk case, only doing so under ref-walk. So they won't permit the filesystem implementations to ever dereference the dentry for get_link() under rcu-walk. Not sure why it gets passed to security_inode_follow_link() then, or if it is ever safe for a security module to dereference its fields. I was hoping to get fsdevel folks to comment since I feel like we're guessing about exactly what guarantees we have in this area. > > As an aside, if we somehow can guarantee (e.g. via a name_snapshot) > that qstr length information is valid, we might want to consider > moving from audit_log_unstrustedstring() to > audit_log_n_untrustedstring() to save us a call to strlen(). > >>>> Added the LSM list as I'm beginning to wonder if we should push this >>>> logic down into common_lsm_audit(), this problem around blocking >>>> shouldn't be SELinux specific. >> >> That would require passing down the MAY_NOT_BLOCK flag or a rcu bool to >> common_lsm_audit() just so that it could immediately return if that is >> set and a->type is _INODE or _DENTRY (or _PATH or _FILE). And the >> individual security module still needs to have its own handling of >> MAY_NOT_BLOCK/rcu for its own processing, so it won't free the security >> module authors from thinking about it. > > Looking at the current SELinux code, all we do is bail out early with > -ECHILD. If we didn't have that check it looks like the only impact > would be some extra assignments into a struct living on the stack and > a call into common_lsm_audit(). That doesn't seem terrible for a slow > path, especially if it pushes this code into a LSM common function. Not terrible, just not sure if it ends up being a worthwhile change. If the LSM maintainers would like it that way, I can do that. > >>>> For the LSM folks just joining, the full patchset can be found here: >>>> * https://lore.kernel.org/selinux/20191121145245.8637-1-sds@tycho.nsa.gov/T/#t >
On Thu, Nov 21, 2019 at 09:52:45AM -0500, Stephen Smalley wrote: > commit bda0be7ad994 ("security: make inode_follow_link RCU-walk aware") > passed down the rcu flag to the SELinux AVC, but failed to adjust the > test in slow_avc_audit() to also return -ECHILD on LSM_AUDIT_DATA_DENTRY. > Previously, we only returned -ECHILD if generating an audit record with > LSM_AUDIT_DATA_INODE since this was only relevant from inode_permission. > Return -ECHILD on either LSM_AUDIT_DATA_INODE or LSM_AUDIT_DATA_DENTRY. > LSM_AUDIT_DATA_INODE only requires this handling due to the fact > that dump_common_audit_data() calls d_find_alias() and collects the > dname from the result if any. > Other cases that might require similar treatment in the future are > LSM_AUDIT_DATA_PATH and LSM_AUDIT_DATA_FILE if any hook that takes > a path or file is called under RCU-walk. > > Fixes: bda0be7ad994 ("security: make inode_follow_link RCU-walk aware") > Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> > --- > security/selinux/avc.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/security/selinux/avc.c b/security/selinux/avc.c > index 74c43ebe34bb..f1fa1072230c 100644 > --- a/security/selinux/avc.c > +++ b/security/selinux/avc.c > @@ -779,7 +779,8 @@ noinline int slow_avc_audit(struct selinux_state *state, > * during retry. However this is logically just as if the operation > * happened a little later. > */ > - if ((a->type == LSM_AUDIT_DATA_INODE) && > + if ((a->type == LSM_AUDIT_DATA_INODE || > + a->type == LSM_AUDIT_DATA_DENTRY) && > (flags & MAY_NOT_BLOCK)) IDGI, to be honest. Why do we bother with slow path if MAY_NOT_BLOCK has been given? If we'd run into "there's something to report" case, we are not on the fastpath anymore. IOW, why not have audited = avc_audit_required(requested, avd, result, 0, &denied); if (likely(!audited)) return 0; if (flags & MAY_NOT_BLOCK) return -ECHILD; return slow_avc_audit(state, ssid, tsid, tclass, requested, audited, denied, result, a, flags); in avc_audit() and be done with that? It's not just whether we *can* collect whatever audit might want; do we want to try and make an audit-spewing syscall marginally faster? And "marginally" is all you'll get there, really... We could do error = security_inode_follow_link(dentry, inode, nd->flags & LOOKUP_RCU); if (unlikely(error)) { if (error == -ECHILD && !unlazy_walk(nd)) error = security_inode_follow_link(dentry, inode, 0); if (error) return ERR_PTR(error); } in fs/namei.c:get_link() to slightly reduce the costs; that might or might not be useful - I'd like to see profiling results first. But trying to push the actual "spew to audit" into RCU case? What for?
On 11/22/19 11:11 AM, Al Viro wrote: > On Thu, Nov 21, 2019 at 09:52:45AM -0500, Stephen Smalley wrote: >> commit bda0be7ad994 ("security: make inode_follow_link RCU-walk aware") >> passed down the rcu flag to the SELinux AVC, but failed to adjust the >> test in slow_avc_audit() to also return -ECHILD on LSM_AUDIT_DATA_DENTRY. >> Previously, we only returned -ECHILD if generating an audit record with >> LSM_AUDIT_DATA_INODE since this was only relevant from inode_permission. >> Return -ECHILD on either LSM_AUDIT_DATA_INODE or LSM_AUDIT_DATA_DENTRY. >> LSM_AUDIT_DATA_INODE only requires this handling due to the fact >> that dump_common_audit_data() calls d_find_alias() and collects the >> dname from the result if any. >> Other cases that might require similar treatment in the future are >> LSM_AUDIT_DATA_PATH and LSM_AUDIT_DATA_FILE if any hook that takes >> a path or file is called under RCU-walk. >> >> Fixes: bda0be7ad994 ("security: make inode_follow_link RCU-walk aware") >> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> >> --- >> security/selinux/avc.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/security/selinux/avc.c b/security/selinux/avc.c >> index 74c43ebe34bb..f1fa1072230c 100644 >> --- a/security/selinux/avc.c >> +++ b/security/selinux/avc.c >> @@ -779,7 +779,8 @@ noinline int slow_avc_audit(struct selinux_state *state, >> * during retry. However this is logically just as if the operation >> * happened a little later. >> */ >> - if ((a->type == LSM_AUDIT_DATA_INODE) && >> + if ((a->type == LSM_AUDIT_DATA_INODE || >> + a->type == LSM_AUDIT_DATA_DENTRY) && >> (flags & MAY_NOT_BLOCK)) > > IDGI, to be honest. Why do we bother with slow path if MAY_NOT_BLOCK has > been given? If we'd run into "there's something to report" case, we > are not on the fastpath anymore. IOW, why not have > audited = avc_audit_required(requested, avd, result, 0, &denied); > if (likely(!audited)) > return 0; > if (flags & MAY_NOT_BLOCK) > return -ECHILD; > return slow_avc_audit(state, ssid, tsid, tclass, > requested, audited, denied, result, > a, flags); > in avc_audit() and be done with that? That works for me; we would also need to do the same in selinux_inode_permission(). We can then stop passing flags down to slow_avc_audit() entirely. > It's not just whether we *can* collect whatever audit might want; do > we want to try and make an audit-spewing syscall marginally faster? > And "marginally" is all you'll get there, really... > > We could do > error = security_inode_follow_link(dentry, inode, > nd->flags & LOOKUP_RCU); > if (unlikely(error)) { > if (error == -ECHILD && !unlazy_walk(nd)) > error = security_inode_follow_link(dentry, inode, 0); > if (error) > return ERR_PTR(error); > } > in fs/namei.c:get_link() to slightly reduce the costs; that might or > might not be useful - I'd like to see profiling results first. But > trying to push the actual "spew to audit" into RCU case? What for? >
On 11/22/19 10:09 AM, Stephen Smalley wrote: > On 11/22/19 9:49 AM, Paul Moore wrote: >> On Fri, Nov 22, 2019 at 8:37 AM Stephen Smalley <sds@tycho.nsa.gov> >> wrote: >>> On 11/21/19 7:30 PM, Paul Moore wrote: >>>> On Thu, Nov 21, 2019 at 7:12 PM Paul Moore <paul@paul-moore.com> wrote: >>>>> On Thu, Nov 21, 2019 at 9:52 AM Stephen Smalley <sds@tycho.nsa.gov> >>>>> wrote: >>>>>> commit bda0be7ad994 ("security: make inode_follow_link RCU-walk >>>>>> aware") >>>>>> passed down the rcu flag to the SELinux AVC, but failed to adjust the >>>>>> test in slow_avc_audit() to also return -ECHILD on >>>>>> LSM_AUDIT_DATA_DENTRY. >>>>>> Previously, we only returned -ECHILD if generating an audit record >>>>>> with >>>>>> LSM_AUDIT_DATA_INODE since this was only relevant from >>>>>> inode_permission. >>>>>> Return -ECHILD on either LSM_AUDIT_DATA_INODE or >>>>>> LSM_AUDIT_DATA_DENTRY. >>>>>> LSM_AUDIT_DATA_INODE only requires this handling due to the fact >>>>>> that dump_common_audit_data() calls d_find_alias() and collects the >>>>>> dname from the result if any. >>>>>> Other cases that might require similar treatment in the future are >>>>>> LSM_AUDIT_DATA_PATH and LSM_AUDIT_DATA_FILE if any hook that takes >>>>>> a path or file is called under RCU-walk. >>>>>> >>>>>> Fixes: bda0be7ad994 ("security: make inode_follow_link RCU-walk >>>>>> aware") >>>>>> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> >>>>>> --- >>>>>> security/selinux/avc.c | 3 ++- >>>>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/security/selinux/avc.c b/security/selinux/avc.c >>>>>> index 74c43ebe34bb..f1fa1072230c 100644 >>>>>> --- a/security/selinux/avc.c >>>>>> +++ b/security/selinux/avc.c >>>>>> @@ -779,7 +779,8 @@ noinline int slow_avc_audit(struct >>>>>> selinux_state *state, >>>>>> * during retry. However this is logically just as if >>>>>> the operation >>>>>> * happened a little later. >>>>>> */ >>>>>> - if ((a->type == LSM_AUDIT_DATA_INODE) && >>>>>> + if ((a->type == LSM_AUDIT_DATA_INODE || >>>>>> + a->type == LSM_AUDIT_DATA_DENTRY) && >>>>>> (flags & MAY_NOT_BLOCK)) >>>>>> return -ECHILD; >>>> >>>> With LSM_AUDIT_DATA_INODE we eventually end up calling d_find_alias() >>>> in dump_common_audit_data() which could block, which is bad, that I >>>> understand. However, looking at LSM_AUDIT_DATA_DENTRY I'm less clear >>>> on why that is bad? It makes a few audit_log*() calls and one call to >>>> d_backing_inode() which is non-blocking and trivial. >>>> >>>> What am I missing? >>> >>> For those who haven't, you may wish to also read the earlier thread here >>> that led to this one: >>> https://lore.kernel.org/selinux/20191119184057.14961-1-will@kernel.org/T/#t >>> >>> >>> AFAIK, neither the LSM_AUDIT_DATA_INODE nor the LSM_AUDIT_DATA_DENTRY >>> case truly block (d_find_alias does not block AFAICT, nor should >>> audit_log* as long as we use audit_log_start with GFP_ATOMIC or >>> GFP_NOWAIT). >> >> Yes, the audit_log*() functions should be safe, if not I would >> consider that a bug; I thought d_find_alias() might block, but it's >> very likely I'm wrong in that regard. > > No, it doesn't appear to block. However, it does take d_lock and > increment d_lockref.count, which IIUC aren't permitted during rcu-walk. > >>> My impression from the comment in slow_avc_audit() is that >>> the issue is not really about blocking but rather about the inability to >>> safely dereference the dentry->d_name during RCU walk, which is >>> something that can occur under LSM_AUDIT_DATA_INODE or _DENTRY (or _PATH >>> or _FILE, but neither of the latter two are currently used from the two >>> hooks that are called during RCU walk, inode_permission and >>> inode_follow_link). Originally _PATH, _DENTRY, and _INODE were all >>> under a single _FS type and the original test in slow_avc_audit() was >>> against LSM_AUDIT_DATA_FS before the split. >> >> Thanks, I think that is the part I was missing. I was focused too >> much on the VFS stuff that I didn't pay enough attention to >> slow_avc_audit(). >> >> If that is the case, the comment and code in dentry_cmp() would seem >> to indicate that it would be safe to fetch the dentry name string as >> long as we use READ_ONCE(). The length field in the qstr might be >> off, but the audit_log_untrustedstring() function doesn't use the >> qstr's length information. I suppose if we don't mind the extra >> spinlock we could use take_dentry_name_snapshot(); that should be safe >> and we are already in the "slow" path. I didn't check the _PATH or >> _FILE cases. >> >> Once again, let me know if I'm missing something. > > We can't take any spinlocks on the dentry during rcu-walk IIUC; that > would defeat the purpose. In looking for a parallel with filesystem > implementations, I noted that fs/namei.c:get_link() doesn't even pass > the dentry to the filesystem get_link() method in the rcu-walk case, > only doing so under ref-walk. So they won't permit the filesystem > implementations to ever dereference the dentry for get_link() under > rcu-walk. Not sure why it gets passed to security_inode_follow_link() > then, or if it is ever safe for a security module to dereference its > fields. > > I was hoping to get fsdevel folks to comment since I feel like we're > guessing about exactly what guarantees we have in this area. > >> >> As an aside, if we somehow can guarantee (e.g. via a name_snapshot) >> that qstr length information is valid, we might want to consider >> moving from audit_log_unstrustedstring() to >> audit_log_n_untrustedstring() to save us a call to strlen(). >> >>>>> Added the LSM list as I'm beginning to wonder if we should push this >>>>> logic down into common_lsm_audit(), this problem around blocking >>>>> shouldn't be SELinux specific. >>> >>> That would require passing down the MAY_NOT_BLOCK flag or a rcu bool to >>> common_lsm_audit() just so that it could immediately return if that is >>> set and a->type is _INODE or _DENTRY (or _PATH or _FILE). And the >>> individual security module still needs to have its own handling of >>> MAY_NOT_BLOCK/rcu for its own processing, so it won't free the security >>> module authors from thinking about it. >> >> Looking at the current SELinux code, all we do is bail out early with >> -ECHILD. If we didn't have that check it looks like the only impact >> would be some extra assignments into a struct living on the stack and >> a call into common_lsm_audit(). That doesn't seem terrible for a slow >> path, especially if it pushes this code into a LSM common function. > > Not terrible, just not sure if it ends up being a worthwhile change. If > the LSM maintainers would like it that way, I can do that. I think this rendered moot by viro's suggestion, since we are taking the handling of MAY_NOT_BLOCK up even earlier in the processing and the flags don't need to be passed down to slow_avc_audit() anymore. Sure, we could still pass them down and defer the handling to common_lsm_audit(), but that's just extra wasted work before we bail out, and we are no longer even testing the a->type field with the new logic so there is no longer anything related to the lsm_audit implementation. > >> >>>>> For the LSM folks just joining, the full patchset can be found here: >>>>> * >>>>> https://lore.kernel.org/selinux/20191121145245.8637-1-sds@tycho.nsa.gov/T/#t >>>>> >> >
On Fri, Nov 22, 2019 at 11:27:37AM -0500, Stephen Smalley wrote: > On 11/22/19 11:11 AM, Al Viro wrote: > > On Thu, Nov 21, 2019 at 09:52:45AM -0500, Stephen Smalley wrote: > > > commit bda0be7ad994 ("security: make inode_follow_link RCU-walk aware") > > > passed down the rcu flag to the SELinux AVC, but failed to adjust the > > > test in slow_avc_audit() to also return -ECHILD on LSM_AUDIT_DATA_DENTRY. > > > Previously, we only returned -ECHILD if generating an audit record with > > > LSM_AUDIT_DATA_INODE since this was only relevant from inode_permission. > > > Return -ECHILD on either LSM_AUDIT_DATA_INODE or LSM_AUDIT_DATA_DENTRY. > > > LSM_AUDIT_DATA_INODE only requires this handling due to the fact > > > that dump_common_audit_data() calls d_find_alias() and collects the > > > dname from the result if any. > > > Other cases that might require similar treatment in the future are > > > LSM_AUDIT_DATA_PATH and LSM_AUDIT_DATA_FILE if any hook that takes > > > a path or file is called under RCU-walk. > > > > > > Fixes: bda0be7ad994 ("security: make inode_follow_link RCU-walk aware") > > > Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> > > > --- > > > security/selinux/avc.c | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/security/selinux/avc.c b/security/selinux/avc.c > > > index 74c43ebe34bb..f1fa1072230c 100644 > > > --- a/security/selinux/avc.c > > > +++ b/security/selinux/avc.c > > > @@ -779,7 +779,8 @@ noinline int slow_avc_audit(struct selinux_state *state, > > > * during retry. However this is logically just as if the operation > > > * happened a little later. > > > */ > > > - if ((a->type == LSM_AUDIT_DATA_INODE) && > > > + if ((a->type == LSM_AUDIT_DATA_INODE || > > > + a->type == LSM_AUDIT_DATA_DENTRY) && > > > (flags & MAY_NOT_BLOCK)) > > > > IDGI, to be honest. Why do we bother with slow path if MAY_NOT_BLOCK has > > been given? If we'd run into "there's something to report" case, we > > are not on the fastpath anymore. IOW, why not have > > audited = avc_audit_required(requested, avd, result, 0, &denied); > > if (likely(!audited)) > > return 0; > > if (flags & MAY_NOT_BLOCK) > > return -ECHILD; > > return slow_avc_audit(state, ssid, tsid, tclass, > > requested, audited, denied, result, > > a, flags); > > in avc_audit() and be done with that? > > That works for me; we would also need to do the same in > selinux_inode_permission(). We can then stop passing flags down to > slow_avc_audit() entirely. I'm new to looking at this code, but that would certainly have helped me to understand it when I was reading it a couple of weeks back! So, for what it's worth, you can count me in favour. Cheers, Will
diff --git a/security/selinux/avc.c b/security/selinux/avc.c index 74c43ebe34bb..f1fa1072230c 100644 --- a/security/selinux/avc.c +++ b/security/selinux/avc.c @@ -779,7 +779,8 @@ noinline int slow_avc_audit(struct selinux_state *state, * during retry. However this is logically just as if the operation * happened a little later. */ - if ((a->type == LSM_AUDIT_DATA_INODE) && + if ((a->type == LSM_AUDIT_DATA_INODE || + a->type == LSM_AUDIT_DATA_DENTRY) && (flags & MAY_NOT_BLOCK)) return -ECHILD;
commit bda0be7ad994 ("security: make inode_follow_link RCU-walk aware") passed down the rcu flag to the SELinux AVC, but failed to adjust the test in slow_avc_audit() to also return -ECHILD on LSM_AUDIT_DATA_DENTRY. Previously, we only returned -ECHILD if generating an audit record with LSM_AUDIT_DATA_INODE since this was only relevant from inode_permission. Return -ECHILD on either LSM_AUDIT_DATA_INODE or LSM_AUDIT_DATA_DENTRY. LSM_AUDIT_DATA_INODE only requires this handling due to the fact that dump_common_audit_data() calls d_find_alias() and collects the dname from the result if any. Other cases that might require similar treatment in the future are LSM_AUDIT_DATA_PATH and LSM_AUDIT_DATA_FILE if any hook that takes a path or file is called under RCU-walk. Fixes: bda0be7ad994 ("security: make inode_follow_link RCU-walk aware") Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> --- security/selinux/avc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)