Message ID | 20220420140633.753772-2-stefanb@linux.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ima: Namespace IMA with audit support in IMA-ns | expand |
On Wed, Apr 20, 2022 at 10:06:08AM -0400, Stefan Berger wrote: > From: Christian Brauner <brauner@kernel.org> > > When securityfs creates a new file or directory via > securityfs_create_dentry() it will take an additional reference on the > newly created dentry after it has attached the new inode to the new > dentry and added it to the hashqueues. > If we contrast this with debugfs which has the same underlying logic as > securityfs. It uses a similar pairing as securityfs. Where securityfs > has the securityfs_create_dentry() and securityfs_remove() pairing, > debugfs has the __debugfs_create_file() and debugfs_remove() pairing. > > In contrast to securityfs, debugfs doesn't take an additional reference > on the newly created dentry in __debugfs_create_file() which would need > to be put in debugfs_remove(). > > The additional dget() isn't a problem per se. In the current > implementation of securityfs each created dentry pins the filesystem via Is 'via' an extra word here or is there a missing word? I'll delay the rest of my response as the missing word may answer my remaining question :) > until it is removed. Since it is virtually guaranteed that there is at > least one user of securityfs that has created dentries the initial > securityfs mount cannot go away until all dentries have been removed. > > Since most of the users of the initial securityfs mount don't go away > until the system is shutdown the initial securityfs won't go away when > unmounted. Instead a mount will usually surface the same superblock as > before. The additional dget() doesn't matter in this scenario since it > is required that all dentries have been cleaned up by the respective > users before the superblock can be destroyed, i.e. superblock shutdown > is tied to the lifetime of the associated dentries. > > However, in order to support ima namespaces we need to extend securityfs > to support being mounted outside of the initial user namespace. For > namespaced users the pinning logic doesn't make sense. Whereas in the > initial namespace the securityfs instance and the associated data > structures of its users can't go away for reason explained earlier users > of non-initial securityfs instances do go away when the last users of > the namespace are gone. > > So for those users we neither want to duplicate the pinning logic nor > make the global securityfs instance display different information based > on the namespace. Both options would be really messy and hacky. > > Instead we will simply give each namespace its own securityfs instance > similar to how each ipc namespace has its own mqueue instance and all > entries in there are cleaned up on umount or when the last user of the > associated namespace is gone. > > This means that the superblock's lifetime isn't tied to the dentries. > Instead the last umount, without any fds kept open, will trigger a clean > shutdown. But now the additional dget() gets in the way. Instead of > being able to rely on the generic superblock shutdown logic we would > need to drop the additional dentry reference during superblock shutdown > for all associated users. That would force the use of a generic > coordination mechanism for current and future users of securityfs which > is unnecessary. Simply remove the additional dget() in > securityfs_dentry_create(). > > In securityfs_remove() we will call dget() to take an additional > reference on the dentry about to be removed. After simple_unlink() or > simple_rmdir() have dropped the dentry refcount we can call d_delete() > which will either turn the dentry into negative dentry if our earlier > dget() is the only reference to the dentry, i.e. it has no other users, > or remove it from the hashqueues in case there are additional users. > > All of these changes should not have any effect on the userspace > semantics of the initial securityfs mount. > > Signed-off-by: Christian Brauner <brauner@kernel.org> > Cc: John Johansen <john.johansen@canonical.com> > Cc: Matthew Garrett <mjg59@srcf.ucam.org> > Cc: Micah Morton <mortonm@chromium.org> > Cc: Kentaro Takeda <takedakn@nttdata.co.jp> > Cc: James Morris <jmorris@namei.org> > Cc: Jarkko Sakkinen <jarkko@kernel.org> > Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> > Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> > --- > security/inode.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/security/inode.c b/security/inode.c > index 6c326939750d..13e6780c4444 100644 > --- a/security/inode.c > +++ b/security/inode.c > @@ -159,7 +159,6 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode, > inode->i_fop = fops; > } > d_instantiate(dentry, inode); > - dget(dentry); > inode_unlock(dir); > return dentry; > > @@ -302,10 +301,12 @@ void securityfs_remove(struct dentry *dentry) > dir = d_inode(dentry->d_parent); > inode_lock(dir); > if (simple_positive(dentry)) { > + dget(dentry); > if (d_is_dir(dentry)) > simple_rmdir(dir, dentry); > else > simple_unlink(dir, dentry); > + d_delete(dentry); > dput(dentry); > } > inode_unlock(dir); > -- > 2.34.1
On Mon, May 09, 2022 at 02:54:14PM -0500, Serge E. Hallyn wrote: > On Wed, Apr 20, 2022 at 10:06:08AM -0400, Stefan Berger wrote: > > From: Christian Brauner <brauner@kernel.org> > > > > When securityfs creates a new file or directory via > > securityfs_create_dentry() it will take an additional reference on the > > newly created dentry after it has attached the new inode to the new > > dentry and added it to the hashqueues. > > If we contrast this with debugfs which has the same underlying logic as > > securityfs. It uses a similar pairing as securityfs. Where securityfs > > has the securityfs_create_dentry() and securityfs_remove() pairing, > > debugfs has the __debugfs_create_file() and debugfs_remove() pairing. > > > > In contrast to securityfs, debugfs doesn't take an additional reference > > on the newly created dentry in __debugfs_create_file() which would need > > to be put in debugfs_remove(). > > > > The additional dget() isn't a problem per se. In the current > > implementation of securityfs each created dentry pins the filesystem via > > Is 'via' an extra word here or is there a missing word? > > I'll delay the rest of my response as the missing word may answer my > remaining question :) > > > until it is removed. Since it is virtually guaranteed that there is at > > least one user of securityfs that has created dentries the initial > > securityfs mount cannot go away until all dentries have been removed. > > > > Since most of the users of the initial securityfs mount don't go away > > until the system is shutdown the initial securityfs won't go away when > > unmounted. Instead a mount will usually surface the same superblock as > > before. The additional dget() doesn't matter in this scenario since it > > is required that all dentries have been cleaned up by the respective > > users before the superblock can be destroyed, i.e. superblock shutdown > > is tied to the lifetime of the associated dentries. > > > > However, in order to support ima namespaces we need to extend securityfs > > to support being mounted outside of the initial user namespace. For > > namespaced users the pinning logic doesn't make sense. Whereas in the > > initial namespace the securityfs instance and the associated data > > structures of its users can't go away for reason explained earlier users > > of non-initial securityfs instances do go away when the last users of > > the namespace are gone. > > > > So for those users we neither want to duplicate the pinning logic nor > > make the global securityfs instance display different information based > > on the namespace. Both options would be really messy and hacky. > > > > Instead we will simply give each namespace its own securityfs instance > > similar to how each ipc namespace has its own mqueue instance and all > > entries in there are cleaned up on umount or when the last user of the > > associated namespace is gone. > > > > This means that the superblock's lifetime isn't tied to the dentries. > > Instead the last umount, without any fds kept open, will trigger a clean > > shutdown. But now the additional dget() gets in the way. Instead of > > being able to rely on the generic superblock shutdown logic we would > > need to drop the additional dentry reference during superblock shutdown > > for all associated users. That would force the use of a generic > > coordination mechanism for current and future users of securityfs which > > is unnecessary. Simply remove the additional dget() in > > securityfs_dentry_create(). > > > > In securityfs_remove() we will call dget() to take an additional > > reference on the dentry about to be removed. After simple_unlink() or > > simple_rmdir() have dropped the dentry refcount we can call d_delete() > > which will either turn the dentry into negative dentry if our earlier > > dget() is the only reference to the dentry, i.e. it has no other users, > > or remove it from the hashqueues in case there are additional users. > > > > All of these changes should not have any effect on the userspace > > semantics of the initial securityfs mount. > > > > Signed-off-by: Christian Brauner <brauner@kernel.org> > > Cc: John Johansen <john.johansen@canonical.com> > > Cc: Matthew Garrett <mjg59@srcf.ucam.org> > > Cc: Micah Morton <mortonm@chromium.org> > > Cc: Kentaro Takeda <takedakn@nttdata.co.jp> > > Cc: James Morris <jmorris@namei.org> > > Cc: Jarkko Sakkinen <jarkko@kernel.org> > > Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> > > Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> > > --- > > security/inode.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/security/inode.c b/security/inode.c > > index 6c326939750d..13e6780c4444 100644 > > --- a/security/inode.c > > +++ b/security/inode.c > > @@ -159,7 +159,6 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode, > > inode->i_fop = fops; > > } > > d_instantiate(dentry, inode); > > - dget(dentry); > > inode_unlock(dir); > > return dentry; > > > > @@ -302,10 +301,12 @@ void securityfs_remove(struct dentry *dentry) > > dir = d_inode(dentry->d_parent); > > inode_lock(dir); > > if (simple_positive(dentry)) { > > + dget(dentry); > > if (d_is_dir(dentry)) > > simple_rmdir(dir, dentry); Hm, so I realize your patch isn't introducing this, but is the fact that we ignore the possible -ENOTEMPTY return value of simple_rmdir() not a problem? > > else > > simple_unlink(dir, dentry); > > + d_delete(dentry); I'm mostly trying to convince myself that the fact that there is not a matching dput being removed (to match the dget being removed above) is ok. I do think it is, but that belief seems to dictate that right now dentries must never be being released. Otherwise, it seems like there must be cases where the next dput could be called on a dentry that has been freed. > > dput(dentry); > > } > > inode_unlock(dir); > > -- > > 2.34.1
On Mon, May 9, 2022 at 11:36 PM Serge E. Hallyn <serge@hallyn.com> wrote: > > On Mon, May 09, 2022 at 02:54:14PM -0500, Serge E. Hallyn wrote: > > On Wed, Apr 20, 2022 at 10:06:08AM -0400, Stefan Berger wrote: > > > From: Christian Brauner <brauner@kernel.org> > > > > > > When securityfs creates a new file or directory via > > > securityfs_create_dentry() it will take an additional reference on the > > > newly created dentry after it has attached the new inode to the new > > > dentry and added it to the hashqueues. > > > If we contrast this with debugfs which has the same underlying logic as > > > securityfs. It uses a similar pairing as securityfs. Where securityfs > > > has the securityfs_create_dentry() and securityfs_remove() pairing, > > > debugfs has the __debugfs_create_file() and debugfs_remove() pairing. > > > > > > In contrast to securityfs, debugfs doesn't take an additional reference > > > on the newly created dentry in __debugfs_create_file() which would need > > > to be put in debugfs_remove(). > > > > > > The additional dget() isn't a problem per se. In the current > > > implementation of securityfs each created dentry pins the filesystem via > > > > Is 'via' an extra word here or is there a missing word? > > > > I'll delay the rest of my response as the missing word may answer my > > remaining question :) > > > > > until it is removed. Since it is virtually guaranteed that there is at > > > least one user of securityfs that has created dentries the initial > > > securityfs mount cannot go away until all dentries have been removed. > > > > > > Since most of the users of the initial securityfs mount don't go away > > > until the system is shutdown the initial securityfs won't go away when > > > unmounted. Instead a mount will usually surface the same superblock as > > > before. The additional dget() doesn't matter in this scenario since it > > > is required that all dentries have been cleaned up by the respective > > > users before the superblock can be destroyed, i.e. superblock shutdown > > > is tied to the lifetime of the associated dentries. > > > > > > However, in order to support ima namespaces we need to extend securityfs > > > to support being mounted outside of the initial user namespace. For > > > namespaced users the pinning logic doesn't make sense. Whereas in the > > > initial namespace the securityfs instance and the associated data > > > structures of its users can't go away for reason explained earlier users > > > of non-initial securityfs instances do go away when the last users of > > > the namespace are gone. > > > > > > So for those users we neither want to duplicate the pinning logic nor > > > make the global securityfs instance display different information based > > > on the namespace. Both options would be really messy and hacky. > > > > > > Instead we will simply give each namespace its own securityfs instance > > > similar to how each ipc namespace has its own mqueue instance and all > > > entries in there are cleaned up on umount or when the last user of the > > > associated namespace is gone. > > > > > > This means that the superblock's lifetime isn't tied to the dentries. > > > Instead the last umount, without any fds kept open, will trigger a clean > > > shutdown. But now the additional dget() gets in the way. Instead of > > > being able to rely on the generic superblock shutdown logic we would > > > need to drop the additional dentry reference during superblock shutdown > > > for all associated users. That would force the use of a generic > > > coordination mechanism for current and future users of securityfs which > > > is unnecessary. Simply remove the additional dget() in > > > securityfs_dentry_create(). > > > > > > In securityfs_remove() we will call dget() to take an additional > > > reference on the dentry about to be removed. After simple_unlink() or > > > simple_rmdir() have dropped the dentry refcount we can call d_delete() > > > which will either turn the dentry into negative dentry if our earlier > > > dget() is the only reference to the dentry, i.e. it has no other users, > > > or remove it from the hashqueues in case there are additional users. > > > The first case (turn negative) cannot happen because the function is entered with at least 1 refcount and increments it by 1. So you can follow commit 46c46f8df9aa ("devpts_pty_kill(): don't bother with d_delete()") and use d_drop() instead. > > > All of these changes should not have any effect on the userspace > > > semantics of the initial securityfs mount. > > > > > > Signed-off-by: Christian Brauner <brauner@kernel.org> > > > Cc: John Johansen <john.johansen@canonical.com> > > > Cc: Matthew Garrett <mjg59@srcf.ucam.org> > > > Cc: Micah Morton <mortonm@chromium.org> > > > Cc: Kentaro Takeda <takedakn@nttdata.co.jp> > > > Cc: James Morris <jmorris@namei.org> > > > Cc: Jarkko Sakkinen <jarkko@kernel.org> > > > Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> > > > Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> > > > --- > > > security/inode.c | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/security/inode.c b/security/inode.c > > > index 6c326939750d..13e6780c4444 100644 > > > --- a/security/inode.c > > > +++ b/security/inode.c > > > @@ -159,7 +159,6 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode, > > > inode->i_fop = fops; > > > } > > > d_instantiate(dentry, inode); > > > - dget(dentry); > > > inode_unlock(dir); > > > return dentry; > > > > > > @@ -302,10 +301,12 @@ void securityfs_remove(struct dentry *dentry) > > > dir = d_inode(dentry->d_parent); > > > inode_lock(dir); > > > if (simple_positive(dentry)) { > > > + dget(dentry); > > > if (d_is_dir(dentry)) > > > simple_rmdir(dir, dentry); > > Hm, so I realize your patch isn't introducing this, but is the > fact that we ignore the possible -ENOTEMPTY return value of > simple_rmdir() not a problem? As long as we are using debugfs as a reference code, wouldn't securityfs need to use simple_recursive_removal()? Can we guaranty that modules always cleanup all entries in correct order? > > > > else > > > simple_unlink(dir, dentry); > > > + d_delete(dentry); > d_drop() (see comment above) > I'm mostly trying to convince myself that the fact that there is not > a matching dput being removed (to match the dget being removed above) > is ok. I do think it is, but that belief seems to dictate that right > now dentries must never be being released. > > Otherwise, it seems like there must be cases where the next dput could > be called on a dentry that has been freed. > > > > dput(dentry); Huh? There must be a ref to dentry when entering this function and there is dget() added above so balance is not lost. Thanks, Amir.
On Mon, May 09, 2022 at 02:54:14PM -0500, Serge Hallyn wrote: > On Wed, Apr 20, 2022 at 10:06:08AM -0400, Stefan Berger wrote: > > From: Christian Brauner <brauner@kernel.org> > > > > When securityfs creates a new file or directory via > > securityfs_create_dentry() it will take an additional reference on the > > newly created dentry after it has attached the new inode to the new > > dentry and added it to the hashqueues. > > If we contrast this with debugfs which has the same underlying logic as > > securityfs. It uses a similar pairing as securityfs. Where securityfs > > has the securityfs_create_dentry() and securityfs_remove() pairing, > > debugfs has the __debugfs_create_file() and debugfs_remove() pairing. > > > > In contrast to securityfs, debugfs doesn't take an additional reference > > on the newly created dentry in __debugfs_create_file() which would need > > to be put in debugfs_remove(). > > > > The additional dget() isn't a problem per se. In the current > > implementation of securityfs each created dentry pins the filesystem via > > Is 'via' an extra word here or is there a missing word? > > I'll delay the rest of my response as the missing word may answer my > remaining question :) It can be both. It should either be removed or it should be followed by "securityfs_create_dentry()". securityfs_create_dentry() takes two references one in lookup_one_len() and another one explicitly via dget(). The latter one isn't needed. Some of that has been covered in an earlier thread: https://lore.kernel.org/lkml/20220105101815.ldsm4s5yx7pmuiil@wittgenstein
On Mon, May 09, 2022 at 03:36:18PM -0500, Serge Hallyn wrote: > On Mon, May 09, 2022 at 02:54:14PM -0500, Serge E. Hallyn wrote: > > On Wed, Apr 20, 2022 at 10:06:08AM -0400, Stefan Berger wrote: > > > From: Christian Brauner <brauner@kernel.org> > > > > > > When securityfs creates a new file or directory via > > > securityfs_create_dentry() it will take an additional reference on the > > > newly created dentry after it has attached the new inode to the new > > > dentry and added it to the hashqueues. > > > If we contrast this with debugfs which has the same underlying logic as > > > securityfs. It uses a similar pairing as securityfs. Where securityfs > > > has the securityfs_create_dentry() and securityfs_remove() pairing, > > > debugfs has the __debugfs_create_file() and debugfs_remove() pairing. > > > > > > In contrast to securityfs, debugfs doesn't take an additional reference > > > on the newly created dentry in __debugfs_create_file() which would need > > > to be put in debugfs_remove(). > > > > > > The additional dget() isn't a problem per se. In the current > > > implementation of securityfs each created dentry pins the filesystem via > > > > Is 'via' an extra word here or is there a missing word? > > > > I'll delay the rest of my response as the missing word may answer my > > remaining question :) > > > > > until it is removed. Since it is virtually guaranteed that there is at > > > least one user of securityfs that has created dentries the initial > > > securityfs mount cannot go away until all dentries have been removed. > > > > > > Since most of the users of the initial securityfs mount don't go away > > > until the system is shutdown the initial securityfs won't go away when > > > unmounted. Instead a mount will usually surface the same superblock as > > > before. The additional dget() doesn't matter in this scenario since it > > > is required that all dentries have been cleaned up by the respective > > > users before the superblock can be destroyed, i.e. superblock shutdown > > > is tied to the lifetime of the associated dentries. > > > > > > However, in order to support ima namespaces we need to extend securityfs > > > to support being mounted outside of the initial user namespace. For > > > namespaced users the pinning logic doesn't make sense. Whereas in the > > > initial namespace the securityfs instance and the associated data > > > structures of its users can't go away for reason explained earlier users > > > of non-initial securityfs instances do go away when the last users of > > > the namespace are gone. > > > > > > So for those users we neither want to duplicate the pinning logic nor > > > make the global securityfs instance display different information based > > > on the namespace. Both options would be really messy and hacky. > > > > > > Instead we will simply give each namespace its own securityfs instance > > > similar to how each ipc namespace has its own mqueue instance and all > > > entries in there are cleaned up on umount or when the last user of the > > > associated namespace is gone. > > > > > > This means that the superblock's lifetime isn't tied to the dentries. > > > Instead the last umount, without any fds kept open, will trigger a clean > > > shutdown. But now the additional dget() gets in the way. Instead of > > > being able to rely on the generic superblock shutdown logic we would > > > need to drop the additional dentry reference during superblock shutdown > > > for all associated users. That would force the use of a generic > > > coordination mechanism for current and future users of securityfs which > > > is unnecessary. Simply remove the additional dget() in > > > securityfs_dentry_create(). > > > > > > In securityfs_remove() we will call dget() to take an additional > > > reference on the dentry about to be removed. After simple_unlink() or > > > simple_rmdir() have dropped the dentry refcount we can call d_delete() > > > which will either turn the dentry into negative dentry if our earlier > > > dget() is the only reference to the dentry, i.e. it has no other users, > > > or remove it from the hashqueues in case there are additional users. > > > > > > All of these changes should not have any effect on the userspace > > > semantics of the initial securityfs mount. > > > > > > Signed-off-by: Christian Brauner <brauner@kernel.org> > > > Cc: John Johansen <john.johansen@canonical.com> > > > Cc: Matthew Garrett <mjg59@srcf.ucam.org> > > > Cc: Micah Morton <mortonm@chromium.org> > > > Cc: Kentaro Takeda <takedakn@nttdata.co.jp> > > > Cc: James Morris <jmorris@namei.org> > > > Cc: Jarkko Sakkinen <jarkko@kernel.org> > > > Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> > > > Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> > > > --- > > > security/inode.c | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/security/inode.c b/security/inode.c > > > index 6c326939750d..13e6780c4444 100644 > > > --- a/security/inode.c > > > +++ b/security/inode.c > > > @@ -159,7 +159,6 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode, > > > inode->i_fop = fops; > > > } > > > d_instantiate(dentry, inode); > > > - dget(dentry); > > > inode_unlock(dir); > > > return dentry; > > > > > > @@ -302,10 +301,12 @@ void securityfs_remove(struct dentry *dentry) > > > dir = d_inode(dentry->d_parent); > > > inode_lock(dir); > > > if (simple_positive(dentry)) { > > > + dget(dentry); > > > if (d_is_dir(dentry)) > > > simple_rmdir(dir, dentry); > > Hm, so I realize your patch isn't introducing this, but is the > fact that we ignore the possible -ENOTEMPTY return value of > simple_rmdir() not a problem? > > > > else > > > simple_unlink(dir, dentry); > > > + d_delete(dentry); > > I'm mostly trying to convince myself that the fact that there is not > a matching dput being removed (to match the dget being removed above) > is ok. I do think it is, but that belief seems to dictate that right > now dentries must never be being released. > > Otherwise, it seems like there must be cases where the next dput could > be called on a dentry that has been freed. I think that's answered by Amir in the next mail already. So I'm skipping to that part of the thread.
On Tue, May 10, 2022 at 11:43:13AM +0300, Amir Goldstein wrote: > On Mon, May 9, 2022 at 11:36 PM Serge E. Hallyn <serge@hallyn.com> wrote: > > > > On Mon, May 09, 2022 at 02:54:14PM -0500, Serge E. Hallyn wrote: > > > On Wed, Apr 20, 2022 at 10:06:08AM -0400, Stefan Berger wrote: > > > > From: Christian Brauner <brauner@kernel.org> > > > > > > > > When securityfs creates a new file or directory via > > > > securityfs_create_dentry() it will take an additional reference on the > > > > newly created dentry after it has attached the new inode to the new > > > > dentry and added it to the hashqueues. > > > > If we contrast this with debugfs which has the same underlying logic as > > > > securityfs. It uses a similar pairing as securityfs. Where securityfs > > > > has the securityfs_create_dentry() and securityfs_remove() pairing, > > > > debugfs has the __debugfs_create_file() and debugfs_remove() pairing. > > > > > > > > In contrast to securityfs, debugfs doesn't take an additional reference > > > > on the newly created dentry in __debugfs_create_file() which would need > > > > to be put in debugfs_remove(). > > > > > > > > The additional dget() isn't a problem per se. In the current > > > > implementation of securityfs each created dentry pins the filesystem via > > > > > > Is 'via' an extra word here or is there a missing word? > > > > > > I'll delay the rest of my response as the missing word may answer my > > > remaining question :) > > > > > > > until it is removed. Since it is virtually guaranteed that there is at > > > > least one user of securityfs that has created dentries the initial > > > > securityfs mount cannot go away until all dentries have been removed. > > > > > > > > Since most of the users of the initial securityfs mount don't go away > > > > until the system is shutdown the initial securityfs won't go away when > > > > unmounted. Instead a mount will usually surface the same superblock as > > > > before. The additional dget() doesn't matter in this scenario since it > > > > is required that all dentries have been cleaned up by the respective > > > > users before the superblock can be destroyed, i.e. superblock shutdown > > > > is tied to the lifetime of the associated dentries. > > > > > > > > However, in order to support ima namespaces we need to extend securityfs > > > > to support being mounted outside of the initial user namespace. For > > > > namespaced users the pinning logic doesn't make sense. Whereas in the > > > > initial namespace the securityfs instance and the associated data > > > > structures of its users can't go away for reason explained earlier users > > > > of non-initial securityfs instances do go away when the last users of > > > > the namespace are gone. > > > > > > > > So for those users we neither want to duplicate the pinning logic nor > > > > make the global securityfs instance display different information based > > > > on the namespace. Both options would be really messy and hacky. > > > > > > > > Instead we will simply give each namespace its own securityfs instance > > > > similar to how each ipc namespace has its own mqueue instance and all > > > > entries in there are cleaned up on umount or when the last user of the > > > > associated namespace is gone. > > > > > > > > This means that the superblock's lifetime isn't tied to the dentries. > > > > Instead the last umount, without any fds kept open, will trigger a clean > > > > shutdown. But now the additional dget() gets in the way. Instead of > > > > being able to rely on the generic superblock shutdown logic we would > > > > need to drop the additional dentry reference during superblock shutdown > > > > for all associated users. That would force the use of a generic > > > > coordination mechanism for current and future users of securityfs which > > > > is unnecessary. Simply remove the additional dget() in > > > > securityfs_dentry_create(). > > > > > > > > In securityfs_remove() we will call dget() to take an additional > > > > reference on the dentry about to be removed. After simple_unlink() or > > > > simple_rmdir() have dropped the dentry refcount we can call d_delete() > > > > which will either turn the dentry into negative dentry if our earlier > > > > dget() is the only reference to the dentry, i.e. it has no other users, > > > > or remove it from the hashqueues in case there are additional users. > > > > > > The first case (turn negative) cannot happen because the function is > entered with at least 1 refcount and increments it by 1. > So you can follow commit 46c46f8df9aa ("devpts_pty_kill(): don't bother > with d_delete()") and use d_drop() instead. > > > > > All of these changes should not have any effect on the userspace > > > > semantics of the initial securityfs mount. > > > > > > > > Signed-off-by: Christian Brauner <brauner@kernel.org> > > > > Cc: John Johansen <john.johansen@canonical.com> > > > > Cc: Matthew Garrett <mjg59@srcf.ucam.org> > > > > Cc: Micah Morton <mortonm@chromium.org> > > > > Cc: Kentaro Takeda <takedakn@nttdata.co.jp> > > > > Cc: James Morris <jmorris@namei.org> > > > > Cc: Jarkko Sakkinen <jarkko@kernel.org> > > > > Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> > > > > Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> > > > > --- > > > > security/inode.c | 3 ++- > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/security/inode.c b/security/inode.c > > > > index 6c326939750d..13e6780c4444 100644 > > > > --- a/security/inode.c > > > > +++ b/security/inode.c > > > > @@ -159,7 +159,6 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode, > > > > inode->i_fop = fops; > > > > } > > > > d_instantiate(dentry, inode); > > > > - dget(dentry); > > > > inode_unlock(dir); > > > > return dentry; > > > > > > > > @@ -302,10 +301,12 @@ void securityfs_remove(struct dentry *dentry) > > > > dir = d_inode(dentry->d_parent); > > > > inode_lock(dir); > > > > if (simple_positive(dentry)) { > > > > + dget(dentry); > > > > if (d_is_dir(dentry)) > > > > simple_rmdir(dir, dentry); > > > > Hm, so I realize your patch isn't introducing this, but is the > > fact that we ignore the possible -ENOTEMPTY return value of > > simple_rmdir() not a problem? > > As long as we are using debugfs as a reference code, wouldn't > securityfs need to use simple_recursive_removal()? > Can we guaranty that modules always cleanup all entries in > correct order? We could but that seems like a separate cleanup patch. This patch became part of the series because we want non-initial ima namespaces to guarantee cleanup on securityfs umount. That's different for the initial securityfs mount which is alwasy going to be around. The patch is intended to this a little cleaner to implement.
On Tue, May 10, 2022 at 12:25:25PM +0200, Christian Brauner wrote: > On Mon, May 09, 2022 at 02:54:14PM -0500, Serge Hallyn wrote: > > On Wed, Apr 20, 2022 at 10:06:08AM -0400, Stefan Berger wrote: > > > From: Christian Brauner <brauner@kernel.org> > > > > > > When securityfs creates a new file or directory via > > > securityfs_create_dentry() it will take an additional reference on the > > > newly created dentry after it has attached the new inode to the new > > > dentry and added it to the hashqueues. > > > If we contrast this with debugfs which has the same underlying logic as > > > securityfs. It uses a similar pairing as securityfs. Where securityfs > > > has the securityfs_create_dentry() and securityfs_remove() pairing, > > > debugfs has the __debugfs_create_file() and debugfs_remove() pairing. > > > > > > In contrast to securityfs, debugfs doesn't take an additional reference > > > on the newly created dentry in __debugfs_create_file() which would need > > > to be put in debugfs_remove(). > > > > > > The additional dget() isn't a problem per se. In the current > > > implementation of securityfs each created dentry pins the filesystem via > > > > Is 'via' an extra word here or is there a missing word? > > > > I'll delay the rest of my response as the missing word may answer my > > remaining question :) > > It can be both. It should either be removed or it should be followed by > "securityfs_create_dentry()". securityfs_create_dentry() takes two > references one in lookup_one_len() and another one explicitly via > dget(). The latter one isn't needed. Some of that has been covered in an > earlier thread: > https://lore.kernel.org/lkml/20220105101815.ldsm4s5yx7pmuiil@wittgenstein Yes, I saw that two references were being taken. And near as I can tell, the second one was never being dropped. So if you tell me that before this patch the dentries are never freed, then I'm happy. Otherwise, I'm bothered the fact that no matching dput is being deleted in the code (to match the extra dget being removed). So where is the code where the final dput was happening, and is it the d_delete() you're adding which is making it so that that dput won't be called now?
On Tue, May 10, 2022 at 12:38:17PM +0200, Christian Brauner wrote: > On Tue, May 10, 2022 at 11:43:13AM +0300, Amir Goldstein wrote: > > On Mon, May 9, 2022 at 11:36 PM Serge E. Hallyn <serge@hallyn.com> wrote: > > > > > > On Mon, May 09, 2022 at 02:54:14PM -0500, Serge E. Hallyn wrote: > > > > On Wed, Apr 20, 2022 at 10:06:08AM -0400, Stefan Berger wrote: > > > > > From: Christian Brauner <brauner@kernel.org> > > > > > > > > > > When securityfs creates a new file or directory via > > > > > securityfs_create_dentry() it will take an additional reference on the > > > > > newly created dentry after it has attached the new inode to the new > > > > > dentry and added it to the hashqueues. > > > > > If we contrast this with debugfs which has the same underlying logic as > > > > > securityfs. It uses a similar pairing as securityfs. Where securityfs > > > > > has the securityfs_create_dentry() and securityfs_remove() pairing, > > > > > debugfs has the __debugfs_create_file() and debugfs_remove() pairing. > > > > > > > > > > In contrast to securityfs, debugfs doesn't take an additional reference > > > > > on the newly created dentry in __debugfs_create_file() which would need > > > > > to be put in debugfs_remove(). > > > > > > > > > > The additional dget() isn't a problem per se. In the current > > > > > implementation of securityfs each created dentry pins the filesystem via > > > > > > > > Is 'via' an extra word here or is there a missing word? > > > > > > > > I'll delay the rest of my response as the missing word may answer my > > > > remaining question :) > > > > > > > > > until it is removed. Since it is virtually guaranteed that there is at > > > > > least one user of securityfs that has created dentries the initial > > > > > securityfs mount cannot go away until all dentries have been removed. > > > > > > > > > > Since most of the users of the initial securityfs mount don't go away > > > > > until the system is shutdown the initial securityfs won't go away when > > > > > unmounted. Instead a mount will usually surface the same superblock as > > > > > before. The additional dget() doesn't matter in this scenario since it > > > > > is required that all dentries have been cleaned up by the respective > > > > > users before the superblock can be destroyed, i.e. superblock shutdown > > > > > is tied to the lifetime of the associated dentries. > > > > > > > > > > However, in order to support ima namespaces we need to extend securityfs > > > > > to support being mounted outside of the initial user namespace. For > > > > > namespaced users the pinning logic doesn't make sense. Whereas in the > > > > > initial namespace the securityfs instance and the associated data > > > > > structures of its users can't go away for reason explained earlier users > > > > > of non-initial securityfs instances do go away when the last users of > > > > > the namespace are gone. > > > > > > > > > > So for those users we neither want to duplicate the pinning logic nor > > > > > make the global securityfs instance display different information based > > > > > on the namespace. Both options would be really messy and hacky. > > > > > > > > > > Instead we will simply give each namespace its own securityfs instance > > > > > similar to how each ipc namespace has its own mqueue instance and all > > > > > entries in there are cleaned up on umount or when the last user of the > > > > > associated namespace is gone. > > > > > > > > > > This means that the superblock's lifetime isn't tied to the dentries. > > > > > Instead the last umount, without any fds kept open, will trigger a clean > > > > > shutdown. But now the additional dget() gets in the way. Instead of > > > > > being able to rely on the generic superblock shutdown logic we would > > > > > need to drop the additional dentry reference during superblock shutdown > > > > > for all associated users. That would force the use of a generic > > > > > coordination mechanism for current and future users of securityfs which > > > > > is unnecessary. Simply remove the additional dget() in > > > > > securityfs_dentry_create(). > > > > > > > > > > In securityfs_remove() we will call dget() to take an additional > > > > > reference on the dentry about to be removed. After simple_unlink() or > > > > > simple_rmdir() have dropped the dentry refcount we can call d_delete() > > > > > which will either turn the dentry into negative dentry if our earlier > > > > > dget() is the only reference to the dentry, i.e. it has no other users, > > > > > or remove it from the hashqueues in case there are additional users. > > > > > > > > > The first case (turn negative) cannot happen because the function is > > entered with at least 1 refcount and increments it by 1. > > So you can follow commit 46c46f8df9aa ("devpts_pty_kill(): don't bother > > with d_delete()") and use d_drop() instead. > > > > > > > All of these changes should not have any effect on the userspace > > > > > semantics of the initial securityfs mount. > > > > > > > > > > Signed-off-by: Christian Brauner <brauner@kernel.org> > > > > > Cc: John Johansen <john.johansen@canonical.com> > > > > > Cc: Matthew Garrett <mjg59@srcf.ucam.org> > > > > > Cc: Micah Morton <mortonm@chromium.org> > > > > > Cc: Kentaro Takeda <takedakn@nttdata.co.jp> > > > > > Cc: James Morris <jmorris@namei.org> > > > > > Cc: Jarkko Sakkinen <jarkko@kernel.org> > > > > > Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> > > > > > Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> > > > > > --- > > > > > security/inode.c | 3 ++- > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/security/inode.c b/security/inode.c > > > > > index 6c326939750d..13e6780c4444 100644 > > > > > --- a/security/inode.c > > > > > +++ b/security/inode.c > > > > > @@ -159,7 +159,6 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode, > > > > > inode->i_fop = fops; > > > > > } > > > > > d_instantiate(dentry, inode); > > > > > - dget(dentry); > > > > > inode_unlock(dir); > > > > > return dentry; > > > > > > > > > > @@ -302,10 +301,12 @@ void securityfs_remove(struct dentry *dentry) > > > > > dir = d_inode(dentry->d_parent); > > > > > inode_lock(dir); > > > > > if (simple_positive(dentry)) { > > > > > + dget(dentry); > > > > > if (d_is_dir(dentry)) > > > > > simple_rmdir(dir, dentry); > > > > > > Hm, so I realize your patch isn't introducing this, but is the > > > fact that we ignore the possible -ENOTEMPTY return value of > > > simple_rmdir() not a problem? > > > > As long as we are using debugfs as a reference code, wouldn't > > securityfs need to use simple_recursive_removal()? > > Can we guaranty that modules always cleanup all entries in > > correct order? > > We could but that seems like a separate cleanup patch. Yes, I'm not saying this set should fix it, just something that caught my eye. Thanks. > This patch became part of the series because we want non-initial ima > namespaces to guarantee cleanup on securityfs umount. That's different > for the initial securityfs mount which is alwasy going to be around. The > patch is intended to this a little cleaner to implement.
On Tue, May 10, 2022 at 11:43:13AM +0300, Amir Goldstein wrote: > On Mon, May 9, 2022 at 11:36 PM Serge E. Hallyn <serge@hallyn.com> wrote: > > > > On Mon, May 09, 2022 at 02:54:14PM -0500, Serge E. Hallyn wrote: > > > On Wed, Apr 20, 2022 at 10:06:08AM -0400, Stefan Berger wrote: > > > > From: Christian Brauner <brauner@kernel.org> > > > > > > > > When securityfs creates a new file or directory via > > > > securityfs_create_dentry() it will take an additional reference on the > > > > newly created dentry after it has attached the new inode to the new > > > > dentry and added it to the hashqueues. > > > > If we contrast this with debugfs which has the same underlying logic as > > > > securityfs. It uses a similar pairing as securityfs. Where securityfs > > > > has the securityfs_create_dentry() and securityfs_remove() pairing, > > > > debugfs has the __debugfs_create_file() and debugfs_remove() pairing. > > > > > > > > In contrast to securityfs, debugfs doesn't take an additional reference > > > > on the newly created dentry in __debugfs_create_file() which would need > > > > to be put in debugfs_remove(). > > > > > > > > The additional dget() isn't a problem per se. In the current > > > > implementation of securityfs each created dentry pins the filesystem via > > > > > > Is 'via' an extra word here or is there a missing word? > > > > > > I'll delay the rest of my response as the missing word may answer my > > > remaining question :) > > > > > > > until it is removed. Since it is virtually guaranteed that there is at > > > > least one user of securityfs that has created dentries the initial > > > > securityfs mount cannot go away until all dentries have been removed. > > > > > > > > Since most of the users of the initial securityfs mount don't go away > > > > until the system is shutdown the initial securityfs won't go away when > > > > unmounted. Instead a mount will usually surface the same superblock as > > > > before. The additional dget() doesn't matter in this scenario since it > > > > is required that all dentries have been cleaned up by the respective > > > > users before the superblock can be destroyed, i.e. superblock shutdown > > > > is tied to the lifetime of the associated dentries. > > > > > > > > However, in order to support ima namespaces we need to extend securityfs > > > > to support being mounted outside of the initial user namespace. For > > > > namespaced users the pinning logic doesn't make sense. Whereas in the > > > > initial namespace the securityfs instance and the associated data > > > > structures of its users can't go away for reason explained earlier users > > > > of non-initial securityfs instances do go away when the last users of > > > > the namespace are gone. > > > > > > > > So for those users we neither want to duplicate the pinning logic nor > > > > make the global securityfs instance display different information based > > > > on the namespace. Both options would be really messy and hacky. > > > > > > > > Instead we will simply give each namespace its own securityfs instance > > > > similar to how each ipc namespace has its own mqueue instance and all > > > > entries in there are cleaned up on umount or when the last user of the > > > > associated namespace is gone. > > > > > > > > This means that the superblock's lifetime isn't tied to the dentries. > > > > Instead the last umount, without any fds kept open, will trigger a clean > > > > shutdown. But now the additional dget() gets in the way. Instead of > > > > being able to rely on the generic superblock shutdown logic we would > > > > need to drop the additional dentry reference during superblock shutdown > > > > for all associated users. That would force the use of a generic > > > > coordination mechanism for current and future users of securityfs which > > > > is unnecessary. Simply remove the additional dget() in > > > > securityfs_dentry_create(). > > > > > > > > In securityfs_remove() we will call dget() to take an additional > > > > reference on the dentry about to be removed. After simple_unlink() or > > > > simple_rmdir() have dropped the dentry refcount we can call d_delete() > > > > which will either turn the dentry into negative dentry if our earlier > > > > dget() is the only reference to the dentry, i.e. it has no other users, > > > > or remove it from the hashqueues in case there are additional users. > > > > > > The first case (turn negative) cannot happen because the function is > entered with at least 1 refcount and increments it by 1. > So you can follow commit 46c46f8df9aa ("devpts_pty_kill(): don't bother > with d_delete()") and use d_drop() instead. > > > > > All of these changes should not have any effect on the userspace > > > > semantics of the initial securityfs mount. > > > > > > > > Signed-off-by: Christian Brauner <brauner@kernel.org> > > > > Cc: John Johansen <john.johansen@canonical.com> > > > > Cc: Matthew Garrett <mjg59@srcf.ucam.org> > > > > Cc: Micah Morton <mortonm@chromium.org> > > > > Cc: Kentaro Takeda <takedakn@nttdata.co.jp> > > > > Cc: James Morris <jmorris@namei.org> > > > > Cc: Jarkko Sakkinen <jarkko@kernel.org> > > > > Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> > > > > Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> > > > > --- > > > > security/inode.c | 3 ++- > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/security/inode.c b/security/inode.c > > > > index 6c326939750d..13e6780c4444 100644 > > > > --- a/security/inode.c > > > > +++ b/security/inode.c > > > > @@ -159,7 +159,6 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode, > > > > inode->i_fop = fops; > > > > } > > > > d_instantiate(dentry, inode); > > > > - dget(dentry); > > > > inode_unlock(dir); > > > > return dentry; > > > > > > > > @@ -302,10 +301,12 @@ void securityfs_remove(struct dentry *dentry) > > > > dir = d_inode(dentry->d_parent); > > > > inode_lock(dir); > > > > if (simple_positive(dentry)) { > > > > + dget(dentry); > > > > if (d_is_dir(dentry)) > > > > simple_rmdir(dir, dentry); > > > > Hm, so I realize your patch isn't introducing this, but is the > > fact that we ignore the possible -ENOTEMPTY return value of > > simple_rmdir() not a problem? > > As long as we are using debugfs as a reference code, wouldn't > securityfs need to use simple_recursive_removal()? > Can we guaranty that modules always cleanup all entries in > correct order? > > > > > > > else > > > > simple_unlink(dir, dentry); > > > > + d_delete(dentry); > > > > d_drop() (see comment above) > > > I'm mostly trying to convince myself that the fact that there is not > > a matching dput being removed (to match the dget being removed above) > > is ok. I do think it is, but that belief seems to dictate that right > > now dentries must never be being released. > > > > Otherwise, it seems like there must be cases where the next dput could > > be called on a dentry that has been freed. > > > > > > dput(dentry); > > Huh? There must be a ref to dentry when entering this function > and there is dget() added above so balance is not lost. Like, I said, i think the answer is that before this patch there the dentry counts never reach 0. But we are removing a dget and not removing any matching dput, so if they reached 0 before, then my question is where was that happening and is that code still safe after this patch.
On Tue, May 10, 2022 at 09:10:25AM -0500, Serge Hallyn wrote: > On Tue, May 10, 2022 at 12:25:25PM +0200, Christian Brauner wrote: > > On Mon, May 09, 2022 at 02:54:14PM -0500, Serge Hallyn wrote: > > > On Wed, Apr 20, 2022 at 10:06:08AM -0400, Stefan Berger wrote: > > > > From: Christian Brauner <brauner@kernel.org> > > > > > > > > When securityfs creates a new file or directory via > > > > securityfs_create_dentry() it will take an additional reference on the > > > > newly created dentry after it has attached the new inode to the new > > > > dentry and added it to the hashqueues. > > > > If we contrast this with debugfs which has the same underlying logic as > > > > securityfs. It uses a similar pairing as securityfs. Where securityfs > > > > has the securityfs_create_dentry() and securityfs_remove() pairing, > > > > debugfs has the __debugfs_create_file() and debugfs_remove() pairing. > > > > > > > > In contrast to securityfs, debugfs doesn't take an additional reference > > > > on the newly created dentry in __debugfs_create_file() which would need > > > > to be put in debugfs_remove(). > > > > > > > > The additional dget() isn't a problem per se. In the current > > > > implementation of securityfs each created dentry pins the filesystem via > > > > > > Is 'via' an extra word here or is there a missing word? > > > > > > I'll delay the rest of my response as the missing word may answer my > > > remaining question :) > > > > It can be both. It should either be removed or it should be followed by > > "securityfs_create_dentry()". securityfs_create_dentry() takes two > > references one in lookup_one_len() and another one explicitly via > > dget(). The latter one isn't needed. Some of that has been covered in an > > earlier thread: > > https://lore.kernel.org/lkml/20220105101815.ldsm4s5yx7pmuiil@wittgenstein > > Yes, I saw that two references were being taken. And near as I can tell, > the second one was never being dropped. So if you tell me that before this > patch the dentries are never freed, then I'm happy. Otherwise, I'm > bothered the fact that no matching dput is being deleted in the code (to > match the extra dget being removed). So where is the code where the final > dput was happening, and is it the d_delete() you're adding which is making > it so that that dput won't be called now? * So consider mounting securityfs _without this patch applied_: mount -t securityfs /sfs and assume we only have a single user that creates a file "foo" via securityfs_create_file() { lookup_one_len(); // first dget() dget(); // second dget() } now assume that user at some point calls void securityfs_remove() { if (d_is_dir(dentry)) simple_rmdir(dir, dentry); // first dput() else simple_unlink(dir, dentry); // first dput() dput(dentry); // second dput() } * Now consider mounting securityfs _with this patch applied_: securityfs_create_file() { lookup_one_len(); // first dget() } void securityfs_remove() { dget(); // second dget() if (d_is_dir(dentry)) simple_rmdir(dir, dentry); // first dput() else simple_unlink(dir, dentry); // first dput() dput(dentry); // second dput() }
On 5/10/22 06:25, Christian Brauner wrote: > On Mon, May 09, 2022 at 02:54:14PM -0500, Serge Hallyn wrote: >> On Wed, Apr 20, 2022 at 10:06:08AM -0400, Stefan Berger wrote: >>> From: Christian Brauner <brauner@kernel.org> >>> >>> When securityfs creates a new file or directory via >>> securityfs_create_dentry() it will take an additional reference on the >>> newly created dentry after it has attached the new inode to the new >>> dentry and added it to the hashqueues. >>> If we contrast this with debugfs which has the same underlying logic as >>> securityfs. It uses a similar pairing as securityfs. Where securityfs >>> has the securityfs_create_dentry() and securityfs_remove() pairing, >>> debugfs has the __debugfs_create_file() and debugfs_remove() pairing. >>> >>> In contrast to securityfs, debugfs doesn't take an additional reference >>> on the newly created dentry in __debugfs_create_file() which would need >>> to be put in debugfs_remove(). >>> >>> The additional dget() isn't a problem per se. In the current >>> implementation of securityfs each created dentry pins the filesystem via >> >> Is 'via' an extra word here or is there a missing word? >> >> I'll delay the rest of my response as the missing word may answer my >> remaining question :) > > It can be both. It should either be removed or it should be followed by > "securityfs_create_dentry()". securityfs_create_dentry() takes two I am adding "securityfs_create_dentry()" to the text.
On Tue, May 10, 2022 at 05:51:07PM +0200, Christian Brauner wrote: > On Tue, May 10, 2022 at 09:10:25AM -0500, Serge Hallyn wrote: > > On Tue, May 10, 2022 at 12:25:25PM +0200, Christian Brauner wrote: > > > On Mon, May 09, 2022 at 02:54:14PM -0500, Serge Hallyn wrote: > > > > On Wed, Apr 20, 2022 at 10:06:08AM -0400, Stefan Berger wrote: > > > > > From: Christian Brauner <brauner@kernel.org> > > > > > > > > > > When securityfs creates a new file or directory via > > > > > securityfs_create_dentry() it will take an additional reference on the > > > > > newly created dentry after it has attached the new inode to the new > > > > > dentry and added it to the hashqueues. > > > > > If we contrast this with debugfs which has the same underlying logic as > > > > > securityfs. It uses a similar pairing as securityfs. Where securityfs > > > > > has the securityfs_create_dentry() and securityfs_remove() pairing, > > > > > debugfs has the __debugfs_create_file() and debugfs_remove() pairing. > > > > > > > > > > In contrast to securityfs, debugfs doesn't take an additional reference > > > > > on the newly created dentry in __debugfs_create_file() which would need > > > > > to be put in debugfs_remove(). > > > > > > > > > > The additional dget() isn't a problem per se. In the current > > > > > implementation of securityfs each created dentry pins the filesystem via > > > > > > > > Is 'via' an extra word here or is there a missing word? > > > > > > > > I'll delay the rest of my response as the missing word may answer my > > > > remaining question :) > > > > > > It can be both. It should either be removed or it should be followed by > > > "securityfs_create_dentry()". securityfs_create_dentry() takes two > > > references one in lookup_one_len() and another one explicitly via > > > dget(). The latter one isn't needed. Some of that has been covered in an > > > earlier thread: > > > https://lore.kernel.org/lkml/20220105101815.ldsm4s5yx7pmuiil@wittgenstein > > > > Yes, I saw that two references were being taken. And near as I can tell, > > the second one was never being dropped. So if you tell me that before this > > patch the dentries are never freed, then I'm happy. Otherwise, I'm > > bothered the fact that no matching dput is being deleted in the code (to > > match the extra dget being removed). So where is the code where the final > > dput was happening, and is it the d_delete() you're adding which is making > > it so that that dput won't be called now? > > * So consider mounting securityfs _without this patch applied_: > > mount -t securityfs /sfs > > and assume we only have a single user that creates a file "foo" via > > securityfs_create_file() > { > lookup_one_len(); // first dget() > dget(); // second dget() > } > > now assume that user at some point calls > > void securityfs_remove() > { > if (d_is_dir(dentry)) > simple_rmdir(dir, dentry); // first dput() > else > simple_unlink(dir, dentry); // first dput() > dput(dentry); // second dput() > } > > * Now consider mounting securityfs _with this patch applied_: > > securityfs_create_file() > { > lookup_one_len(); // first dget() > } > > void securityfs_remove() > { > dget(); // second dget() > if (d_is_dir(dentry)) > simple_rmdir(dir, dentry); // first dput() > else > simple_unlink(dir, dentry); // first dput() > dput(dentry); // second dput() > } Oh - I was thinking about the new d_delete, but I guess that doesn't matter. thanks, -serge
On Tue, May 10, 2022 at 05:51:07PM +0200, Christian Brauner wrote: > On Tue, May 10, 2022 at 09:10:25AM -0500, Serge Hallyn wrote: > > On Tue, May 10, 2022 at 12:25:25PM +0200, Christian Brauner wrote: > > > On Mon, May 09, 2022 at 02:54:14PM -0500, Serge Hallyn wrote: > > > > On Wed, Apr 20, 2022 at 10:06:08AM -0400, Stefan Berger wrote: > > > > > From: Christian Brauner <brauner@kernel.org> > > > > > > > > > > When securityfs creates a new file or directory via > > > > > securityfs_create_dentry() it will take an additional reference on the > > > > > newly created dentry after it has attached the new inode to the new > > > > > dentry and added it to the hashqueues. > > > > > If we contrast this with debugfs which has the same underlying logic as > > > > > securityfs. It uses a similar pairing as securityfs. Where securityfs > > > > > has the securityfs_create_dentry() and securityfs_remove() pairing, > > > > > debugfs has the __debugfs_create_file() and debugfs_remove() pairing. > > > > > > > > > > In contrast to securityfs, debugfs doesn't take an additional reference > > > > > on the newly created dentry in __debugfs_create_file() which would need > > > > > to be put in debugfs_remove(). > > > > > > > > > > The additional dget() isn't a problem per se. In the current > > > > > implementation of securityfs each created dentry pins the filesystem via > > > > > > > > Is 'via' an extra word here or is there a missing word? > > > > > > > > I'll delay the rest of my response as the missing word may answer my > > > > remaining question :) > > > > > > It can be both. It should either be removed or it should be followed by > > > "securityfs_create_dentry()". securityfs_create_dentry() takes two > > > references one in lookup_one_len() and another one explicitly via > > > dget(). The latter one isn't needed. Some of that has been covered in an > > > earlier thread: > > > https://lore.kernel.org/lkml/20220105101815.ldsm4s5yx7pmuiil@wittgenstein > > > > Yes, I saw that two references were being taken. And near as I can tell, > > the second one was never being dropped. So if you tell me that before this > > patch the dentries are never freed, then I'm happy. Otherwise, I'm > > bothered the fact that no matching dput is being deleted in the code (to > > match the extra dget being removed). So where is the code where the final > > dput was happening, and is it the d_delete() you're adding which is making > > it so that that dput won't be called now? > > * So consider mounting securityfs _without this patch applied_: > > mount -t securityfs /sfs > > and assume we only have a single user that creates a file "foo" via > > securityfs_create_file() > { > lookup_one_len(); // first dget() > dget(); // second dget() > } > > now assume that user at some point calls > > void securityfs_remove() > { > if (d_is_dir(dentry)) > simple_rmdir(dir, dentry); // first dput() > else > simple_unlink(dir, dentry); // first dput() > dput(dentry); // second dput() > } > > * Now consider mounting securityfs _with this patch applied_: > > securityfs_create_file() > { > lookup_one_len(); // first dget() > } > > void securityfs_remove() > { > dget(); // second dget() > if (d_is_dir(dentry)) > simple_rmdir(dir, dentry); // first dput() > else > simple_unlink(dir, dentry); // first dput() > dput(dentry); // second dput() > } Thanks, I get it now Reviewed-by: Serge Hallyn <serge@hallyn.com>
Hi James,
On Tue, 2022-05-10 at 15:41 -0500, Serge E. Hallyn wrote:
<snip>
> Reviewed-by: Serge Hallyn <serge@hallyn.com>
b4 properly applies Serge's tag. Any chance you could take this patch
and, probably, 2/26 via the security tree?
thanks,
Mimi
diff --git a/security/inode.c b/security/inode.c index 6c326939750d..13e6780c4444 100644 --- a/security/inode.c +++ b/security/inode.c @@ -159,7 +159,6 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode, inode->i_fop = fops; } d_instantiate(dentry, inode); - dget(dentry); inode_unlock(dir); return dentry; @@ -302,10 +301,12 @@ void securityfs_remove(struct dentry *dentry) dir = d_inode(dentry->d_parent); inode_lock(dir); if (simple_positive(dentry)) { + dget(dentry); if (d_is_dir(dentry)) simple_rmdir(dir, dentry); else simple_unlink(dir, dentry); + d_delete(dentry); dput(dentry); } inode_unlock(dir);