Message ID | 20180822085522.GA14354@veci.piliscsaba.redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ovl: set I_CREATING on inode being created | expand |
On Wed, Aug 22, 2018 at 1:55 AM Miklos Szeredi <miklos@szeredi.hu> wrote: > > + spin_lock(&inode->i_lock); > + inode->i_state |= I_CREATING; > + spin_unlock(&inode->i_lock); > + Why is that spinlock protection there? Isn't this a new inode that cannot possibly be reached any other way yet? NOTE! This is a question. Maybe there is something I missed, and there *are* other ways to reach that inode. But if that's true, isn't it already too late to set I_CREATING? So I'd like some clarification on this point before applying it. It's possible that the spinlock is required, I just want to understand why. Linus
On Wed, Aug 22, 2018 at 4:53 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Wed, Aug 22, 2018 at 1:55 AM Miklos Szeredi <miklos@szeredi.hu> wrote: >> >> + spin_lock(&inode->i_lock); >> + inode->i_state |= I_CREATING; >> + spin_unlock(&inode->i_lock); >> + > > Why is that spinlock protection there? > > Isn't this a new inode that cannot possibly be reached any other way yet? new_inode() puts it on sb->s_inodes list, so it *is* reachable. Following operate on s_inodes: - evict_inodes() called from generic_shutdown_super(): a) we shouldn't get here while in creation, b) it's careful to not touch inodes with non-zero refcount - invalidate_inodes(), called from block devices, so it doesn't apply to overlayfs, also it skips inodes with non-zero refcount - iterate_bdevs(), operates on blockdev_superblock - fsnotify_unmount_inodes() called from generic_shutdown_super(): we shouldn't get here while in creation, - add_dquot_ref(), remove_dquot_ref(): not quite sure what these do, but quotas are not (yet) supported on overlayfs So looks like we are safe without a spinlock. And there's another, more fundamental reason: if anything starts messing with i_state of an inode that is not yet even had its state changed to I_NEW, then lots of filesystems are in trouble. > NOTE! This is a question. Maybe there is something I missed, and there > *are* other ways to reach that inode. But if that's true, isn't it > already too late to set I_CREATING? No, it's not too late, I_CREATING can be set anytime up to inode_insert5(), which is the first one to actually look at that flag. > So I'd like some clarification on this point before applying it. It's > possible that the spinlock is required, I just want to understand why. I added the spinlock, because it's cheap (new_inode() already pulls it into L1 cache) and because it's much harder to prove that lockless one is correct than just adding that locking. Thanks, Miklos
On Wed, Aug 22, 2018 at 12:58 PM Miklos Szeredi <miklos@szeredi.hu> wrote: > > > So I'd like some clarification on this point before applying it. It's > > possible that the spinlock is required, I just want to understand why. > > I added the spinlock, because it's cheap (new_inode() already pulls it > into L1 cache) and because it's much harder to prove that lockless one > is correct than just adding that locking. Ok, thanks, looks good to me. And looking around, I think it matches most of the other cases of us setting those I_NEW and I_CREATING flags, so I guess it's good from a consistency standpoint too. I just wanted that clarified, but I'll just apply the patch directly. Thanks, Linus
--- a/fs/overlayfs/dir.c +++ b/fs/overlayfs/dir.c @@ -603,6 +603,10 @@ static int ovl_create_object(struct dent if (!inode) goto out_drop_write; + spin_lock(&inode->i_lock); + inode->i_state |= I_CREATING; + spin_unlock(&inode->i_lock); + inode_init_owner(inode, dentry->d_parent->d_inode, mode); attr.mode = inode->i_mode;
...otherwise there will be list corruption due to inode_sb_list_add() being called for inode already on the sb list. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: e950564b97fd ("vfs: don't evict uninitialized inode") --- This missed the 4.19 overlay pull request, because it fixes a bug introduced by patch not in said pull (buggy patch is also mine, incidentally). fs/overlayfs/dir.c | 4 ++++ 1 file changed, 4 insertions(+)