diff mbox series

ovl: set I_CREATING on inode being created

Message ID 20180822085522.GA14354@veci.piliscsaba.redhat.com (mailing list archive)
State New, archived
Headers show
Series ovl: set I_CREATING on inode being created | expand

Commit Message

Miklos Szeredi Aug. 22, 2018, 8:55 a.m. UTC
...otherwise there will be list corruption due to inode_sb_list_add() being
called for inode already on the sb list.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: e950564b97fd ("vfs: don't evict uninitialized inode")
---
This missed the 4.19 overlay pull request, because it fixes a bug
introduced by patch not in said pull (buggy patch is also mine,
incidentally).

 fs/overlayfs/dir.c |    4 ++++
 1 file changed, 4 insertions(+)

Comments

Linus Torvalds Aug. 22, 2018, 2:53 p.m. UTC | #1
On Wed, Aug 22, 2018 at 1:55 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> +       spin_lock(&inode->i_lock);
> +       inode->i_state |= I_CREATING;
> +       spin_unlock(&inode->i_lock);
> +

Why is that spinlock protection there?

Isn't this a new inode that cannot possibly be reached any other way yet?

NOTE! This is a question. Maybe there is something I missed, and there
*are* other ways to reach that inode. But if that's true, isn't it
already too late to set I_CREATING?

So I'd like some clarification on this point before applying it. It's
possible that the spinlock is required, I just want to understand why.

          Linus
Miklos Szeredi Aug. 22, 2018, 7:58 p.m. UTC | #2
On Wed, Aug 22, 2018 at 4:53 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Wed, Aug 22, 2018 at 1:55 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
>>
>> +       spin_lock(&inode->i_lock);
>> +       inode->i_state |= I_CREATING;
>> +       spin_unlock(&inode->i_lock);
>> +
>
> Why is that spinlock protection there?
>
> Isn't this a new inode that cannot possibly be reached any other way yet?

new_inode() puts it on sb->s_inodes list, so it *is* reachable.
Following operate on s_inodes:

- evict_inodes() called from generic_shutdown_super(): a) we shouldn't
get here while in creation, b) it's careful to not touch inodes with
non-zero refcount

- invalidate_inodes(), called from block devices, so it doesn't apply
to overlayfs, also it skips inodes with non-zero refcount

- iterate_bdevs(), operates on blockdev_superblock

- fsnotify_unmount_inodes() called from generic_shutdown_super(): we
shouldn't get here while in creation,

- add_dquot_ref(), remove_dquot_ref(): not quite sure what these do,
but quotas are not (yet) supported on overlayfs

So looks like we are safe without a spinlock.

And there's another, more fundamental reason: if anything starts
messing with i_state of an inode that is not yet even had its state
changed to I_NEW, then lots of filesystems are in trouble.

> NOTE! This is a question. Maybe there is something I missed, and there
> *are* other ways to reach that inode. But if that's true, isn't it
> already too late to set I_CREATING?

No, it's not too late, I_CREATING can be set anytime up to
inode_insert5(), which is the first one to actually look at that flag.

> So I'd like some clarification on this point before applying it. It's
> possible that the spinlock is required, I just want to understand why.

I added the spinlock, because it's cheap (new_inode() already pulls it
into L1 cache) and because it's much harder to prove that lockless one
is correct than just adding that locking.

Thanks,
Miklos
Linus Torvalds Aug. 22, 2018, 8:14 p.m. UTC | #3
On Wed, Aug 22, 2018 at 12:58 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> > So I'd like some clarification on this point before applying it. It's
> > possible that the spinlock is required, I just want to understand why.
>
> I added the spinlock, because it's cheap (new_inode() already pulls it
> into L1 cache) and because it's much harder to prove that lockless one
> is correct than just adding that locking.

Ok, thanks, looks good to me. And looking around, I think it matches
most of the other cases of us setting those I_NEW and I_CREATING
flags, so I guess it's good from a consistency standpoint too.

I just wanted that clarified, but I'll just apply the patch directly.

Thanks,

             Linus
diff mbox series

Patch

--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -603,6 +603,10 @@  static int ovl_create_object(struct dent
 	if (!inode)
 		goto out_drop_write;
 
+	spin_lock(&inode->i_lock);
+	inode->i_state |= I_CREATING;
+	spin_unlock(&inode->i_lock);
+
 	inode_init_owner(inode, dentry->d_parent->d_inode, mode);
 	attr.mode = inode->i_mode;