Message ID | 20220511165339.85614-1-jlayton@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] fs: change test in inode_insert5 for adding to the sb list | expand |
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
On Wed, May 11, 2022 at 12:53:39PM -0400, Jeff Layton wrote: > The inode_insert5 currently looks at I_CREATING to decide whether to > insert the inode into the sb list. This test is a bit ambiguous though > as I_CREATING state is not directly related to that list. > > This test is also problematic for some upcoming ceph changes to add > fscrypt support. We need to be able to allocate an inode using new_inode > and insert it into the hash later if we end up using it, and doing that > now means that we double add it and corrupt the list. > > What we really want to know in this test is whether the inode is already > in its superblock list, and then add it if it isn't. Have it test for > list_empty instead and ensure that we always initialize the list by > doing it in inode_init_once. It's only ever removed from the list with > list_del_init, so that should be sufficient. > > There doesn't seem to be any need to hold the inode_hash_lock for this > operation either, so drop that before adding to to the list. > > Suggested-by: Al Viro <viro@zeniv.linux.org.uk> > Cc: Dave Chinner <dchinner@redhat.com> > Signed-off-by: Jeff Layton <jlayton@kernel.org> > Signed-off-by: Ilya Dryomov <idryomov@gmail.com> > --- > fs/inode.c | 13 +++++++++---- > 1 file changed, 9 insertions(+), 4 deletions(-) > > A small revision to the patch that I sent as part of the ceph+fscrypt > series. I didn't see any need to hold the inode_hash_lock when adding > the inode to the sb list, so do that outside the lock. I also revised > the comment to be more clear. I'm not sure that's valid. The moment the hash lock is dropped, other lookups can find this inode in the cache. Hence there's no guarantee that someone else isn't already accessing the inode once the hash lock is dropped. Hence it's not clear to me that this is a safe modification to make. Given that we already do the list add under the hash lock, I don't see any real gain to removing it from that context and it isn't necessary to address the problem. If you are concerned about reducing inode_has_lock contention, then the answer to that is to convert the global lock to bit list locks as the dentry cache uses. I wrote a patch a while back to do this: https://lore.kernel.org/linux-fsdevel/20210406123343.1739669-1-david@fromorbit.com/ Hence at this stage, I prefer the original version that doesn't change locking because there's much less risk associated with it. Cheers, DAve.
On Fri, 2022-05-13 at 00:21 +1000, Dave Chinner wrote: > On Wed, May 11, 2022 at 12:53:39PM -0400, Jeff Layton wrote: > > The inode_insert5 currently looks at I_CREATING to decide whether to > > insert the inode into the sb list. This test is a bit ambiguous though > > as I_CREATING state is not directly related to that list. > > > > This test is also problematic for some upcoming ceph changes to add > > fscrypt support. We need to be able to allocate an inode using new_inode > > and insert it into the hash later if we end up using it, and doing that > > now means that we double add it and corrupt the list. > > > > What we really want to know in this test is whether the inode is already > > in its superblock list, and then add it if it isn't. Have it test for > > list_empty instead and ensure that we always initialize the list by > > doing it in inode_init_once. It's only ever removed from the list with > > list_del_init, so that should be sufficient. > > > > There doesn't seem to be any need to hold the inode_hash_lock for this > > operation either, so drop that before adding to to the list. > > > > Suggested-by: Al Viro <viro@zeniv.linux.org.uk> > > Cc: Dave Chinner <dchinner@redhat.com> > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > Signed-off-by: Ilya Dryomov <idryomov@gmail.com> > > --- > > fs/inode.c | 13 +++++++++---- > > 1 file changed, 9 insertions(+), 4 deletions(-) > > > > A small revision to the patch that I sent as part of the ceph+fscrypt > > series. I didn't see any need to hold the inode_hash_lock when adding > > the inode to the sb list, so do that outside the lock. I also revised > > the comment to be more clear. > > I'm not sure that's valid. The moment the hash lock is dropped, > other lookups can find this inode in the cache. Hence there's no > guarantee that someone else isn't already accessing the inode > once the hash lock is dropped. Hence it's not clear to me that this > is a safe modification to make. > > Given that we already do the list add under the hash lock, I don't > see any real gain to removing it from that context and it isn't > necessary to address the problem. > > If you are concerned about reducing inode_has_lock contention, then > the answer to that is to convert the global lock to bit list locks > as the dentry cache uses. I wrote a patch a while back to do this: > > https://lore.kernel.org/linux-fsdevel/20210406123343.1739669-1-david@fromorbit.com/ > > Hence at this stage, I prefer the original version that doesn't > change locking because there's much less risk associated with it. > I didn't see anything that cares about the i_sb_list while simultaneously holding the inode_hash_lock. It gets removed from the list in evict, but we hold a reference to it at this point, so nothingThe thing also has I_NEW set on it so there had better not be anyone. That said, iget_locked also holds it while adding it to the list and you're correct that it's not required to fix the problem. I'll plan to revert it to the v1 version. We can always experiment with moving it outside the lock later if we think that's useful. Christoph, are you OK with the original version of that patch as well? I can resend it again if you missed it the first time. Thanks,
On Wed, 2022-05-11 at 12:53 -0400, Jeff Layton wrote: > The inode_insert5 currently looks at I_CREATING to decide whether to > insert the inode into the sb list. This test is a bit ambiguous though > as I_CREATING state is not directly related to that list. > > This test is also problematic for some upcoming ceph changes to add > fscrypt support. We need to be able to allocate an inode using new_inode > and insert it into the hash later if we end up using it, and doing that > now means that we double add it and corrupt the list. > > What we really want to know in this test is whether the inode is already > in its superblock list, and then add it if it isn't. Have it test for > list_empty instead and ensure that we always initialize the list by > doing it in inode_init_once. It's only ever removed from the list with > list_del_init, so that should be sufficient. > > There doesn't seem to be any need to hold the inode_hash_lock for this > operation either, so drop that before adding to to the list. > > Suggested-by: Al Viro <viro@zeniv.linux.org.uk> > Cc: Dave Chinner <dchinner@redhat.com> > Signed-off-by: Jeff Layton <jlayton@kernel.org> > Signed-off-by: Ilya Dryomov <idryomov@gmail.com> > --- > fs/inode.c | 13 +++++++++---- > 1 file changed, 9 insertions(+), 4 deletions(-) > > A small revision to the patch that I sent as part of the ceph+fscrypt > series. I didn't see any need to hold the inode_hash_lock when adding > the inode to the sb list, so do that outside the lock. I also revised > the comment to be more clear. > > Al, I'm planning to merge this via the ceph tree since I have other > patches that depend on it. Let me know if you'd rather take this via > your tree instead. > > diff --git a/fs/inode.c b/fs/inode.c > index 9d9b422504d1..9d429247a4f0 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -422,6 +422,7 @@ void inode_init_once(struct inode *inode) > INIT_LIST_HEAD(&inode->i_io_list); > INIT_LIST_HEAD(&inode->i_wb_list); > INIT_LIST_HEAD(&inode->i_lru); > + INIT_LIST_HEAD(&inode->i_sb_list); > __address_space_init_once(&inode->i_data); > i_size_ordered_init(inode); > } > @@ -1021,7 +1022,6 @@ struct inode *new_inode_pseudo(struct super_block *sb) > spin_lock(&inode->i_lock); > inode->i_state = 0; > spin_unlock(&inode->i_lock); > - INIT_LIST_HEAD(&inode->i_sb_list); > } > return inode; > } > @@ -1165,7 +1165,6 @@ struct inode *inode_insert5(struct inode *inode, unsigned long hashval, > { > struct hlist_head *head = inode_hashtable + hash(inode->i_sb, hashval); > struct inode *old; > - bool creating = inode->i_state & I_CREATING; > > again: > spin_lock(&inode_hash_lock); > @@ -1199,11 +1198,17 @@ struct inode *inode_insert5(struct inode *inode, unsigned long hashval, > inode->i_state |= I_NEW; > hlist_add_head_rcu(&inode->i_hash, head); > spin_unlock(&inode->i_lock); > - if (!creating) > - inode_sb_list_add(inode); > unlock: > spin_unlock(&inode_hash_lock); > > + /* > + * Add it to the sb list if it's not already. If there is an inode, > + * then it has I_NEW at this point, so it should be safe to test > + * i_sb_list locklessly. > + */ > + if (inode && list_empty(&inode->i_sb_list)) > + inode_sb_list_add(inode); > + > return inode; > } > EXPORT_SYMBOL(inode_insert5); Hi Al, Could I get your Acked-by or R-b on this patch? I'd like to take this in via the ceph tree, and I'd prefer an explicit ack on this if possible. Thanks!
diff --git a/fs/inode.c b/fs/inode.c index 9d9b422504d1..9d429247a4f0 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -422,6 +422,7 @@ void inode_init_once(struct inode *inode) INIT_LIST_HEAD(&inode->i_io_list); INIT_LIST_HEAD(&inode->i_wb_list); INIT_LIST_HEAD(&inode->i_lru); + INIT_LIST_HEAD(&inode->i_sb_list); __address_space_init_once(&inode->i_data); i_size_ordered_init(inode); } @@ -1021,7 +1022,6 @@ struct inode *new_inode_pseudo(struct super_block *sb) spin_lock(&inode->i_lock); inode->i_state = 0; spin_unlock(&inode->i_lock); - INIT_LIST_HEAD(&inode->i_sb_list); } return inode; } @@ -1165,7 +1165,6 @@ struct inode *inode_insert5(struct inode *inode, unsigned long hashval, { struct hlist_head *head = inode_hashtable + hash(inode->i_sb, hashval); struct inode *old; - bool creating = inode->i_state & I_CREATING; again: spin_lock(&inode_hash_lock); @@ -1199,11 +1198,17 @@ struct inode *inode_insert5(struct inode *inode, unsigned long hashval, inode->i_state |= I_NEW; hlist_add_head_rcu(&inode->i_hash, head); spin_unlock(&inode->i_lock); - if (!creating) - inode_sb_list_add(inode); unlock: spin_unlock(&inode_hash_lock); + /* + * Add it to the sb list if it's not already. If there is an inode, + * then it has I_NEW at this point, so it should be safe to test + * i_sb_list locklessly. + */ + if (inode && list_empty(&inode->i_sb_list)) + inode_sb_list_add(inode); + return inode; } EXPORT_SYMBOL(inode_insert5);