diff mbox

[2/9] inode: add IOP_NOTHASHED to avoid inode hash lock in evict

Message ID 1426016724-23912-3-git-send-email-jbacik@fb.com (mailing list archive)
State New, archived
Headers show

Commit Message

Josef Bacik March 10, 2015, 7:45 p.m. UTC
From: Dave Chinner <dchinner@redhat.com>

Some filesystems don't use the VFS inode hash and fake the fact they
are hashed so that all the writeback code works correctly. However,
this means the evict() path still tries to remove the inode from the
hash, meaning that the inode_hash_lock() needs to be taken
unnecessarily. Hence under certain workloads the inode_hash_lock can
be contended even if the inode is never actually hashed.

To avoid this, add an inode opflag to allow inode_hash_remove() to
avoid taking the hash lock on inodes have never actually been
hashed.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
 fs/xfs/xfs_iops.c  | 2 ++
 include/linux/fs.h | 3 ++-
 2 files changed, 4 insertions(+), 1 deletion(-)

Comments

Al Viro March 12, 2015, 9:52 a.m. UTC | #1
On Tue, Mar 10, 2015 at 03:45:17PM -0400, Josef Bacik wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Some filesystems don't use the VFS inode hash and fake the fact they
> are hashed so that all the writeback code works correctly. However,
> this means the evict() path still tries to remove the inode from the
> hash, meaning that the inode_hash_lock() needs to be taken
> unnecessarily. Hence under certain workloads the inode_hash_lock can
> be contended even if the inode is never actually hashed.
> 
> To avoid this, add an inode opflag to allow inode_hash_remove() to
> avoid taking the hash lock on inodes have never actually been
> hashed.

Why bother with flags, etc. when we could just do

static inline bool hlist_fake(struct hlist_node *h) 
{
	return h->pprev == &h->next;
}

> -	if (!inode_unhashed(inode))
> +	if (!((inode->i_opflags & IOP_NOTHASHED) || inode_unhashed(inode)))

and turn this check into
	if (!inode_unhashed(inode) && !hlist_fake(&inode->i_hash))

instead?
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Josef Bacik March 12, 2015, 12:24 p.m. UTC | #2
On 03/12/2015 05:52 AM, Al Viro wrote:
> On Tue, Mar 10, 2015 at 03:45:17PM -0400, Josef Bacik wrote:
>> From: Dave Chinner <dchinner@redhat.com>
>>
>> Some filesystems don't use the VFS inode hash and fake the fact they
>> are hashed so that all the writeback code works correctly. However,
>> this means the evict() path still tries to remove the inode from the
>> hash, meaning that the inode_hash_lock() needs to be taken
>> unnecessarily. Hence under certain workloads the inode_hash_lock can
>> be contended even if the inode is never actually hashed.
>>
>> To avoid this, add an inode opflag to allow inode_hash_remove() to
>> avoid taking the hash lock on inodes have never actually been
>> hashed.
>
> Why bother with flags, etc. when we could just do
>
> static inline bool hlist_fake(struct hlist_node *h)
> {
> 	return h->pprev == &h->next;
> }
>
>> -	if (!inode_unhashed(inode))
>> +	if (!((inode->i_opflags & IOP_NOTHASHED) || inode_unhashed(inode)))
>
> and turn this check into
> 	if (!inode_unhashed(inode) && !hlist_fake(&inode->i_hash))
>
> instead?
>

I made the change and sent the new patch, hopefully it worked, sometimes 
my laptop screws up git-send-email.  If everything looks good to you Al 
you can pull from my tree if you don't want to scrape from the list

git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next.git 
superblock-scaling

Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index e53a903..5068629 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -1250,8 +1250,10 @@  xfs_setup_inode(
 	inode->i_state = I_NEW;
 
 	inode_sb_list_add(inode);
+
 	/* make the inode look hashed for the writeback code */
 	hlist_add_fake(&inode->i_hash);
+	inode->i_opflags |= IOP_NOTHASHED;
 
 	inode->i_mode	= ip->i_d.di_mode;
 	set_nlink(inode, ip->i_d.di_nlink);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index b4d71b5..1045132 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -546,6 +546,7 @@  struct posix_acl;
 #define IOP_FASTPERM	0x0001
 #define IOP_LOOKUP	0x0002
 #define IOP_NOFOLLOW	0x0004
+#define IOP_NOTHASHED	0x0008	/* inode never hashed, avoid unhashing */
 
 /*
  * Keep mostly read-only and often accessed (especially for
@@ -2528,7 +2529,7 @@  static inline void insert_inode_hash(struct inode *inode)
 extern void __remove_inode_hash(struct inode *);
 static inline void remove_inode_hash(struct inode *inode)
 {
-	if (!inode_unhashed(inode))
+	if (!((inode->i_opflags & IOP_NOTHASHED) || inode_unhashed(inode)))
 		__remove_inode_hash(inode);
 }