From patchwork Tue Oct 25 22:44:42 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 9395667 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id ED9E260234 for ; Tue, 25 Oct 2016 22:45:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C02832953E for ; Tue, 25 Oct 2016 22:45:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B439E2954E; Tue, 25 Oct 2016 22:45:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 23AED2953C for ; Tue, 25 Oct 2016 22:45:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754410AbcJYWp1 (ORCPT ); Tue, 25 Oct 2016 18:45:27 -0400 Received: from mail-pf0-f171.google.com ([209.85.192.171]:35990 "EHLO mail-pf0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752987AbcJYWpX (ORCPT ); Tue, 25 Oct 2016 18:45:23 -0400 Received: by mail-pf0-f171.google.com with SMTP id e6so126532638pfk.3 for ; Tue, 25 Oct 2016 15:44:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=RpjnCaDxH6K9NlEG4G3DklqvCtqO2ZDEfppJHh8zIQ8=; b=RJT5p7Pys8qsCOHb2Bf0sogAENPX2RGOjK/GqLwp+z+ufXWLdH0aOpNaqMNnHgjB1B KoepyVGTCs2dOvVE5jEwH2v1WW1xTQkSHYlYVM1dA9J3KGKbDj7fNHU3xhIYipkJmZna vHJ+q1T6lluQLEW9DTPNxdKtncwj9ZTHzpmgtoer+BNyx6Y069Diz5viwTlj7WsAq2sj 5f1zZqx7ZDMmZoCdOh6VuXZVFXjRZXWM0RqtgnqXxiaT6OsMhFdPHW9Rwsl9GL5i8YFt XCOHutT/9xYrZUIrisfO2dA66DVeBxXVd83MW5h7w4JpNHZTi6zCvFSyr+WxUcN89Q44 kkcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=RpjnCaDxH6K9NlEG4G3DklqvCtqO2ZDEfppJHh8zIQ8=; b=Kl7BXANsueBcGMQBTpc1Nf9Lexzk16QNDSjeWqsAY00+asNToutx96SeW2LWzV/q8j iJEAaXFYLHhahzExuBRJhKv8dl/fazaL9WcycwgrAxGA73Oz6a3bb3BX5jVuGnpVRB9d 5cEjpkYAtDoEjCxWy1SThcQarLfolp1DzlYFzc7kZJWlr1wdR6065S78sSWwGG7epKXP 734r9HKYJWJC8bWHkjygg30bfZOREmTykc8ql1bTUs4Td//HHppfbgwee8s+rrOfWTun b+2ZpXagMk6WoNGrVirxF6pXx429PIrMRX2Q7Qub5xb4bEPKE6M5mOtoy/6MGUzKUorD B96w== X-Gm-Message-State: ABUngvd8UJ90eF452OZ5C62+BkpuGw+niM29iMwObA+KOUp/jRCDMkxkFP9D5nrfrwEIH6u9 X-Received: by 10.98.130.11 with SMTP id w11mr44597843pfd.101.1477435484278; Tue, 25 Oct 2016 15:44:44 -0700 (PDT) Received: from vader.DHCP.thefacebook.com ([2620:10d:c090:200::c:8ab1]) by smtp.gmail.com with ESMTPSA id j17sm35724952pfe.79.2016.10.25.15.44.43 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 25 Oct 2016 15:44:43 -0700 (PDT) Date: Tue, 25 Oct 2016 15:44:42 -0700 From: Omar Sandoval To: Josef Bacik Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com, david@fromorbit.com, jack@suse.cz, linux-fsdevel@vger.kernel.org, viro@zeniv.linux.org.uk, hch@infradead.org, jweiner@fb.com Subject: Re: [PATCH 5/5] fs: don't set *REFERENCED unless we are on the lru list Message-ID: <20161025224442.GA18879@vader.DHCP.thefacebook.com> References: <1477420904-1399-1-git-send-email-jbacik@fb.com> <1477420904-1399-6-git-send-email-jbacik@fb.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1477420904-1399-6-git-send-email-jbacik@fb.com> User-Agent: Mutt/1.7.1 (2016-10-04) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Tue, Oct 25, 2016 at 02:41:44PM -0400, Josef Bacik wrote: > With anything that populates the inode/dentry cache with a lot of one time use > inodes we can really put a lot of pressure on the system for things we don't > need to keep in cache. It takes two runs through the LRU to evict these one use > entries, and if you have a lot of memory you can end up with 10's of millions of > entries in the dcache or icache that have never actually been touched since they > were first instantiated, and it will take a lot of CPU and a lot of pressure to > evict all of them. > > So instead do what we do with pagecache, only set the *REFERENCED flags if we > are being used after we've been put onto the LRU. This makes a significant > difference in the system's ability to evict these useless cache entries. With a > fs_mark workload that creates 40 million files we get the following results (all > in files/sec) > > Btrfs Patched Unpatched > Average Files/sec: 72209.3 63254.2 > p50 Files/sec: 70850 57560 > p90 Files/sec: 68757 53085 > p99 Files/sec: 68757 53085 > > XFS Patched Unpatched > Average Files/sec: 61025.5 60719.5 > p50 Files/sec: 60107 59465 > p90 Files/sec: 59300 57966 > p99 Files/sec: 59227 57528 > > Ext4 Patched Unpatched > Average Files/sec: 121785.4 119248.0 > p50 Files/sec: 120852 119274 > p90 Files/sec: 116703 112783 > p99 Files/sec: 114393 104934 > > The reason Btrfs has a much larger improvement is because it holds a lot more > things in memory so benefits more from faster slab reclaim, but across the board > is an improvement for each of the file systems. > > Signed-off-by: Josef Bacik > --- > fs/dcache.c | 15 ++++++++++----- > fs/inode.c | 5 ++++- > 2 files changed, 14 insertions(+), 6 deletions(-) > > diff --git a/fs/dcache.c b/fs/dcache.c > index 5c7cc95..a558075 100644 > --- a/fs/dcache.c > +++ b/fs/dcache.c > @@ -779,8 +779,6 @@ repeat: > goto kill_it; > } > > - if (!(dentry->d_flags & DCACHE_REFERENCED)) > - dentry->d_flags |= DCACHE_REFERENCED; > dentry_lru_add(dentry); > > dentry->d_lockref.count--; > @@ -803,6 +801,13 @@ static inline void __dget_dlock(struct dentry *dentry) > dentry->d_lockref.count++; > } > > +static inline void __dget_dlock_reference(struct dentry *dentry) > +{ > + if (dentry->d_flags & DCACHE_LRU_LIST) > + dentry->d_flags |= DCACHE_REFERENCED; > + dentry->d_lockref.count++; > +} > + > static inline void __dget(struct dentry *dentry) > { > lockref_get(&dentry->d_lockref); > @@ -875,7 +880,7 @@ again: > (alias->d_flags & DCACHE_DISCONNECTED)) { > discon_alias = alias; > } else { > - __dget_dlock(alias); > + __dget_dlock_reference(alias); > spin_unlock(&alias->d_lock); > return alias; > } > @@ -886,7 +891,7 @@ again: > alias = discon_alias; > spin_lock(&alias->d_lock); > if (S_ISDIR(inode->i_mode) || !d_unhashed(alias)) { > - __dget_dlock(alias); > + __dget_dlock_reference(alias); > spin_unlock(&alias->d_lock); > return alias; > } > @@ -2250,7 +2255,7 @@ struct dentry *__d_lookup(const struct dentry *parent, const struct qstr *name) > if (!d_same_name(dentry, parent, name)) > goto next; > > - dentry->d_lockref.count++; > + __dget_dlock_reference(dentry); This misses dentries that we get through __d_lookup_rcu(), so I think your change made it so most things aren't getting DCACHE_REFERENCED set at all. Maybe something like this instead? diff --git a/fs/dcache.c b/fs/dcache.c index 5c7cc95..d7a56a8 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -412,15 +412,6 @@ static void d_lru_shrink_move(struct list_lru_one *lru, struct dentry *dentry, list_lru_isolate_move(lru, &dentry->d_lru, list); } -/* - * dentry_lru_(add|del)_list) must be called with d_lock held. - */ -static void dentry_lru_add(struct dentry *dentry) -{ - if (unlikely(!(dentry->d_flags & DCACHE_LRU_LIST))) - d_lru_add(dentry); -} - /** * d_drop - drop a dentry * @dentry: dentry to drop @@ -779,9 +770,12 @@ void dput(struct dentry *dentry) goto kill_it; } - if (!(dentry->d_flags & DCACHE_REFERENCED)) - dentry->d_flags |= DCACHE_REFERENCED; - dentry_lru_add(dentry); + if (likely(dentry->d_flags & DCACHE_LRU_LIST)) { + if (!(dentry->d_flags & DCACHE_REFERENCED)) + dentry->d_flags |= DCACHE_REFERENCED; + } else { + d_lru_add(dentry); + } dentry->d_lockref.count--; spin_unlock(&dentry->d_lock); diff --git a/fs/inode.c b/fs/inode.c index 88110fd..16faca3 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -798,6 +798,8 @@ static struct inode *find_inode(struct super_block *sb, __wait_on_freeing_inode(inode); goto repeat; } + if (!list_empty(&inode->i_lru)) + inode->i_state |= I_REFERENCED; __iget(inode); spin_unlock(&inode->i_lock); return inode; @@ -825,6 +827,8 @@ static struct inode *find_inode_fast(struct super_block *sb, __wait_on_freeing_inode(inode); goto repeat; } + if (!list_empty(&inode->i_lru)) + inode->i_state |= I_REFERENCED; __iget(inode); spin_unlock(&inode->i_lock); return inode; @@ -1492,7 +1496,6 @@ static void iput_final(struct inode *inode) drop = generic_drop_inode(inode); if (!drop && (sb->s_flags & MS_ACTIVE)) { - inode->i_state |= I_REFERENCED; inode_add_lru(inode); spin_unlock(&inode->i_lock); return;