From patchwork Sat Apr 16 00:55:24 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Al Viro X-Patchwork-Id: 8860611 Return-Path: X-Original-To: patchwork-linux-fsdevel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 1C46FBF440 for ; Sat, 16 Apr 2016 00:57:30 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 1D9E320259 for ; Sat, 16 Apr 2016 00:57:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1BDF22026F for ; Sat, 16 Apr 2016 00:57:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753078AbcDPA40 (ORCPT ); Fri, 15 Apr 2016 20:56:26 -0400 Received: from zeniv.linux.org.uk ([195.92.253.2]:39589 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752714AbcDPAza (ORCPT ); Fri, 15 Apr 2016 20:55:30 -0400 Received: from viro by ZenIV.linux.org.uk with local (Exim 4.76 #1 (Red Hat Linux)) id 1arEWG-0008IW-RA; Sat, 16 Apr 2016 00:55:28 +0000 From: Al Viro To: Linus Torvalds Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [PATCH 12/15] parallel lookups machinery, part 2 Date: Sat, 16 Apr 2016 01:55:24 +0100 Message-Id: <1460768127-31822-12-git-send-email-viro@ZenIV.linux.org.uk> X-Mailer: git-send-email 2.7.3 In-Reply-To: <20160416005232.GV25498@ZenIV.linux.org.uk> References: <20160416005232.GV25498@ZenIV.linux.org.uk> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Spam-Status: No, score=-7.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Al Viro We'll need to verify that there's neither a hashed nor in-lookup dentry with desired parent/name before adding to in-lookup set. One possible solution would be to hold the parent's ->d_lock through both checks, but while the in-lookup set is relatively small at any time, dcache is not. And holding the parent's ->d_lock through something like __d_lookup_rcu() would suck too badly. So we leave the parent's ->d_lock alone, which means that we watch out for the following scenario: * we verify that there's no hashed match * existing in-lookup match gets hashed by another process * we verify that there's no in-lookup matches and decide that everything's fine. Solution: per-directory kinda-sorta seqlock, bumped around the times we hash something that used to be in-lookup or move (and hash) something in place of in-lookup. Then the above would turn into * read the counter * do dcache lookup * if no matches found, check for in-lookup matches * if there had been none of those either, check if the counter has changed; repeat if it has. The "kinda-sorta" part is due to the fact that we don't have much spare space in inode. There is a spare word (shared with i_bdev/i_cdev/i_pipe), so the counter part is not a problem, but spinlock is a different story. We could use the parent's ->d_lock, and it would be less painful in terms of contention, for __d_add() it would be rather inconvenient to grab; we could do that (using lock_parent()), but... Fortunately, we can get serialization on the counter itself, and it might be a good idea in general; we can use cmpxchg() in a loop to get from even to odd and smp_store_release() from odd to even. This commit adds the counter and updating logics; the readers will be added in the next commit. Signed-off-by: Al Viro --- fs/dcache.c | 34 ++++++++++++++++++++++++++++++++-- fs/inode.c | 1 + include/linux/fs.h | 1 + 3 files changed, 34 insertions(+), 2 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 5cea3cb..3959f18 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -2361,6 +2361,22 @@ void d_rehash(struct dentry * entry) } EXPORT_SYMBOL(d_rehash); +static inline unsigned start_dir_add(struct inode *dir) +{ + + for (;;) { + unsigned n = dir->i_dir_seq; + if (!(n & 1) && cmpxchg(&dir->i_dir_seq, n, n + 1) == n) + return n; + cpu_relax(); + } +} + +static inline void end_dir_add(struct inode *dir, unsigned n) +{ + smp_store_release(&dir->i_dir_seq, n + 2); +} + void __d_not_in_lookup(struct dentry *dentry) { dentry->d_flags &= ~DCACHE_PAR_LOOKUP; @@ -2371,9 +2387,14 @@ void __d_not_in_lookup(struct dentry *dentry) static inline void __d_add(struct dentry *dentry, struct inode *inode) { + struct inode *dir = NULL; + unsigned n; spin_lock(&dentry->d_lock); - if (unlikely(dentry->d_flags & DCACHE_PAR_LOOKUP)) + if (unlikely(dentry->d_flags & DCACHE_PAR_LOOKUP)) { + dir = dentry->d_parent->d_inode; + n = start_dir_add(dir); __d_not_in_lookup(dentry); + } if (inode) { unsigned add_flags = d_flags_for_inode(inode); hlist_add_head(&dentry->d_u.d_alias, &inode->i_dentry); @@ -2383,6 +2404,8 @@ static inline void __d_add(struct dentry *dentry, struct inode *inode) __fsnotify_d_instantiate(dentry); } _d_rehash(dentry); + if (dir) + end_dir_add(dir, n); spin_unlock(&dentry->d_lock); if (inode) spin_unlock(&inode->i_lock); @@ -2612,6 +2635,8 @@ static void dentry_unlock_for_move(struct dentry *dentry, struct dentry *target) static void __d_move(struct dentry *dentry, struct dentry *target, bool exchange) { + struct inode *dir = NULL; + unsigned n; if (!dentry->d_inode) printk(KERN_WARNING "VFS: moving negative dcache entry\n"); @@ -2619,8 +2644,11 @@ static void __d_move(struct dentry *dentry, struct dentry *target, BUG_ON(d_ancestor(target, dentry)); dentry_lock_for_move(dentry, target); - if (unlikely(target->d_flags & DCACHE_PAR_LOOKUP)) + if (unlikely(target->d_flags & DCACHE_PAR_LOOKUP)) { + dir = target->d_parent->d_inode; + n = start_dir_add(dir); __d_not_in_lookup(target); + } write_seqcount_begin(&dentry->d_seq); write_seqcount_begin_nested(&target->d_seq, DENTRY_D_LOCK_NESTED); @@ -2670,6 +2698,8 @@ static void __d_move(struct dentry *dentry, struct dentry *target, write_seqcount_end(&target->d_seq); write_seqcount_end(&dentry->d_seq); + if (dir) + end_dir_add(dir, n); dentry_unlock_for_move(dentry, target); } diff --git a/fs/inode.c b/fs/inode.c index 4202aac..4b884f7 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -151,6 +151,7 @@ int inode_init_always(struct super_block *sb, struct inode *inode) inode->i_bdev = NULL; inode->i_cdev = NULL; inode->i_link = NULL; + inode->i_dir_seq = 0; inode->i_rdev = 0; inode->dirtied_when = 0; diff --git a/include/linux/fs.h b/include/linux/fs.h index 1b5fcae..0a32045 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -684,6 +684,7 @@ struct inode { struct block_device *i_bdev; struct cdev *i_cdev; char *i_link; + unsigned i_dir_seq; }; __u32 i_generation;