From patchwork Tue Jul 5 20:21:42 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oleg Drokin X-Patchwork-Id: 9215179 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 25FF96048B for ; Tue, 5 Jul 2016 20:21:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 16E162840B for ; Tue, 5 Jul 2016 20:21:54 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0BAE62840D; Tue, 5 Jul 2016 20:21:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E3B9E2840B for ; Tue, 5 Jul 2016 20:21:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751880AbcGEUVs (ORCPT ); Tue, 5 Jul 2016 16:21:48 -0400 Received: from linuxhacker.ru ([217.76.32.60]:58060 "EHLO fiona.linuxhacker.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751257AbcGEUVr convert rfc822-to-8bit (ORCPT ); Tue, 5 Jul 2016 16:21:47 -0400 Received: from [192.168.1.140] (c-73-190-129-164.hsd1.tn.comcast.net [73.190.129.164]) (authenticated bits=0) by fiona.linuxhacker.ru (8.15.2/8.14.7) with ESMTPSA id u65KLhg23180188 (version=TLSv1 cipher=AES128-SHA bits=128 verify=NO); Tue, 5 Jul 2016 23:21:44 +0300 Subject: Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes. Mime-Version: 1.0 (Apple Message framework v1283) From: Oleg Drokin In-Reply-To: <20160705200826.GP14480@ZenIV.linux.org.uk> Date: Tue, 5 Jul 2016 16:21:42 -0400 Cc: Mailing List , "" Message-Id: <1587788B-C7F6-4D19-BDAD-29846231CD6C@linuxhacker.ru> References: <20160617042914.GD14480@ZenIV.linux.org.uk> <20160703062917.GG14480@ZenIV.linux.org.uk> <94F1587A-7AFC-4B48-A0FC-F4CE152F18CC@linuxhacker.ru> <20160705123110.GL14480@ZenIV.linux.org.uk> <20160705135149.GM14480@ZenIV.linux.org.uk> <6D633478-6B94-465E-84D7-C0BA59C5E5F5@linuxhacker.ru> <20160705180841.GO14480@ZenIV.linux.org.uk> <1C63FC85-848B-486F-8223-F5CEB5C39848@linuxhacker.ru> <20160705200826.GP14480@ZenIV.linux.org.uk> To: Al Viro X-Mailer: Apple Mail (2.1283) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Jul 5, 2016, at 4:08 PM, Al Viro wrote: > On Tue, Jul 05, 2016 at 03:12:44PM -0400, Oleg Drokin wrote: >> >> When d_in_lookup was introduced, it was described as: >> New primitives: d_in_lookup() (a predicate checking if dentry is in >> the in-lookup state) and d_lookup_done() (tells the system that >> we are done with lookup and if it's still marked as in-lookup, it >> should cease to be such). >> >> I don't see where it mentions anything about exclusive vs parallel lookup >> that probably led to some confusion. > > In the same commit: > > #define DCACHE_PAR_LOOKUP 0x10000000 /* being looked up (with parent locked shared) */ Well, I did not really check the commit, just the log message, since there's no other documentation about it apparently ;) >> So with Lustre the dentry can be in three states, really: >> >> 1. hashed dentry that's all A-ok to reuse. >> 2. hashed dentry that's NOT valid (dlm lock went away) - this is distinguished in d_compare by looking at a bit in the fs_data >> 3. unhashed dentry ( I guess could be both valid and invalid lustre-wise). >> >> So the logic in ll_lookup_it_finish() (part of regular lookup) is this: >> >> If the dentry we have is not hashed - this is a new lookup, so we need to >> call into ll_splice_alias() to see if there's a better dentry we need to >> reuse that was already rejected by VFS before since we did not have necessary locks, >> but we do have them now. >> The comment at the top of ll_dcompare() explains why we don't just unhash the >> dentry on lock-loss - that apparently leads to a loop around real_lookup for >> real-contended dentries. >> This is also why we cannot use d_splice_alias here - such cases are possible >> for regular files and directories. >> >> Anyway, I guess additional point of confusion here is then why does >> ll_lookup_it_finish() need to check for hashedness of the dentry since it's in >> lookup, so we should be unhashed here. >> I checked the commit history and this check was added along with atomic_open >> support, so I imagine we can just move it up into ll_atomic_open and then your >> change starts to make sense along with a few other things. > > So basically this > } else if (!it_disposition(it, DISP_LOOKUP_NEG) && > !it_disposition(it, DISP_OPEN_CREATE)) { > /* With DISP_OPEN_CREATE dentry will be > * instantiated in ll_create_it. > */ > LASSERT(!d_inode(*de)); > d_instantiate(*de, inode); > } > is something we should do only for negative hashed fed to it by > ->atomic_open(), and that - only if we have no O_CREAT in flags? Yes, and in fact we can totally avoid it I think. > Then, since 3/3 eliminates that case completely, we could just rip that > else-if, along with the check for d_unhashed itself, making the call of > ll_splice_alias() unconditional there. Or am I misreading what you said? > Do you see any problems with what's in #for-linus now (head at 11f0083)? Yes, we can make it unconditional I think we can simplify it even more and since unlike NFS our negative dentries are a lot less speculative, we can just return ENOENT if this is not a create request and unhash otherwise. Still needs to go through the whole test suite to make sure it does not break anything unexpected. At least this is the patch I am playing with right now: --- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html --- a/drivers/staging/lustre/lustre/llite/namei.c +++ b/drivers/staging/lustre/lustre/llite/namei.c @@ -391,6 +391,7 @@ static int ll_lookup_it_finish(struct ptlrpc_request *request, struct inode *inode = NULL; __u64 bits = 0; int rc = 0; + struct dentry *alias; /* NB 1 request reference will be taken away by ll_intent_lock() * when I return @@ -415,26 +416,12 @@ static int ll_lookup_it_finish(struct ptlrpc_request *request, */ } - /* Only hash *de if it is unhashed (new dentry). - * Atoimc_open may passing hashed dentries for open. - */ - if (d_unhashed(*de)) { - struct dentry *alias; - - alias = ll_splice_alias(inode, *de); - if (IS_ERR(alias)) { - rc = PTR_ERR(alias); - goto out; - } - *de = alias; - } else if (!it_disposition(it, DISP_LOOKUP_NEG) && - !it_disposition(it, DISP_OPEN_CREATE)) { - /* With DISP_OPEN_CREATE dentry will be - * instantiated in ll_create_it. - */ - LASSERT(!d_inode(*de)); - d_instantiate(*de, inode); + alias = ll_splice_alias(inode, *de); + if (IS_ERR(alias)) { + rc = PTR_ERR(alias); + goto out; } + *de = alias; if (!it_disposition(it, DISP_LOOKUP_NEG)) { /* we have lookup look - unhide dentry */ @@ -590,6 +577,24 @@ static int ll_atomic_open(struct inode *dir, struct dentry *dentry, dentry, PFID(ll_inode2fid(dir)), dir, file, open_flags, mode, *opened); + /* Only negative dentries enter here */ + LASSERT(!d_inode(dentry)); + + if (!(open_flags & O_CREAT) && !d_in_lookup(dentry)) { + /* A valid negative dentry that just passed revalidation, + * there's little point to try and open it server-side, + * even though there's a minuscle chance it might succeed. + * Either way it's a valid race to just return -ENOENT here. + */ + if (!(open_flags & O_CREAT)) + return -ENOENT; + + /* Otherwise we just unhash it to be rehashed afresh via + * lookup if necessary + */ + d_drop(dentry); + } + it = kzalloc(sizeof(*it), GFP_NOFS); if (!it) return -ENOMEM;