From patchwork Tue Apr 9 19:42:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 10892205 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 810B21805 for ; Tue, 9 Apr 2019 19:42:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6EBDE2897E for ; Tue, 9 Apr 2019 19:42:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6C9DB28991; Tue, 9 Apr 2019 19:42:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0645228985 for ; Tue, 9 Apr 2019 19:42:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726691AbfDITmh (ORCPT ); Tue, 9 Apr 2019 15:42:37 -0400 Received: from mail.kernel.org ([198.145.29.99]:57802 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726682AbfDITmg (ORCPT ); Tue, 9 Apr 2019 15:42:36 -0400 Received: from tleilax.poochiereds.net (cpe-71-70-156-158.nc.res.rr.com [71.70.156.158]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 168942084B for ; Tue, 9 Apr 2019 19:42:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1554838956; bh=HiYrBc2rwV+PM00tWtPdWZgy4Th3F2MA2yljlz1nKlw=; h=From:To:Subject:Date:In-Reply-To:References:From; b=y4kR0yjzndG6UTHR6drOgIi1G6heJedSiizRC1DvHx8QjPnWX5utvWbMjXtLZ4ttY 7iI+YKp3d5Ffv3jFdwxcjY32O5PLNOgC5udfwcXbvl1VehWnbjOgdXXXTWwama93uO VMAnyZD2QVN+11pAOu1JCTGDLMxXKPaPw3wtZnLo= From: Jeff Layton To: ceph-devel@vger.kernel.org Subject: [RFC PATCH 10/11] ceph: perform asynchronous unlink if we have sufficient caps Date: Tue, 9 Apr 2019 15:42:28 -0400 Message-Id: <20190409194229.8247-11-jlayton@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190409194229.8247-1-jlayton@kernel.org> References: <20190409194229.8247-1-jlayton@kernel.org> MIME-Version: 1.0 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When performing an unlink, if we have Fx on the parent directory and Lx on the inode of the dentry being unlinked then we don't need to wait on the reply before returning. In that situation, just fix up the dcache and link count and return immediately after issuing the call to the MDS. This does mean that we need to hold an extra reference to the inode being unlinked, and extra references to the caps to avoid races. Put those references in the r_callback routine. If the operation ends up failing, then set a writeback error on the directory inode that can be fetched later by an fsync on the dir. Signed-off-by: Jeff Layton --- fs/ceph/caps.c | 22 ++++++++++++++++++++++ fs/ceph/dir.c | 38 +++++++++++++++++++++++++++++++++++--- fs/ceph/super.h | 1 + 3 files changed, 58 insertions(+), 3 deletions(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index d022e15c8378..8fbb09c761a7 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -2682,6 +2682,28 @@ static int try_get_cap_refs(struct ceph_inode_info *ci, int need, int want, return ret; } +bool ceph_get_caps_for_unlink(struct inode *dir, struct dentry *dentry) +{ + int err, got; + struct ceph_inode_info *ci = ceph_inode(d_inode(dentry)); + + /* Ensure we have Lx on the inode being unlinked */ + err = try_get_cap_refs(ci, 0, CEPH_CAP_LINK_EXCL, 0, true, &got); + dout("Lx on %p err=%d got=%d\n", dentry, err, got); + if (err != 1 || !(got & CEPH_CAP_LINK_EXCL)) + return false; + + /* Do we have Fx on the dir ? */ + err = try_get_cap_refs(ceph_inode(dir), 0, CEPH_CAP_FILE_EXCL, 0, + true, &got); + dout("Fx on %p err=%d got=%d\n", dir, err, got); + if (err != 1 || !(got & CEPH_CAP_FILE_EXCL)) { + ceph_put_cap_refs(ci, CEPH_CAP_LINK_EXCL); + return false; + } + return true; +} + /* * Check the offset we are writing up to against our current * max_size. If necessary, tell the MDS we want to write to diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index 3eb9bc226b77..386c9439a020 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -1029,6 +1029,18 @@ static int ceph_link(struct dentry *old_dentry, struct inode *dir, return err; } +static void ceph_async_unlink_cb(struct ceph_mds_client *mdsc, + struct ceph_mds_request *req) +{ + struct ceph_inode_info *ci = ceph_inode(req->r_old_inode); + + /* If op failed, set error on parent directory */ + mapping_set_error(req->r_parent->i_mapping, req->r_err); + ceph_put_cap_refs(ci, CEPH_CAP_LINK_EXCL); + ceph_put_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_FILE_EXCL); + iput(req->r_old_inode); +} + /* * rmdir and unlink are differ only by the metadata op code */ @@ -1065,9 +1077,29 @@ static int ceph_unlink(struct inode *dir, struct dentry *dentry) req->r_dentry_drop = CEPH_CAP_FILE_SHARED; req->r_dentry_unless = CEPH_CAP_FILE_EXCL; req->r_inode_drop = ceph_drop_caps_for_unlink(inode); - err = ceph_mdsc_do_request(mdsc, dir, req); - if (!err && !req->r_reply_info.head->is_dentry) - d_delete(dentry); + + if (op == CEPH_MDS_OP_UNLINK && + ceph_get_caps_for_unlink(dir, dentry)) { + /* Keep LINK caps to ensure continuity over async call */ + req->r_inode_drop &= ~(CEPH_CAP_LINK_SHARED|CEPH_CAP_LINK_EXCL); + req->r_callback = ceph_async_unlink_cb; + req->r_old_inode = d_inode(dentry); + ihold(req->r_old_inode); + err = ceph_mdsc_submit_request(mdsc, dir, req); + if (!err) { + /* + * We have enough caps, so we assume that the unlink + * will succeed. Fix up the target inode and dcache. + */ + drop_nlink(inode); + d_delete(dentry); + } + } else { + err = ceph_mdsc_do_request(mdsc, dir, req); + if (!err && !req->r_reply_info.head->is_dentry) + d_delete(dentry); + } + ceph_mdsc_put_request(req); out: return err; diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 3c5608f2108a..5c361dc1f47f 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -1033,6 +1033,7 @@ extern int ceph_get_caps(struct ceph_inode_info *ci, int need, int want, loff_t endoff, int *got, struct page **pinned_page); extern int ceph_try_get_caps(struct ceph_inode_info *ci, int need, int want, bool nonblock, int *got); +extern bool ceph_get_caps_for_unlink(struct inode *dir, struct dentry *dentry); /* for counting open files by mode */ extern void __ceph_get_fmode(struct ceph_inode_info *ci, int mode);