From patchwork Fri Mar 1 17:57:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Henriques X-Patchwork-Id: 10835805 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 392C513B5 for ; Fri, 1 Mar 2019 17:58:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2A90C286D4 for ; Fri, 1 Mar 2019 17:58:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1D51528725; Fri, 1 Mar 2019 17:58:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9A618286D4 for ; Fri, 1 Mar 2019 17:58:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389727AbfCAR6A (ORCPT ); Fri, 1 Mar 2019 12:58:00 -0500 Received: from mx2.suse.de ([195.135.220.15]:43316 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727952AbfCAR57 (ORCPT ); Fri, 1 Mar 2019 12:57:59 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id E8BB9AF79; Fri, 1 Mar 2019 17:57:57 +0000 (UTC) From: Luis Henriques To: "Yan, Zheng" , Sage Weil , Ilya Dryomov Cc: ceph-devel@vger.kernel.org, linux-kernel@vger.kernel.org, Luis Henriques Subject: [RFC PATCH 1/2] ceph: factor out ceph_lookup_inode() Date: Fri, 1 Mar 2019 17:57:51 +0000 Message-Id: <20190301175752.17808-2-lhenriques@suse.com> In-Reply-To: <20190301175752.17808-1-lhenriques@suse.com> References: <20190301175752.17808-1-lhenriques@suse.com> MIME-Version: 1.0 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This function will be used by __fh_to_dentry and by the quotas code, to find quota realm inodes that are not visible in the mountpoint. The only functional change is that an error is also returned if ceph_mdsc_do_request() fails. Signed-off-by: Luis Henriques --- fs/ceph/export.c | 14 +++++++++++++- fs/ceph/super.h | 1 + 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/fs/ceph/export.c b/fs/ceph/export.c index 3c59ad180ef0..0d8ead82c816 100644 --- a/fs/ceph/export.c +++ b/fs/ceph/export.c @@ -59,7 +59,7 @@ static int ceph_encode_fh(struct inode *inode, u32 *rawfh, int *max_len, return type; } -static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino) +struct inode *ceph_lookup_inode(struct super_block *sb, u64 ino) { struct ceph_mds_client *mdsc = ceph_sb_to_client(sb)->mdsc; struct inode *inode; @@ -92,12 +92,24 @@ static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino) ceph_mdsc_put_request(req); if (!inode) return ERR_PTR(-ESTALE); + if (err) + return ERR_PTR(err); if (inode->i_nlink == 0) { iput(inode); return ERR_PTR(-ESTALE); } } + return inode; +} + +static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino) +{ + struct inode *inode = ceph_lookup_inode(sb, ino); + + if (IS_ERR(inode)) + return ERR_CAST(inode); + return d_obtain_alias(inode); } diff --git a/fs/ceph/super.h b/fs/ceph/super.h index dfb64a5211b6..ce51e98b08ec 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -1061,6 +1061,7 @@ extern long ceph_ioctl(struct file *file, unsigned int cmd, unsigned long arg); /* export.c */ extern const struct export_operations ceph_export_ops; +struct inode *ceph_lookup_inode(struct super_block *sb, u64 ino); /* locks.c */ extern __init void ceph_flock_init(void); From patchwork Fri Mar 1 17:57:52 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Henriques X-Patchwork-Id: 10835803 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9B31A1515 for ; Fri, 1 Mar 2019 17:58:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 89F9C2870E for ; Fri, 1 Mar 2019 17:58:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7D2662F1F1; Fri, 1 Mar 2019 17:58:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C98E52870E for ; Fri, 1 Mar 2019 17:58:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389741AbfCAR6B (ORCPT ); Fri, 1 Mar 2019 12:58:01 -0500 Received: from mx2.suse.de ([195.135.220.15]:43330 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2389722AbfCAR6B (ORCPT ); Fri, 1 Mar 2019 12:58:01 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id E46C3AED7; Fri, 1 Mar 2019 17:57:58 +0000 (UTC) From: Luis Henriques To: "Yan, Zheng" , Sage Weil , Ilya Dryomov Cc: ceph-devel@vger.kernel.org, linux-kernel@vger.kernel.org, Luis Henriques , Hendrik Peyerl Subject: [RFC PATCH 2/2] ceph: quota: fix quota subdir mounts Date: Fri, 1 Mar 2019 17:57:52 +0000 Message-Id: <20190301175752.17808-3-lhenriques@suse.com> In-Reply-To: <20190301175752.17808-1-lhenriques@suse.com> References: <20190301175752.17808-1-lhenriques@suse.com> MIME-Version: 1.0 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The CephFS kernel client doesn't enforce quotas that are set in a directory that isn't visible in the mount point. For example, given the path '/dir1/dir2', if quotas are set in 'dir1' and the mount is done in with mount -t ceph ::/dir1/ /mnt then the client can't access the 'dir1' inode from the quota realm dir2 belongs to. This patch fixes this by simply doing an MDS LOOKUPINO Op and grabbing a reference to it (so that it doesn't disappear again). This also requires an extra field in ceph_snap_realm so that we know we have to release that reference when destroying the realm. Links: https://tracker.ceph.com/issues/3848 Reported-by: Hendrik Peyerl Signed-off-by: Luis Henriques --- fs/ceph/caps.c | 2 +- fs/ceph/quota.c | 30 +++++++++++++++++++++++++++--- fs/ceph/snap.c | 3 +++ fs/ceph/super.h | 2 ++ 4 files changed, 33 insertions(+), 4 deletions(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index bba28a5034ba..e79994ff53d6 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -1035,7 +1035,7 @@ static void drop_inode_snap_realm(struct ceph_inode_info *ci) list_del_init(&ci->i_snap_realm_item); ci->i_snap_realm_counter++; ci->i_snap_realm = NULL; - if (realm->ino == ci->i_vino.ino) + if ((realm->ino == ci->i_vino.ino) && !realm->own_inode) realm->inode = NULL; spin_unlock(&realm->inodes_with_caps_lock); ceph_put_snap_realm(ceph_sb_to_client(ci->vfs_inode.i_sb)->mdsc, diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c index 9455d3aef0c3..f6b972d222e4 100644 --- a/fs/ceph/quota.c +++ b/fs/ceph/quota.c @@ -22,7 +22,16 @@ void ceph_adjust_quota_realms_count(struct inode *inode, bool inc) static inline bool ceph_has_realms_with_quotas(struct inode *inode) { struct ceph_mds_client *mdsc = ceph_inode_to_client(inode)->mdsc; - return atomic64_read(&mdsc->quotarealms_count) > 0; + struct super_block *sb = mdsc->fsc->sb; + + if (atomic64_read(&mdsc->quotarealms_count) > 0) + return true; + /* if root is the real CephFS root, we don't have quota realms */ + if (sb->s_root->d_inode && + (sb->s_root->d_inode->i_ino == CEPH_INO_ROOT)) + return false; + /* otherwise, we can't know for sure */ + return true; } void ceph_handle_quota(struct ceph_mds_client *mdsc, @@ -166,6 +175,7 @@ static bool check_quota_exceeded(struct inode *inode, enum quota_check_op op, return false; down_read(&mdsc->snap_rwsem); +restart: realm = ceph_inode(inode)->i_snap_realm; if (realm) ceph_get_snap_realm(mdsc, realm); @@ -176,8 +186,22 @@ static bool check_quota_exceeded(struct inode *inode, enum quota_check_op op, spin_lock(&realm->inodes_with_caps_lock); in = realm->inode ? igrab(realm->inode) : NULL; spin_unlock(&realm->inodes_with_caps_lock); - if (!in) - break; + if (!in) { + up_read(&mdsc->snap_rwsem); + in = ceph_lookup_inode(inode->i_sb, realm->ino); + down_read(&mdsc->snap_rwsem); + if (IS_ERR(in)) { + pr_warn("Can't lookup inode %llx (err: %ld)\n", + realm->ino, PTR_ERR(in)); + break; + } + spin_lock(&realm->inodes_with_caps_lock); + realm->inode = in; + realm->own_inode = true; + spin_unlock(&realm->inodes_with_caps_lock); + ceph_put_snap_realm(mdsc, realm); + goto restart; + } ci = ceph_inode(in); spin_lock(&ci->i_ceph_lock); diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c index f74193da0e09..c84ed8e8526a 100644 --- a/fs/ceph/snap.c +++ b/fs/ceph/snap.c @@ -117,6 +117,7 @@ static struct ceph_snap_realm *ceph_create_snap_realm( atomic_set(&realm->nref, 1); /* for caller */ realm->ino = ino; + realm->own_inode = false; INIT_LIST_HEAD(&realm->children); INIT_LIST_HEAD(&realm->child_item); INIT_LIST_HEAD(&realm->empty_item); @@ -184,6 +185,8 @@ static void __destroy_snap_realm(struct ceph_mds_client *mdsc, kfree(realm->prior_parent_snaps); kfree(realm->snaps); ceph_put_snap_context(realm->cached_context); + if (realm->own_inode && realm->inode) + iput(realm->inode); kfree(realm); } diff --git a/fs/ceph/super.h b/fs/ceph/super.h index ce51e98b08ec..3f0d74d2150f 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -764,6 +764,8 @@ struct ceph_snap_realm { atomic_t nref; struct rb_node node; + bool own_inode; /* true if we hold a ref to the inode */ + u64 created, seq; u64 parent_ino; u64 parent_since; /* snapid when our current parent became so */