From patchwork Wed Jan 17 11:18:17 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amir Goldstein X-Patchwork-Id: 10169093 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id E7239601D3 for ; Wed, 17 Jan 2018 11:18:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D501828546 for ; Wed, 17 Jan 2018 11:18:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C7F6028558; Wed, 17 Jan 2018 11:18:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C948328546 for ; Wed, 17 Jan 2018 11:18:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752578AbeAQLSW (ORCPT ); Wed, 17 Jan 2018 06:18:22 -0500 Received: from mail-yb0-f194.google.com ([209.85.213.194]:34742 "EHLO mail-yb0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752822AbeAQLST (ORCPT ); Wed, 17 Jan 2018 06:18:19 -0500 Received: by mail-yb0-f194.google.com with SMTP id u35so2400601ybi.1; Wed, 17 Jan 2018 03:18:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=bdLnIY59PtcnlO4V5i9w7Mnxx1cVut3PEAC350icqa8=; b=E9tjl2lJfDBWVvfq3H3/xvO0MajcSEcSCvjMew77fNrPBqRjE6lWRYdnVHhE8PR1Rq wN8JcLNLxzlGFiIV1uhLY3N/NvlgraVjgly/c7bYxskpeuBqtai66QptUz4TrQtuwhN6 I/yGj3tOUUj6JFQJhB+O71vFrrkPU5BCvkYF6dr+EouAdjyeoW9Y3QTwClfpJNMj5iMB 362U2V+bVJUEP00f5TaMC2x/qtzWjqBuFc2qYN+YRv2XgLt+JUoIkAMKBGN3nHkOPmrw 91zM+Ad++aw7Kr1GBn00R4OKhz4DWf1uf+PcOJoipT/mskuXHuhd+G+5fUyJ3eD2wnW3 ijkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=bdLnIY59PtcnlO4V5i9w7Mnxx1cVut3PEAC350icqa8=; b=djVMRXuXy63p9j3KOzWH805pghlox1BJYzsrD0uHOhbQEZnb/UBHBqKMpsHlrbyACd 8YJpUhsNg/3deHB7APHB21+b/fNkgV0JszRHDFNJ5tD3F/BzvXO6OMbtBrjnIbcw91Yk ZJ6Vo4j8PJ4NVaaw0jdzQNb+xj3RDy30WY9F0nTRQfh1sHzuYQGz+QASy9k+cokxZCwI hehPcvqYArJbMxhvSsc/NRLV5uBpml8gCJvTZJhjD9YL9fozHeqqQFzfNihGm+136wdG HZc4HGBA+aPG43q9kLKockVMF5wBYt4D5hTm33OVq+IjCA9fh18lpQxnaG6Q45kAwktn FENw== X-Gm-Message-State: AKwxytdL7REl1cJNzoA2DeaAE6ok/QlIXCVQ17UKAeCe8TeW0XPnt96E omO7ue2op33S/uCkNZZ9wuP9g/fSKAfSZNjRmEc= X-Google-Smtp-Source: ACJfBouolhGyQrGcQCjcH5z0aWLJJdpK7yVbEN9NdOsgRnTMrodKX1mIa5eBC4tU77d28qJKyTInZINOKCC85RNa5s8= X-Received: by 10.37.210.6 with SMTP id j6mr15799147ybg.468.1516187898117; Wed, 17 Jan 2018 03:18:18 -0800 (PST) MIME-Version: 1.0 Received: by 10.13.225.193 with HTTP; Wed, 17 Jan 2018 03:18:17 -0800 (PST) In-Reply-To: References: <1515086449-26563-1-git-send-email-amir73il@gmail.com> <1515086449-26563-5-git-send-email-amir73il@gmail.com> From: Amir Goldstein Date: Wed, 17 Jan 2018 13:18:17 +0200 Message-ID: Subject: Re: [PATCH v2 04/17] ovl: decode connected upper dir file handles To: Miklos Szeredi Cc: Jeff Layton , "J . Bruce Fields" , overlayfs , linux-fsdevel Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Mon, Jan 15, 2018 at 4:56 PM, Miklos Szeredi wrote: > [...] > >> > >> So, a working algorithm would be going up to the first connected > >> parent or root, lock parent, lookup name and restart. Not guaranteed > >> to finish, since not protected against always racing with renames. > >> Can we take s_vfs_rename_sem on ovl to prevent that? > >> > > > > Sounds like a simple and good enough solution. > > Do we really need the locking of parent and restart connect if > > we take s_vfs_rename_sem around ovl_lookup_real()? > > No, but s_vfs_rename_sem is a really heavyweight solution, we should > do better than that for decoding a file handle. > > And we probably don't need anything else, since rename on ancestor > means renamed dir is connected, and hopefully not evicted from the > cache until we repeat the walk up. > > So need to lock parent, lookup ovl dentry, verify we got the same > upper, if not retry icache lookup. > > Not sure we need to worry about that "hopefully". Hopefully not. > Something like this?? This is just the raw fix to patch 4/17 without the icache lookup that is added by later patches. I added rename_lock seqlock around backwalk to connected ancestor and take_dentry_name_snapshot() for the stability of real name during overlay lookup. I considered also storing OVL_I(d_inode(connected))->version inside seqlock and comparing it to version in case lookup of child failed. This could help us distinguish between overlay rename and underlying rename (overlay dir version did not change) and return ESTALE instead of restarting lookup in the latter case. Wasn't sure if that was a good idea and what we loose if we leave it out. I tested this code, but only with upper file handles of course (xfstest generic/467). Please let me know what you think. Thanks, Amir. + static struct dentry *ovl_upper_fh_to_d(struct super_block *sb, struct ovl_fh *fh) { @@ -144,7 +326,7 @@ static struct dentry *ovl_upper_fh_to_d(struct super_block *sb, if (IS_ERR_OR_NULL(upper)) return upper; - dentry = ovl_obtain_alias(sb, upper, NULL); + dentry = ovl_get_dentry(sb, upper, NULL); dput(upper); return dentry; @@ -183,7 +365,38 @@ static struct dentry *ovl_fh_to_dentry(struct super_block *sb, struct fid *fid, return ERR_PTR(err); } +static struct dentry *ovl_fh_to_parent(struct super_block *sb, struct fid *fid, + int fh_len, int fh_type) +{ + pr_warn_ratelimited("overlayfs: connectable file handles not supported; use 'no_subtree_check' exportfs option.\n"); + return ERR_PTR(-EACCES); +} + +static int ovl_get_name(struct dentry *parent, char *name, + struct dentry *child) +{ + /* + * ovl_fh_to_dentry() returns connected dir overlay dentries and + * ovl_fh_to_parent() is not implemented, so we should not get here. + */ + WARN_ON_ONCE(1); + return -EIO; +} + +static struct dentry *ovl_get_parent(struct dentry *dentry) +{ + /* + * ovl_fh_to_dentry() returns connected dir overlay dentries, so we + * should not get here. + */ + WARN_ON_ONCE(1); + return ERR_PTR(-EIO); +} + const struct export_operations ovl_export_operations = { .encode_fh = ovl_encode_inode_fh, .fh_to_dentry = ovl_fh_to_dentry, + .fh_to_parent = ovl_fh_to_parent, + .get_name = ovl_get_name, + .get_parent = ovl_get_parent, }; ================================================================ From 337543c3fcdf9323d3720d17ab6fc13e287bbec1 Mon Sep 17 00:00:00 2001 From: Amir Goldstein Date: Thu, 28 Dec 2017 18:36:16 +0200 Subject: [PATCH v3 4/17] ovl: decode connected upper dir file handles Until this change, we decoded upper file handles by instantiating an overlay dentry from the real upper dentry. This is sufficient to handle pure upper files, but insufficient to handle merge/impure dirs. To that end, if decoded real upper dir is connected and hashed, we lookup an overlay dentry with the same path as the real upper dir. If decoded real upper is non-dir, we instantiate a disconnected overlay dentry as before this change. Because ovl_fh_to_dentry() returns a connected overlay dir dentry, exportfs never needs to call get_parent() and get_name() to reconnect an upper overlay dir. Because connectable non-dir file handles are not supported, exportfs will not be able to use fh_to_parent() and get_name() methods to reconnect a disconnected non-dir to its parent. Therefore, the methods get_parent() and get_name() are implemented just to print out a sanity warning and the method fh_to_parent() is implemented to warn the user that using the 'subtree_check' exportfs option is not supported. An alternative approach could have been to implement instantiating of an overlay directory inode from origin/index and implement get_parent() and get_name() by calling into underlying fs operations and them instantiating the overlay parent dir. The reasons for not choosing the get_parent() approach were: - Obtaining a disconnected overlay dir dentry would requires a delicate re-factoring of ovl_lookup() to get a dentry with overlay parent info. It was preferred to avoid doing that re-factoring unless it was proven worthy. - Going down the path of disconnected dir would mean that the (non trivial) code path of d_splice_alias() could be traveled and that meant writing more tests and introduces race cases that are very hard to hit on purpose. Taking the path of connecting overlay dentry by forward lookup is therefore the safe and boring way to avoid surprises. The culprit of the chosen "connected overlay dentry" approach: - We need to take special care to rename of ancestors while connecting the overlay dentry by real dentry path. These subtleties are usually handled by generic exportfs and VFS code. Signed-off-by: Amir Goldstein --- fs/overlayfs/export.c | 215 +++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 214 insertions(+), 1 deletion(-) diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c index 557c29928e98..35f37a72d55e 100644 --- a/fs/overlayfs/export.c +++ b/fs/overlayfs/export.c @@ -130,6 +130,188 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb, return dentry; } +/* + * Lookup a child overlay dentry whose real dentry is @real. + * If @real is on upper layer, we lookup a child overlay dentry with the same + * name as the real dentry. Otherwise, we need to consult index for lookup. + */ +static struct dentry *ovl_lookup_real_one(struct dentry *parent, + struct dentry *real, + struct ovl_layer *layer) +{ + struct dentry *this; + struct name_snapshot name; + int err; + + /* TODO: use index when looking up by lower real dentry */ + if (layer->idx) + return ERR_PTR(-EACCES); + + /* + * Lookup overlay dentry by real name. The parent mutex protects us + * from racing with overlay rename. If the overlay dentry that is + * above real has already been moved to a different parent, then this + * lookup will fail to find a child dentry whose real dentry is @real + * and we will have to restart the lookup of real path from the top. + * + * We also need to take a snapshot of real dentry name to protect us + * from racing with underlying layer rename. In this case, we don't + * care about returning ESTALE, only from referencing a free name + * pointer. + * + * TODO: try to lookup the renamed overlay dentry in inode cache by + * real inode. + */ + inode_lock_nested(d_inode(parent), I_MUTEX_PARENT); + take_dentry_name_snapshot(&name, real); + this = lookup_one_len(name.name, parent, strlen(name.name)); + err = PTR_ERR(this); + if (IS_ERR(this)) { + goto fail; + } else if (!this || !this->d_inode) { + dput(this); + err = -ENOENT; + goto fail; + } else if (ovl_dentry_upper(this) != real) { + dput(this); + err = -ESTALE; + goto fail; + } + +out: + release_dentry_name_snapshot(&name); + inode_unlock(d_inode(parent)); + return this; + +fail: + pr_warn_ratelimited("overlayfs: failed to lookup one by real (%pd2, layer=%d, parent=%pd2, err=%i)\n", + real, layer->idx, parent, err); + this = ERR_PTR(err); + goto out; +} + +/* + * Lookup an overlay dentry whose real dentry is @real. + * If @real is on upper layer, we lookup a child overlay dentry with the same + * path the real dentry. Otherwise, we need to consult index for lookup. + */ +static struct dentry *ovl_lookup_real(struct super_block *sb, + struct dentry *real, + struct ovl_layer *layer) +{ + struct dentry *connected; + unsigned int seq; + int err = 0; + + /* TODO: use index when looking up by lower real dentry */ + if (layer->idx) + return ERR_PTR(-EACCES); + + connected = dget(sb->s_root); + while (!err) { + struct dentry *next, *this; + struct dentry *parent = NULL; + struct dentry *real_connected = layer->mnt->mnt_root; + + if (real_connected == real) + break; + + /* + * Find the topmost dentry not yet connected. Taking rename_lock + * so at least we don't race with rename when walking back to + * 'real_connected'. + */ + seq = read_seqbegin(&rename_lock); + next = dget(real); + for (;;) { + parent = dget_parent(next); + + if (real_connected == parent) + break; + + /* + * If real file has been moved out of the layer root + * directory, we will eventully hit the real fs root. + */ + if (parent == next) { + err = -EXDEV; + break; + } + + dput(next); + next = parent; + } + + if (!read_seqretry(&rename_lock, seq) && !err) { + this = ovl_lookup_real_one(connected, next, layer); + /* + * Lookup of child in overlay can fail when racing with + * overlay rename of child away from 'connected' parent. + * In this case, we need to restart the lookup from the + * top, because we cannot trust that 'real_connected' is + * still an ancestor of 'real'. + */ + if (IS_ERR(this)) { + err = PTR_ERR(this); + if (err == -ENOENT || err == -ESTALE) { + this = dget(sb->s_root); + err = 0; + } + } + if (!err) { + dput(connected); + connected = this; + } + } + + dput(parent); + dput(next); + } + + if (err) + goto fail; + + return connected; + +fail: + pr_warn_ratelimited("overlayfs: failed to lookup by real (%pd2, layer=%d, connected=%pd2, err=%i)\n", + real, layer->idx, connected, err); + dput(connected); + return ERR_PTR(err); +} + + +/* + * Get an overlay dentry from upper/lower real dentries. + */ +static struct dentry *ovl_get_dentry(struct super_block *sb, + struct dentry *upper, + struct ovl_path *lowerpath) +{ + struct ovl_fs *ofs = sb->s_fs_info; + struct ovl_layer layer = { .mnt = ofs->upper_mnt }; + + /* TODO: get non-upper dentry */ + if (!upper) + return ERR_PTR(-EACCES); + + /* + * Obtain a disconnected overlay dentry from a non-dir real upper + * dentry. + */ + if (!d_is_dir(upper)) + return ovl_obtain_alias(sb, upper, NULL); + + /* Removed empty directory? */ + if ((upper->d_flags & DCACHE_DISCONNECTED) || d_unhashed(upper)) + return ERR_PTR(-ENOENT); + + /* + * If real upper dentry is connected and hashed, get a connected + * overlay dentry with the same path as the real upper dentry. + */ + return ovl_lookup_real(sb, upper, &layer); +}