From patchwork Mon Jul 1 13:20:30 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 2807601 Return-Path: X-Original-To: patchwork-linux-nfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 81B7D9F756 for ; Mon, 1 Jul 2013 13:20:42 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id C72032019A for ; Mon, 1 Jul 2013 13:20:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 44EBB20196 for ; Mon, 1 Jul 2013 13:20:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753649Ab3GANUh (ORCPT ); Mon, 1 Jul 2013 09:20:37 -0400 Received: from mail-yh0-f42.google.com ([209.85.213.42]:60817 "EHLO mail-yh0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752942Ab3GANUg (ORCPT ); Mon, 1 Jul 2013 09:20:36 -0400 Received: by mail-yh0-f42.google.com with SMTP id c41so2520196yho.15 for ; Mon, 01 Jul 2013 06:20:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=sender:from:to:cc:subject:date:message-id:x-mailer :x-gm-message-state; bh=fqGBMAu97Kz72A5Z7t8SCtysaQffR8jz8b4QRVtWBqQ=; b=gMzcjAe5lM+izrUN5IdEKDMRmM15ykV5vKLvuk9aVuq9hfyrz/td2zmBptntiH3XeN AckLbjkGwmfu5snzcAEJGwTs1ysgJ/eztTrILaTjKfGd63WMa1YwJ0AwUfreeX/Rw/JE CKSBECr/MdZe92iNbACQeCERDj9hab1ZYZl2F7HjqnkjUSXlv+MSd8dRdXine98WmbE0 bWTCUFZ8i/XgvP+5HWojeLa6CgDH90linFbZVQXEkThmh9TVGNOhRrpiTrVgMxqRgki2 gA+9U3SkboQBUKeo3wVyp+CDR+UQE2O17z0kg1csv41VoJUR1hDWLJRht6nEukiWXrJY 3WzQ== X-Received: by 10.236.60.137 with SMTP id u9mr806005yhc.195.1372684835932; Mon, 01 Jul 2013 06:20:35 -0700 (PDT) Received: from salusa.poochiereds.net (cpe-107-015-124-230.nc.res.rr.com. [107.15.124.230]) by mx.google.com with ESMTPSA id g39sm33146081yhb.13.2013.07.01.06.20.34 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Mon, 01 Jul 2013 06:20:35 -0700 (PDT) From: Jeff Layton To: Al Viro Cc: Trond Myklebust , NeilBrown , linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [PATCH v1] vfs: allow umount to handle mountpoints without revalidating them Date: Mon, 1 Jul 2013 09:20:30 -0400 Message-Id: <1372684830-13272-1-git-send-email-jlayton@redhat.com> X-Mailer: git-send-email 1.8.1.4 X-Gm-Message-State: ALoCoQk9Ofsv6bJSkKr7P9+sREcFXGSNS7jib7UKgyWDfOuHiPgK9aavNjHRvbeQ5SeubkaQHXLj Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Christopher reported a regression where he was unable to unmount a NFS filesystem where the root had gone stale. The problem is that d_revalidate handles the root of the filesystem differently from other dentries, but d_weak_revalidate does not. We could simply fix this by making d_weak_revalidate return success on IS_ROOT dentries, but there are cases where we do want to revalidate the root of the fs. A umount is really a special case. We generally aren't interested in anything but the dentry and vfsmount that's attached at that point. If the inode turns out to be stale we just don't care since the intent is to stop using it anyway. Try to handle this situation better by treating umount as a special case in the lookup code. Have it resolve the parent using normal means, and then do a lookup of the final dentry without revalidating it. In most cases, the final lookup will come out of the dcache, but the case where there's a trailing symlink or !LAST_NORM entry on the end complicates things a bit. Reported-by: Christopher T Vogan Signed-off-by: Jeff Layton --- fs/namei.c | 182 ++++++++++++++++++++++++++++++++++++++++++++++++++ fs/namespace.c | 2 +- include/linux/namei.h | 1 + 3 files changed, 184 insertions(+), 1 deletion(-) diff --git a/fs/namei.c b/fs/namei.c index 9ed9361..47a7d69 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -2185,6 +2185,188 @@ user_path_parent(int dfd, const char __user *path, struct nameidata *nd, return s; } +/** + * umount_lookup_last - look up last component for umount + * @nd: pathwalk nameidata - currently pointing at parent directory of "last" + * @path: pointer to container for result + * + * This is a special lookup_last function just for umount. In this case, we + * need to resolve the path without doing any revalidation. + * + * The nameidata should be the result of doing a LOOKUP_PARENT pathwalk. Since + * mountpoints are always pinned in the dcache, their ancestors are too. Thus, + * in almost all cases, this lookup will be served out of the dcache. The only + * cases where it won't are if nd->last refers to a symlink or the path is + * bogus and it doesn't exist. + * + * Returns: + * -error: if there was an error during lookup. This includes -ENOENT if the + * lookup found a negative dentry. The nd->path reference will also be + * put in this case. + * + * 0: if we successfully resolved nd->path and found it to not to be a + * symlink that needs to be followed. "path" will also be populated. + * The nd->path reference will also be put. + * + * 1: if we successfully resolved nd->last and found it to be a symlink + * that needs to be followed. "path" will be populated with the path + * to the link, and nd->path will *not* be put. + */ +static int +umount_lookup_last(struct nameidata *nd, struct path *path) +{ + int error = 0; + struct dentry *dentry; + struct dentry *dir = nd->path.dentry; + + if (unlikely(nd->flags & LOOKUP_RCU)) { + WARN_ON_ONCE(1); + error = -ECHILD; + goto error_check; + } + + nd->flags &= ~LOOKUP_PARENT; + + if (unlikely(nd->last_type != LAST_NORM)) { + error = handle_dots(nd, nd->last_type); + if (!error) + dentry = dget(nd->path.dentry); + goto error_check; + } + + mutex_lock(&dir->d_inode->i_mutex); + dentry = d_lookup(dir, &nd->last); + if (!dentry) { + /* + * No cached dentry. Mounted dentries are pinned in the cache, + * so that means that this dentry is probably a symlink or the + * path doesn't actually point to a mounted dentry. + */ + dentry = d_alloc(dir, &nd->last); + if (!dentry) { + error = -ENOMEM; + } else { + dentry = lookup_real(dir->d_inode, dentry, nd->flags); + if (IS_ERR(dentry)) + error = PTR_ERR(dentry); + } + } + mutex_unlock(&dir->d_inode->i_mutex); + +error_check: + if (!error) { + if (!dentry->d_inode) { + error = -ENOENT; + dput(dentry); + } else { + path->dentry = dentry; + path->mnt = mntget(nd->path.mnt); + if (should_follow_link(dentry->d_inode, + nd->flags & LOOKUP_FOLLOW)) + return 1; + follow_mount(path); + } + } + terminate_walk(nd); + return error; +} + +/** + * path_umountat - look up a path to be umounted + * @dfd: directory file descriptor to start walk from + * @name: full pathname to walk + * @flags: lookup flags + * @nd: pathwalk nameidata + * + * Look up the given name, but don't attempt to revalidate the last component. + * Returns 0 and "path" will be valid on success; Retuns error otherwise. + */ +static int +path_umountat(int dfd, const char *name, struct path *path, unsigned int flags) +{ + struct file *base = NULL; + struct nameidata nd; + int err; + + err = path_init(dfd, name, flags | LOOKUP_PARENT, &nd, &base); + if (unlikely(err)) + return err; + + current->total_link_count = 0; + err = link_path_walk(name, &nd); + if (err) + goto out; + + /* If we're in rcuwalk, drop out of it to handle last component */ + if (nd.flags & LOOKUP_RCU) { + err = unlazy_walk(&nd, NULL); + if (err) { + terminate_walk(&nd); + goto out; + } + } + + err = umount_lookup_last(&nd, path); + while (err > 0) { + void *cookie; + struct path link = *path; + err = may_follow_link(&link, &nd); + if (unlikely(err)) + break; + nd.flags |= LOOKUP_PARENT; + err = follow_link(&link, &nd, &cookie); + if (err) + break; + err = umount_lookup_last(&nd, path); + put_link(&nd, &link, cookie); + } +out: + if (base) + fput(base); + + if (nd.root.mnt && !(nd.flags & LOOKUP_ROOT)) + path_put(&nd.root); + + return err; +} + +/** + * user_path_umountat - lookup a path from userland in order to umount it + * @dfd: directory file descriptor + * @name: pathname from userland + * @flags: lookup flags + * @path: pointer to container to hold result + * + * A umount is a special case for path walking. We're not actually interested + * in the inode in this situation, and ESTALE errors can be a problem. We + * simply want track down the dentry and vfsmount attached at the mountpoint + * and avoid revalidating the last component. + * + * Returns 0 and populates "path" on success. + */ +int +user_path_umountat(int dfd, const char __user *name, unsigned int flags, + struct path *path) +{ + struct filename *s = getname(name); + int error; + + if (IS_ERR(s)) + return PTR_ERR(s); + + error = path_umountat(dfd, s->name, path, flags | LOOKUP_RCU); + if (unlikely(error == -ECHILD)) + error = path_umountat(dfd, s->name, path, flags); + if (unlikely(error == -ESTALE)) + error = path_umountat(dfd, s->name, path, flags | LOOKUP_REVAL); + + if (likely(!error)) + audit_inode(s, path->dentry, 0); + + putname(s); + return error; +} + /* * It's inline, so penalty for filesystems that don't use sticky bit is * minimal. diff --git a/fs/namespace.c b/fs/namespace.c index 7b1ca9b..5d2676a 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -1318,7 +1318,7 @@ SYSCALL_DEFINE2(umount, char __user *, name, int, flags) if (!(flags & UMOUNT_NOFOLLOW)) lookup_flags |= LOOKUP_FOLLOW; - retval = user_path_at(AT_FDCWD, name, lookup_flags, &path); + retval = user_path_umountat(AT_FDCWD, name, lookup_flags, &path); if (retval) goto out; mnt = real_mount(path.mnt); diff --git a/include/linux/namei.h b/include/linux/namei.h index 5a5ff57..cd09751 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -58,6 +58,7 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT, LAST_BIND}; extern int user_path_at(int, const char __user *, unsigned, struct path *); extern int user_path_at_empty(int, const char __user *, unsigned, struct path *, int *empty); +extern int user_path_umountat(int, const char __user *, unsigned int, struct path *); #define user_path(name, path) user_path_at(AT_FDCWD, name, LOOKUP_FOLLOW, path) #define user_lpath(name, path) user_path_at(AT_FDCWD, name, 0, path)