[2/4] nfs: check for NULL vfsmount in nfs_getattr

Message ID	20180731211045.5671-3-mfasheh@suse.de (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-fsdevel-owner@kernel.org> From: Mark Fasheh <mfasheh@suse.de> To: Al Viro <viro@ZenIV.linux.org.uk>, linux-fsdevel@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, Andrew Morton <akpm@linux-foundation.org>, Amir Goldstein <amir73il@gmail.com>, linux-unionfs@vger.kernel.org, Jeff Mahoney <jeffm@suse.com>, linux-nfs@vger.kernel.org, Mark Fasheh <mfasheh@suse.de> Subject: [PATCH 2/4] nfs: check for NULL vfsmount in nfs_getattr Date: Tue, 31 Jul 2018 14:10:43 -0700 Message-Id: <20180731211045.5671-3-mfasheh@suse.de> In-Reply-To: <20180731211045.5671-1-mfasheh@suse.de> References: <20180731211045.5671-1-mfasheh@suse.de> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk
Series	vfs: map unique ino/dev pairs for user space \| expand [RFC,0/4] vfs: map unique ino/dev pairs for user space [1/4] vfs: introduce function to map unique ino/dev pairs [2/4] nfs: check for NULL vfsmount in nfs_getattr [3/4] proc: use vfs helper to get ino/dev pairs for maps file [4/4] locks: map correct ino/dev pairs when exporting to userspace

Message ID

20180731211045.5671-3-mfasheh@suse.de (mailing list archive)

State

New, archived

Headers

From: Mark Fasheh <mfasheh@suse.de>
To: Al Viro <viro@ZenIV.linux.org.uk>, linux-fsdevel@vger.kernel.org
Cc: linux-btrfs@vger.kernel.org,
        Andrew Morton <akpm@linux-foundation.org>,
        Amir Goldstein <amir73il@gmail.com>,
        linux-unionfs@vger.kernel.org, Jeff Mahoney <jeffm@suse.com>,
        linux-nfs@vger.kernel.org, Mark Fasheh <mfasheh@suse.de>
Subject: [PATCH 2/4] nfs: check for NULL vfsmount in nfs_getattr
Date: Tue, 31 Jul 2018 14:10:43 -0700
Message-Id: <20180731211045.5671-3-mfasheh@suse.de>
In-Reply-To: <20180731211045.5671-1-mfasheh@suse.de>
References: <20180731211045.5671-1-mfasheh@suse.de>
Sender: linux-fsdevel-owner@vger.kernel.org
Precedence: bulk

Series

vfs: map unique ino/dev pairs for user space | expand

->getattr from inside the kernel won't always have a vfsmount, check for this. Signed-off-by: Mark Fasheh <mfasheh@suse.de> --- fs/nfs/inode.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Al Viro July 31, 2018, 10:16 p.m. UTC | #1

On Tue, Jul 31, 2018 at 02:10:43PM -0700, Mark Fasheh wrote:
> ->getattr from inside the kernel won't always have a vfsmount, check for
> this.

Really?  Where would that happen?

Mark Fasheh July 31, 2018, 10:51 p.m. UTC | #2

Hi Al,

On Tue, Jul 31, 2018 at 11:16:05PM +0100, Al Viro wrote:
> On Tue, Jul 31, 2018 at 02:10:43PM -0700, Mark Fasheh wrote:
> > ->getattr from inside the kernel won't always have a vfsmount, check for
> > this.
> 
> Really?  Where would that happen?

It happens in my first patch. FWIW, I'm not tied to that particular bit of
code, or even this (latest) approach. I would actually very much appreciate
your input into how we might fix the problem we have where we are sometimes
not exporting a correct ino/dev pair to user space.

I have a good break-down of the problem and possible solutions here:

https://www.spinics.net/lists/linux-fsdevel/msg128003.html

Thanks,
	--Mark

Al Viro Aug. 2, 2018, 12:43 a.m. UTC | #3

On Tue, Jul 31, 2018 at 10:51:57PM +0000, Mark Fasheh wrote:
> Hi Al,
> 
> On Tue, Jul 31, 2018 at 11:16:05PM +0100, Al Viro wrote:
> > On Tue, Jul 31, 2018 at 02:10:43PM -0700, Mark Fasheh wrote:
> > > ->getattr from inside the kernel won't always have a vfsmount, check for
> > > this.
> > 
> > Really?  Where would that happen?
> 
> It happens in my first patch. FWIW, I'm not tied to that particular bit of
> code, or even this (latest) approach. I would actually very much appreciate
> your input into how we might fix the problem we have where we are sometimes
> not exporting a correct ino/dev pair to user space.

Which callers would those be?  I don't see anything in your series that
wouldn't have vfsmount at hand...

> I have a good break-down of the problem and possible solutions here:
> 
> https://www.spinics.net/lists/linux-fsdevel/msg128003.html

btrfs pulling odd crap with device numbers is a problem, but this is far from
the most obvious unpleasantness caused by that - e.g. userland code expecting
that unequal st_dev for directory and subdirectory means that we have
a mountpoint there.  Or assuming it can compare that to st_dev of mountpoints
(or, indeed, the values in /proc/self/mountinfo) to find which fs it's on...

/proc/*/maps is unfortunate, but it does contain the pathnames, doesn't it?
What _real_ problems are there - the ones where we don't have a saner solution?
Note that /proc/locks is (a) FPOS interface and (b) unsuitable for your
approach anyway.

Mark Fasheh Aug. 3, 2018, 2:04 a.m. UTC | #4

Hi Al,

On Thu, Aug 02, 2018 at 01:43:48AM +0100, Al Viro wrote:
> On Tue, Jul 31, 2018 at 10:51:57PM +0000, Mark Fasheh wrote:
> > Hi Al,
> > 
> > On Tue, Jul 31, 2018 at 11:16:05PM +0100, Al Viro wrote:
> > > On Tue, Jul 31, 2018 at 02:10:43PM -0700, Mark Fasheh wrote:
> > > > ->getattr from inside the kernel won't always have a vfsmount, check for
> > > > this.
> > > 
> > > Really?  Where would that happen?
> > 
> > It happens in my first patch. FWIW, I'm not tied to that particular bit of
> > code, or even this (latest) approach. I would actually very much appreciate
> > your input into how we might fix the problem we have where we are sometimes
> > not exporting a correct ino/dev pair to user space.
> 
> Which callers would those be?  I don't see anything in your series that
> wouldn't have vfsmount at hand...

You are correct - there's no callers in my patch series that don't have a
vfsmount available. I can remove patch #2 and pass the vfsmount through.

At some point I was trying to plug this callback into audit_copy_inode()
which AFAICT doesn't have a vfsmount to pass in when we're coming from
vfs_rename (vfs_rename() -> fsnotify_move()). I will eventually have to go
back and handle this for audit though so it seems worth mentioning here.

> > I have a good break-down of the problem and possible solutions here:
> > 
> > https://www.spinics.net/lists/linux-fsdevel/msg128003.html
> 
> btrfs pulling odd crap with device numbers is a problem, but this is far from
> the most obvious unpleasantness caused by that - e.g. userland code expecting
> that unequal st_dev for directory and subdirectory means that we have
> a mountpoint there.  Or assuming it can compare that to st_dev of mountpoints
> (or, indeed, the values in /proc/self/mountinfo) to find which fs it's on...

Indeed, I agree that all of these are pretty gross and things I'd like to
see fixed - I have to deal with subvolume boundaries in some of my own
software. I know Jeff is working on some code to give subvolumes their own
vfsmount but that wouldn't help with our device reporting unless we did
something like move s_dev from the super_block to the vfsmount (also an idea
that's been thrown around).

> /proc/*/maps is unfortunate, but it does contain the pathnames, doesn't it?
> What _real_ problems are there - the ones where we don't have a saner solution?
> Note that /proc/locks is (a) FPOS interface and (b) unsuitable for your
> approach anyway.

Yeah I see now that /proc/locks isn't going to work with something calling
into ->getattr().

Primarily my concerns are /proc/*/maps and audit (which is recording ino/dev
pairs). Even though /proc/*/maps prints a path it is still a problem as
there is no way for userspace to verify that the inode it stats using that
path is the correct one.

I don't mean to overstate the case but the /proc/*/maps issue is real to us
(SUSE) in that it's produced bug reports. For example, lsof produces
confusing output when we run with a btrfs subvolume as root.

As far as problems that can't be solved by this approach - there are
tracepoints in events/lock.h and events/writeback.h which ought to be
reporting the correct ino/dev pairs. There's of course /proc/locks too. I
acknowledge that whether those are severe enough to warrant a different
approach might be up for debate.

Thanks,
	--Mark

diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index b65aee481d13..7a3d90a7cc92 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -795,7 +795,9 @@  int nfs_getattr(const struct path *path, struct kstat *stat,
 	}
 
 	/*
-	 * We may force a getattr if the user cares about atime.
+	 * We may force a getattr if the user cares about atime. A
+	 * NULL vfsmount means we are called from inside the kernel,
+	 * where no atime is required.
 	 *
 	 * Note that we only have to check the vfsmount flags here:
 	 *  - NFS always sets S_NOATIME by so checking it would give a
@@ -803,7 +805,7 @@  int nfs_getattr(const struct path *path, struct kstat *stat,
 	 *  - NFS never sets SB_NOATIME or SB_NODIRATIME so there is
 	 *    no point in checking those.
 	 */
-	if ((path->mnt->mnt_flags & MNT_NOATIME) ||
+	if (!path->mnt || (path->mnt->mnt_flags & MNT_NOATIME) ||
 	    ((path->mnt->mnt_flags & MNT_NODIRATIME) && S_ISDIR(inode->i_mode)))
 		request_mask &= ~STATX_ATIME;

[2/4] nfs: check for NULL vfsmount in nfs_getattr

Commit Message

Comments

Patch