diff mbox series

[2/4] nfs: check for NULL vfsmount in nfs_getattr

Message ID 20180731211045.5671-3-mfasheh@suse.de (mailing list archive)
State New, archived
Headers show
Series vfs: map unique ino/dev pairs for user space | expand

Commit Message

Mark Fasheh July 31, 2018, 9:10 p.m. UTC
->getattr from inside the kernel won't always have a vfsmount, check for
this.

Signed-off-by: Mark Fasheh <mfasheh@suse.de>
---
 fs/nfs/inode.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Al Viro July 31, 2018, 10:16 p.m. UTC | #1
On Tue, Jul 31, 2018 at 02:10:43PM -0700, Mark Fasheh wrote:
> ->getattr from inside the kernel won't always have a vfsmount, check for
> this.

Really?  Where would that happen?
Mark Fasheh July 31, 2018, 10:51 p.m. UTC | #2
Hi Al,

On Tue, Jul 31, 2018 at 11:16:05PM +0100, Al Viro wrote:
> On Tue, Jul 31, 2018 at 02:10:43PM -0700, Mark Fasheh wrote:
> > ->getattr from inside the kernel won't always have a vfsmount, check for
> > this.
> 
> Really?  Where would that happen?

It happens in my first patch. FWIW, I'm not tied to that particular bit of
code, or even this (latest) approach. I would actually very much appreciate
your input into how we might fix the problem we have where we are sometimes
not exporting a correct ino/dev pair to user space.

I have a good break-down of the problem and possible solutions here:

https://www.spinics.net/lists/linux-fsdevel/msg128003.html

Thanks,
	--Mark
Al Viro Aug. 2, 2018, 12:43 a.m. UTC | #3
On Tue, Jul 31, 2018 at 10:51:57PM +0000, Mark Fasheh wrote:
> Hi Al,
> 
> On Tue, Jul 31, 2018 at 11:16:05PM +0100, Al Viro wrote:
> > On Tue, Jul 31, 2018 at 02:10:43PM -0700, Mark Fasheh wrote:
> > > ->getattr from inside the kernel won't always have a vfsmount, check for
> > > this.
> > 
> > Really?  Where would that happen?
> 
> It happens in my first patch. FWIW, I'm not tied to that particular bit of
> code, or even this (latest) approach. I would actually very much appreciate
> your input into how we might fix the problem we have where we are sometimes
> not exporting a correct ino/dev pair to user space.

Which callers would those be?  I don't see anything in your series that
wouldn't have vfsmount at hand...

> I have a good break-down of the problem and possible solutions here:
> 
> https://www.spinics.net/lists/linux-fsdevel/msg128003.html

btrfs pulling odd crap with device numbers is a problem, but this is far from
the most obvious unpleasantness caused by that - e.g. userland code expecting
that unequal st_dev for directory and subdirectory means that we have
a mountpoint there.  Or assuming it can compare that to st_dev of mountpoints
(or, indeed, the values in /proc/self/mountinfo) to find which fs it's on...

/proc/*/maps is unfortunate, but it does contain the pathnames, doesn't it?
What _real_ problems are there - the ones where we don't have a saner solution?
Note that /proc/locks is (a) FPOS interface and (b) unsuitable for your
approach anyway.
Mark Fasheh Aug. 3, 2018, 2:04 a.m. UTC | #4
Hi Al,

On Thu, Aug 02, 2018 at 01:43:48AM +0100, Al Viro wrote:
> On Tue, Jul 31, 2018 at 10:51:57PM +0000, Mark Fasheh wrote:
> > Hi Al,
> > 
> > On Tue, Jul 31, 2018 at 11:16:05PM +0100, Al Viro wrote:
> > > On Tue, Jul 31, 2018 at 02:10:43PM -0700, Mark Fasheh wrote:
> > > > ->getattr from inside the kernel won't always have a vfsmount, check for
> > > > this.
> > > 
> > > Really?  Where would that happen?
> > 
> > It happens in my first patch. FWIW, I'm not tied to that particular bit of
> > code, or even this (latest) approach. I would actually very much appreciate
> > your input into how we might fix the problem we have where we are sometimes
> > not exporting a correct ino/dev pair to user space.
> 
> Which callers would those be?  I don't see anything in your series that
> wouldn't have vfsmount at hand...

You are correct - there's no callers in my patch series that don't have a
vfsmount available. I can remove patch #2 and pass the vfsmount through.

At some point I was trying to plug this callback into audit_copy_inode()
which AFAICT doesn't have a vfsmount to pass in when we're coming from
vfs_rename (vfs_rename() -> fsnotify_move()). I will eventually have to go
back and handle this for audit though so it seems worth mentioning here.


> > I have a good break-down of the problem and possible solutions here:
> > 
> > https://www.spinics.net/lists/linux-fsdevel/msg128003.html
> 
> btrfs pulling odd crap with device numbers is a problem, but this is far from
> the most obvious unpleasantness caused by that - e.g. userland code expecting
> that unequal st_dev for directory and subdirectory means that we have
> a mountpoint there.  Or assuming it can compare that to st_dev of mountpoints
> (or, indeed, the values in /proc/self/mountinfo) to find which fs it's on...

Indeed, I agree that all of these are pretty gross and things I'd like to
see fixed - I have to deal with subvolume boundaries in some of my own
software. I know Jeff is working on some code to give subvolumes their own
vfsmount but that wouldn't help with our device reporting unless we did
something like move s_dev from the super_block to the vfsmount (also an idea
that's been thrown around).


> /proc/*/maps is unfortunate, but it does contain the pathnames, doesn't it?
> What _real_ problems are there - the ones where we don't have a saner solution?
> Note that /proc/locks is (a) FPOS interface and (b) unsuitable for your
> approach anyway.

Yeah I see now that /proc/locks isn't going to work with something calling
into ->getattr().

Primarily my concerns are /proc/*/maps and audit (which is recording ino/dev
pairs). Even though /proc/*/maps prints a path it is still a problem as
there is no way for userspace to verify that the inode it stats using that
path is the correct one.

I don't mean to overstate the case but the /proc/*/maps issue is real to us
(SUSE) in that it's produced bug reports. For example, lsof produces
confusing output when we run with a btrfs subvolume as root.

As far as problems that can't be solved by this approach - there are
tracepoints in events/lock.h and events/writeback.h which ought to be
reporting the correct ino/dev pairs. There's of course /proc/locks too. I
acknowledge that whether those are severe enough to warrant a different
approach might be up for debate.

Thanks,
	--Mark
diff mbox series

Patch

diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index b65aee481d13..7a3d90a7cc92 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -795,7 +795,9 @@  int nfs_getattr(const struct path *path, struct kstat *stat,
 	}
 
 	/*
-	 * We may force a getattr if the user cares about atime.
+	 * We may force a getattr if the user cares about atime. A
+	 * NULL vfsmount means we are called from inside the kernel,
+	 * where no atime is required.
 	 *
 	 * Note that we only have to check the vfsmount flags here:
 	 *  - NFS always sets S_NOATIME by so checking it would give a
@@ -803,7 +805,7 @@  int nfs_getattr(const struct path *path, struct kstat *stat,
 	 *  - NFS never sets SB_NOATIME or SB_NODIRATIME so there is
 	 *    no point in checking those.
 	 */
-	if ((path->mnt->mnt_flags & MNT_NOATIME) ||
+	if (!path->mnt || (path->mnt->mnt_flags & MNT_NOATIME) ||
 	    ((path->mnt->mnt_flags & MNT_NODIRATIME) && S_ISDIR(inode->i_mode)))
 		request_mask &= ~STATX_ATIME;