diff mbox

[1/3] autofs - fix AT_NO_AUTOMOUNT not being honored

Message ID 150216641255.11652.4204561328197919771.stgit@pluto.themaw.net (mailing list archive)
State New, archived
Headers show

Commit Message

Ian Kent Aug. 8, 2017, 4:26 a.m. UTC
The fstatat(2) and statx() calls can pass the flag AT_NO_AUTOMOUNT
which is meant to clear the LOOKUP_AUTOMOUNT flag and prevent triggering
of an automount by the call. But this flag is unconditionally cleared
for all stat family system calls except statx().

stat family system calls have always triggered mount requests for the
negative dentry case in follow_automount() which is intended but prevents
the fstatat(2) and statx() AT_NO_AUTOMOUNT case from being handled.

In order to handle the AT_NO_AUTOMOUNT for both system calls the
negative dentry case in follow_automount() needs to be changed to
return ENOENT when the LOOKUP_AUTOMOUNT flag is clear (and the other
required flags are clear).

AFAICT this change doesn't have any noticable side effects and may,
in some use cases (although I didn't see it in testing) prevent
unnecessary callbacks to the automount daemon.

It's also possible that a stat family call has been made with a
path that is in the process of being mounted by some other process.
But stat family calls should return the automount state of the path
as it is "now" so it shouldn't wait for mount completion.

This is the same semantic as the positive dentry case already
handled.

Signed-off-by: Ian Kent <raven@themaw.net>
Cc: David Howells <dhowells@redhat.com>
Cc: Colin Walters <walters@redhat.com>
Cc: Ondrej Holy <oholy@redhat.com>
---
 fs/namei.c         |   15 ++++++++++++---
 include/linux/fs.h |    3 +--
 2 files changed, 13 insertions(+), 5 deletions(-)

Comments

Colin Walters Aug. 8, 2017, 1:11 p.m. UTC | #1
On Tue, Aug 8, 2017, at 12:26 AM, Ian Kent wrote:

> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -3022,8 +3022,7 @@ static inline int vfs_lstat(const char __user *name, struct kstat *stat)
>  static inline int vfs_fstatat(int dfd, const char __user *filename,
>  			      struct kstat *stat, int flags)
>  {
> -	return vfs_statx(dfd, filename, flags | AT_NO_AUTOMOUNT,
> -			 stat, STATX_BASIC_STATS);
> +	return vfs_statx(dfd, filename, flags, stat, STATX_BASIC_STATS);
>  }
>  static inline int vfs_fstat(int fd, struct kstat *stat)
>  {

This is reverting the fstatat() prat of
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=deccf497d804a4c5fca2dbfad2f104675a6f9102
Which itself seems weird to me - it looks like we were unconditionally
forcing on AT_NO_AUTOMOUNT regardless of what userspace passed?
So perhaps a
Fixes: deccf497d804a4c5fca2dbfad2f104675a6f9102
is appropriate here?

I understand that for stat()/lstat() we didn't expose the option to userspace,
so the behavior was...ah, there's this note in man-pages (man-pages-4.09-3.fc26.noarch):

> On Linux, lstat() will generally not trigger automounter action, whereas stat() will (but see fstatat(2)).

I have no idea of the history here, but maybe it makes sense to drop
the AT_NO_AUTOMOUNT from the vfs_stat() too?
Ian Kent Aug. 9, 2017, 12:45 a.m. UTC | #2
On 08/08/17 21:11, Colin Walters wrote:
> On Tue, Aug 8, 2017, at 12:26 AM, Ian Kent wrote:
> 
>> --- a/include/linux/fs.h
>> +++ b/include/linux/fs.h
>> @@ -3022,8 +3022,7 @@ static inline int vfs_lstat(const char __user *name, struct kstat *stat)
>>  static inline int vfs_fstatat(int dfd, const char __user *filename,
>>  			      struct kstat *stat, int flags)
>>  {
>> -	return vfs_statx(dfd, filename, flags | AT_NO_AUTOMOUNT,
>> -			 stat, STATX_BASIC_STATS);
>> +	return vfs_statx(dfd, filename, flags, stat, STATX_BASIC_STATS);
>>  }
>>  static inline int vfs_fstat(int fd, struct kstat *stat)
>>  {
> 
> This is reverting the fstatat() prat of
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=deccf497d804a4c5fca2dbfad2f104675a6f9102
> Which itself seems weird to me - it looks like we were unconditionally
> forcing on AT_NO_AUTOMOUNT regardless of what userspace passed?
> So perhaps a
> Fixes: deccf497d804a4c5fca2dbfad2f104675a6f9102
> is appropriate here?

David posted this at my request.

I asked him to do it because, when I saw this, I thought restoring the semantics
to what they were before they were changed needed to be done as quickly as possible.

That was so that I could then work on fixing the AT_NO_AUTOMOUNT not being honored
with fstatat(2).

> 
> I understand that for stat()/lstat() we didn't expose the option to userspace,
> so the behavior was...ah, there's this note in man-pages (man-pages-4.09-3.fc26.noarch):
> 
>> On Linux, lstat() will generally not trigger automounter action, whereas stat() will (but see fstatat(2)).
> 
> I have no idea of the history here, but maybe it makes sense to drop
> the AT_NO_AUTOMOUNT from the vfs_stat() too?
> 

I thought I had talked about the history in the patch description but I guess
it's not clear and isn't detailed enough for people that haven't been close to
the development over time.

Historically stat family calls were not supposed to trigger automounts because
that can easily lead to mount storms that are really bad for large autofs mount
maps. But the mount storm problem was mostly only evident for autofs maps that
used the "browse" option, the non-negative dentry case. The negative dentry
case always triggered an automount regardless of the system call.

Because of the move in user space to mostly always use proc filesystem mount
tables where there can be many more mount entries than were present in the
text based mount tables it's critical to not perform mount callbacks that
aren't absolutely essentially.

At this point that means to me going over the stat(2) system call behavior
and making sure it is only calling back where necessary. That's because that
is where it's expected automounts won't be triggered and so should have least
impact.

So it's the negative dentry handling in follow_automount() you should be
thinking about in terms of impact rather than the actual fstatat(2) change.
If man pages need to change then they need to change.

AFAICT, as I said in the patch description, this should not cause regressions
but I can't be certain. In any case it is in keeping with the historical "stat
family system calls shouldn't trigger automounts" mantra needed from the beginning.

Also notice that the negative dentry handling change should only affect autofs
as other kernel uses will be triggering automounts for positive dentrys.

Hopefully this doesn't sound to aggressive, I don't mean it to sound that way.

Ian
David Howells Aug. 9, 2017, 8:39 a.m. UTC | #3
Ian Kent <raven@themaw.net> wrote:

> In order to handle the AT_NO_AUTOMOUNT for both system calls the
> negative dentry case in follow_automount() needs to be changed to
> return ENOENT when the LOOKUP_AUTOMOUNT flag is clear (and the other
> required flags are clear).

Should the be EREMOTE instead of ENOENT?

David
Ian Kent Aug. 9, 2017, 9:51 a.m. UTC | #4
On 09/08/17 16:39, David Howells wrote:
> Ian Kent <raven@themaw.net> wrote:
> 
>> In order to handle the AT_NO_AUTOMOUNT for both system calls the
>> negative dentry case in follow_automount() needs to be changed to
>> return ENOENT when the LOOKUP_AUTOMOUNT flag is clear (and the other
>> required flags are clear).
> 
> Should the be EREMOTE instead of ENOENT?

I thought about that and ended up thinking ENOENT was more sensible
but I'll look at it again.

Thanks
Ian
Ian Kent Aug. 10, 2017, 2:16 a.m. UTC | #5
On 09/08/17 17:51, Ian Kent wrote:
> On 09/08/17 16:39, David Howells wrote:
>> Ian Kent <raven@themaw.net> wrote:
>>
>>> In order to handle the AT_NO_AUTOMOUNT for both system calls the
>>> negative dentry case in follow_automount() needs to be changed to
>>> return ENOENT when the LOOKUP_AUTOMOUNT flag is clear (and the other
>>> required flags are clear).
>>
>> Should the be EREMOTE instead of ENOENT?
> 
> I thought about that and ended up thinking ENOENT was more sensible
> but I'll look at it again.

I think EREMOTE and ENOENT both are inaccurate.

There's no way to know if the negative dentry corresponds to a valid map
key, and we've seen increasing lookups from userspace applications for
invalid directories, so I'm not sure.

I went with ENOENT but I guess we could use EREMOTE, what's your thinking
on why EREMOTE might be better than ENOENT?

Ian
diff mbox

Patch

diff --git a/fs/namei.c b/fs/namei.c
index ddb6a7c2b3d4..1180f9c58093 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1129,9 +1129,18 @@  static int follow_automount(struct path *path, struct nameidata *nd,
 	 * of the daemon to instantiate them before they can be used.
 	 */
 	if (!(nd->flags & (LOOKUP_PARENT | LOOKUP_DIRECTORY |
-			   LOOKUP_OPEN | LOOKUP_CREATE | LOOKUP_AUTOMOUNT)) &&
-	    path->dentry->d_inode)
-		return -EISDIR;
+			   LOOKUP_OPEN | LOOKUP_CREATE |
+			   LOOKUP_AUTOMOUNT))) {
+		/* Positive dentry that isn't meant to trigger an
+		 * automount, EISDIR will allow it to be used,
+		 * otherwise there's no mount here "now" so return
+		 * ENOENT.
+		 */
+		if (path->dentry->d_inode)
+			return -EISDIR;
+		else
+			return -ENOENT;
+	}
 
 	if (path->dentry->d_sb->s_user_ns != &init_user_ns)
 		return -EACCES;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 6e1fd5d21248..37c96f52e48e 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3022,8 +3022,7 @@  static inline int vfs_lstat(const char __user *name, struct kstat *stat)
 static inline int vfs_fstatat(int dfd, const char __user *filename,
 			      struct kstat *stat, int flags)
 {
-	return vfs_statx(dfd, filename, flags | AT_NO_AUTOMOUNT,
-			 stat, STATX_BASIC_STATS);
+	return vfs_statx(dfd, filename, flags, stat, STATX_BASIC_STATS);
 }
 static inline int vfs_fstat(int fd, struct kstat *stat)
 {