diff mbox

[3/3] autofs - fix AT_NO_AUTOMOUNT not being honored

Message ID 8760a1254e.fsf@notabene.neil.brown.name
State New, archived
Headers show

Commit Message

NeilBrown Nov. 23, 2017, 12:47 a.m. UTC
On Wed, Nov 22 2017, Ian Kent wrote:

> On 21/11/17 09:53, NeilBrown wrote:
>> On Wed, May 10 2017, Ian Kent wrote:
>> 
>>> The fstatat(2) and statx() calls can pass the flag AT_NO_AUTOMOUNT
>>> which is meant to clear the LOOKUP_AUTOMOUNT flag and prevent triggering
>>> of an automount by the call. But this flag is unconditionally cleared
>>> for all stat family system calls except statx().
>>>
>>> stat family system calls have always triggered mount requests for the
>>> negative dentry case in follow_automount() which is intended but prevents
>>> the fstatat(2) and statx() AT_NO_AUTOMOUNT case from being handled.
>>>
>>> In order to handle the AT_NO_AUTOMOUNT for both system calls the
>>> negative dentry case in follow_automount() needs to be changed to
>>> return ENOENT when the LOOKUP_AUTOMOUNT flag is clear (and the other
>>> required flags are clear).
>>>
>>> AFAICT this change doesn't have any noticable side effects and may,
>>> in some use cases (although I didn't see it in testing) prevent
>>> unnecessary callbacks to the automount daemon.
>> 
>> Actually, this patch does have a noticeable side effect.
>
> That's right.
>
>> 
>> Previously, if /home were an indirect mount point so that /home/neilb
>> would mount my home directory, "ls -l /home/neilb" would always work.
>> Now it doesn't.
>
> An this is correct too.
>
> The previous policy was that a ->lookup() would always cause a mount to
> occur. It's that policy that prevents the AT_NO_AUTOMOUNT flag from being
> able to work consistently.

Just to be clear, the previous policy was:
 kernel - a lookup would cause a message to be sent to the automount daemon
 daemon - the receipt of a message would cause a mount to occur.

Either part of this policy could be changed to allow AT_NO_AUTOMOUNT to
work reliably.  You chose to change the first.  At the time I thought
this was a good idea.  I no longer think so.

Specifically, I think the second part of the policy should be revised a little.

>
> If you set the indirect mount "browse" option that will cause the mount
> point directories for each of the map keys to be pre-created. So a
> directory listing will show the mount points and an attempt to access
> anything within the mount point directory will cause it to be mounted.
>
> There is still the problem of not knowing map keys when the wildcard
> map entry is used.
>
> Still, stat family systems calls have always had similar problems that
> prevent consistent behavior.

Yes, I understand all this.  stat family sys-call have some odd
behaviours at times like "stat(); open(); fstat()" will result in
different sets of status info.  This is known.
The point is that these odd behaviours have changed in a noticeably way
(contrary to the change log comment) and it isn't clear these changes
are good.

>
>> 
>> This happens because "ls" calls 'lstat' on the path and when that fails
>> with "ENOENT", it doesn't bother trying to open.
>
> I know, I believe the ENOENT is appropriate because there is no mount
> and no directory at the time this happens.

Two distinct statements here.  "no mount" and "no directory".
If AT_NO_AUTOMOUNT is given (or implicit) the "no mount" is significant
and shouldn't be changed.   It isn't clear that "no directory" is
significant.
If you think of the list of names stored in the autofs filesystem as a
cache of recently used names from the map, then the directory *does*
exist (in the map), it is just that the (in-kernel) cache hasn't been
populated yet.
Should "AT_NO_AUTOMOUNT" prevent the cache from being populated?

>
>> 
>> An alternate approach to the problem that this patch addresses is to
>> *not* change follow_automount() but instead change the automount daemon
>> and possibly autofs.
>
> Perhaps, but the daemon probably doesn't have enough information to
> make decisions about this so there would need to be something coming
> from the kernel too.

I don't think so.
The daemon already has a promise that upcalls for a given name are
serialized, and it has the ability to test if a name is already in the
cache.  This is enough.
I applied the following patch:


to automount and now it behaves (for NFS mounts) how I would like (though
there is room for improvement).

>
>> 
>> When a notification of access for an indirect mount point is received,
>> it would firstly just create the directory - not triggering a mount.
>> If another notification is then received (after the directory has been
>> created), it then triggers the mount.
>
> Not sure, perhaps.
>
> But I don't think that will work, I've had many problems with this type
> behavior due to bugs and I think the the same sort of problems would be
> encountered.
>
> The indirect mount "browse" option which behaves very much like what your
> suggesting is the internal program default, and has been since the initial
> v5 release. Historically it is disabled on install to maintain backward
> compatibility with the original Linux autofs (which is also the reason for
> always triggering an automount on ->lookup()).
>
> Perhaps now is the time for that to change.

Enabling browse mode does resolve this problem when the map is
enumerable (as you say, wildcards can't be supported).
But browse mode isn't always wanted.  If you have a very large map, then
caching all 10,000 entries in the kernel may be a pointless waste of
time and space.

>
>> 
>> I suspect this might need changes to autofs to avoid races when two
>> threads call lstat() on the same path at the same time.  We would need
>> to ensure that automount didn't see this as two requests.... though
>> maybe it already has enough locking.
>> 
>> Does that seem reasonable?  Should we revert this patch (as a
>> regression) and do something in automount instead?
>
> Can you check out the "browse" option behavior before we talk further
> about this please.

Done that.

>
> The problem in user space now is that there is no way to control
> whether an indirect mount that uses the "nobrowse" option is triggered
> or not (using fstatat() or statx()) regardless of the flags used and it's
> essential the AT_NO_AUTOMOUNT flag work as expected to have user space
> take more care in not triggering automounts.

I think that user-space has all the control that it needs.
There are two cases:
1/ daemon receives "missing indirect" message and the directory
   doesn't currently exist (mkdir() by the daemon succeeds).
   This could be an AT_NO_AUTOMOUNT lookup, or it could be a full
   open() or similar.  In either case the directory gets created if
   the map suggests that the name should exist.  The daemon needs to
   be careful not to block for long if it goes off-host to check for
   validity of the name.

2/ daemon received "missing indirect" message and the directory
   *does* exist (mkdir() by daemon fails with -EEXIST).
   This cannot be an AT_NO_AUTOMOUNT lookup, it must be a lookup
   intended to trigger automounts.  In this case, we trigger the
   mount.

This does lead to an odd behaviour with wildcards.
If /misc is a map with a wildcard entry, then
  ls -d /misc/IDoNotExist
will create IDoNotExist and report its existence.
  ls -l /misc/IDoNotExist
will report that it doesn't exist, but it will still have
been created.

Removing the directory after a mount-attempt fails is probably only
a few more lines of code to my patch, and would fix the worst of this.
Have a very short timeout on directories that don't immediately get
mounted on (e.g. 2 seconds??) would further improve the situation.
That timeout would have to be completely managed in the daemon
as the kernel doesn't expire empty directories.

Thanks,
NeilBrown

Comments

Ian Kent Nov. 23, 2017, 1:43 a.m. UTC | #1
On 23/11/17 08:47, NeilBrown wrote:
> On Wed, Nov 22 2017, Ian Kent wrote:
> 
>> On 21/11/17 09:53, NeilBrown wrote:
>>> On Wed, May 10 2017, Ian Kent wrote:
>>>
>>>> The fstatat(2) and statx() calls can pass the flag AT_NO_AUTOMOUNT
>>>> which is meant to clear the LOOKUP_AUTOMOUNT flag and prevent triggering
>>>> of an automount by the call. But this flag is unconditionally cleared
>>>> for all stat family system calls except statx().
>>>>
>>>> stat family system calls have always triggered mount requests for the
>>>> negative dentry case in follow_automount() which is intended but prevents
>>>> the fstatat(2) and statx() AT_NO_AUTOMOUNT case from being handled.
>>>>
>>>> In order to handle the AT_NO_AUTOMOUNT for both system calls the
>>>> negative dentry case in follow_automount() needs to be changed to
>>>> return ENOENT when the LOOKUP_AUTOMOUNT flag is clear (and the other
>>>> required flags are clear).
>>>>
>>>> AFAICT this change doesn't have any noticable side effects and may,
>>>> in some use cases (although I didn't see it in testing) prevent
>>>> unnecessary callbacks to the automount daemon.
>>>
>>> Actually, this patch does have a noticeable side effect.
>>
>> That's right.
>>
>>>
>>> Previously, if /home were an indirect mount point so that /home/neilb
>>> would mount my home directory, "ls -l /home/neilb" would always work.
>>> Now it doesn't.
>>
>> An this is correct too.
>>
>> The previous policy was that a ->lookup() would always cause a mount to
>> occur. It's that policy that prevents the AT_NO_AUTOMOUNT flag from being
>> able to work consistently.
> 
> Just to be clear, the previous policy was:
>  kernel - a lookup would cause a message to be sent to the automount daemon
>  daemon - the receipt of a message would cause a mount to occur.
> 
> Either part of this policy could be changed to allow AT_NO_AUTOMOUNT to
> work reliably.  You chose to change the first.  At the time I thought
> this was a good idea.  I no longer think so.
> 
> Specifically, I think the second part of the policy should be revised a little.
> 
>>
>> If you set the indirect mount "browse" option that will cause the mount
>> point directories for each of the map keys to be pre-created. So a
>> directory listing will show the mount points and an attempt to access
>> anything within the mount point directory will cause it to be mounted.
>>
>> There is still the problem of not knowing map keys when the wildcard
>> map entry is used.
>>
>> Still, stat family systems calls have always had similar problems that
>> prevent consistent behavior.
> 
> Yes, I understand all this.  stat family sys-call have some odd
> behaviours at times like "stat(); open(); fstat()" will result in
> different sets of status info.  This is known.
> The point is that these odd behaviours have changed in a noticeably way
> (contrary to the change log comment) and it isn't clear these changes
> are good.
> 
>>
>>>
>>> This happens because "ls" calls 'lstat' on the path and when that fails
>>> with "ENOENT", it doesn't bother trying to open.
>>
>> I know, I believe the ENOENT is appropriate because there is no mount
>> and no directory at the time this happens.
> 
> Two distinct statements here.  "no mount" and "no directory".
> If AT_NO_AUTOMOUNT is given (or implicit) the "no mount" is significant
> and shouldn't be changed.   It isn't clear that "no directory" is
> significant.
> If you think of the list of names stored in the autofs filesystem as a
> cache of recently used names from the map, then the directory *does*
> exist (in the map), it is just that the (in-kernel) cache hasn't been
> populated yet.
> Should "AT_NO_AUTOMOUNT" prevent the cache from being populated?
> 
>>
>>>
>>> An alternate approach to the problem that this patch addresses is to
>>> *not* change follow_automount() but instead change the automount daemon
>>> and possibly autofs.
>>
>> Perhaps, but the daemon probably doesn't have enough information to
>> make decisions about this so there would need to be something coming
>> from the kernel too.
> 
> I don't think so.
> The daemon already has a promise that upcalls for a given name are
> serialized, and it has the ability to test if a name is already in the
> cache.  This is enough.
> I applied the following patch:
> 
> diff --git a/modules/mount_nfs.c b/modules/mount_nfs.c
> index c558a7381054..7ddfe9c61038 100644
> --- a/modules/mount_nfs.c
> +++ b/modules/mount_nfs.c
> @@ -269,6 +269,11 @@ dont_probe:
>  		free_host_list(&hosts);
>  		return 1;
>  	}
> +	if (!status) {
> +		debug(ap->logopt, MODPREFIX "created dir, that'll do for now");
> +		free_host_list(&hosts);
> +		return 0;
> +	}
>  
>  	if (!status)
>  		existed = 0;
> 
> to automount and now it behaves (for NFS mounts) how I would like (though
> there is room for improvement).
> 
>>
>>>
>>> When a notification of access for an indirect mount point is received,
>>> it would firstly just create the directory - not triggering a mount.
>>> If another notification is then received (after the directory has been
>>> created), it then triggers the mount.
>>
>> Not sure, perhaps.
>>
>> But I don't think that will work, I've had many problems with this type
>> behavior due to bugs and I think the the same sort of problems would be
>> encountered.
>>
>> The indirect mount "browse" option which behaves very much like what your
>> suggesting is the internal program default, and has been since the initial
>> v5 release. Historically it is disabled on install to maintain backward
>> compatibility with the original Linux autofs (which is also the reason for
>> always triggering an automount on ->lookup()).
>>
>> Perhaps now is the time for that to change.
> 
> Enabling browse mode does resolve this problem when the map is
> enumerable (as you say, wildcards can't be supported).
> But browse mode isn't always wanted.  If you have a very large map, then
> caching all 10,000 entries in the kernel may be a pointless waste of
> time and space.

Indeed, that's the main use of nobrowse indirect maps.

In fact another recent change where I moved the last_used field so it
wouldn't be updated on every walk, to help with mounts never expiring,
also needs at different approach.

Doing that causes more aggressive expiring of automounts which increases
umount to mount churn and leads to increased server load at sites with a
very large number of clients.
 
> 
>>
>>>
>>> I suspect this might need changes to autofs to avoid races when two
>>> threads call lstat() on the same path at the same time.  We would need
>>> to ensure that automount didn't see this as two requests.... though
>>> maybe it already has enough locking.
>>>
>>> Does that seem reasonable?  Should we revert this patch (as a
>>> regression) and do something in automount instead?
>>
>> Can you check out the "browse" option behavior before we talk further
>> about this please.
> 
> Done that.
> 
>>
>> The problem in user space now is that there is no way to control
>> whether an indirect mount that uses the "nobrowse" option is triggered
>> or not (using fstatat() or statx()) regardless of the flags used and it's
>> essential the AT_NO_AUTOMOUNT flag work as expected to have user space
>> take more care in not triggering automounts.
> 
> I think that user-space has all the control that it needs.
> There are two cases:
> 1/ daemon receives "missing indirect" message and the directory
>    doesn't currently exist (mkdir() by the daemon succeeds).
>    This could be an AT_NO_AUTOMOUNT lookup, or it could be a full
>    open() or similar.  In either case the directory gets created if
>    the map suggests that the name should exist.  The daemon needs to
>    be careful not to block for long if it goes off-host to check for
>    validity of the name.

Aren't you assuming the the daemon only receives these lookups
on the last path component?

Intermediate path components that can trigger an automount must
be treated as not having AT_NO_AUTOMOUNT in order for the walk to
continue or fail at that point.

> 
> 2/ daemon received "missing indirect" message and the directory
>    *does* exist (mkdir() by daemon fails with -EEXIST).
>    This cannot be an AT_NO_AUTOMOUNT lookup, it must be a lookup
>    intended to trigger automounts.  In this case, we trigger the
>    mount.

An fstatat(<path>, AT_NO_AUTOMOUNT) on a browse indirect map entry
means the directory will exist but user space has asked not to trigger
the mount and return the stat info of the autofs dentry.

Please don't get me wrong, I think the suggestion is good, I just
don't think it's as simple to do as it appears.

> 
> This does lead to an odd behaviour with wildcards.
> If /misc is a map with a wildcard entry, then
>   ls -d /misc/IDoNotExist
> will create IDoNotExist and report its existence.
>   ls -l /misc/IDoNotExist
> will report that it doesn't exist, but it will still have
> been created.
> 
> Removing the directory after a mount-attempt fails is probably only
> a few more lines of code to my patch, and would fix the worst of this.
> Have a very short timeout on directories that don't immediately get
> mounted on (e.g. 2 seconds??) would further improve the situation.
> That timeout would have to be completely managed in the daemon
> as the kernel doesn't expire empty directories.
> 
> Thanks,
> NeilBrown
>
Ian Kent Nov. 23, 2017, 2:26 a.m. UTC | #2
On 23/11/17 09:43, Ian Kent wrote:
> On 23/11/17 08:47, NeilBrown wrote:
>> On Wed, Nov 22 2017, Ian Kent wrote:
>>
>>> On 21/11/17 09:53, NeilBrown wrote:
>>>> On Wed, May 10 2017, Ian Kent wrote:
>>>>
>>>>> The fstatat(2) and statx() calls can pass the flag AT_NO_AUTOMOUNT
>>>>> which is meant to clear the LOOKUP_AUTOMOUNT flag and prevent triggering
>>>>> of an automount by the call. But this flag is unconditionally cleared
>>>>> for all stat family system calls except statx().
>>>>>
>>>>> stat family system calls have always triggered mount requests for the
>>>>> negative dentry case in follow_automount() which is intended but prevents
>>>>> the fstatat(2) and statx() AT_NO_AUTOMOUNT case from being handled.
>>>>>
>>>>> In order to handle the AT_NO_AUTOMOUNT for both system calls the
>>>>> negative dentry case in follow_automount() needs to be changed to
>>>>> return ENOENT when the LOOKUP_AUTOMOUNT flag is clear (and the other
>>>>> required flags are clear).
>>>>>
>>>>> AFAICT this change doesn't have any noticable side effects and may,
>>>>> in some use cases (although I didn't see it in testing) prevent
>>>>> unnecessary callbacks to the automount daemon.
>>>>
>>>> Actually, this patch does have a noticeable side effect.
>>>
>>> That's right.
>>>
>>>>
>>>> Previously, if /home were an indirect mount point so that /home/neilb
>>>> would mount my home directory, "ls -l /home/neilb" would always work.
>>>> Now it doesn't.
>>>
>>> An this is correct too.
>>>
>>> The previous policy was that a ->lookup() would always cause a mount to
>>> occur. It's that policy that prevents the AT_NO_AUTOMOUNT flag from being
>>> able to work consistently.
>>
>> Just to be clear, the previous policy was:
>>  kernel - a lookup would cause a message to be sent to the automount daemon
>>  daemon - the receipt of a message would cause a mount to occur.
>>
>> Either part of this policy could be changed to allow AT_NO_AUTOMOUNT to
>> work reliably.  You chose to change the first.  At the time I thought
>> this was a good idea.  I no longer think so.
>>
>> Specifically, I think the second part of the policy should be revised a little.
>>
>>>
>>> If you set the indirect mount "browse" option that will cause the mount
>>> point directories for each of the map keys to be pre-created. So a
>>> directory listing will show the mount points and an attempt to access
>>> anything within the mount point directory will cause it to be mounted.
>>>
>>> There is still the problem of not knowing map keys when the wildcard
>>> map entry is used.
>>>
>>> Still, stat family systems calls have always had similar problems that
>>> prevent consistent behavior.
>>
>> Yes, I understand all this.  stat family sys-call have some odd
>> behaviours at times like "stat(); open(); fstat()" will result in
>> different sets of status info.  This is known.
>> The point is that these odd behaviours have changed in a noticeably way
>> (contrary to the change log comment) and it isn't clear these changes
>> are good.
>>
>>>
>>>>
>>>> This happens because "ls" calls 'lstat' on the path and when that fails
>>>> with "ENOENT", it doesn't bother trying to open.
>>>
>>> I know, I believe the ENOENT is appropriate because there is no mount
>>> and no directory at the time this happens.
>>
>> Two distinct statements here.  "no mount" and "no directory".
>> If AT_NO_AUTOMOUNT is given (or implicit) the "no mount" is significant
>> and shouldn't be changed.   It isn't clear that "no directory" is
>> significant.
>> If you think of the list of names stored in the autofs filesystem as a
>> cache of recently used names from the map, then the directory *does*
>> exist (in the map), it is just that the (in-kernel) cache hasn't been
>> populated yet.
>> Should "AT_NO_AUTOMOUNT" prevent the cache from being populated?
>>
>>>
>>>>
>>>> An alternate approach to the problem that this patch addresses is to
>>>> *not* change follow_automount() but instead change the automount daemon
>>>> and possibly autofs.
>>>
>>> Perhaps, but the daemon probably doesn't have enough information to
>>> make decisions about this so there would need to be something coming
>>> from the kernel too.
>>
>> I don't think so.
>> The daemon already has a promise that upcalls for a given name are
>> serialized, and it has the ability to test if a name is already in the
>> cache.  This is enough.
>> I applied the following patch:
>>
>> diff --git a/modules/mount_nfs.c b/modules/mount_nfs.c
>> index c558a7381054..7ddfe9c61038 100644
>> --- a/modules/mount_nfs.c
>> +++ b/modules/mount_nfs.c
>> @@ -269,6 +269,11 @@ dont_probe:
>>  		free_host_list(&hosts);
>>  		return 1;
>>  	}
>> +	if (!status) {
>> +		debug(ap->logopt, MODPREFIX "created dir, that'll do for now");
>> +		free_host_list(&hosts);
>> +		return 0;
>> +	}
>>  
>>  	if (!status)
>>  		existed = 0;
>>
>> to automount and now it behaves (for NFS mounts) how I would like (though
>> there is room for improvement).
>>
>>>
>>>>
>>>> When a notification of access for an indirect mount point is received,
>>>> it would firstly just create the directory - not triggering a mount.
>>>> If another notification is then received (after the directory has been
>>>> created), it then triggers the mount.
>>>
>>> Not sure, perhaps.
>>>
>>> But I don't think that will work, I've had many problems with this type
>>> behavior due to bugs and I think the the same sort of problems would be
>>> encountered.
>>>
>>> The indirect mount "browse" option which behaves very much like what your
>>> suggesting is the internal program default, and has been since the initial
>>> v5 release. Historically it is disabled on install to maintain backward
>>> compatibility with the original Linux autofs (which is also the reason for
>>> always triggering an automount on ->lookup()).
>>>
>>> Perhaps now is the time for that to change.
>>
>> Enabling browse mode does resolve this problem when the map is
>> enumerable (as you say, wildcards can't be supported).
>> But browse mode isn't always wanted.  If you have a very large map, then
>> caching all 10,000 entries in the kernel may be a pointless waste of
>> time and space.
> 
> Indeed, that's the main use of nobrowse indirect maps.
> 
> In fact another recent change where I moved the last_used field so it
> wouldn't be updated on every walk, to help with mounts never expiring,
> also needs at different approach.
> 
> Doing that causes more aggressive expiring of automounts which increases
> umount to mount churn and leads to increased server load at sites with a
> very large number of clients.
>  
>>
>>>
>>>>
>>>> I suspect this might need changes to autofs to avoid races when two
>>>> threads call lstat() on the same path at the same time.  We would need
>>>> to ensure that automount didn't see this as two requests.... though
>>>> maybe it already has enough locking.
>>>>
>>>> Does that seem reasonable?  Should we revert this patch (as a
>>>> regression) and do something in automount instead?
>>>
>>> Can you check out the "browse" option behavior before we talk further
>>> about this please.
>>
>> Done that.
>>
>>>
>>> The problem in user space now is that there is no way to control
>>> whether an indirect mount that uses the "nobrowse" option is triggered
>>> or not (using fstatat() or statx()) regardless of the flags used and it's
>>> essential the AT_NO_AUTOMOUNT flag work as expected to have user space
>>> take more care in not triggering automounts.
>>
>> I think that user-space has all the control that it needs.
>> There are two cases:
>> 1/ daemon receives "missing indirect" message and the directory
>>    doesn't currently exist (mkdir() by the daemon succeeds).
>>    This could be an AT_NO_AUTOMOUNT lookup, or it could be a full
>>    open() or similar.  In either case the directory gets created if
>>    the map suggests that the name should exist.  The daemon needs to
>>    be careful not to block for long if it goes off-host to check for
>>    validity of the name.
> 
> Aren't you assuming the the daemon only receives these lookups
> on the last path component?
> 
> Intermediate path components that can trigger an automount must
> be treated as not having AT_NO_AUTOMOUNT in order for the walk to
> continue or fail at that point.
> 
>>
>> 2/ daemon received "missing indirect" message and the directory
>>    *does* exist (mkdir() by daemon fails with -EEXIST).
>>    This cannot be an AT_NO_AUTOMOUNT lookup, it must be a lookup
>>    intended to trigger automounts.  In this case, we trigger the
>>    mount.
> 
> An fstatat(<path>, AT_NO_AUTOMOUNT) on a browse indirect map entry
> means the directory will exist but user space has asked not to trigger
> the mount and return the stat info of the autofs dentry.

This isn't right, this case won't call back to the daemon as things
are now (both before the recent change and after), it will just return
the stat info of the existing autofs dentry so if a callback reaches
the daemon it can't be an AT_NO_AUTOMOUNT lookup.

Sorry, my mistake.

> 
> Please don't get me wrong, I think the suggestion is good, I just
> don't think it's as simple to do as it appears.
> 
>>
>> This does lead to an odd behaviour with wildcards.
>> If /misc is a map with a wildcard entry, then
>>   ls -d /misc/IDoNotExist
>> will create IDoNotExist and report its existence.
>>   ls -l /misc/IDoNotExist
>> will report that it doesn't exist, but it will still have
>> been created.
>>
>> Removing the directory after a mount-attempt fails is probably only
>> a few more lines of code to my patch, and would fix the worst of this.
>> Have a very short timeout on directories that don't immediately get
>> mounted on (e.g. 2 seconds??) would further improve the situation.
>> That timeout would have to be completely managed in the daemon
>> as the kernel doesn't expire empty directories.
>>
>> Thanks,
>> NeilBrown
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe autofs" in
>
NeilBrown Nov. 23, 2017, 3:04 a.m. UTC | #3
On Thu, Nov 23 2017, Ian Kent wrote:

> On 23/11/17 08:47, NeilBrown wrote:
>> On Wed, Nov 22 2017, Ian Kent wrote:
>> 
>>> On 21/11/17 09:53, NeilBrown wrote:
>>>> On Wed, May 10 2017, Ian Kent wrote:
>>>>
>>>>> The fstatat(2) and statx() calls can pass the flag AT_NO_AUTOMOUNT
>>>>> which is meant to clear the LOOKUP_AUTOMOUNT flag and prevent triggering
>>>>> of an automount by the call. But this flag is unconditionally cleared
>>>>> for all stat family system calls except statx().
>>>>>
>>>>> stat family system calls have always triggered mount requests for the
>>>>> negative dentry case in follow_automount() which is intended but prevents
>>>>> the fstatat(2) and statx() AT_NO_AUTOMOUNT case from being handled.
>>>>>
>>>>> In order to handle the AT_NO_AUTOMOUNT for both system calls the
>>>>> negative dentry case in follow_automount() needs to be changed to
>>>>> return ENOENT when the LOOKUP_AUTOMOUNT flag is clear (and the other
>>>>> required flags are clear).
>>>>>
>>>>> AFAICT this change doesn't have any noticable side effects and may,
>>>>> in some use cases (although I didn't see it in testing) prevent
>>>>> unnecessary callbacks to the automount daemon.
>>>>
>>>> Actually, this patch does have a noticeable side effect.
>>>
>>> That's right.
>>>
>>>>
>>>> Previously, if /home were an indirect mount point so that /home/neilb
>>>> would mount my home directory, "ls -l /home/neilb" would always work.
>>>> Now it doesn't.
>>>
>>> An this is correct too.
>>>
>>> The previous policy was that a ->lookup() would always cause a mount to
>>> occur. It's that policy that prevents the AT_NO_AUTOMOUNT flag from being
>>> able to work consistently.
>> 
>> Just to be clear, the previous policy was:
>>  kernel - a lookup would cause a message to be sent to the automount daemon
>>  daemon - the receipt of a message would cause a mount to occur.
>> 
>> Either part of this policy could be changed to allow AT_NO_AUTOMOUNT to
>> work reliably.  You chose to change the first.  At the time I thought
>> this was a good idea.  I no longer think so.
>> 
>> Specifically, I think the second part of the policy should be revised a little.
>> 
>>>
>>> If you set the indirect mount "browse" option that will cause the mount
>>> point directories for each of the map keys to be pre-created. So a
>>> directory listing will show the mount points and an attempt to access
>>> anything within the mount point directory will cause it to be mounted.
>>>
>>> There is still the problem of not knowing map keys when the wildcard
>>> map entry is used.
>>>
>>> Still, stat family systems calls have always had similar problems that
>>> prevent consistent behavior.
>> 
>> Yes, I understand all this.  stat family sys-call have some odd
>> behaviours at times like "stat(); open(); fstat()" will result in
>> different sets of status info.  This is known.
>> The point is that these odd behaviours have changed in a noticeably way
>> (contrary to the change log comment) and it isn't clear these changes
>> are good.
>> 
>>>
>>>>
>>>> This happens because "ls" calls 'lstat' on the path and when that fails
>>>> with "ENOENT", it doesn't bother trying to open.
>>>
>>> I know, I believe the ENOENT is appropriate because there is no mount
>>> and no directory at the time this happens.
>> 
>> Two distinct statements here.  "no mount" and "no directory".
>> If AT_NO_AUTOMOUNT is given (or implicit) the "no mount" is significant
>> and shouldn't be changed.   It isn't clear that "no directory" is
>> significant.
>> If you think of the list of names stored in the autofs filesystem as a
>> cache of recently used names from the map, then the directory *does*
>> exist (in the map), it is just that the (in-kernel) cache hasn't been
>> populated yet.
>> Should "AT_NO_AUTOMOUNT" prevent the cache from being populated?
>> 
>>>
>>>>
>>>> An alternate approach to the problem that this patch addresses is to
>>>> *not* change follow_automount() but instead change the automount daemon
>>>> and possibly autofs.
>>>
>>> Perhaps, but the daemon probably doesn't have enough information to
>>> make decisions about this so there would need to be something coming
>>> from the kernel too.
>> 
>> I don't think so.
>> The daemon already has a promise that upcalls for a given name are
>> serialized, and it has the ability to test if a name is already in the
>> cache.  This is enough.
>> I applied the following patch:
>> 
>> diff --git a/modules/mount_nfs.c b/modules/mount_nfs.c
>> index c558a7381054..7ddfe9c61038 100644
>> --- a/modules/mount_nfs.c
>> +++ b/modules/mount_nfs.c
>> @@ -269,6 +269,11 @@ dont_probe:
>>  		free_host_list(&hosts);
>>  		return 1;
>>  	}
>> +	if (!status) {
>> +		debug(ap->logopt, MODPREFIX "created dir, that'll do for now");
>> +		free_host_list(&hosts);
>> +		return 0;
>> +	}
>>  
>>  	if (!status)
>>  		existed = 0;
>> 
>> to automount and now it behaves (for NFS mounts) how I would like (though
>> there is room for improvement).
>> 
>>>
>>>>
>>>> When a notification of access for an indirect mount point is received,
>>>> it would firstly just create the directory - not triggering a mount.
>>>> If another notification is then received (after the directory has been
>>>> created), it then triggers the mount.
>>>
>>> Not sure, perhaps.
>>>
>>> But I don't think that will work, I've had many problems with this type
>>> behavior due to bugs and I think the the same sort of problems would be
>>> encountered.
>>>
>>> The indirect mount "browse" option which behaves very much like what your
>>> suggesting is the internal program default, and has been since the initial
>>> v5 release. Historically it is disabled on install to maintain backward
>>> compatibility with the original Linux autofs (which is also the reason for
>>> always triggering an automount on ->lookup()).
>>>
>>> Perhaps now is the time for that to change.
>> 
>> Enabling browse mode does resolve this problem when the map is
>> enumerable (as you say, wildcards can't be supported).
>> But browse mode isn't always wanted.  If you have a very large map, then
>> caching all 10,000 entries in the kernel may be a pointless waste of
>> time and space.
>
> Indeed, that's the main use of nobrowse indirect maps.
>
> In fact another recent change where I moved the last_used field so it
> wouldn't be updated on every walk, to help with mounts never expiring,
> also needs at different approach.
>
> Doing that causes more aggressive expiring of automounts which increases
> umount to mount churn and leads to increased server load at sites with a
> very large number of clients.
>  
>> 
>>>
>>>>
>>>> I suspect this might need changes to autofs to avoid races when two
>>>> threads call lstat() on the same path at the same time.  We would need
>>>> to ensure that automount didn't see this as two requests.... though
>>>> maybe it already has enough locking.
>>>>
>>>> Does that seem reasonable?  Should we revert this patch (as a
>>>> regression) and do something in automount instead?
>>>
>>> Can you check out the "browse" option behavior before we talk further
>>> about this please.
>> 
>> Done that.
>> 
>>>
>>> The problem in user space now is that there is no way to control
>>> whether an indirect mount that uses the "nobrowse" option is triggered
>>> or not (using fstatat() or statx()) regardless of the flags used and it's
>>> essential the AT_NO_AUTOMOUNT flag work as expected to have user space
>>> take more care in not triggering automounts.
>> 
>> I think that user-space has all the control that it needs.
>> There are two cases:
>> 1/ daemon receives "missing indirect" message and the directory
>>    doesn't currently exist (mkdir() by the daemon succeeds).
>>    This could be an AT_NO_AUTOMOUNT lookup, or it could be a full
>>    open() or similar.  In either case the directory gets created if
>>    the map suggests that the name should exist.  The daemon needs to
>>    be careful not to block for long if it goes off-host to check for
>>    validity of the name.
>
> Aren't you assuming the the daemon only receives these lookups
> on the last path component?

No.
follow_managed() calls follow_automount() in a loop until it fails
(including -EISDIR which isn't exactly failure), or until no automount
is needed.
So where-ever the DCACHE_NEED_AUTOMOUNT is in the path, d_automount()
will be called if the dentry is negative, and unless it is the last
component of a NO_AUTOMOUNT lookup, it will be called if the dentry
isn't mounted-on.  So it would now be called twice for indirect
mounts that aren't browseable - once to mkdir and once to mount.  Might
there be a problem there?

>
> Intermediate path components that can trigger an automount must
> be treated as not having AT_NO_AUTOMOUNT in order for the walk to
> continue or fail at that point.
>
>> 
>> 2/ daemon received "missing indirect" message and the directory
>>    *does* exist (mkdir() by daemon fails with -EEXIST).
>>    This cannot be an AT_NO_AUTOMOUNT lookup, it must be a lookup
>>    intended to trigger automounts.  In this case, we trigger the
>>    mount.
>
> An fstatat(<path>, AT_NO_AUTOMOUNT) on a browse indirect map entry
> means the directory will exist but user space has asked not to trigger
> the mount and return the stat info of the autofs dentry.
>
> Please don't get me wrong, I think the suggestion is good, I just
> don't think it's as simple to do as it appears.

Maybe I should write a more complete patch for you to experiment with.

Thanks,
NeilBrown
Ian Kent Nov. 23, 2017, 3:41 a.m. UTC | #4
On 23/11/17 11:04, NeilBrown wrote:
> On Thu, Nov 23 2017, Ian Kent wrote:
> 
>> On 23/11/17 08:47, NeilBrown wrote:
>>> On Wed, Nov 22 2017, Ian Kent wrote:
>>>
>>>> On 21/11/17 09:53, NeilBrown wrote:
>>>>> On Wed, May 10 2017, Ian Kent wrote:
>>>>>
>>>>>> The fstatat(2) and statx() calls can pass the flag AT_NO_AUTOMOUNT
>>>>>> which is meant to clear the LOOKUP_AUTOMOUNT flag and prevent triggering
>>>>>> of an automount by the call. But this flag is unconditionally cleared
>>>>>> for all stat family system calls except statx().
>>>>>>
>>>>>> stat family system calls have always triggered mount requests for the
>>>>>> negative dentry case in follow_automount() which is intended but prevents
>>>>>> the fstatat(2) and statx() AT_NO_AUTOMOUNT case from being handled.
>>>>>>
>>>>>> In order to handle the AT_NO_AUTOMOUNT for both system calls the
>>>>>> negative dentry case in follow_automount() needs to be changed to
>>>>>> return ENOENT when the LOOKUP_AUTOMOUNT flag is clear (and the other
>>>>>> required flags are clear).
>>>>>>
>>>>>> AFAICT this change doesn't have any noticable side effects and may,
>>>>>> in some use cases (although I didn't see it in testing) prevent
>>>>>> unnecessary callbacks to the automount daemon.
>>>>>
>>>>> Actually, this patch does have a noticeable side effect.
>>>>
>>>> That's right.
>>>>
>>>>>
>>>>> Previously, if /home were an indirect mount point so that /home/neilb
>>>>> would mount my home directory, "ls -l /home/neilb" would always work.
>>>>> Now it doesn't.
>>>>
>>>> An this is correct too.
>>>>
>>>> The previous policy was that a ->lookup() would always cause a mount to
>>>> occur. It's that policy that prevents the AT_NO_AUTOMOUNT flag from being
>>>> able to work consistently.
>>>
>>> Just to be clear, the previous policy was:
>>>  kernel - a lookup would cause a message to be sent to the automount daemon
>>>  daemon - the receipt of a message would cause a mount to occur.
>>>
>>> Either part of this policy could be changed to allow AT_NO_AUTOMOUNT to
>>> work reliably.  You chose to change the first.  At the time I thought
>>> this was a good idea.  I no longer think so.
>>>
>>> Specifically, I think the second part of the policy should be revised a little.
>>>
>>>>
>>>> If you set the indirect mount "browse" option that will cause the mount
>>>> point directories for each of the map keys to be pre-created. So a
>>>> directory listing will show the mount points and an attempt to access
>>>> anything within the mount point directory will cause it to be mounted.
>>>>
>>>> There is still the problem of not knowing map keys when the wildcard
>>>> map entry is used.
>>>>
>>>> Still, stat family systems calls have always had similar problems that
>>>> prevent consistent behavior.
>>>
>>> Yes, I understand all this.  stat family sys-call have some odd
>>> behaviours at times like "stat(); open(); fstat()" will result in
>>> different sets of status info.  This is known.
>>> The point is that these odd behaviours have changed in a noticeably way
>>> (contrary to the change log comment) and it isn't clear these changes
>>> are good.
>>>
>>>>
>>>>>
>>>>> This happens because "ls" calls 'lstat' on the path and when that fails
>>>>> with "ENOENT", it doesn't bother trying to open.
>>>>
>>>> I know, I believe the ENOENT is appropriate because there is no mount
>>>> and no directory at the time this happens.
>>>
>>> Two distinct statements here.  "no mount" and "no directory".
>>> If AT_NO_AUTOMOUNT is given (or implicit) the "no mount" is significant
>>> and shouldn't be changed.   It isn't clear that "no directory" is
>>> significant.
>>> If you think of the list of names stored in the autofs filesystem as a
>>> cache of recently used names from the map, then the directory *does*
>>> exist (in the map), it is just that the (in-kernel) cache hasn't been
>>> populated yet.
>>> Should "AT_NO_AUTOMOUNT" prevent the cache from being populated?
>>>
>>>>
>>>>>
>>>>> An alternate approach to the problem that this patch addresses is to
>>>>> *not* change follow_automount() but instead change the automount daemon
>>>>> and possibly autofs.
>>>>
>>>> Perhaps, but the daemon probably doesn't have enough information to
>>>> make decisions about this so there would need to be something coming
>>>> from the kernel too.
>>>
>>> I don't think so.
>>> The daemon already has a promise that upcalls for a given name are
>>> serialized, and it has the ability to test if a name is already in the
>>> cache.  This is enough.
>>> I applied the following patch:
>>>
>>> diff --git a/modules/mount_nfs.c b/modules/mount_nfs.c
>>> index c558a7381054..7ddfe9c61038 100644
>>> --- a/modules/mount_nfs.c
>>> +++ b/modules/mount_nfs.c
>>> @@ -269,6 +269,11 @@ dont_probe:
>>>  		free_host_list(&hosts);
>>>  		return 1;
>>>  	}
>>> +	if (!status) {
>>> +		debug(ap->logopt, MODPREFIX "created dir, that'll do for now");
>>> +		free_host_list(&hosts);
>>> +		return 0;
>>> +	}
>>>  
>>>  	if (!status)
>>>  		existed = 0;
>>>
>>> to automount and now it behaves (for NFS mounts) how I would like (though
>>> there is room for improvement).
>>>
>>>>
>>>>>
>>>>> When a notification of access for an indirect mount point is received,
>>>>> it would firstly just create the directory - not triggering a mount.
>>>>> If another notification is then received (after the directory has been
>>>>> created), it then triggers the mount.
>>>>
>>>> Not sure, perhaps.
>>>>
>>>> But I don't think that will work, I've had many problems with this type
>>>> behavior due to bugs and I think the the same sort of problems would be
>>>> encountered.
>>>>
>>>> The indirect mount "browse" option which behaves very much like what your
>>>> suggesting is the internal program default, and has been since the initial
>>>> v5 release. Historically it is disabled on install to maintain backward
>>>> compatibility with the original Linux autofs (which is also the reason for
>>>> always triggering an automount on ->lookup()).
>>>>
>>>> Perhaps now is the time for that to change.
>>>
>>> Enabling browse mode does resolve this problem when the map is
>>> enumerable (as you say, wildcards can't be supported).
>>> But browse mode isn't always wanted.  If you have a very large map, then
>>> caching all 10,000 entries in the kernel may be a pointless waste of
>>> time and space.
>>
>> Indeed, that's the main use of nobrowse indirect maps.
>>
>> In fact another recent change where I moved the last_used field so it
>> wouldn't be updated on every walk, to help with mounts never expiring,
>> also needs at different approach.
>>
>> Doing that causes more aggressive expiring of automounts which increases
>> umount to mount churn and leads to increased server load at sites with a
>> very large number of clients.
>>  
>>>
>>>>
>>>>>
>>>>> I suspect this might need changes to autofs to avoid races when two
>>>>> threads call lstat() on the same path at the same time.  We would need
>>>>> to ensure that automount didn't see this as two requests.... though
>>>>> maybe it already has enough locking.
>>>>>
>>>>> Does that seem reasonable?  Should we revert this patch (as a
>>>>> regression) and do something in automount instead?
>>>>
>>>> Can you check out the "browse" option behavior before we talk further
>>>> about this please.
>>>
>>> Done that.
>>>
>>>>
>>>> The problem in user space now is that there is no way to control
>>>> whether an indirect mount that uses the "nobrowse" option is triggered
>>>> or not (using fstatat() or statx()) regardless of the flags used and it's
>>>> essential the AT_NO_AUTOMOUNT flag work as expected to have user space
>>>> take more care in not triggering automounts.
>>>
>>> I think that user-space has all the control that it needs.
>>> There are two cases:
>>> 1/ daemon receives "missing indirect" message and the directory
>>>    doesn't currently exist (mkdir() by the daemon succeeds).
>>>    This could be an AT_NO_AUTOMOUNT lookup, or it could be a full
>>>    open() or similar.  In either case the directory gets created if
>>>    the map suggests that the name should exist.  The daemon needs to
>>>    be careful not to block for long if it goes off-host to check for
>>>    validity of the name.
>>
>> Aren't you assuming the the daemon only receives these lookups
>> on the last path component?
> 
> No.
> follow_managed() calls follow_automount() in a loop until it fails
> (including -EISDIR which isn't exactly failure), or until no automount
> is needed.
> So where-ever the DCACHE_NEED_AUTOMOUNT is in the path, d_automount()
> will be called if the dentry is negative, and unless it is the last
> component of a NO_AUTOMOUNT lookup, it will be called if the dentry
> isn't mounted-on.  So it would now be called twice for indirect
> mounts that aren't browseable - once to mkdir and once to mount.  Might
> there be a problem there?

Yes, your right, and this essentially implements the needed "does this
key exist" query to the daemon .... mmmm.

Although the directory create in the daemon could be avoided with a
callback that specifically asks "does this key exist" and a module
minor version update could be used to decide whether the new functionality
can be used or not, just a thought.

The network source key lookup might need attention, as you pointed out,
but the usual ->d_automount() suffers this potential problem already so
that might be a bonus.

> 
>>
>> Intermediate path components that can trigger an automount must
>> be treated as not having AT_NO_AUTOMOUNT in order for the walk to
>> continue or fail at that point.
>>
>>>
>>> 2/ daemon received "missing indirect" message and the directory
>>>    *does* exist (mkdir() by daemon fails with -EEXIST).
>>>    This cannot be an AT_NO_AUTOMOUNT lookup, it must be a lookup
>>>    intended to trigger automounts.  In this case, we trigger the
>>>    mount.
>>
>> An fstatat(<path>, AT_NO_AUTOMOUNT) on a browse indirect map entry
>> means the directory will exist but user space has asked not to trigger
>> the mount and return the stat info of the autofs dentry.
>>
>> Please don't get me wrong, I think the suggestion is good, I just
>> don't think it's as simple to do as it appears.
> 
> Maybe I should write a more complete patch for you to experiment with.
> 
> Thanks,
> NeilBrown
>
diff mbox

Patch

diff --git a/modules/mount_nfs.c b/modules/mount_nfs.c
index c558a7381054..7ddfe9c61038 100644
--- a/modules/mount_nfs.c
+++ b/modules/mount_nfs.c
@@ -269,6 +269,11 @@  dont_probe:
 		free_host_list(&hosts);
 		return 1;
 	}
+	if (!status) {
+		debug(ap->logopt, MODPREFIX "created dir, that'll do for now");
+		free_host_list(&hosts);
+		return 0;
+	}
 
 	if (!status)
 		existed = 0;