diff mbox

[2/2] mount: RPC_PROGNOTREGISTERED should not be a permanent error

Message ID 147157115640.26568.2934329194247787636.stgit@noble (mailing list archive)
State Deferred, archived
Headers show

Commit Message

NeilBrown Aug. 19, 2016, 1:45 a.m. UTC
Commit: bf66c9facb8e ("mounts.nfs: v2 and v3 background mounts should retry when server is down.")

changed the behaviour of "bg" mounts so that RPC_PROGNOTREGISTERED,
which maps to EOPNOTSUPP, is not a permanent error.
This useful because when an NFS server starts up there is a small window between
the moment that rpcbind (or portmap) starts responding to lookup requests,
and the moment when nfsd registers with rpcbind.  During that window
rpcbind will reply with RPC_PROGNOTREGISTERED, but mount should not give up.

This same reasoning applies to foreground mounts.  They don't wait for
as long, but could still hit the window and fail prematurely.

So revert the above patch and instead add EOPNOTSUPP to the list of
temporary errors known to nfs_is_permanent_error.

Signed-off-by: NeilBrown <neilb@suse.com>
---
 utils/mount/stropts.c |    7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

J. Bruce Fields Aug. 19, 2016, 8:59 p.m. UTC | #1
Just one more thought about this one....

If you mistakenly attempt to mount a server that's using rpc for some
service other than nfs, then this will result in a hang where we
previously got a useful error, right?

Maybe that's a rare enough case not to worry about these days.

--b.

On Fri, Aug 19, 2016 at 11:45:56AM +1000, NeilBrown wrote:
> Commit: bf66c9facb8e ("mounts.nfs: v2 and v3 background mounts should retry when server is down.")
> 
> changed the behaviour of "bg" mounts so that RPC_PROGNOTREGISTERED,
> which maps to EOPNOTSUPP, is not a permanent error.
> This useful because when an NFS server starts up there is a small window between
> the moment that rpcbind (or portmap) starts responding to lookup requests,
> and the moment when nfsd registers with rpcbind.  During that window
> rpcbind will reply with RPC_PROGNOTREGISTERED, but mount should not give up.
> 
> This same reasoning applies to foreground mounts.  They don't wait for
> as long, but could still hit the window and fail prematurely.
> 
> So revert the above patch and instead add EOPNOTSUPP to the list of
> temporary errors known to nfs_is_permanent_error.
> 
> Signed-off-by: NeilBrown <neilb@suse.com>
> ---
>  utils/mount/stropts.c |    7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/utils/mount/stropts.c b/utils/mount/stropts.c
> index 9de6794c6177..d5dfb5e4a669 100644
> --- a/utils/mount/stropts.c
> +++ b/utils/mount/stropts.c
> @@ -948,6 +948,7 @@ static int nfs_is_permanent_error(int error)
>  	case ETIMEDOUT:
>  	case ECONNREFUSED:
>  	case EHOSTUNREACH:
> +	case EOPNOTSUPP:	/* aka RPC_PROGNOTREGISTERED */
>  	case EAGAIN:
>  		return 0;	/* temporary */
>  	default:
> @@ -1019,8 +1020,7 @@ static int nfsmount_parent(struct nfsmount_info *mi)
>  	if (nfs_try_mount(mi))
>  		return EX_SUCCESS;
>  
> -	/* retry background mounts when the server is not up */
> -	if (nfs_is_permanent_error(errno) && errno != EOPNOTSUPP) {
> +	if (nfs_is_permanent_error(errno)) {
>  		mount_error(mi->spec, mi->node, errno);
>  		return EX_FAIL;
>  	}
> @@ -1055,8 +1055,7 @@ static int nfsmount_child(struct nfsmount_info *mi)
>  		if (nfs_try_mount(mi))
>  			return EX_SUCCESS;
>  
> -		/* retry background mounts when the server is not up */
> -		if (nfs_is_permanent_error(errno) && errno != EOPNOTSUPP)
> +		if (nfs_is_permanent_error(errno))
>  			break;
>  
>  		if (time(NULL) > timeout)
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
NeilBrown Aug. 19, 2016, 9:37 p.m. UTC | #2
On Sat, Aug 20 2016, J. Bruce Fields wrote:

> Just one more thought about this one....
>
> If you mistakenly attempt to mount a server that's using rpc for some
> service other than nfs, then this will result in a hang where we
> previously got a useful error, right?

That is the one problem case - yes.
I prefer to think that it will result in a timeout, rather than a
hang.... but I'm a half-full kind-a-guy.

>
> Maybe that's a rare enough case not to worry about these days.

Trying to NFS mount a filesystem from a machine that doesn't serve NFS
should always be a rare case.
The most likely credible situation that I can come up with is a typo in
the server name.

If a site uses NIS there could very likely be servers that run rpcbind
but not nfsd, so the typo scenario certainly could happen (maybe file
servers are named after composers, while NIS servers are named after dog
noises) .  I think (hope?) that an unexpected delay it nearly as good as
an error message to say "you did something wrong" and that should be
enough for the typo to be noticed.

Thanks,
NeilBrown

>
> --b.
>
> On Fri, Aug 19, 2016 at 11:45:56AM +1000, NeilBrown wrote:
>> Commit: bf66c9facb8e ("mounts.nfs: v2 and v3 background mounts should retry when server is down.")
>> 
>> changed the behaviour of "bg" mounts so that RPC_PROGNOTREGISTERED,
>> which maps to EOPNOTSUPP, is not a permanent error.
>> This useful because when an NFS server starts up there is a small window between
>> the moment that rpcbind (or portmap) starts responding to lookup requests,
>> and the moment when nfsd registers with rpcbind.  During that window
>> rpcbind will reply with RPC_PROGNOTREGISTERED, but mount should not give up.
>> 
>> This same reasoning applies to foreground mounts.  They don't wait for
>> as long, but could still hit the window and fail prematurely.
>> 
>> So revert the above patch and instead add EOPNOTSUPP to the list of
>> temporary errors known to nfs_is_permanent_error.
>> 
>> Signed-off-by: NeilBrown <neilb@suse.com>
>> ---
>>  utils/mount/stropts.c |    7 +++----
>>  1 file changed, 3 insertions(+), 4 deletions(-)
>> 
>> diff --git a/utils/mount/stropts.c b/utils/mount/stropts.c
>> index 9de6794c6177..d5dfb5e4a669 100644
>> --- a/utils/mount/stropts.c
>> +++ b/utils/mount/stropts.c
>> @@ -948,6 +948,7 @@ static int nfs_is_permanent_error(int error)
>>  	case ETIMEDOUT:
>>  	case ECONNREFUSED:
>>  	case EHOSTUNREACH:
>> +	case EOPNOTSUPP:	/* aka RPC_PROGNOTREGISTERED */
>>  	case EAGAIN:
>>  		return 0;	/* temporary */
>>  	default:
>> @@ -1019,8 +1020,7 @@ static int nfsmount_parent(struct nfsmount_info *mi)
>>  	if (nfs_try_mount(mi))
>>  		return EX_SUCCESS;
>>  
>> -	/* retry background mounts when the server is not up */
>> -	if (nfs_is_permanent_error(errno) && errno != EOPNOTSUPP) {
>> +	if (nfs_is_permanent_error(errno)) {
>>  		mount_error(mi->spec, mi->node, errno);
>>  		return EX_FAIL;
>>  	}
>> @@ -1055,8 +1055,7 @@ static int nfsmount_child(struct nfsmount_info *mi)
>>  		if (nfs_try_mount(mi))
>>  			return EX_SUCCESS;
>>  
>> -		/* retry background mounts when the server is not up */
>> -		if (nfs_is_permanent_error(errno) && errno != EOPNOTSUPP)
>> +		if (nfs_is_permanent_error(errno))
>>  			break;
>>  
>>  		if (time(NULL) > timeout)
>>
J. Bruce Fields Aug. 20, 2016, 1:25 a.m. UTC | #3
On Sat, Aug 20, 2016 at 07:37:14AM +1000, NeilBrown wrote:
> On Sat, Aug 20 2016, J. Bruce Fields wrote:
> 
> > Just one more thought about this one....
> >
> > If you mistakenly attempt to mount a server that's using rpc for some
> > service other than nfs, then this will result in a hang where we
> > previously got a useful error, right?
> 
> That is the one problem case - yes.
> I prefer to think that it will result in a timeout, rather than a
> hang.... but I'm a half-full kind-a-guy.
> 
> >
> > Maybe that's a rare enough case not to worry about these days.
> 
> Trying to NFS mount a filesystem from a machine that doesn't serve NFS
> should always be a rare case.
> The most likely credible situation that I can come up with is a typo in
> the server name.
> 
> If a site uses NIS there could very likely be servers that run rpcbind
> but not nfsd, so the typo scenario certainly could happen (maybe file
> servers are named after composers, while NIS servers are named after dog
> noises) .  I think (hope?) that an unexpected delay it nearly as good as
> an error message to say "you did something wrong" and that should be
> enough for the typo to be noticed.

OK.  Might just be worth thinking whether there's anything additional
here worth documenting (in code or just changelog) for future developers
in case this turns out to be a bigger problem than expected.

--b.

> 
> Thanks,
> NeilBrown
> 
> >
> > --b.
> >
> > On Fri, Aug 19, 2016 at 11:45:56AM +1000, NeilBrown wrote:
> >> Commit: bf66c9facb8e ("mounts.nfs: v2 and v3 background mounts should retry when server is down.")
> >> 
> >> changed the behaviour of "bg" mounts so that RPC_PROGNOTREGISTERED,
> >> which maps to EOPNOTSUPP, is not a permanent error.
> >> This useful because when an NFS server starts up there is a small window between
> >> the moment that rpcbind (or portmap) starts responding to lookup requests,
> >> and the moment when nfsd registers with rpcbind.  During that window
> >> rpcbind will reply with RPC_PROGNOTREGISTERED, but mount should not give up.
> >> 
> >> This same reasoning applies to foreground mounts.  They don't wait for
> >> as long, but could still hit the window and fail prematurely.
> >> 
> >> So revert the above patch and instead add EOPNOTSUPP to the list of
> >> temporary errors known to nfs_is_permanent_error.
> >> 
> >> Signed-off-by: NeilBrown <neilb@suse.com>
> >> ---
> >>  utils/mount/stropts.c |    7 +++----
> >>  1 file changed, 3 insertions(+), 4 deletions(-)
> >> 
> >> diff --git a/utils/mount/stropts.c b/utils/mount/stropts.c
> >> index 9de6794c6177..d5dfb5e4a669 100644
> >> --- a/utils/mount/stropts.c
> >> +++ b/utils/mount/stropts.c
> >> @@ -948,6 +948,7 @@ static int nfs_is_permanent_error(int error)
> >>  	case ETIMEDOUT:
> >>  	case ECONNREFUSED:
> >>  	case EHOSTUNREACH:
> >> +	case EOPNOTSUPP:	/* aka RPC_PROGNOTREGISTERED */
> >>  	case EAGAIN:
> >>  		return 0;	/* temporary */
> >>  	default:
> >> @@ -1019,8 +1020,7 @@ static int nfsmount_parent(struct nfsmount_info *mi)
> >>  	if (nfs_try_mount(mi))
> >>  		return EX_SUCCESS;
> >>  
> >> -	/* retry background mounts when the server is not up */
> >> -	if (nfs_is_permanent_error(errno) && errno != EOPNOTSUPP) {
> >> +	if (nfs_is_permanent_error(errno)) {
> >>  		mount_error(mi->spec, mi->node, errno);
> >>  		return EX_FAIL;
> >>  	}
> >> @@ -1055,8 +1055,7 @@ static int nfsmount_child(struct nfsmount_info *mi)
> >>  		if (nfs_try_mount(mi))
> >>  			return EX_SUCCESS;
> >>  
> >> -		/* retry background mounts when the server is not up */
> >> -		if (nfs_is_permanent_error(errno) && errno != EOPNOTSUPP)
> >> +		if (nfs_is_permanent_error(errno))
> >>  			break;
> >>  
> >>  		if (time(NULL) > timeout)
> >> 


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Dickson Nov. 22, 2016, 5:33 p.m. UTC | #4
On 08/18/2016 09:45 PM, NeilBrown wrote:
> Commit: bf66c9facb8e ("mounts.nfs: v2 and v3 background mounts should retry when server is down.")
> 
> changed the behaviour of "bg" mounts so that RPC_PROGNOTREGISTERED,
> which maps to EOPNOTSUPP, is not a permanent error.
> This useful because when an NFS server starts up there is a small window between
> the moment that rpcbind (or portmap) starts responding to lookup requests,
> and the moment when nfsd registers with rpcbind.  During that window
> rpcbind will reply with RPC_PROGNOTREGISTERED, but mount should not give up.
> 
> This same reasoning applies to foreground mounts.  They don't wait for
> as long, but could still hit the window and fail prematurely.
> 
> So revert the above patch and instead add EOPNOTSUPP to the list of
> temporary errors known to nfs_is_permanent_error.
> 
> Signed-off-by: NeilBrown <neilb@suse.com>
> ---
>  utils/mount/stropts.c |    7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/utils/mount/stropts.c b/utils/mount/stropts.c
> index 9de6794c6177..d5dfb5e4a669 100644
> --- a/utils/mount/stropts.c
> +++ b/utils/mount/stropts.c
> @@ -948,6 +948,7 @@ static int nfs_is_permanent_error(int error)
>  	case ETIMEDOUT:
>  	case ECONNREFUSED:
>  	case EHOSTUNREACH:
> +	case EOPNOTSUPP:	/* aka RPC_PROGNOTREGISTERED */
Neil,

I think this introduced a regression... When the server does not support
a protocol, say UDP, this patch cause the mount to hang forever,
which I don't think we want.

steved.
>  	case EAGAIN:
>  		return 0;	/* temporary */
>  	default:
> @@ -1019,8 +1020,7 @@ static int nfsmount_parent(struct nfsmount_info *mi)
>  	if (nfs_try_mount(mi))
>  		return EX_SUCCESS;
>  
> -	/* retry background mounts when the server is not up */
> -	if (nfs_is_permanent_error(errno) && errno != EOPNOTSUPP) {
> +	if (nfs_is_permanent_error(errno)) {
>  		mount_error(mi->spec, mi->node, errno);
>  		return EX_FAIL;
>  	}
> @@ -1055,8 +1055,7 @@ static int nfsmount_child(struct nfsmount_info *mi)
>  		if (nfs_try_mount(mi))
>  			return EX_SUCCESS;
>  
> -		/* retry background mounts when the server is not up */
> -		if (nfs_is_permanent_error(errno) && errno != EOPNOTSUPP)
> +		if (nfs_is_permanent_error(errno))
>  			break;
>  
>  		if (time(NULL) > timeout)
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
NeilBrown Nov. 22, 2016, 10:43 p.m. UTC | #5
On Wed, Nov 23 2016, Steve Dickson wrote:

> [Resent due to mailman rejecting the HTML subpart]
(and the resend included HTML too ... how embarrassing :-)

>
> Hey Neil,
>
>
> On 08/18/2016 09:45 PM, NeilBrown wrote:
>> Commit: bf66c9facb8e ("mounts.nfs: v2 and v3 background mounts should retry when server is down.")
>>
>> changed the behaviour of "bg" mounts so that RPC_PROGNOTREGISTERED,
>> which maps to EOPNOTSUPP, is not a permanent error.
>> This useful because when an NFS server starts up there is a small window between
>> the moment that rpcbind (or portmap) starts responding to lookup requests,
>> and the moment when nfsd registers with rpcbind.  During that window
>> rpcbind will reply with RPC_PROGNOTREGISTERED, but mount should not give up.
>>
>> This same reasoning applies to foreground mounts.  They don't wait for
>> as long, but could still hit the window and fail prematurely.
>>
>> So revert the above patch and instead add EOPNOTSUPP to the list of
>> temporary errors known to nfs_is_permanent_error.
>>
>> Signed-off-by: NeilBrown <neilb@suse.com>
>> ---
>>  utils/mount/stropts.c |    7 +++----
>>  1 file changed, 3 insertions(+), 4 deletions(-)
>>
>> diff --git a/utils/mount/stropts.c b/utils/mount/stropts.c
>> index 9de6794c6177..d5dfb5e4a669 100644
>> --- a/utils/mount/stropts.c
>> +++ b/utils/mount/stropts.c
>> @@ -948,6 +948,7 @@ static int nfs_is_permanent_error(int error)
>>  	case ETIMEDOUT:
>>  	case ECONNREFUSED:
>>  	case EHOSTUNREACH:
>> +	case EOPNOTSUPP:	/* aka RPC_PROGNOTREGISTERED */
> I think this introduced a regression... When the server does not support
> a protocol, say UDP, this patch cause the mount to hang forever,
> which I don't think we want.


I think we do want it to wait a while so that the nfs server has a
chance to start up.  We have no guarantee that the NFS server will be
registered with rpcbind before rpcbind responds to requests.

I disagree with the "hang forever" description.  I just tested after
disabling UDP on an nfs server, and the delay was 2 minutes, 5 seconds
before a failure was reported.  It might be longer when trying TCP on a
server that only supports UDP.

So I think the current behavior is correct.  You might be able to argue
that certain error codes should trigger a shorter timeout, but it would
need a strong argument.

Or maybe you mean that a "bg" mount would "hang forever" in the
background?  I think that behavior is correct too.

Thanks,
NeilBrown
Steve Dickson Nov. 23, 2016, 6:21 p.m. UTC | #6
On 11/22/2016 05:43 PM, NeilBrown wrote:
> On Wed, Nov 23 2016, Steve Dickson wrote:
> 
>> [Resent due to mailman rejecting the HTML subpart]
> (and the resend included HTML too ... how embarrassing :-)
Yeah... :-) I guess an upgrade turned it on.. 

> 
>>
>> Hey Neil,
>>
>>
>> On 08/18/2016 09:45 PM, NeilBrown wrote:
>>> Commit: bf66c9facb8e ("mounts.nfs: v2 and v3 background mounts should retry when server is down.")
>>>
>>> changed the behaviour of "bg" mounts so that RPC_PROGNOTREGISTERED,
>>> which maps to EOPNOTSUPP, is not a permanent error.
>>> This useful because when an NFS server starts up there is a small window between
>>> the moment that rpcbind (or portmap) starts responding to lookup requests,
>>> and the moment when nfsd registers with rpcbind.  During that window
>>> rpcbind will reply with RPC_PROGNOTREGISTERED, but mount should not give up.
>>>
>>> This same reasoning applies to foreground mounts.  They don't wait for
>>> as long, but could still hit the window and fail prematurely.
>>>
>>> So revert the above patch and instead add EOPNOTSUPP to the list of
>>> temporary errors known to nfs_is_permanent_error.
>>>
>>> Signed-off-by: NeilBrown <neilb@suse.com>
>>> ---
>>>  utils/mount/stropts.c |    7 +++----
>>>  1 file changed, 3 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/utils/mount/stropts.c b/utils/mount/stropts.c
>>> index 9de6794c6177..d5dfb5e4a669 100644
>>> --- a/utils/mount/stropts.c
>>> +++ b/utils/mount/stropts.c
>>> @@ -948,6 +948,7 @@ static int nfs_is_permanent_error(int error)
>>>  	case ETIMEDOUT:
>>>  	case ECONNREFUSED:
>>>  	case EHOSTUNREACH:
>>> +	case EOPNOTSUPP:	/* aka RPC_PROGNOTREGISTERED */
>> I think this introduced a regression... When the server does not support
>> a protocol, say UDP, this patch cause the mount to hang forever,
>> which I don't think we want.
> 
> 
> I think we do want it to wait a while so that the nfs server has a
> chance to start up.  We have no guarantee that the NFS server will be
> registered with rpcbind before rpcbind responds to requests.
I do see this race but there it has to be a small window. With
Fedora its under seconds between the time rpcbind started
and the NFS server.

> 
> I disagree with the "hang forever" description.  I just tested after
> disabling UDP on an nfs server, and the delay was 2 minutes, 5 seconds
> before a failure was reported.  It might be longer when trying TCP on a
> server that only supports UDP.
Yeah I did not wait that long... You are much more of a patient man than I ;-) 
I do think this is a regression. Going an from an instant failure to one
that takes over 2min is not a good thing... IMHO.

> 
> So I think the current behavior is correct.  You might be able to argue
> that certain error codes should trigger a shorter timeout, but it would
> need a strong argument.
Going with the theory the window is very small, how about 
a retry with a timeout then a failure? 

> 
> Or maybe you mean that a "bg" mount would "hang forever" in the
> background?  I think that behavior is correct too.
I agreed... "bg" mounts should hang longer than fg mounts
but they shouldn't for something that will never happen
like the non-support of a protocol.

steved.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/utils/mount/stropts.c b/utils/mount/stropts.c
index 9de6794c6177..d5dfb5e4a669 100644
--- a/utils/mount/stropts.c
+++ b/utils/mount/stropts.c
@@ -948,6 +948,7 @@  static int nfs_is_permanent_error(int error)
 	case ETIMEDOUT:
 	case ECONNREFUSED:
 	case EHOSTUNREACH:
+	case EOPNOTSUPP:	/* aka RPC_PROGNOTREGISTERED */
 	case EAGAIN:
 		return 0;	/* temporary */
 	default:
@@ -1019,8 +1020,7 @@  static int nfsmount_parent(struct nfsmount_info *mi)
 	if (nfs_try_mount(mi))
 		return EX_SUCCESS;
 
-	/* retry background mounts when the server is not up */
-	if (nfs_is_permanent_error(errno) && errno != EOPNOTSUPP) {
+	if (nfs_is_permanent_error(errno)) {
 		mount_error(mi->spec, mi->node, errno);
 		return EX_FAIL;
 	}
@@ -1055,8 +1055,7 @@  static int nfsmount_child(struct nfsmount_info *mi)
 		if (nfs_try_mount(mi))
 			return EX_SUCCESS;
 
-		/* retry background mounts when the server is not up */
-		if (nfs_is_permanent_error(errno) && errno != EOPNOTSUPP)
+		if (nfs_is_permanent_error(errno))
 			break;
 
 		if (time(NULL) > timeout)