sunrpc/cache: handle missing listeners better.

Message ID	87pnqj64br.fsf@notabene.neil.brown.name (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-nfs-owner@kernel.org> From: NeilBrown <neilb@suse.com> To: "J. Bruce Fields" <bfields@fieldses.org> Date: Fri, 22 Mar 2019 13:16:56 +1100 Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org> Subject: [PATCH] sunrpc/cache: handle missing listeners better. Message-ID: <87pnqj64br.fsf@notabene.neil.brown.name> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk
Series	sunrpc/cache: handle missing listeners better. \| expand sunrpc/cache: handle missing listeners better.

Message ID

87pnqj64br.fsf@notabene.neil.brown.name (mailing list archive)

State

New, archived

Headers

From: NeilBrown <neilb@suse.com>
To: "J. Bruce Fields" <bfields@fieldses.org>
Date: Fri, 22 Mar 2019 13:16:56 +1100
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: [PATCH] sunrpc/cache: handle missing listeners better.
Message-ID: <87pnqj64br.fsf@notabene.neil.brown.name>
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=-=-=";
        micalg=pgp-sha256; protocol="application/pgp-signature"
Sender: linux-nfs-owner@vger.kernel.org
Precedence: bulk

Series

sunrpc/cache: handle missing listeners better. | expand

Commit Message

NeilBrown March 22, 2019, 2:16 a.m. UTC

If no handler (such as rpc.mountd) has opened
a cache 'channel', the sunrpc cache responds to
all lookup requests with -ENOENT.  This is particularly
important for the auth.unix.gid cache which is
optional.

If the channel was open briefly and an upcall was written to it,
this upcall remains pending even when the handler closes the
channel.  When an upcall is pending, the code currently
doesn't check if there are still listeners, it only performs
that check before sending an upcall.

As the cache treads a recently closes channel (closed less than
30 seconds ago) as "potentially still open", there is a
reasonable sized window when a request can become pending
in a closed channel, and thereby block lookups indefinitely.

This can easily be demonstrated by running
  cat /proc/net/rpc/auth.unix.gid/channel

and then trying to mount an NFS filesystem from this host.  It
will block indefinitely (unless mountd is run with --manage-gids,
or krb5 is used).

When cache_check() finds that an upcall is pending, it should
perform the "cache_listeners_exist()" exist test.  If no
listeners do exist, the request should be negated.

With this change in place, there can still be a 30second wait on
mount, until the cache gives up waiting for a handler to come
back, but this is much better than an indefinite wait.

Signed-off-by: NeilBrown <neilb@suse.com>
---
 net/sunrpc/cache.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

J. Bruce Fields March 22, 2019, 4:53 p.m. UTC | #1

I've gotten complaints about the same thing and said "well, in
retrospect we shouldn't have designed the interface this way, but we
did, so just stop opening those files".

But maybe this is a reasonable compromise.

One advantage of waiting for mountd to come back is that you could
upgrade mountd in place.  That shouldn't take 30 seconds, though.  And I
haven't heard of anyone actually doing that.

It's too bad that not opening auth.unix.gid is the only way for mountd
to communicate that gids shouldn't be mapped.

--b.

On Fri, Mar 22, 2019 at 01:16:56PM +1100, NeilBrown wrote:
> If no handler (such as rpc.mountd) has opened
> a cache 'channel', the sunrpc cache responds to
> all lookup requests with -ENOENT.  This is particularly
> important for the auth.unix.gid cache which is
> optional.
> 
> If the channel was open briefly and an upcall was written to it,
> this upcall remains pending even when the handler closes the
> channel.  When an upcall is pending, the code currently
> doesn't check if there are still listeners, it only performs
> that check before sending an upcall.
> 
> As the cache treads a recently closes channel (closed less than
> 30 seconds ago) as "potentially still open", there is a
> reasonable sized window when a request can become pending
> in a closed channel, and thereby block lookups indefinitely.
> 
> This can easily be demonstrated by running
>   cat /proc/net/rpc/auth.unix.gid/channel
> 
> and then trying to mount an NFS filesystem from this host.  It
> will block indefinitely (unless mountd is run with --manage-gids,
> or krb5 is used).
> 
> When cache_check() finds that an upcall is pending, it should
> perform the "cache_listeners_exist()" exist test.  If no
> listeners do exist, the request should be negated.
> 
> With this change in place, there can still be a 30second wait on
> mount, until the cache gives up waiting for a handler to come
> back, but this is much better than an indefinite wait.
> 
> Signed-off-by: NeilBrown <neilb@suse.com>
> ---
>  net/sunrpc/cache.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
> index 12bb23b8e0c5..be9e29385adc 100644
> --- a/net/sunrpc/cache.c
> +++ b/net/sunrpc/cache.c
> @@ -40,6 +40,7 @@
>  
>  static bool cache_defer_req(struct cache_req *req, struct cache_head *item);
>  static void cache_revisit_request(struct cache_head *item);
> +static bool cache_listeners_exist(struct cache_detail *detail);
>  
>  static void cache_init(struct cache_head *h, struct cache_detail *detail)
>  {
> @@ -303,7 +304,8 @@ int cache_check(struct cache_detail *detail,
>  				cache_fresh_unlocked(h, detail);
>  				break;
>  			}
> -		}
> +		} else if (!cache_listeners_exist(detail))
> +			rv = try_to_negate_entry(detail, h);
>  	}
>  
>  	if (rv == -EAGAIN) {
> -- 
> 2.14.0.rc0.dirty
>

NeilBrown March 22, 2019, 10:32 p.m. UTC | #2

On Fri, Mar 22 2019, J. Bruce Fields wrote:

> I've gotten complaints about the same thing and said "well, in
> retrospect we shouldn't have designed the interface this way, but we
> did, so just stop opening those files".

As the fool who actually "designed" this, I can say with some confidence
that the intention was always the requests would block for at most 30
seconds.
Unfortunately the implementation was sloppy and the testing was
haphazard. 

>
> But maybe this is a reasonable compromise.
>
> One advantage of waiting for mountd to come back is that you could
> upgrade mountd in place.  That shouldn't take 30 seconds, though.  And I
> haven't heard of anyone actually doing that.

Surely upgrading of mountd in-place happens whenever you install a new
version.
That was (iirc) the main reason for not treating a recent close as fatal.

>
> It's too bad that not opening auth.unix.gid is the only way for mountd
> to communicate that gids shouldn't be mapped.

I have a general preference for reusing existing functionality rather
than creating new special-purpose functionality.  I think this has
served me well more often than not.  Maybe this is one case of "not".

If you want to restart mountd without --managed-gids (where previously
it had that option), there is a chance that you will hit this problem.
That is a case where the answer "just stop opening those files" doesn't
really apply.

Thanks,
NeilBrown

>
> --b.
>
> On Fri, Mar 22, 2019 at 01:16:56PM +1100, NeilBrown wrote:
>> If no handler (such as rpc.mountd) has opened
>> a cache 'channel', the sunrpc cache responds to
>> all lookup requests with -ENOENT.  This is particularly
>> important for the auth.unix.gid cache which is
>> optional.
>> 
>> If the channel was open briefly and an upcall was written to it,
>> this upcall remains pending even when the handler closes the
>> channel.  When an upcall is pending, the code currently
>> doesn't check if there are still listeners, it only performs
>> that check before sending an upcall.
>> 
>> As the cache treads a recently closes channel (closed less than
>> 30 seconds ago) as "potentially still open", there is a
>> reasonable sized window when a request can become pending
>> in a closed channel, and thereby block lookups indefinitely.
>> 
>> This can easily be demonstrated by running
>>   cat /proc/net/rpc/auth.unix.gid/channel
>> 
>> and then trying to mount an NFS filesystem from this host.  It
>> will block indefinitely (unless mountd is run with --manage-gids,
>> or krb5 is used).
>> 
>> When cache_check() finds that an upcall is pending, it should
>> perform the "cache_listeners_exist()" exist test.  If no
>> listeners do exist, the request should be negated.
>> 
>> With this change in place, there can still be a 30second wait on
>> mount, until the cache gives up waiting for a handler to come
>> back, but this is much better than an indefinite wait.
>> 
>> Signed-off-by: NeilBrown <neilb@suse.com>
>> ---
>>  net/sunrpc/cache.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>> 
>> diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
>> index 12bb23b8e0c5..be9e29385adc 100644
>> --- a/net/sunrpc/cache.c
>> +++ b/net/sunrpc/cache.c
>> @@ -40,6 +40,7 @@
>>  
>>  static bool cache_defer_req(struct cache_req *req, struct cache_head *item);
>>  static void cache_revisit_request(struct cache_head *item);
>> +static bool cache_listeners_exist(struct cache_detail *detail);
>>  
>>  static void cache_init(struct cache_head *h, struct cache_detail *detail)
>>  {
>> @@ -303,7 +304,8 @@ int cache_check(struct cache_detail *detail,
>>  				cache_fresh_unlocked(h, detail);
>>  				break;
>>  			}
>> -		}
>> +		} else if (!cache_listeners_exist(detail))
>> +			rv = try_to_negate_entry(detail, h);
>>  	}
>>  
>>  	if (rv == -EAGAIN) {
>> -- 
>> 2.14.0.rc0.dirty
>>

J. Bruce Fields March 25, 2019, 10:21 p.m. UTC | #3

On Sat, Mar 23, 2019 at 09:32:08AM +1100, NeilBrown wrote:
> On Fri, Mar 22 2019, J. Bruce Fields wrote:
> > I've gotten complaints about the same thing and said "well, in
> > retrospect we shouldn't have designed the interface this way, but we
> > did, so just stop opening those files".
> 
> As the fool who actually "designed" this,

Pretty sure I was there too and had my chance to object.  In any case,
hope you didn't take that as a personal complaint about your work, it
(along with lots of maintenance since then) is much appreciated.

> I can say with some confidence that the intention was always the
> requests would block for at most 30 seconds.
...
> > One advantage of waiting for mountd to come back is that you could
> > upgrade mountd in place.  That shouldn't take 30 seconds, though.  And I
> > haven't heard of anyone actually doing that.
> 
> Surely upgrading of mountd in-place happens whenever you install a new
> version.

I didn't think distros restarted mountd on upgrade, but I haven't
actually checked that.  I agree that it's something we should allow,
anyway.

> > It's too bad that not opening auth.unix.gid is the only way for mountd
> > to communicate that gids shouldn't be mapped.
> 
> I have a general preference for reusing existing functionality rather
> than creating new special-purpose functionality.  I think this has
> served me well more often than not.  Maybe this is one case of "not".
> 
> If you want to restart mountd without --managed-gids (where previously
> it had that option), there is a chance that you will hit this problem.
> That is a case where the answer "just stop opening those files" doesn't
> really apply.

Yeah.  OK, applying.

(But I'm traveling and may not get this tested and pushed out till next
week.)

--b.

NeilBrown March 25, 2019, 11 p.m. UTC | #4

On Mon, Mar 25 2019, J. Bruce Fields wrote:

> On Sat, Mar 23, 2019 at 09:32:08AM +1100, NeilBrown wrote:
>> On Fri, Mar 22 2019, J. Bruce Fields wrote:
>> > I've gotten complaints about the same thing and said "well, in
>> > retrospect we shouldn't have designed the interface this way, but we
>> > did, so just stop opening those files".
>> 
>> As the fool who actually "designed" this,
>
> Pretty sure I was there too and had my chance to object.  In any case,
> hope you didn't take that as a personal complaint about your work, it
> (along with lots of maintenance since then) is much appreciated.

Not at all, I was just feeling in a self-deprecatory mood.

>
>> I can say with some confidence that the intention was always the
>> requests would block for at most 30 seconds.
> ...
>> > One advantage of waiting for mountd to come back is that you could
>> > upgrade mountd in place.  That shouldn't take 30 seconds, though.  And I
>> > haven't heard of anyone actually doing that.
>> 
>> Surely upgrading of mountd in-place happens whenever you install a new
>> version.
>
> I didn't think distros restarted mountd on upgrade, but I haven't
> actually checked that.  I agree that it's something we should allow,
> anyway.

I just had a look at the openSUSE nfs-kernel-server package.  It
declares

  %service_add_post nfs-mountd.service nfs-server.service

where %service_add_post is an rpm macro which does various things that I
struggle to understand, but it does reference a setting in
/etc/sysconfig/services, which contains

#
# Do you want to disable the automatic restart of services when
# a new version gets installed?
#
DISABLE_RESTART_ON_UPDATE="no"

So I'm going to guess that openSUSE does restart mountd on upgrade.

>
>> > It's too bad that not opening auth.unix.gid is the only way for mountd
>> > to communicate that gids shouldn't be mapped.
>> 
>> I have a general preference for reusing existing functionality rather
>> than creating new special-purpose functionality.  I think this has
>> served me well more often than not.  Maybe this is one case of "not".
>> 
>> If you want to restart mountd without --managed-gids (where previously
>> it had that option), there is a chance that you will hit this problem.
>> That is a case where the answer "just stop opening those files" doesn't
>> really apply.
>
> Yeah.  OK, applying.
>
> (But I'm traveling and may not get this tested and pushed out till next
> week.)

No rush.  Safe travels.

Thanks,
NeilBrown


>
> --b.

diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 12bb23b8e0c5..be9e29385adc 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -40,6 +40,7 @@ 
 
 static bool cache_defer_req(struct cache_req *req, struct cache_head *item);
 static void cache_revisit_request(struct cache_head *item);
+static bool cache_listeners_exist(struct cache_detail *detail);
 
 static void cache_init(struct cache_head *h, struct cache_detail *detail)
 {
@@ -303,7 +304,8 @@  int cache_check(struct cache_detail *detail,
 				cache_fresh_unlocked(h, detail);
 				break;
 			}
-		}
+		} else if (!cache_listeners_exist(detail))
+			rv = try_to_negate_entry(detail, h);
 	}
 
 	if (rv == -EAGAIN) {

sunrpc/cache: handle missing listeners better.

Commit Message

Comments

Patch