Message ID | 87pnqj64br.fsf@notabene.neil.brown.name (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | sunrpc/cache: handle missing listeners better. | expand |
I've gotten complaints about the same thing and said "well, in retrospect we shouldn't have designed the interface this way, but we did, so just stop opening those files". But maybe this is a reasonable compromise. One advantage of waiting for mountd to come back is that you could upgrade mountd in place. That shouldn't take 30 seconds, though. And I haven't heard of anyone actually doing that. It's too bad that not opening auth.unix.gid is the only way for mountd to communicate that gids shouldn't be mapped. --b. On Fri, Mar 22, 2019 at 01:16:56PM +1100, NeilBrown wrote: > If no handler (such as rpc.mountd) has opened > a cache 'channel', the sunrpc cache responds to > all lookup requests with -ENOENT. This is particularly > important for the auth.unix.gid cache which is > optional. > > If the channel was open briefly and an upcall was written to it, > this upcall remains pending even when the handler closes the > channel. When an upcall is pending, the code currently > doesn't check if there are still listeners, it only performs > that check before sending an upcall. > > As the cache treads a recently closes channel (closed less than > 30 seconds ago) as "potentially still open", there is a > reasonable sized window when a request can become pending > in a closed channel, and thereby block lookups indefinitely. > > This can easily be demonstrated by running > cat /proc/net/rpc/auth.unix.gid/channel > > and then trying to mount an NFS filesystem from this host. It > will block indefinitely (unless mountd is run with --manage-gids, > or krb5 is used). > > When cache_check() finds that an upcall is pending, it should > perform the "cache_listeners_exist()" exist test. If no > listeners do exist, the request should be negated. > > With this change in place, there can still be a 30second wait on > mount, until the cache gives up waiting for a handler to come > back, but this is much better than an indefinite wait. > > Signed-off-by: NeilBrown <neilb@suse.com> > --- > net/sunrpc/cache.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c > index 12bb23b8e0c5..be9e29385adc 100644 > --- a/net/sunrpc/cache.c > +++ b/net/sunrpc/cache.c > @@ -40,6 +40,7 @@ > > static bool cache_defer_req(struct cache_req *req, struct cache_head *item); > static void cache_revisit_request(struct cache_head *item); > +static bool cache_listeners_exist(struct cache_detail *detail); > > static void cache_init(struct cache_head *h, struct cache_detail *detail) > { > @@ -303,7 +304,8 @@ int cache_check(struct cache_detail *detail, > cache_fresh_unlocked(h, detail); > break; > } > - } > + } else if (!cache_listeners_exist(detail)) > + rv = try_to_negate_entry(detail, h); > } > > if (rv == -EAGAIN) { > -- > 2.14.0.rc0.dirty >
On Fri, Mar 22 2019, J. Bruce Fields wrote: > I've gotten complaints about the same thing and said "well, in > retrospect we shouldn't have designed the interface this way, but we > did, so just stop opening those files". As the fool who actually "designed" this, I can say with some confidence that the intention was always the requests would block for at most 30 seconds. Unfortunately the implementation was sloppy and the testing was haphazard. > > But maybe this is a reasonable compromise. > > One advantage of waiting for mountd to come back is that you could > upgrade mountd in place. That shouldn't take 30 seconds, though. And I > haven't heard of anyone actually doing that. Surely upgrading of mountd in-place happens whenever you install a new version. That was (iirc) the main reason for not treating a recent close as fatal. > > It's too bad that not opening auth.unix.gid is the only way for mountd > to communicate that gids shouldn't be mapped. I have a general preference for reusing existing functionality rather than creating new special-purpose functionality. I think this has served me well more often than not. Maybe this is one case of "not". If you want to restart mountd without --managed-gids (where previously it had that option), there is a chance that you will hit this problem. That is a case where the answer "just stop opening those files" doesn't really apply. Thanks, NeilBrown > > --b. > > On Fri, Mar 22, 2019 at 01:16:56PM +1100, NeilBrown wrote: >> If no handler (such as rpc.mountd) has opened >> a cache 'channel', the sunrpc cache responds to >> all lookup requests with -ENOENT. This is particularly >> important for the auth.unix.gid cache which is >> optional. >> >> If the channel was open briefly and an upcall was written to it, >> this upcall remains pending even when the handler closes the >> channel. When an upcall is pending, the code currently >> doesn't check if there are still listeners, it only performs >> that check before sending an upcall. >> >> As the cache treads a recently closes channel (closed less than >> 30 seconds ago) as "potentially still open", there is a >> reasonable sized window when a request can become pending >> in a closed channel, and thereby block lookups indefinitely. >> >> This can easily be demonstrated by running >> cat /proc/net/rpc/auth.unix.gid/channel >> >> and then trying to mount an NFS filesystem from this host. It >> will block indefinitely (unless mountd is run with --manage-gids, >> or krb5 is used). >> >> When cache_check() finds that an upcall is pending, it should >> perform the "cache_listeners_exist()" exist test. If no >> listeners do exist, the request should be negated. >> >> With this change in place, there can still be a 30second wait on >> mount, until the cache gives up waiting for a handler to come >> back, but this is much better than an indefinite wait. >> >> Signed-off-by: NeilBrown <neilb@suse.com> >> --- >> net/sunrpc/cache.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c >> index 12bb23b8e0c5..be9e29385adc 100644 >> --- a/net/sunrpc/cache.c >> +++ b/net/sunrpc/cache.c >> @@ -40,6 +40,7 @@ >> >> static bool cache_defer_req(struct cache_req *req, struct cache_head *item); >> static void cache_revisit_request(struct cache_head *item); >> +static bool cache_listeners_exist(struct cache_detail *detail); >> >> static void cache_init(struct cache_head *h, struct cache_detail *detail) >> { >> @@ -303,7 +304,8 @@ int cache_check(struct cache_detail *detail, >> cache_fresh_unlocked(h, detail); >> break; >> } >> - } >> + } else if (!cache_listeners_exist(detail)) >> + rv = try_to_negate_entry(detail, h); >> } >> >> if (rv == -EAGAIN) { >> -- >> 2.14.0.rc0.dirty >>
On Sat, Mar 23, 2019 at 09:32:08AM +1100, NeilBrown wrote: > On Fri, Mar 22 2019, J. Bruce Fields wrote: > > I've gotten complaints about the same thing and said "well, in > > retrospect we shouldn't have designed the interface this way, but we > > did, so just stop opening those files". > > As the fool who actually "designed" this, Pretty sure I was there too and had my chance to object. In any case, hope you didn't take that as a personal complaint about your work, it (along with lots of maintenance since then) is much appreciated. > I can say with some confidence that the intention was always the > requests would block for at most 30 seconds. ... > > One advantage of waiting for mountd to come back is that you could > > upgrade mountd in place. That shouldn't take 30 seconds, though. And I > > haven't heard of anyone actually doing that. > > Surely upgrading of mountd in-place happens whenever you install a new > version. I didn't think distros restarted mountd on upgrade, but I haven't actually checked that. I agree that it's something we should allow, anyway. > > It's too bad that not opening auth.unix.gid is the only way for mountd > > to communicate that gids shouldn't be mapped. > > I have a general preference for reusing existing functionality rather > than creating new special-purpose functionality. I think this has > served me well more often than not. Maybe this is one case of "not". > > If you want to restart mountd without --managed-gids (where previously > it had that option), there is a chance that you will hit this problem. > That is a case where the answer "just stop opening those files" doesn't > really apply. Yeah. OK, applying. (But I'm traveling and may not get this tested and pushed out till next week.) --b.
On Mon, Mar 25 2019, J. Bruce Fields wrote: > On Sat, Mar 23, 2019 at 09:32:08AM +1100, NeilBrown wrote: >> On Fri, Mar 22 2019, J. Bruce Fields wrote: >> > I've gotten complaints about the same thing and said "well, in >> > retrospect we shouldn't have designed the interface this way, but we >> > did, so just stop opening those files". >> >> As the fool who actually "designed" this, > > Pretty sure I was there too and had my chance to object. In any case, > hope you didn't take that as a personal complaint about your work, it > (along with lots of maintenance since then) is much appreciated. Not at all, I was just feeling in a self-deprecatory mood. > >> I can say with some confidence that the intention was always the >> requests would block for at most 30 seconds. > ... >> > One advantage of waiting for mountd to come back is that you could >> > upgrade mountd in place. That shouldn't take 30 seconds, though. And I >> > haven't heard of anyone actually doing that. >> >> Surely upgrading of mountd in-place happens whenever you install a new >> version. > > I didn't think distros restarted mountd on upgrade, but I haven't > actually checked that. I agree that it's something we should allow, > anyway. I just had a look at the openSUSE nfs-kernel-server package. It declares %service_add_post nfs-mountd.service nfs-server.service where %service_add_post is an rpm macro which does various things that I struggle to understand, but it does reference a setting in /etc/sysconfig/services, which contains # # Do you want to disable the automatic restart of services when # a new version gets installed? # DISABLE_RESTART_ON_UPDATE="no" So I'm going to guess that openSUSE does restart mountd on upgrade. > >> > It's too bad that not opening auth.unix.gid is the only way for mountd >> > to communicate that gids shouldn't be mapped. >> >> I have a general preference for reusing existing functionality rather >> than creating new special-purpose functionality. I think this has >> served me well more often than not. Maybe this is one case of "not". >> >> If you want to restart mountd without --managed-gids (where previously >> it had that option), there is a chance that you will hit this problem. >> That is a case where the answer "just stop opening those files" doesn't >> really apply. > > Yeah. OK, applying. > > (But I'm traveling and may not get this tested and pushed out till next > week.) No rush. Safe travels. Thanks, NeilBrown > > --b.
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c index 12bb23b8e0c5..be9e29385adc 100644 --- a/net/sunrpc/cache.c +++ b/net/sunrpc/cache.c @@ -40,6 +40,7 @@ static bool cache_defer_req(struct cache_req *req, struct cache_head *item); static void cache_revisit_request(struct cache_head *item); +static bool cache_listeners_exist(struct cache_detail *detail); static void cache_init(struct cache_head *h, struct cache_detail *detail) { @@ -303,7 +304,8 @@ int cache_check(struct cache_detail *detail, cache_fresh_unlocked(h, detail); break; } - } + } else if (!cache_listeners_exist(detail)) + rv = try_to_negate_entry(detail, h); } if (rv == -EAGAIN) {
If no handler (such as rpc.mountd) has opened a cache 'channel', the sunrpc cache responds to all lookup requests with -ENOENT. This is particularly important for the auth.unix.gid cache which is optional. If the channel was open briefly and an upcall was written to it, this upcall remains pending even when the handler closes the channel. When an upcall is pending, the code currently doesn't check if there are still listeners, it only performs that check before sending an upcall. As the cache treads a recently closes channel (closed less than 30 seconds ago) as "potentially still open", there is a reasonable sized window when a request can become pending in a closed channel, and thereby block lookups indefinitely. This can easily be demonstrated by running cat /proc/net/rpc/auth.unix.gid/channel and then trying to mount an NFS filesystem from this host. It will block indefinitely (unless mountd is run with --manage-gids, or krb5 is used). When cache_check() finds that an upcall is pending, it should perform the "cache_listeners_exist()" exist test. If no listeners do exist, the request should be negated. With this change in place, there can still be a 30second wait on mount, until the cache gives up waiting for a handler to come back, but this is much better than an indefinite wait. Signed-off-by: NeilBrown <neilb@suse.com> --- net/sunrpc/cache.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)