From patchwork Tue Aug 6 02:21:35 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Trond Myklebust X-Patchwork-Id: 2839132 Return-Path: X-Original-To: patchwork-linux-nfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 794CABF535 for ; Tue, 6 Aug 2013 02:22:06 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 3E6292017B for ; Tue, 6 Aug 2013 02:22:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E0E9220173 for ; Tue, 6 Aug 2013 02:22:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752576Ab3HFCVh (ORCPT ); Mon, 5 Aug 2013 22:21:37 -0400 Received: from mx11.netapp.com ([216.240.18.76]:51915 "EHLO mx11.netapp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752382Ab3HFCVg (ORCPT ); Mon, 5 Aug 2013 22:21:36 -0400 X-IronPort-AV: E=Sophos;i="4.89,823,1367996400"; d="scan'208,223";a="39349659" Received: from vmwexceht03-prd.hq.netapp.com ([10.106.76.241]) by mx11-out.netapp.com with ESMTP; 05 Aug 2013 19:21:35 -0700 Received: from SACEXCMBX04-PRD.hq.netapp.com ([169.254.6.252]) by vmwexceht03-prd.hq.netapp.com ([10.106.76.241]) with mapi id 14.03.0123.003; Mon, 5 Aug 2013 19:21:35 -0700 From: "Myklebust, Trond" To: Jeff Layton CC: Nix , =?utf-8?B?VG9yYWxmIEbDtnJzdGVy?= , Oleg Nesterov , NFS list , Linux Kernel Mailing List , "dhowells@redhat.com" Subject: Re: [3.10.4] NFS locking panic, plus persisting NFS shutdown panic from 3.9.* Thread-Topic: [3.10.4] NFS locking panic, plus persisting NFS shutdown panic from 3.9.* Thread-Index: AQHOkfOH4gcyFEImq0mWTcGQZbqon5mHP1GAgAAXFoCAAAtIAIAABDyAgACC3IA= Date: Tue, 6 Aug 2013 02:21:35 +0000 Message-ID: <1375755693.7337.42.camel@leira.trondhjem.org> References: <8761vlv4z9.fsf@spindle.srvr.nix> <20130805084436.69ee4415@corrin.poochiereds.net> <87siyotcri.fsf@spindle.srvr.nix> <20130805110427.509db424@tlielax.poochiereds.net> <20130805111106.73d6ab90@tlielax.poochiereds.net> <87zjswrvaq.fsf@spindle.srvr.nix> <1375719301.7337.14.camel@leira.trondhjem.org> <20130805133739.21654ecb@tlielax.poochiereds.net> <1375726682.7337.29.camel@leira.trondhjem.org> <20130805143311.4ae59067@tlielax.poochiereds.net> In-Reply-To: <20130805143311.4ae59067@tlielax.poochiereds.net> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: x-originating-ip: [10.106.53.51] MIME-Version: 1.0 Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, T_TVD_MIME_EPI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Mon, 2013-08-05 at 14:33 -0400, Jeff Layton wrote: > On Mon, 5 Aug 2013 18:18:03 +0000 > "Myklebust, Trond" wrote: > > > On Mon, 2013-08-05 at 13:37 -0400, Jeff Layton wrote: > > > On Mon, 5 Aug 2013 16:15:01 +0000 > > > "Myklebust, Trond" wrote: > > > > > > > From 3c50ba80105464a28d456d9a1e0f1d81d4af92a8 Mon Sep 17 00:00:00 2001 > > > > From: Trond Myklebust > > > > Date: Mon, 5 Aug 2013 12:06:12 -0400 > > > > Subject: [PATCH] LOCKD: Don't call utsname()->nodename from > > > > nlmclnt_setlockargs > > > > MIME-Version: 1.0 > > > > Content-Type: text/plain; charset=UTF-8 > > > > Content-Transfer-Encoding: 8bit > > > > > > > > Firstly, nlmclnt_setlockargs can be called from a reclaimer thread, in > > > > which case we're in entirely the wrong namespace. > > > > Secondly, commit 8aac62706adaaf0fab02c4327761561c8bda9448 (move > > > > exit_task_namespaces() outside of exit_notify()) now means that > > > > exit_task_work() is called after exit_task_namespaces(), which > > > > triggers an Oops when we're freeing up the locks. > > > > > > > > Signed-off-by: Trond Myklebust > > > > Cc: Toralf Förster > > > > Cc: Oleg Nesterov > > > > Cc: Nix > > > > Cc: Jeff Layton > > > > --- > > > > fs/lockd/clntproc.c | 5 +++-- > > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c > > > > index 9760ecb..acd3947 100644 > > > > --- a/fs/lockd/clntproc.c > > > > +++ b/fs/lockd/clntproc.c > > > > @@ -125,14 +125,15 @@ static void nlmclnt_setlockargs(struct nlm_rqst *req, struct file_lock *fl) > > > > { > > > > struct nlm_args *argp = &req->a_args; > > > > struct nlm_lock *lock = &argp->lock; > > > > + char *nodename = req->a_host->h_rpcclnt->cl_nodename; > > > > > > > > nlmclnt_next_cookie(&argp->cookie); > > > > memcpy(&lock->fh, NFS_FH(file_inode(fl->fl_file)), sizeof(struct nfs_fh)); > > > > - lock->caller = utsname()->nodename; > > > > + lock->caller = nodename; > > > > lock->oh.data = req->a_owner; > > > > lock->oh.len = snprintf(req->a_owner, sizeof(req->a_owner), "%u@%s", > > > > (unsigned int)fl->fl_u.nfs_fl.owner->pid, > > > > - utsname()->nodename); > > > > + nodename); > > > > lock->svid = fl->fl_u.nfs_fl.owner->pid; > > > > lock->fl.fl_start = fl->fl_start; > > > > lock->fl.fl_end = fl->fl_end; > > > > > > Looks good to me... > > > > > > Reviewed-by: Jeff Layton > > > > > > Trond, any thoughts on the other oops that Nix posted? The issue there > > > seems to be that we're trying to do the pathwalk to the rpcbind unix > > > socket from exit_task_work(), but that's happening after we've already > > > called exit_fs(). > > > > > > The trivial answer seems to be to simply call exit_task_work() before > > > exit_fs() there, but it seems like we ought to be doing the upcall to > > > rpcbind in a mount namespace from which we know we can reach the > > > socket... > > > > Isn't it enough to just do the same thing as we did for gss proxy? i.e. > > set the RPC_CLNT_CREATE_NO_IDLE_TIMEOUT flag. > > > > See attachment. > > Yeah, that looks like a reasonable thing to do... > > OTOH, Is there any other way for a unix socket to end up disconnected > other than if we were to close it? Maybe if rpcbind stopped, the socket > unlinked and recreated and then started again? > > If so then you still could potentially end up in this situation even if > you didn't autoclose it. True. How about something like the following instead. Note the change to the original patch... -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@netapp.com www.netapp.com Acked-by: Jeff Layton From d0ce48dd442e153b7fa109f48e3a3e642ae1395f Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Mon, 5 Aug 2013 16:04:47 -0400 Subject: [PATCH v2 2/2] SUNRPC: If the rpcbind channel is disconnected, fail the call to unregister If rpcbind causes our connection to the AF_LOCAL socket to close after we've registered a service, then we want to be careful about reconnecting since the mount namespace may have changed. By simply refusing to reconnect the AF_LOCAL socket in the case of unregister, we avoid the need to somehow save the mount namespace. While this may lead to some services not unregistering properly, it should be safe. Signed-off-by: Trond Myklebust Cc: Nix Cc: Jeff Layton --- include/linux/sunrpc/sched.h | 1 + net/sunrpc/clnt.c | 4 ++++ net/sunrpc/netns.h | 1 + net/sunrpc/rpcb_clnt.c | 40 +++++++++++++++++++++++++++------------- 4 files changed, 33 insertions(+), 13 deletions(-) diff --git a/include/linux/sunrpc/sched.h b/include/linux/sunrpc/sched.h index 6d87035..1821445 100644 --- a/include/linux/sunrpc/sched.h +++ b/include/linux/sunrpc/sched.h @@ -121,6 +121,7 @@ struct rpc_task_setup { #define RPC_TASK_SOFTCONN 0x0400 /* Fail if can't connect */ #define RPC_TASK_SENT 0x0800 /* message was sent */ #define RPC_TASK_TIMEOUT 0x1000 /* fail with ETIMEDOUT on timeout */ +#define RPC_TASK_NOCONNECT 0x2000 /* return ENOTCONN if not connected */ #define RPC_IS_ASYNC(t) ((t)->tk_flags & RPC_TASK_ASYNC) #define RPC_IS_SWAPPER(t) ((t)->tk_flags & RPC_TASK_SWAPPER) diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index 74f6a70..ecbc4e3 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -1660,6 +1660,10 @@ call_connect(struct rpc_task *task) task->tk_action = call_connect_status; if (task->tk_status < 0) return; + if (task->tk_flags & RPC_TASK_NOCONNECT) { + rpc_exit(task, -ENOTCONN); + return; + } xprt_connect(task); } } diff --git a/net/sunrpc/netns.h b/net/sunrpc/netns.h index 74d948f..779742c 100644 --- a/net/sunrpc/netns.h +++ b/net/sunrpc/netns.h @@ -23,6 +23,7 @@ struct sunrpc_net { struct rpc_clnt *rpcb_local_clnt4; spinlock_t rpcb_clnt_lock; unsigned int rpcb_users; + unsigned int rpcb_is_af_local : 1; struct mutex gssp_lock; wait_queue_head_t gssp_wq; diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c index b0f7232..1891a10 100644 --- a/net/sunrpc/rpcb_clnt.c +++ b/net/sunrpc/rpcb_clnt.c @@ -204,13 +204,15 @@ void rpcb_put_local(struct net *net) } static void rpcb_set_local(struct net *net, struct rpc_clnt *clnt, - struct rpc_clnt *clnt4) + struct rpc_clnt *clnt4, + bool is_af_local) { struct sunrpc_net *sn = net_generic(net, sunrpc_net_id); /* Protected by rpcb_create_local_mutex */ sn->rpcb_local_clnt = clnt; sn->rpcb_local_clnt4 = clnt4; + sn->rpcb_is_af_local = is_af_local ? 1 : 0; smp_wmb(); sn->rpcb_users = 1; dprintk("RPC: created new rpcb local clients (rpcb_local_clnt: " @@ -271,7 +273,7 @@ static int rpcb_create_local_unix(struct net *net) clnt4 = NULL; } - rpcb_set_local(net, clnt, clnt4); + rpcb_set_local(net, clnt, clnt4, true); out: return result; @@ -323,7 +325,7 @@ static int rpcb_create_local_net(struct net *net) clnt4 = NULL; } - rpcb_set_local(net, clnt, clnt4); + rpcb_set_local(net, clnt, clnt4, false); out: return result; @@ -384,13 +386,16 @@ static struct rpc_clnt *rpcb_create(struct net *net, const char *hostname, return rpc_create(&args); } -static int rpcb_register_call(struct rpc_clnt *clnt, struct rpc_message *msg) +static int rpcb_register_call(struct sunrpc_net *sn, struct rpc_clnt *clnt, struct rpc_message *msg, bool is_set) { - int result, error = 0; + int flags = RPC_TASK_NOCONNECT; + int error, result = 0; + if (is_set || !sn->rpcb_is_af_local) + flags = RPC_TASK_SOFTCONN; msg->rpc_resp = &result; - error = rpc_call_sync(clnt, msg, RPC_TASK_SOFTCONN); + error = rpc_call_sync(clnt, msg, flags); if (error < 0) { dprintk("RPC: failed to contact local rpcbind " "server (errno %d).\n", -error); @@ -447,16 +452,19 @@ int rpcb_register(struct net *net, u32 prog, u32 vers, int prot, unsigned short .rpc_argp = &map, }; struct sunrpc_net *sn = net_generic(net, sunrpc_net_id); + bool is_set = false; dprintk("RPC: %sregistering (%u, %u, %d, %u) with local " "rpcbind\n", (port ? "" : "un"), prog, vers, prot, port); msg.rpc_proc = &rpcb_procedures2[RPCBPROC_UNSET]; - if (port) + if (port != 0) { msg.rpc_proc = &rpcb_procedures2[RPCBPROC_SET]; + is_set = true; + } - return rpcb_register_call(sn->rpcb_local_clnt, &msg); + return rpcb_register_call(sn, sn->rpcb_local_clnt, &msg, is_set); } /* @@ -469,6 +477,7 @@ static int rpcb_register_inet4(struct sunrpc_net *sn, const struct sockaddr_in *sin = (const struct sockaddr_in *)sap; struct rpcbind_args *map = msg->rpc_argp; unsigned short port = ntohs(sin->sin_port); + bool is_set = false; int result; map->r_addr = rpc_sockaddr2uaddr(sap, GFP_KERNEL); @@ -479,10 +488,12 @@ static int rpcb_register_inet4(struct sunrpc_net *sn, map->r_addr, map->r_netid); msg->rpc_proc = &rpcb_procedures4[RPCBPROC_UNSET]; - if (port) + if (port != 0) { msg->rpc_proc = &rpcb_procedures4[RPCBPROC_SET]; + is_set = true; + } - result = rpcb_register_call(sn->rpcb_local_clnt4, msg); + result = rpcb_register_call(sn, sn->rpcb_local_clnt4, msg, is_set); kfree(map->r_addr); return result; } @@ -497,6 +508,7 @@ static int rpcb_register_inet6(struct sunrpc_net *sn, const struct sockaddr_in6 *sin6 = (const struct sockaddr_in6 *)sap; struct rpcbind_args *map = msg->rpc_argp; unsigned short port = ntohs(sin6->sin6_port); + bool is_set = false; int result; map->r_addr = rpc_sockaddr2uaddr(sap, GFP_KERNEL); @@ -507,10 +519,12 @@ static int rpcb_register_inet6(struct sunrpc_net *sn, map->r_addr, map->r_netid); msg->rpc_proc = &rpcb_procedures4[RPCBPROC_UNSET]; - if (port) + if (port != 0) { msg->rpc_proc = &rpcb_procedures4[RPCBPROC_SET]; + is_set = true; + } - result = rpcb_register_call(sn->rpcb_local_clnt4, msg); + result = rpcb_register_call(sn, sn->rpcb_local_clnt4, msg, is_set); kfree(map->r_addr); return result; } @@ -527,7 +541,7 @@ static int rpcb_unregister_all_protofamilies(struct sunrpc_net *sn, map->r_addr = ""; msg->rpc_proc = &rpcb_procedures4[RPCBPROC_UNSET]; - return rpcb_register_call(sn->rpcb_local_clnt4, msg); + return rpcb_register_call(sn, sn->rpcb_local_clnt4, msg, false); } /** -- 1.8.3.1