From patchwork Wed Jun 22 21:51:47 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Trond Myklebust X-Patchwork-Id: 907512 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter2.kernel.org (8.14.4/8.14.4) with ESMTP id p5MLqDOP002890 for ; Wed, 22 Jun 2011 21:52:14 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759044Ab1FVVwM (ORCPT ); Wed, 22 Jun 2011 17:52:12 -0400 Received: from mx2.netapp.com ([216.240.18.37]:48317 "EHLO mx2.netapp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758911Ab1FVVwL (ORCPT ); Wed, 22 Jun 2011 17:52:11 -0400 X-IronPort-AV: E=Sophos;i="4.65,408,1304319600"; d="scan'208,223";a="557657467" Received: from smtp1.corp.netapp.com ([10.57.156.124]) by mx2-out.netapp.com with ESMTP; 22 Jun 2011 14:52:11 -0700 Received: from sacrsexc2-prd.hq.netapp.com (sacrsexc2-prd.hq.netapp.com [10.99.115.28]) by smtp1.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id p5MLpKWA009698; Wed, 22 Jun 2011 14:52:10 -0700 (PDT) Received: from SACMVEXC2-PRD.hq.netapp.com ([10.99.115.17]) by sacrsexc2-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 22 Jun 2011 14:51:48 -0700 Received: from 10.55.68.42 ([10.55.68.42]) by SACMVEXC2-PRD.hq.netapp.com ([10.99.115.16]) with Microsoft Exchange Server HTTP-DAV ; Wed, 22 Jun 2011 21:51:47 +0000 Received: from lade.trondhjem.org by SACMVEXC2-PRD.hq.netapp.com; 22 Jun 2011 17:51:47 -0400 Subject: Re: Issue with Race Condition on NFS4 with KRB From: Trond Myklebust To: Joshua Scoggins Cc: linux-kernel@vger.kernel.org, linux-nfs@vger.kernel.org Date: Wed, 22 Jun 2011 17:51:47 -0400 In-Reply-To: References: <1308767449.14997.10.camel@lade.trondhjem.org> <1308769054.14997.18.camel@lade.trondhjem.org> Organization: NetApp Inc X-Mailer: Evolution 3.0.2 (3.0.2-2.fc15) Message-ID: <1308779507.25875.7.camel@lade.trondhjem.org> Mime-Version: 1.0 X-OriginalArrivalTime: 22 Jun 2011 21:51:48.0344 (UTC) FILETIME=[9615EF80:01CC3126] Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.6 (demeter2.kernel.org [140.211.167.43]); Wed, 22 Jun 2011 21:52:14 +0000 (UTC) On Wed, 2011-06-22 at 12:18 -0700, Joshua Scoggins wrote: > According to the it guys they are running solaris 10 as the server platform. Ok. That should not be subject to the race I was thinking of... > On Wed, Jun 22, 2011 at 11:57 AM, Trond Myklebust > wrote: > > On Wed, 2011-06-22 at 11:37 -0700, Joshua Scoggins wrote: > >> Here are our mount options from auto.master > >> > >> /user -fstype=nfs4,sec=krb5p,noresvport,noatime > >> /group -fstype=nfs4,sec=krb5p,noresvport,noatime > >> > >> As for the server, we don't control it. It's actually run by the > >> campus wide it department we are just lab support for CS. I can > >> potentially get the server information but I need to know what you want > >> specifically as they're pretty paranoid about giving out information about > >> their servers. > > > > I would just want to know _what_ server platform you are running > > against. I know of at least one server bug that might explain what you > > are seeing, and I'd like to eliminate that as a possibility. > > > > Trond > > > >> Joshua Scoggins > >> > >> On Wed, Jun 22, 2011 at 11:30 AM, Trond Myklebust > >> wrote: > >> > On Wed, 2011-06-22 at 11:21 -0700, Joshua Scoggins wrote: > >> >> Hello, > >> >> > >> >> We are trying to update our linux images in our CS lab and have it a > >> >> bit of an issue. We are > >> >> using nfs to load user home folder. While testing the new image we > >> >> found that the nfs4 module will > >> >> crash when using firefox 3.6.17 for an extended period of time. Some > >> >> research via google yielded that > >> >> it's a potential race condition specific to nfs with krb auth with > >> >> newer kernels. Our old image doesn't have > >> >> this issue and it seems that its due to it running a far older kernel version. > >> >> > >> >> We have two images and both are having this problem. One is running > >> >> 2.6.39 and the other is 2.6.38. > >> >> Here is what dmesg spit out from the machine running 2.6.39 on one occasion: > >> >> > >> >> [ 678.632061] ------------[ cut here ]------------ > >> >> [ 678.632068] WARNING: at net/sunrpc/clnt.c:1567 call_decode+0xb2/0x69c() > >> >> [ 678.632070] Hardware name: OptiPlex 755 > >> >> [ 678.632072] Modules linked in: nvidia(P) scsi_wait_scan > >> >> [ 678.632078] Pid: 3882, comm: kworker/0:2 Tainted: P > >> >> 2.6.39-gentoo-r1 #1 > >> >> [ 678.632080] Call Trace: > >> >> [ 678.632086] [] warn_slowpath_common+0x80/0x98 > >> >> [ 678.632091] [] ? nfs4_xdr_dec_readdir+0xba/0xba > >> >> [ 678.632094] [] warn_slowpath_null+0x15/0x17 > >> >> [ 678.632097] [] call_decode+0xb2/0x69c > >> >> [ 678.632101] [] __rpc_execute+0x78/0x24b > >> >> [ 678.632104] [] ? rpc_execute+0x41/0x41 > >> >> [ 678.632107] [] rpc_async_schedule+0x10/0x12 > >> >> [ 678.632111] [] process_one_work+0x1d9/0x2e7 > >> >> [ 678.632114] [] worker_thread+0x133/0x24f > >> >> [ 678.632118] [] ? manage_workers+0x18d/0x18d > >> >> [ 678.632121] [] kthread+0x7d/0x85 > >> >> [ 678.632125] [] kernel_thread_helper+0x4/0x10 > >> >> [ 678.632128] [] ? kthread_worker_fn+0x13a/0x13a > >> >> [ 678.632131] [] ? gs_change+0xb/0xb > >> >> [ 678.632133] ---[ end trace 6bfae002a63e020e ]--- Looking at the code, there is only one way I can see for that warning to occur, and that is if we put the request back on the 'xprt->recv' list after it has already received a reply from the server. Can you reproduce the problem with the attached patch? Trond From 7fffb6b479454560503ba3166151b501381f5a6d Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Wed, 22 Jun 2011 17:27:16 -0400 Subject: [PATCH] SUNRPC: Fix a potential race in between xprt_complete_rqst and xprt_transmit In xprt_transmit, if the test for list_empty(&req->rq_list) is to remain lockless, we need to test for whether or not req->rq_reply_bytes_recvd is set (i.e. we already have a reply) after that test. The reason is that xprt_complete_rqst orders the list deletion and the setting of the req->rq_reply_bytes_recvd. By doing the test of req->rq_reply_bytes_recvd under the spinlock, we avoid an extra smp_rmb(). Also ensure that we turn off autodisconnect whether or not the RPC request expects a reply. Signed-off-by: Trond Myklebust --- net/sunrpc/xprt.c | 34 +++++++++++++++++++++------------- 1 files changed, 21 insertions(+), 13 deletions(-) diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index ce5eb68..10e1f21 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -878,23 +878,31 @@ void xprt_transmit(struct rpc_task *task) dprintk("RPC: %5u xprt_transmit(%u)\n", task->tk_pid, req->rq_slen); - if (!req->rq_reply_bytes_recvd) { - if (list_empty(&req->rq_list) && rpc_reply_expected(task)) { - /* - * Add to the list only if we're expecting a reply - */ + if (list_empty(&req->rq_list)) { + /* + * Add to the list only if we're expecting a reply + */ + if (rpc_reply_expected(task)) { spin_lock_bh(&xprt->transport_lock); - /* Update the softirq receive buffer */ - memcpy(&req->rq_private_buf, &req->rq_rcv_buf, - sizeof(req->rq_private_buf)); - /* Add request to the receive list */ - list_add_tail(&req->rq_list, &xprt->recv); + /* Don't put back on the list if we have a reply + * We do this test under the spin lock to avoid + * an extra smp_rmb() betweent the tests of + * req->rq_list and req->rq_reply_bytes_recvd + */ + if (req->rq_reply_bytes_recvd == 0) { + /* Update the softirq receive buffer */ + memcpy(&req->rq_private_buf, &req->rq_rcv_buf, + sizeof(req->rq_private_buf)); + /* Add request to the receive list */ + list_add_tail(&req->rq_list, &xprt->recv); + } spin_unlock_bh(&xprt->transport_lock); xprt_reset_majortimeo(req); - /* Turn off autodisconnect */ - del_singleshot_timer_sync(&xprt->timer); } - } else if (!req->rq_bytes_sent) + /* Turn off autodisconnect */ + del_singleshot_timer_sync(&xprt->timer); + } + if (req->rq_reply_bytes_recvd != 0 && req->rq_bytes_sent == 0) return; req->rq_connect_cookie = xprt->connect_cookie; -- 1.7.5.4