From patchwork Tue Jul 22 17:52:06 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 4604371 Return-Path: X-Original-To: patchwork-linux-nfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 7E8DAC0514 for ; Tue, 22 Jul 2014 17:52:13 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 8089D2015E for ; Tue, 22 Jul 2014 17:52:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 64D7920158 for ; Tue, 22 Jul 2014 17:52:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750748AbaGVRwK (ORCPT ); Tue, 22 Jul 2014 13:52:10 -0400 Received: from mail-qa0-f67.google.com ([209.85.216.67]:48364 "EHLO mail-qa0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752149AbaGVRwJ (ORCPT ); Tue, 22 Jul 2014 13:52:09 -0400 Received: by mail-qa0-f67.google.com with SMTP id v10so2433129qac.10 for ; Tue, 22 Jul 2014 10:52:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:date:to:cc:subject:message-id:in-reply-to :references:mime-version:content-type:content-transfer-encoding; bh=BzTQ0J9njyqTma89ZC4elEun1qLj9rx1wAtbyQUhcxA=; b=UGCg3UlpzBDXtRctU734qZgZTPtbc/cZWvZJrwcFQs0nidBFgjvgLmPTrvxFZ6wpfE G3526PcCDs/kwd6BKafHU1wvb7VIHyH9XuC8i0fH+kb3cK6p2ey9YBzkD2+OpWMUF/3o kSvDDAL2TRbWv6m9fjfK0KWnwegnnogmwDKdz4RalRIW63tw5/6YGaAwwPF2p8RGS7Rp 8ybJjU2Uy0JqD0zSkY4Ozm+gRon5PBPOR+eLAVwzUwbyk6aV9ES0tclhxWbxSNMr+a7d LdgDjM4kdXVB5lFdd3Tq1sYF8lgs5QLhgw8cn/ttH+SnqqdD3HLNrYl+4aSL/jXNDk6n yGDw== X-Gm-Message-State: ALoCoQnxeAkuXEzcNIbejydih6JFsTMS0NM2JUU/uDFFmjAVf4jgf9IBip+KcpVpNDwthKqVj9t2 X-Received: by 10.140.109.118 with SMTP id k109mr53537428qgf.98.1406051528892; Tue, 22 Jul 2014 10:52:08 -0700 (PDT) Received: from tlielax.poochiereds.net ([2001:470:8:d63:3a60:77ff:fe93:a95d]) by mx.google.com with ESMTPSA id c8sm1376397qaj.16.2014.07.22.10.52.08 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 22 Jul 2014 10:52:08 -0700 (PDT) From: Jeff Layton X-Google-Original-From: Jeff Layton Date: Tue, 22 Jul 2014 13:52:06 -0400 To: "J. Bruce Fields" Cc: linux-nfs@vger.kernel.org Subject: Re: [PATCH] nfsd: bump dp->dl_time when unhashing delegation Message-ID: <20140722135206.7dbfbac5@tlielax.poochiereds.net> In-Reply-To: <20140722174552.GA27277@fieldses.org> References: <1406047291-1279-1-git-send-email-jlayton@primarydata.com> <20140722174552.GA27277@fieldses.org> X-Mailer: Claws Mail 3.10.1 (GTK+ 2.24.22; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Tue, 22 Jul 2014 13:45:52 -0400 "J. Bruce Fields" wrote: > On Tue, Jul 22, 2014 at 12:41:31PM -0400, Jeff Layton wrote: > > There's a potential race between a lease break and DELEGRETURN call. > > > > Suppose a lease break comes in and queues the workqueue job for a > > delegation, but it doesn't run just yet. Then, a DELEGRETURN comes in > > finds the delegation and calls destroy_delegation on it to unhash it and > > put its primary reference. > > > > Next, the workqueue job runs and queues the delegation back onto the > > del_recall_lru list, issues the CB_RECALL and puts the final reference. > > With that, the final reference to the delegation is put, but it's still > > on the LRU list. > > > > When we go to unhash a delegation, it's because we intend to get rid of > > it soon afterward, so we don't want lease breaks to mess with it once > > that occurs. Fix this by bumping the dl_time whenever we unhash a > > delegation, to ensure that lease breaks don't monkey with it. > > Makes sense, thanks. Repeating from IRC: this fixes a regression from > 02e1215f9f7 "nfsd: Avoid taking state_lock while holding inode lock in > nfsd_break_one_deleg". (In my tree only.) > > --b. > > > > > Signed-off-by: Jeff Layton > > --- > > fs/nfsd/nfs4state.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > > index 72da0d44e66b..a3a828d17563 100644 > > --- a/fs/nfsd/nfs4state.c > > +++ b/fs/nfsd/nfs4state.c > > @@ -660,6 +660,8 @@ unhash_delegation(struct nfs4_delegation *dp) > > > > spin_lock(&state_lock); > > dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID; > > + /* Ensure that deleg break won't try to requeue it */ > > + ++dp->dl_time; > > spin_lock(&fp->fi_lock); > > list_del_init(&dp->dl_perclnt); > > list_del_init(&dp->dl_recall_lru); > > -- > > 1.9.3 > > Sorry, I think I sent you a version with an earlier description. Here's the one I meant to send: --------------------[snip]--------------------- [PATCH] nfsd: bump dl_time when unhashing delegation There's a potential race between a lease break and DELEGRETURN call. Suppose a lease break comes in and queues the workqueue job for a delegation, but it doesn't run just yet. Then, a DELEGRETURN comes in finds the delegation and calls destroy_delegation on it to unhash it and put its primary reference. Next, the workqueue job runs and queues the delegation back onto the del_recall_lru list, issues the CB_RECALL and puts the final reference. With that, the final reference to the delegation is put, but it's still on the LRU list. When we go to unhash a delegation, it's because we intend to get rid of it soon afterward, so we don't want lease breaks to mess with it once that occurs. Fix this by bumping the dl_time whenever we unhash a delegation, to ensure that lease breaks don't monkey with it. I believe this is a regression due to commit 02e1215f9f7 (nfsd: Avoid taking state_lock while holding inode lock in nfsd_break_one_deleg). Prior to that, the state_lock was held in the lm_break callback itself, and that would have prevented this race. Cc: Trond Myklebust Signed-off-by: Jeff Layton --- fs/nfsd/nfs4state.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index 72da0d44e66b..a3a828d17563 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -660,6 +660,8 @@ unhash_delegation(struct nfs4_delegation *dp) spin_lock(&state_lock); dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID; + /* Ensure that deleg break won't try to requeue it */ + ++dp->dl_time; spin_lock(&fp->fi_lock); list_del_init(&dp->dl_perclnt); list_del_init(&dp->dl_recall_lru);