From patchwork Thu Jun 27 18:30:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?6buE5LmQ?= X-Patchwork-Id: 11020267 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6AE8814E5 for ; Thu, 27 Jun 2019 18:30:33 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 55699282E8 for ; Thu, 27 Jun 2019 18:30:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 491822847B; Thu, 27 Jun 2019 18:30:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,FROM_EXCESS_BASE64, MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2FBE2282E8 for ; Thu, 27 Jun 2019 18:30:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726441AbfF0Sab (ORCPT ); Thu, 27 Jun 2019 14:30:31 -0400 Received: from smtp3.jd.com ([59.151.64.88]:2124 "EHLO smtp3.jd.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726539AbfF0Sab (ORCPT ); Thu, 27 Jun 2019 14:30:31 -0400 Received: from BJMAILD1MBX38.360buyAD.local (172.31.0.38) by BJMAILD1MBX49.360buyAD.local (172.31.0.49) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1466.3; Fri, 28 Jun 2019 02:30:27 +0800 Received: from BJMAILD1MBX36.360buyAD.local (172.31.0.36) by BJMAILD1MBX38.360buyAD.local (172.31.0.38) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1415.2; Fri, 28 Jun 2019 02:30:27 +0800 Received: from BJMAILD1MBX36.360buyAD.local ([fe80::2116:e90b:d89d:e893]) by BJMAILD1MBX36.360buyAD.local ([fe80::2116:e90b:d89d:e893%24]) with mapi id 15.01.1415.002; Fri, 28 Jun 2019 02:30:27 +0800 From: =?eucgb2312_cn?b?u8bA1g==?= To: "bfields@fieldses.org" , "jlayton@kernel.org" , "linux-nfs@vger.kernel.org" , "linux-kernel@vger.kernel.org" Subject: [PATCH] nfsd4: fix a deadlock on state owner replay mutex Thread-Topic: [PATCH] nfsd4: fix a deadlock on state owner replay mutex Thread-Index: AQHVLRYrJwxAvKxXXkKWwBVCcQfpAw== Date: Thu, 27 Jun 2019 18:30:27 +0000 Message-ID: <720b91b1204b4c73be1b6ec2ff44dbab@jd.com> Accept-Language: zh-CN, en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.31.14.12] MIME-Version: 1.0 Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP from: Huang Le In move_to_close_lru(), which only be called on path of nfsd4 CLOSE op, the code could wait for its stid ref count drop to 2 while holding its state owner replay mutex. However, the other stid ref holder (normally a parallel CLOSE op) that move_to_close_lru() is waiting for might be accquiring the same replay mutex. This patch fix the issue by clearing the replay owner before waiting, and assign it back after then. Signed-off-by: Huang Le --- I guess we should cc this patch to stable tree, since a malicious client could craft parallel CLOSE ops to put all nfsd tasks in D state shortly. diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index 618e660..5f6a48f 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -3829,12 +3829,12 @@ static void nfs4_free_openowner(struct nfs4_stateowner *so) * them before returning however. */ static void -move_to_close_lru(struct nfs4_ol_stateid *s, struct net *net) +move_to_close_lru(struct nfsd4_compound_state *cstate, struct nfs4_ol_stateid *s, + struct net *net) { struct nfs4_ol_stateid *last; struct nfs4_openowner *oo = openowner(s->st_stateowner); - struct nfsd_net *nn = net_generic(s->st_stid.sc_client->net, - nfsd_net_id); + struct nfsd_net *nn = net_generic(net, nfsd_net_id); dprintk("NFSD: move_to_close_lru nfs4_openowner %p\n", oo); @@ -3846,8 +3846,19 @@ static void nfs4_free_openowner(struct nfs4_stateowner *so) * Wait for the refcount to drop to 2. Since it has been unhashed, * there should be no danger of the refcount going back up again at * this point. + * + * Before waiting, we clear cstate->replay_owner to release its + * so_replay.rp_mutex, since other reference holder might be accquiring + * the same mutex before they could drop the references. The replay_owner + * can be assigned back safely after they done their jobs. */ - wait_event(close_wq, refcount_read(&s->st_stid.sc_count) == 2); + if (refcount_read(&s->st_stid.sc_count) != 2) { + struct nfs4_stateowner *so = cstate->replay_owner; + + nfsd4_cstate_clear_replay(cstate); + wait_event(close_wq, refcount_read(&s->st_stid.sc_count) == 2); + nfsd4_cstate_assign_replay(cstate, so); + } release_all_access(s); if (s->st_stid.sc_file) { @@ -5531,7 +5542,8 @@ static inline void nfs4_stateid_downgrade(struct nfs4_ol_stateid *stp, u32 to_ac return status; } -static void nfsd4_close_open_stateid(struct nfs4_ol_stateid *s) +static void nfsd4_close_open_stateid(struct nfsd4_compound_state *cstate, + struct nfs4_ol_stateid *s) { struct nfs4_client *clp = s->st_stid.sc_client; bool unhashed; @@ -5549,7 +5561,7 @@ static void nfsd4_close_open_stateid(struct nfs4_ol_stateid *s) spin_unlock(&clp->cl_lock); free_ol_stateid_reaplist(&reaplist); if (unhashed) - move_to_close_lru(s, clp->net); + move_to_close_lru(cstate, s, clp->net); } } @@ -5587,7 +5599,7 @@ static void nfsd4_close_open_stateid(struct nfs4_ol_stateid *s) */ nfs4_inc_and_copy_stateid(&close->cl_stateid, &stp->st_stid); - nfsd4_close_open_stateid(stp); + nfsd4_close_open_stateid(cstate, stp); mutex_unlock(&stp->st_mutex); /* v4.1+ suggests that we send a special stateid in here, since the