From patchwork Mon Oct 14 23:36:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 11189497 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B1351139A for ; Mon, 14 Oct 2019 23:36:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9C2CA21835 for ; Mon, 14 Oct 2019 23:36:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726440AbfJNXgh (ORCPT ); Mon, 14 Oct 2019 19:36:37 -0400 Received: from mx2.suse.de ([195.135.220.15]:60978 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726497AbfJNXgg (ORCPT ); Mon, 14 Oct 2019 19:36:36 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 040B1AE2A; Mon, 14 Oct 2019 23:36:34 +0000 (UTC) From: NeilBrown To: "J. Bruce Fields" , Trond Myklebust , Anna Schumaker Date: Tue, 15 Oct 2019 10:36:27 +1100 Cc: Linux NFS Mailing List Subject: [PATCH] SUNRPC: backchannel RPC request must reference XPRT In-Reply-To: <20191011165603.GD19318@fieldses.org> References: <87tv8iqz3b.fsf@notabene.neil.brown.name> <20191011165603.GD19318@fieldses.org> Message-ID: <87imoqrjb8.fsf@notabene.neil.brown.name> MIME-Version: 1.0 Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org The backchannel RPC requests - that are queued waiting for the reply to be sent by the "NFSv4 callback" thread - have a pointer to the xprt, but it is not reference counted. It is possible for the xprt to be freed while there are still queued requests. I think this has been a problem since Commit fb7a0b9addbd ("nfs41: New backchannel helper routines") when the code was introduced, but I suspect it became more of a problem after Commit 80b14d5e61ca ("SUNRPC: Add a structure to track multiple transports") (or there abouts). Before this second patch, the nfs client would hold a reference to the xprt to keep it alive. After multipath was introduced, a client holds a reference to a swtich, and the switch can have multiple xprts which can be added and removed. I'm not sure of all the causal issues, but this patch has fixed a customer problem were an NFSv4.1 client would run out of memory with tens of thousands of backchannel rpc requests queued for an xprt that had been freed. This was a 64K-page machine so each rpc_rqst consumed more than 128K of memory. Fixes: 80b14d5e61ca ("SUNRPC: Add a structure to track multiple transports") cc: stable@vger.kernel.org (v4.6) Signed-off-by: NeilBrown --- net/sunrpc/backchannel_rqst.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/sunrpc/backchannel_rqst.c b/net/sunrpc/backchannel_rqst.c index 339e8c077c2d..c95ca39688b6 100644 --- a/net/sunrpc/backchannel_rqst.c +++ b/net/sunrpc/backchannel_rqst.c @@ -61,6 +61,7 @@ static void xprt_free_allocation(struct rpc_rqst *req) free_page((unsigned long)xbufp->head[0].iov_base); xbufp = &req->rq_snd_buf; free_page((unsigned long)xbufp->head[0].iov_base); + xprt_put(req->rq_xprt); kfree(req); } @@ -85,7 +86,7 @@ struct rpc_rqst *xprt_alloc_bc_req(struct rpc_xprt *xprt, gfp_t gfp_flags) if (req == NULL) return NULL; - req->rq_xprt = xprt; + req->rq_xprt = xprt_get(xprt); INIT_LIST_HEAD(&req->rq_bc_list); /* Preallocate one XDR receive buffer */