From patchwork Thu May 22 00:56:25 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever III X-Patchwork-Id: 4219571 Return-Path: X-Original-To: patchwork-linux-rdma@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 9F8169F1CD for ; Thu, 22 May 2014 00:56:31 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id AB9412037D for ; Thu, 22 May 2014 00:56:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A60EA20379 for ; Thu, 22 May 2014 00:56:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752941AbaEVA42 (ORCPT ); Wed, 21 May 2014 20:56:28 -0400 Received: from mail-ie0-f171.google.com ([209.85.223.171]:58387 "EHLO mail-ie0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752092AbaEVA41 (ORCPT ); Wed, 21 May 2014 20:56:27 -0400 Received: by mail-ie0-f171.google.com with SMTP id to1so2859583ieb.30 for ; Wed, 21 May 2014 17:56:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:subject:to:date:message-id:in-reply-to:references :user-agent:mime-version:content-type:content-transfer-encoding; bh=HZiAmlh/hC00fXflhT0tu+18Ou+7FSDTLluXiflBXgA=; b=hoPBXvDlauOh8FPsETDLFge5Vv9nxvebu8T1gxGsibOPMG9HfP+LVUgnmdQo9Agc6L 53h3vAVjTnV3wiFEXBll+dxYGtjxZss6Blr2d3SfBAW0/PmGOeAE86THDbQz8VBIOJMW rYJsoF3w+4WwAhs4wZQ4FXZuKmjPFPsz8h57BRpWfHBvS+R6e7tJW1Pz8iFHCv6RtnWV XdBBa6QHU5iR6ovZX3W0TjicYFt+qxgvafL3go3wiJ8QXcxFA+ezcO22iBKWUh4xdkh7 Qnv3qm2FAa6nLRsDBnOtmSovEOr8a2GBJPL1ZmM1xTsQxe2v8TNT36WpKSqAPKRNVTSC EVJA== X-Received: by 10.42.31.134 with SMTP id z6mr53737148icc.9.1400720186784; Wed, 21 May 2014 17:56:26 -0700 (PDT) Received: from manet.1015granger.net ([2604:8800:100:81fc:82ee:73ff:fe43:d64f]) by mx.google.com with ESMTPSA id e6sm9246425igq.6.2014.05.21.17.56.26 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 21 May 2014 17:56:26 -0700 (PDT) From: Chuck Lever Subject: [PATCH v4 15/24] xprtrdma: Reduce the number of hardway buffer allocations To: linux-nfs@vger.kernel.org, linux-rdma@vger.kernel.org Date: Wed, 21 May 2014 20:56:25 -0400 Message-ID: <20140522005625.27190.64480.stgit@manet.1015granger.net> In-Reply-To: <20140522004505.27190.58897.stgit@manet.1015granger.net> References: <20140522004505.27190.58897.stgit@manet.1015granger.net> User-Agent: StGIT/0.14.3 MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Spam-Status: No, score=-7.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,T_DKIM_INVALID,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP While marshaling an RPC/RDMA request, the inline_{rsize,wsize} settings determine whether an inline request is used, or whether read or write chunks lists are built. The current default value of these settings is 1024. Any RPC request smaller than 1024 bytes is sent to the NFS server completely inline. rpcrdma_buffer_create() allocates and pre-registers a set of RPC buffers for each transport instance, also based on the inline rsize and wsize settings. RPC/RDMA requests and replies are built in these buffers. However, if an RPC/RDMA request is expected to be larger than 1024, a buffer has to be allocated and registered for that RPC, and deregistered and released when the RPC is complete. This is known has a "hardway allocation." Since the introduction of NFSv4, the size of RPC requests has become larger, and hardway allocations are thus more frequent. Hardway allocations are significant overhead, and they waste the existing RPC buffers pre-allocated by rpcrdma_buffer_create(). We'd like fewer hardway allocations. Increasing the size of the pre-registered buffers is the most direct way to do this. However, a blanket increase of the inline thresholds has interoperability consequences. On my 64-bit system, rpcrdma_buffer_create() requests roughly 7000 bytes for each RPC request buffer, using kmalloc(). Due to internal fragmentation, this wastes nearly 1200 bytes because kmalloc() already returns an 8192-byte piece of memory for a 7000-byte allocation request, though the extra space remains unused. So let's round up the size of the pre-allocated buffers, and make use of the unused space in the kmalloc'd memory. This change reduces the amount of hardway allocated memory for an NFSv4 general connectathon run from 1322092 to 9472 bytes (99%). Signed-off-by: Chuck Lever Tested-by: Steve Wise --- net/sunrpc/xprtrdma/verbs.c | 25 +++++++++++++------------ 1 files changed, 13 insertions(+), 12 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c index 1d08366..c80995a 100644 --- a/net/sunrpc/xprtrdma/verbs.c +++ b/net/sunrpc/xprtrdma/verbs.c @@ -50,6 +50,7 @@ #include #include /* for Tavor hack below */ #include +#include #include "xprt_rdma.h" @@ -1005,7 +1006,7 @@ rpcrdma_buffer_create(struct rpcrdma_buffer *buf, struct rpcrdma_ep *ep, struct rpcrdma_ia *ia, struct rpcrdma_create_data_internal *cdata) { char *p; - size_t len; + size_t len, rlen, wlen; int i, rc; struct rpcrdma_mw *r; @@ -1120,16 +1121,16 @@ rpcrdma_buffer_create(struct rpcrdma_buffer *buf, struct rpcrdma_ep *ep, * Allocate/init the request/reply buffers. Doing this * using kmalloc for now -- one for each buf. */ + wlen = 1 << fls(cdata->inline_wsize + sizeof(struct rpcrdma_req)); + rlen = 1 << fls(cdata->inline_rsize + sizeof(struct rpcrdma_rep)); + dprintk("RPC: %s: wlen = %zu, rlen = %zu\n", + __func__, wlen, rlen); + for (i = 0; i < buf->rb_max_requests; i++) { struct rpcrdma_req *req; struct rpcrdma_rep *rep; - len = cdata->inline_wsize + sizeof(struct rpcrdma_req); - /* RPC layer requests *double* size + 1K RPC_SLACK_SPACE! */ - /* Typical ~2400b, so rounding up saves work later */ - if (len < 4096) - len = 4096; - req = kmalloc(len, GFP_KERNEL); + req = kmalloc(wlen, GFP_KERNEL); if (req == NULL) { dprintk("RPC: %s: request buffer %d alloc" " failed\n", __func__, i); @@ -1141,16 +1142,16 @@ rpcrdma_buffer_create(struct rpcrdma_buffer *buf, struct rpcrdma_ep *ep, buf->rb_send_bufs[i]->rl_buffer = buf; rc = rpcrdma_register_internal(ia, req->rl_base, - len - offsetof(struct rpcrdma_req, rl_base), + wlen - offsetof(struct rpcrdma_req, rl_base), &buf->rb_send_bufs[i]->rl_handle, &buf->rb_send_bufs[i]->rl_iov); if (rc) goto out; - buf->rb_send_bufs[i]->rl_size = len-sizeof(struct rpcrdma_req); + buf->rb_send_bufs[i]->rl_size = wlen - + sizeof(struct rpcrdma_req); - len = cdata->inline_rsize + sizeof(struct rpcrdma_rep); - rep = kmalloc(len, GFP_KERNEL); + rep = kmalloc(rlen, GFP_KERNEL); if (rep == NULL) { dprintk("RPC: %s: reply buffer %d alloc failed\n", __func__, i); @@ -1162,7 +1163,7 @@ rpcrdma_buffer_create(struct rpcrdma_buffer *buf, struct rpcrdma_ep *ep, buf->rb_recv_bufs[i]->rr_buffer = buf; rc = rpcrdma_register_internal(ia, rep->rr_base, - len - offsetof(struct rpcrdma_rep, rr_base), + rlen - offsetof(struct rpcrdma_rep, rr_base), &buf->rb_recv_bufs[i]->rr_handle, &buf->rb_recv_bufs[i]->rr_iov); if (rc)