From patchwork Tue Sep 13 14:53:23 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 9329275 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id A1FD060231 for ; Tue, 13 Sep 2016 14:53:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 93EF729513 for ; Tue, 13 Sep 2016 14:53:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 88BE029516; Tue, 13 Sep 2016 14:53:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.6 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,RCVD_IN_SORBS_SPAM,T_DKIM_INVALID,URIBL_BLACK autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 01B2829513 for ; Tue, 13 Sep 2016 14:53:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756845AbcIMOx0 (ORCPT ); Tue, 13 Sep 2016 10:53:26 -0400 Received: from mail-it0-f68.google.com ([209.85.214.68]:34812 "EHLO mail-it0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752677AbcIMOxZ (ORCPT ); Tue, 13 Sep 2016 10:53:25 -0400 Received: by mail-it0-f68.google.com with SMTP id 186so2069633itf.1; Tue, 13 Sep 2016 07:53:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=HMeeHgax7wPAk4gtkG2y91Rs062yQigI2lrSJ5s6VVg=; b=Sc9J2Q6qbbnnG3+X8uaRtyY9b8K4qdf77KE6+H8+jVwEh7EF7yxr+CdMRyIfDHuMvv a2QhGfMAk+WCgJ9IVvUEcbHRoKiq3ibXuPFoMJPi5e73jGCNHYLVqYibv/A+Lvu6jGaM bepXIuC8ioQYmcikAnXamwtpvykH+3wjz0hhssnGQyg09s0kO1ciRb6fpLIUle1IFt8q tLLS6Yw6mwyM+hxGtbs2hzbkWDFnnGuPTqnNlJTpQkrzPZhV8PJymo9IRZKjG1PbJwTY isJu2sPea6zKdS8HIFlqjVTTGITVWFsjWSETwBQhgUvZtH1xjWP5ukQmxNkSwytofQX9 r3xw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:subject:from:to:cc:date:message-id :in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=HMeeHgax7wPAk4gtkG2y91Rs062yQigI2lrSJ5s6VVg=; b=TMOuUBDu+rn+nBkqIHUvArXokpcLYf1icHqImgDFoY18J3G84I3G0bEFsAaQjULvVI jRhdYaQD/MAD/YhGqvcarhdF57duXpbN7Ikkz4Ir+spmeg7t8fsf3+dgZzOcbqid/mg7 qjep+yDlMwr32dinzFYFTGsNFbmDbfOYMtt37NbQWKofAJrKKkQmB/NsqXWxu8hyuwI+ ANxPReZyRTzikwJ2RYwgvgf/YTHXSB52MKa7EDM9t+5LDr5RE0opHNb/8tdhl9r/kFBD IJ1twOeY1n3U3L0oyrsMQ8jZaaKnGAf+dFbQ3Fcx7OHcA0TEF+Tpdr1Xc4Vi1+Ne/g5c YWLA== X-Gm-Message-State: AE9vXwNbWA8fJIjyeNIwwokuWBL6qgtkU/zbUsUy/AaCgdNlgnVLzgWFVhy8OSjonej5Sw== X-Received: by 10.107.192.193 with SMTP id q184mr2425918iof.170.1473778404605; Tue, 13 Sep 2016 07:53:24 -0700 (PDT) Received: from klimt.1015granger.net ([2604:8800:100:81fc:ec4:7aff:fe6c:6aa0]) by smtp.gmail.com with ESMTPSA id i131sm2092460ita.1.2016.09.13.07.53.23 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 13 Sep 2016 07:53:24 -0700 (PDT) Subject: [PATCH v3 6/6] svcrdma: support Remote Invalidation From: Chuck Lever To: bfields@fieldses.org Cc: linux-rdma@vger.kernel.org, linux-nfs@vger.kernel.org Date: Tue, 13 Sep 2016 10:53:23 -0400 Message-ID: <20160913145323.5733.71581.stgit@klimt.1015granger.net> In-Reply-To: <20160913144246.5733.66229.stgit@klimt.1015granger.net> References: <20160913144246.5733.66229.stgit@klimt.1015granger.net> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Support Remote Invalidation. A private message is exchanged with the client upon RDMA transport connect that indicates whether Send With Invalidation may be used by the server to send RPC replies. The invalidate_rkey is arbitrarily chosen from among rkeys present in the RPC-over-RDMA header's chunk lists. Send With Invalidate improves performance only when clients can recognize, while processing an RPC reply, that an rkey has already been invalidated. That has been submitted as a separate change. In the future, the RPC-over-RDMA protocol might support Remote Invalidation properly. The protocol needs to enable signaling between peers to indicate when Remote Invalidation can be used for each individual RPC. Signed-off-by: Chuck Lever Reviewed-by: Sagi Grimberg --- include/linux/sunrpc/svc_rdma.h | 1 + net/sunrpc/xprtrdma/svc_rdma_sendto.c | 58 ++++++++++++++++++++++++++++-- net/sunrpc/xprtrdma/svc_rdma_transport.c | 12 +++++- 3 files changed, 65 insertions(+), 6 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/include/linux/sunrpc/svc_rdma.h b/include/linux/sunrpc/svc_rdma.h index 3584bc8..cc3ae16 100644 --- a/include/linux/sunrpc/svc_rdma.h +++ b/include/linux/sunrpc/svc_rdma.h @@ -137,6 +137,7 @@ struct svcxprt_rdma { int sc_ord; /* RDMA read limit */ int sc_max_sge; int sc_max_sge_rd; /* max sge for read target */ + bool sc_snd_w_inv; /* OK to use Send With Invalidate */ atomic_t sc_sq_count; /* Number of SQ WR on queue */ unsigned int sc_sq_depth; /* Depth of SQ */ diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c b/net/sunrpc/xprtrdma/svc_rdma_sendto.c index 3b95b19..f5a91ed 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c @@ -225,6 +225,48 @@ svc_rdma_get_reply_array(struct rpcrdma_msg *rmsgp, return rp_ary; } +/* RPC-over-RDMA Version One private extension: Remote Invalidation. + * Responder's choice: requester signals it can handle Send With + * Invalidate, and responder chooses one rkey to invalidate. + * + * Find a candidate rkey to invalidate when sending a reply. Picks the + * first rkey it finds in the chunks lists. + * + * Returns zero if RPC's chunk lists are empty. + */ +static u32 svc_rdma_get_inv_rkey(struct rpcrdma_msg *rdma_argp, + struct rpcrdma_write_array *wr_ary, + struct rpcrdma_write_array *rp_ary) +{ + struct rpcrdma_read_chunk *rd_ary; + struct rpcrdma_segment *arg_ch; + u32 inv_rkey; + + inv_rkey = 0; + + rd_ary = svc_rdma_get_read_chunk(rdma_argp); + if (rd_ary) { + inv_rkey = be32_to_cpu(rd_ary->rc_target.rs_handle); + goto out; + } + + if (wr_ary && be32_to_cpu(wr_ary->wc_nchunks)) { + arg_ch = &wr_ary->wc_array[0].wc_target; + inv_rkey = be32_to_cpu(arg_ch->rs_handle); + goto out; + } + + if (rp_ary && be32_to_cpu(rp_ary->wc_nchunks)) { + arg_ch = &rp_ary->wc_array[0].wc_target; + inv_rkey = be32_to_cpu(arg_ch->rs_handle); + goto out; + } + +out: + dprintk("svcrdma: Send With Invalidate rkey=%08x\n", inv_rkey); + return inv_rkey; +} + /* Assumptions: * - The specified write_len can be represented in sc_max_sge * PAGE_SIZE */ @@ -464,7 +506,8 @@ static int send_reply(struct svcxprt_rdma *rdma, struct page *page, struct rpcrdma_msg *rdma_resp, struct svc_rdma_req_map *vec, - int byte_count) + int byte_count, + u32 inv_rkey) { struct svc_rdma_op_ctxt *ctxt; struct ib_send_wr send_wr; @@ -535,7 +578,11 @@ static int send_reply(struct svcxprt_rdma *rdma, send_wr.wr_cqe = &ctxt->cqe; send_wr.sg_list = ctxt->sge; send_wr.num_sge = sge_no; - send_wr.opcode = IB_WR_SEND; + if (inv_rkey) { + send_wr.opcode = IB_WR_SEND_WITH_INV; + send_wr.ex.invalidate_rkey = inv_rkey; + } else + send_wr.opcode = IB_WR_SEND; send_wr.send_flags = IB_SEND_SIGNALED; ret = svc_rdma_send(rdma, &send_wr); @@ -567,6 +614,7 @@ int svc_rdma_sendto(struct svc_rqst *rqstp) int inline_bytes; struct page *res_page; struct svc_rdma_req_map *vec; + u32 inv_rkey; dprintk("svcrdma: sending response for rqstp=%p\n", rqstp); @@ -577,6 +625,10 @@ int svc_rdma_sendto(struct svc_rqst *rqstp) wr_ary = svc_rdma_get_write_array(rdma_argp); rp_ary = svc_rdma_get_reply_array(rdma_argp, wr_ary); + inv_rkey = 0; + if (rdma->sc_snd_w_inv) + inv_rkey = svc_rdma_get_inv_rkey(rdma_argp, wr_ary, rp_ary); + /* Build an req vec for the XDR */ vec = svc_rdma_get_req_map(rdma); ret = svc_rdma_map_xdr(rdma, &rqstp->rq_res, vec, wr_ary != NULL); @@ -619,7 +671,7 @@ int svc_rdma_sendto(struct svc_rqst *rqstp) goto err1; ret = send_reply(rdma, rqstp, res_page, rdma_resp, vec, - inline_bytes); + inline_bytes, inv_rkey); if (ret < 0) goto err0; diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c b/net/sunrpc/xprtrdma/svc_rdma_transport.c index f51e98a..b2464fc 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_transport.c +++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c @@ -657,9 +657,14 @@ svc_rdma_parse_connect_private(struct svcxprt_rdma *newxprt, if (pmsg && pmsg->cp_magic == rpcrdma_cmp_magic && pmsg->cp_version == RPCRDMA_CMP_VERSION) { - dprintk("svcrdma: client send_size %u, recv_size %u\n", + newxprt->sc_snd_w_inv = pmsg->cp_flags & + RPCRDMA_CMP_F_SND_W_INV_OK; + + dprintk("svcrdma: client send_size %u, recv_size %u " + "remote inv %ssupported\n", rpcrdma_decode_buffer_size(pmsg->cp_send_size), - rpcrdma_decode_buffer_size(pmsg->cp_recv_size)); + rpcrdma_decode_buffer_size(pmsg->cp_recv_size), + newxprt->sc_snd_w_inv ? "" : "un"); } } @@ -1093,7 +1098,8 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt *xprt) dev->attrs.max_fast_reg_page_list_len; newxprt->sc_dev_caps |= SVCRDMA_DEVCAP_FAST_REG; newxprt->sc_reader = rdma_read_chunk_frmr; - } + } else + newxprt->sc_snd_w_inv = false; /* * Determine if a DMA MR is required and if so, what privs are required