From patchwork Fri Feb 14 15:49:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever III X-Patchwork-Id: 11382583 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2ADCE13A4 for ; Fri, 14 Feb 2020 15:49:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id ED1D224692 for ; Fri, 14 Feb 2020 15:49:52 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="nQxrCsoY" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730245AbgBNPtv (ORCPT ); Fri, 14 Feb 2020 10:49:51 -0500 Received: from mail-yw1-f66.google.com ([209.85.161.66]:45471 "EHLO mail-yw1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730228AbgBNPtv (ORCPT ); Fri, 14 Feb 2020 10:49:51 -0500 Received: by mail-yw1-f66.google.com with SMTP id a125so4419274ywe.12; Fri, 14 Feb 2020 07:49:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=tlq8aUoprJMWXpxoa50zOMDKiEEvhN58MkV638k1cfc=; b=nQxrCsoYLlmURJF7wJA7gxQa/wWUoujnxqMVKxD5WizKz4Wl+LYvEMj0wRB82GoFz4 ZLZmbP7bYWj1djfb2H2VFf/GjFqCFUWjY0HI8kaxzMBFASVBc7SbjiIeXPuHbSUXJH4V 7XzYurotGTJLfJR+4dUMDFjUmmlKJ6x1dEnTu1DT2uiXBkjYAAI0LujJkyApE7HwYjg/ FVBbNzIG5uLd1q+FaBQmFDImmmJ1syLX3jgoh+k4nheE84DuEx1KV788hjEBUmX2a4Zo Ge6L5dlbkonKJYcwV62ezjFSy20iCXTBTkKS+aMZeyUdnQr3VqPey42iNhA7Ol0sazX6 rKSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:subject:from:to:cc:date:message-id :in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=tlq8aUoprJMWXpxoa50zOMDKiEEvhN58MkV638k1cfc=; b=SdW+UCRCMa4Gp/98bhnTduVcd16SvmLYp7T2BuOfUZx4pdZ4RnOP1j5OjNir09pvmv f4B+LDEJoHdENJkks1IpKQg6B72tEZZYxYw81zUrDz+ke80XvLu/bUfElU/mJAoxt3U0 ICC+Ja/vJuOkzveb8XpcLG37BgkphTacp6aMCgZ0L2sbvsx/hhSsSRLjqs8WgAzrmWer hexOdJkKHivpeCx/syPYJiHvGkphDHydgfB7QND8JB3pZnp/qFNm3/OMavXzPa75Ks21 UthOJ0YBqB4d7vqnOJBf6HKvYXLer5e/BVUuY/NujtnklLGHIBviA0j/jTt6qVfTjcrG ma/w== X-Gm-Message-State: APjAAAUQk793nrI7Sc2JzqJhi5hHxiIfQYgWX4yrCX//LPUuhHQ8GXff 7LOa2/Q4Dm+BWomCddbdiE4kyphk X-Google-Smtp-Source: APXvYqwktDGXK8uS6HZjmPZqh42AP8uwa8Nh6eitw2rTtBZijQV0gvpa74dbO2w5ngsAzCBdXfdVHw== X-Received: by 2002:a81:7c06:: with SMTP id x6mr2947897ywc.500.1581695389003; Fri, 14 Feb 2020 07:49:49 -0800 (PST) Received: from gateway.1015granger.net (c-68-61-232-219.hsd1.mi.comcast.net. [68.61.232.219]) by smtp.gmail.com with ESMTPSA id l19sm2653627ywe.29.2020.02.14.07.49.48 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 14 Feb 2020 07:49:48 -0800 (PST) Received: from klimt.1015granger.net (klimt.1015granger.net [192.168.1.55]) by gateway.1015granger.net (8.14.7/8.14.7) with ESMTP id 01EFnlN9029150; Fri, 14 Feb 2020 15:49:47 GMT Subject: [PATCH RFC 1/9] nfsd: Fix NFSv4 READ on RDMA when using readv From: Chuck Lever To: bfields@fieldses.org Cc: linux-rdma@vger.kernel.org, linux-nfs@vger.kernel.org Date: Fri, 14 Feb 2020 10:49:47 -0500 Message-ID: <20200214154947.3848.12451.stgit@klimt.1015granger.net> In-Reply-To: <20200214151427.3848.49739.stgit@klimt.1015granger.net> References: <20200214151427.3848.49739.stgit@klimt.1015granger.net> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org svcrdma expects that the payload falls precisely into the xdr_buf page vector. This does not seem to be the case for nfsd4_encode_readv(). This code is called only when fops->splice_read is missing or when RQ_SPLICE_OK is clear, so it's not a noticeable problem in many common cases. Add new transport method: ->xpo_read_payload so that when a READ payload does not fit exactly in rq_res's page vector, the XDR encoder can inform the RPC transport exactly where that payload is, without the payload's XDR pad. That way, when a Write chunk is present, the transport knows what byte range in the Reply message is supposed to be matched with the chunk. Note that the Linux NFS server implementation of NFS/RDMA can currently handle only one Write chunk per RPC-over-RDMA message. This simplifies the implementation of this fix. Fixes: b04209806384 ("nfsd4: allow exotic read compounds") Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=198053 Signed-off-by: Chuck Lever --- fs/nfsd/nfs4xdr.c | 20 ++++++++------- include/linux/sunrpc/svc.h | 3 ++ include/linux/sunrpc/svc_rdma.h | 8 +++++- include/linux/sunrpc/svc_xprt.h | 2 ++ net/sunrpc/svc.c | 16 ++++++++++++ net/sunrpc/svcsock.c | 8 ++++++ net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 1 + net/sunrpc/xprtrdma/svc_rdma_rw.c | 30 ++++++++++++++--------- net/sunrpc/xprtrdma/svc_rdma_sendto.c | 40 +++++++++++++++++++++++++++++- net/sunrpc/xprtrdma/svc_rdma_transport.c | 1 + 10 files changed, 106 insertions(+), 23 deletions(-) diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c index 9761512674a0..60be969d8be1 100644 --- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -3594,17 +3594,17 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp, u32 zzz = 0; int pad; + /* + * svcrdma requires every READ payload to start somewhere + * in xdr->pages. + */ + if (xdr->iov == xdr->buf->head) { + xdr->iov = NULL; + xdr->end = xdr->p; + } + len = maxcount; v = 0; - - thislen = min_t(long, len, ((void *)xdr->end - (void *)xdr->p)); - p = xdr_reserve_space(xdr, (thislen+3)&~3); - WARN_ON_ONCE(!p); - resp->rqstp->rq_vec[v].iov_base = p; - resp->rqstp->rq_vec[v].iov_len = thislen; - v++; - len -= thislen; - while (len) { thislen = min_t(long, len, PAGE_SIZE); p = xdr_reserve_space(xdr, (thislen+3)&~3); @@ -3623,6 +3623,8 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp, read->rd_length = maxcount; if (nfserr) return nfserr; + if (svc_encode_read_payload(resp->rqstp, starting_len + 8, maxcount)) + return nfserr_io; xdr_truncate_encode(xdr, starting_len + 8 + ((maxcount+3)&~3)); tmp = htonl(eof); diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h index 1afe38eb33f7..82665ff360fd 100644 --- a/include/linux/sunrpc/svc.h +++ b/include/linux/sunrpc/svc.h @@ -517,6 +517,9 @@ int svc_register(const struct svc_serv *, struct net *, const int, void svc_reserve(struct svc_rqst *rqstp, int space); struct svc_pool * svc_pool_for_cpu(struct svc_serv *serv, int cpu); char * svc_print_addr(struct svc_rqst *, char *, size_t); +int svc_encode_read_payload(struct svc_rqst *rqstp, + unsigned int offset, + unsigned int length); unsigned int svc_fill_write_vector(struct svc_rqst *rqstp, struct page **pages, struct kvec *first, size_t total); diff --git a/include/linux/sunrpc/svc_rdma.h b/include/linux/sunrpc/svc_rdma.h index 40f65888dd38..04e4a34d1c6a 100644 --- a/include/linux/sunrpc/svc_rdma.h +++ b/include/linux/sunrpc/svc_rdma.h @@ -137,6 +137,8 @@ struct svc_rdma_recv_ctxt { unsigned int rc_page_count; unsigned int rc_hdr_count; u32 rc_inv_rkey; + unsigned int rc_read_payload_offset; + unsigned int rc_read_payload_length; struct page *rc_pages[RPCSVC_MAXPAGES]; }; @@ -170,7 +172,9 @@ extern int svc_rdma_recv_read_chunk(struct svcxprt_rdma *rdma, struct svc_rqst *rqstp, struct svc_rdma_recv_ctxt *head, __be32 *p); extern int svc_rdma_send_write_chunk(struct svcxprt_rdma *rdma, - __be32 *wr_ch, struct xdr_buf *xdr); + __be32 *wr_ch, struct xdr_buf *xdr, + unsigned int offset, + unsigned long length); extern int svc_rdma_send_reply_chunk(struct svcxprt_rdma *rdma, __be32 *rp_ch, bool writelist, struct xdr_buf *xdr); @@ -189,6 +193,8 @@ extern int svc_rdma_map_reply_msg(struct svcxprt_rdma *rdma, struct svc_rdma_send_ctxt *ctxt, struct xdr_buf *xdr, __be32 *wr_lst); extern int svc_rdma_sendto(struct svc_rqst *); +extern int svc_rdma_read_payload(struct svc_rqst *rqstp, unsigned int offset, + unsigned int length); /* svc_rdma_transport.c */ extern int svc_rdma_create_listen(struct svc_serv *, int, struct sockaddr *); diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h index ea6f46be9cb7..9e1e046de176 100644 --- a/include/linux/sunrpc/svc_xprt.h +++ b/include/linux/sunrpc/svc_xprt.h @@ -21,6 +21,8 @@ struct svc_xprt_ops { int (*xpo_has_wspace)(struct svc_xprt *); int (*xpo_recvfrom)(struct svc_rqst *); int (*xpo_sendto)(struct svc_rqst *); + int (*xpo_read_payload)(struct svc_rqst *, unsigned int, + unsigned int); void (*xpo_release_rqst)(struct svc_rqst *); void (*xpo_detach)(struct svc_xprt *); void (*xpo_free)(struct svc_xprt *); diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c index 187dd4e73d64..18676d36f490 100644 --- a/net/sunrpc/svc.c +++ b/net/sunrpc/svc.c @@ -1637,6 +1637,22 @@ u32 svc_max_payload(const struct svc_rqst *rqstp) EXPORT_SYMBOL_GPL(svc_max_payload); /** + * svc_encode_read_payload - mark a range of bytes as a READ payload + * @rqstp: svc_rqst to operate on + * @offset: payload's byte offset in rqstp->rq_res + * @length: size of payload, in bytes + * + * Returns zero on success, or a negative errno if a permanent + * error occurred. + */ +int svc_encode_read_payload(struct svc_rqst *rqstp, unsigned int offset, + unsigned int length) +{ + return rqstp->rq_xprt->xpt_ops->xpo_read_payload(rqstp, offset, length); +} +EXPORT_SYMBOL_GPL(svc_encode_read_payload); + +/** * svc_fill_write_vector - Construct data argument for VFS write call * @rqstp: svc_rqst to operate on * @pages: list of pages containing data payload diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c index 2934dd711715..758ab10690de 100644 --- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -279,6 +279,12 @@ static int svc_sendto(struct svc_rqst *rqstp, struct xdr_buf *xdr) return len; } +static int svc_sock_read_payload(struct svc_rqst *rqstp, unsigned int offset, + unsigned int length) +{ + return 0; +} + /* * Report socket names for nfsdfs */ @@ -653,6 +659,7 @@ static struct svc_xprt *svc_udp_create(struct svc_serv *serv, .xpo_create = svc_udp_create, .xpo_recvfrom = svc_udp_recvfrom, .xpo_sendto = svc_udp_sendto, + .xpo_read_payload = svc_sock_read_payload, .xpo_release_rqst = svc_release_udp_skb, .xpo_detach = svc_sock_detach, .xpo_free = svc_sock_free, @@ -1171,6 +1178,7 @@ static struct svc_xprt *svc_tcp_create(struct svc_serv *serv, .xpo_create = svc_tcp_create, .xpo_recvfrom = svc_tcp_recvfrom, .xpo_sendto = svc_tcp_sendto, + .xpo_read_payload = svc_sock_read_payload, .xpo_release_rqst = svc_release_skb, .xpo_detach = svc_tcp_sock_detach, .xpo_free = svc_sock_free, diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c index 96bccd398469..71127d898562 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c @@ -193,6 +193,7 @@ void svc_rdma_recv_ctxts_destroy(struct svcxprt_rdma *rdma) out: ctxt->rc_page_count = 0; + ctxt->rc_read_payload_length = 0; return ctxt; out_empty: diff --git a/net/sunrpc/xprtrdma/svc_rdma_rw.c b/net/sunrpc/xprtrdma/svc_rdma_rw.c index 48fe3b16b0d9..b0ac535c8728 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_rw.c +++ b/net/sunrpc/xprtrdma/svc_rdma_rw.c @@ -482,18 +482,19 @@ static int svc_rdma_send_xdr_kvec(struct svc_rdma_write_info *info, vec->iov_len); } -/* Send an xdr_buf's page list by itself. A Write chunk is - * just the page list. a Reply chunk is the head, page list, - * and tail. This function is shared between the two types - * of chunk. +/* Send an xdr_buf's page list by itself. A Write chunk is just + * the page list. A Reply chunk is @xdr's head, page list, and + * tail. This function is shared between the two types of chunk. */ static int svc_rdma_send_xdr_pagelist(struct svc_rdma_write_info *info, - struct xdr_buf *xdr) + struct xdr_buf *xdr, + unsigned int offset, + unsigned long length) { info->wi_xdr = xdr; - info->wi_next_off = 0; + info->wi_next_off = offset - xdr->head[0].iov_len; return svc_rdma_build_writes(info, svc_rdma_pagelist_to_sg, - xdr->page_len); + length); } /** @@ -501,6 +502,8 @@ static int svc_rdma_send_xdr_pagelist(struct svc_rdma_write_info *info, * @rdma: controlling RDMA transport * @wr_ch: Write chunk provided by client * @xdr: xdr_buf containing the data payload + * @offset: payload's byte offset in @xdr + * @length: size of payload, in bytes * * Returns a non-negative number of bytes the chunk consumed, or * %-E2BIG if the payload was larger than the Write chunk, @@ -510,19 +513,20 @@ static int svc_rdma_send_xdr_pagelist(struct svc_rdma_write_info *info, * %-EIO if rdma_rw initialization failed (DMA mapping, etc). */ int svc_rdma_send_write_chunk(struct svcxprt_rdma *rdma, __be32 *wr_ch, - struct xdr_buf *xdr) + struct xdr_buf *xdr, + unsigned int offset, unsigned long length) { struct svc_rdma_write_info *info; int ret; - if (!xdr->page_len) + if (!length) return 0; info = svc_rdma_write_info_alloc(rdma, wr_ch); if (!info) return -ENOMEM; - ret = svc_rdma_send_xdr_pagelist(info, xdr); + ret = svc_rdma_send_xdr_pagelist(info, xdr, offset, length); if (ret < 0) goto out_err; @@ -531,7 +535,7 @@ int svc_rdma_send_write_chunk(struct svcxprt_rdma *rdma, __be32 *wr_ch, goto out_err; trace_svcrdma_encode_write(xdr->page_len); - return xdr->page_len; + return length; out_err: svc_rdma_write_info_free(info); @@ -571,7 +575,9 @@ int svc_rdma_send_reply_chunk(struct svcxprt_rdma *rdma, __be32 *rp_ch, * client did not provide Write chunks. */ if (!writelist && xdr->page_len) { - ret = svc_rdma_send_xdr_pagelist(info, xdr); + ret = svc_rdma_send_xdr_pagelist(info, xdr, + xdr->head[0].iov_len, + xdr->page_len); if (ret < 0) goto out_err; consumed += xdr->page_len; diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c b/net/sunrpc/xprtrdma/svc_rdma_sendto.c index f3f108090aa4..a11983c2056f 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c @@ -858,7 +858,18 @@ int svc_rdma_sendto(struct svc_rqst *rqstp) if (wr_lst) { /* XXX: Presume the client sent only one Write chunk */ - ret = svc_rdma_send_write_chunk(rdma, wr_lst, xdr); + unsigned long offset; + unsigned int length; + + if (rctxt->rc_read_payload_length) { + offset = rctxt->rc_read_payload_offset; + length = rctxt->rc_read_payload_length; + } else { + offset = xdr->head[0].iov_len; + length = xdr->page_len; + } + ret = svc_rdma_send_write_chunk(rdma, wr_lst, xdr, offset, + length); if (ret < 0) goto err2; svc_rdma_xdr_encode_write_list(rdma_resp, wr_lst, ret); @@ -900,3 +911,30 @@ int svc_rdma_sendto(struct svc_rqst *rqstp) ret = -ENOTCONN; goto out; } + +/** + * svc_rdma_read_payload - special processing for a READ payload + * @rqstp: svc_rqst to operate on + * @offset: payload's byte offset in @xdr + * @length: size of payload, in bytes + * + * Returns zero on success. + * + * For the moment, just record the xdr_buf location of the READ + * payload. svc_rdma_sendto will use that location later when + * we actually send the payload. + */ +int svc_rdma_read_payload(struct svc_rqst *rqstp, unsigned int offset, + unsigned int length) +{ + struct svc_rdma_recv_ctxt *rctxt = rqstp->rq_xprt_ctxt; + + /* XXX: Just one READ payload slot for now, since our + * transport implementation currently supports only one + * Write chunk. + */ + rctxt->rc_read_payload_offset = offset; + rctxt->rc_read_payload_length = length; + + return 0; +} diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c b/net/sunrpc/xprtrdma/svc_rdma_transport.c index 145a3615c319..f6aad2798063 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_transport.c +++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c @@ -82,6 +82,7 @@ static struct svc_xprt *svc_rdma_create(struct svc_serv *serv, .xpo_create = svc_rdma_create, .xpo_recvfrom = svc_rdma_recvfrom, .xpo_sendto = svc_rdma_sendto, + .xpo_read_payload = svc_rdma_read_payload, .xpo_release_rqst = svc_rdma_release_rqst, .xpo_detach = svc_rdma_detach, .xpo_free = svc_rdma_free, From patchwork Fri Feb 14 15:49:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever III X-Patchwork-Id: 11382887 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AA324109A for ; Fri, 14 Feb 2020 18:20:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 899022168B for ; Fri, 14 Feb 2020 18:20:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="X9HeBIgq" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730293AbgBNPt5 (ORCPT ); Fri, 14 Feb 2020 10:49:57 -0500 Received: from mail-yb1-f195.google.com ([209.85.219.195]:41737 "EHLO mail-yb1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730276AbgBNPt4 (ORCPT ); Fri, 14 Feb 2020 10:49:56 -0500 Received: by mail-yb1-f195.google.com with SMTP id j11so4949569ybt.8; Fri, 14 Feb 2020 07:49:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=OArWSvRlQC6nv37i8m+YGKW30FGPFbQtp+/YdSZuH2k=; b=X9HeBIgqkvbSmwKpQmJFMtHX8YDTP7182yUj43/dQOc+4vJ0DFvyS8hgMSWeVV8KVE HBr2C2y6WPVu0N3gkMH3B2FKrE36ZhQjkjFVOAp7NUvRSHSrM0wcxaIGvl4BqWRiUgki y0YuUFd/pUv9/vLvlau+3Q4/bisAx1zJrvTj6mCLhELJ+uKl1bUWqE+/QLaQA8PBOcdD DzPsjnJkqBXOsphfJarOBj4eCHGA+jH5dyTdKIICGmXNT5y9a2HkcO1mCCVp0SApKrfi TFFyA+jGLvT8lYsT/eTobOgRagwj7VPMeAXb4PRyTf1s85oUv4HhoZTHsnBvk5Uz54F/ b2Gg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:subject:from:to:cc:date:message-id :in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=OArWSvRlQC6nv37i8m+YGKW30FGPFbQtp+/YdSZuH2k=; b=gtyf1qxChzHknuAmo6HeEYdQ/46yNUNpN8iOPCDKvy5ERRy5CN9OQlji2K07vkRXrm 7LKBjzKI84uhi9RgUjAxv7gObDo58aKX2hZqW3wHBqsacgC1KUxBfZm+64JIdCGJ8UZw r2Op2cvcLLNXHJZRvnpl3h3hKqHr+Eg5bP5/y+vKXs80NXrOIQ8Zye3MiSNpnkK5gaBo rScyn6t77Jz3ca19+Pqkd/7keGyMOtjPuUhk6HR6+xlnwrh5os2JKemEOQp/3M2guLr5 TvLUnvx2tzuCSmq3eHcQD8v+ho6Da3PZP1tJpXPZLLivSAoj1sNwbq6g1sD+ELoWNgot TCIQ== X-Gm-Message-State: APjAAAX0jjWr1IYVqdtsvzXa06C1ja8YlA0VX6dbnE5i3Sn4iMmNzbwX PggaUDOwEU+54/M0yGE0WmM9r6yF X-Google-Smtp-Source: APXvYqygAUi6rlnyTjyzIsPG0nXZZ3xVGVAZpSrvpS8M0xAMG3vB3dx0db6wFkGYEPVBELdntuUprA== X-Received: by 2002:a05:6902:6c1:: with SMTP id m1mr3442569ybt.491.1581695394204; Fri, 14 Feb 2020 07:49:54 -0800 (PST) Received: from gateway.1015granger.net (c-68-61-232-219.hsd1.mi.comcast.net. [68.61.232.219]) by smtp.gmail.com with ESMTPSA id s3sm2788345ywf.22.2020.02.14.07.49.53 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 14 Feb 2020 07:49:53 -0800 (PST) Received: from klimt.1015granger.net (klimt.1015granger.net [192.168.1.55]) by gateway.1015granger.net (8.14.7/8.14.7) with ESMTP id 01EFnqRC029153; Fri, 14 Feb 2020 15:49:52 GMT Subject: [PATCH RFC 2/9] NFSD: Clean up nfsd4_encode_readv From: Chuck Lever To: bfields@fieldses.org Cc: linux-rdma@vger.kernel.org, linux-nfs@vger.kernel.org Date: Fri, 14 Feb 2020 10:49:52 -0500 Message-ID: <20200214154952.3848.15021.stgit@klimt.1015granger.net> In-Reply-To: <20200214151427.3848.49739.stgit@klimt.1015granger.net> References: <20200214151427.3848.49739.stgit@klimt.1015granger.net> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Address some minor nits I noticed while working on this function. Signed-off-by: Chuck Lever --- fs/nfsd/nfs4xdr.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c index 60be969d8be1..262f9fc76e4e 100644 --- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -3591,7 +3591,6 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp, __be32 nfserr; __be32 tmp; __be32 *p; - u32 zzz = 0; int pad; /* @@ -3607,7 +3606,7 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp, v = 0; while (len) { thislen = min_t(long, len, PAGE_SIZE); - p = xdr_reserve_space(xdr, (thislen+3)&~3); + p = xdr_reserve_space(xdr, thislen); WARN_ON_ONCE(!p); resp->rqstp->rq_vec[v].iov_base = p; resp->rqstp->rq_vec[v].iov_len = thislen; @@ -3616,7 +3615,6 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp, } read->rd_vlen = v; - len = maxcount; nfserr = nfsd_readv(resp->rqstp, read->rd_fhp, file, read->rd_offset, resp->rqstp->rq_vec, read->rd_vlen, &maxcount, &eof); @@ -3625,16 +3623,17 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp, return nfserr; if (svc_encode_read_payload(resp->rqstp, starting_len + 8, maxcount)) return nfserr_io; - xdr_truncate_encode(xdr, starting_len + 8 + ((maxcount+3)&~3)); + xdr_truncate_encode(xdr, starting_len + 8 + xdr_align_size(maxcount)); tmp = htonl(eof); write_bytes_to_xdr_buf(xdr->buf, starting_len , &tmp, 4); tmp = htonl(maxcount); write_bytes_to_xdr_buf(xdr->buf, starting_len + 4, &tmp, 4); + tmp = xdr_zero; pad = (maxcount&3) ? 4 - (maxcount&3) : 0; write_bytes_to_xdr_buf(xdr->buf, starting_len + 8 + maxcount, - &zzz, pad); + &tmp, pad); return 0; } From patchwork Fri Feb 14 15:49:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever III X-Patchwork-Id: 11382883 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9FF76109A for ; Fri, 14 Feb 2020 18:20:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 761CE24650 for ; Fri, 14 Feb 2020 18:20:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kVwo744n" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730322AbgBNPuB (ORCPT ); Fri, 14 Feb 2020 10:50:01 -0500 Received: from mail-yw1-f66.google.com ([209.85.161.66]:38944 "EHLO mail-yw1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730309AbgBNPuA (ORCPT ); Fri, 14 Feb 2020 10:50:00 -0500 Received: by mail-yw1-f66.google.com with SMTP id h126so4434676ywc.6; Fri, 14 Feb 2020 07:50:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=C2EzsUQ6sOwcPkm43mfsKiml3ytIXdXhTWnUMmgpa+c=; b=kVwo744nPYAWu0G0gULhQX9WQ0OAfafy+dvtv8naun4SbZkTOWBFNKT4piAmAEdCs8 nwa9NXDrDC0r4Pnb79h8HzVoog2/KnH0WhQU+Lrs/MjN1sqZb/PiCoC5WlvcTW8J/P3x 5FrmB4NmG6roII37dzK5/LXK/6f1xByiQmScvsfXpFkG5nA1zrv2fB//0X+w/6cs5wCA 6FPwQqO1CHlCsKQIvN0vJHgStqfZxLV+hlgpyVdujWk+Bz9MfdtJCLC1UG7uJFSt/lwX hcS8dBHiALGST0fSLwcPQ8fRF6m0my2SR91zpAdk4nU5CWNuX8rtf3JYbRMz2qSKbDz5 KuqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:subject:from:to:cc:date:message-id :in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=C2EzsUQ6sOwcPkm43mfsKiml3ytIXdXhTWnUMmgpa+c=; b=orjgsb9PrqC4TwkduPMlVAe/aoulh6vR54ybbFXywSSvT2PlSn27TeIGQPqVEacxRr lT/cJW+iXm2MXHrtrNFI2wSWuLHLKJFHAhH6+iBMKi1Nl6PyvKHwSGRyycr4w17gjS2l uU/kI1jcJjj+WwJ9vMQSH4MwbzAaOSbRs8HsKhgDl0Ho9y/zYJkl6IGbtBBi0a3L7pWZ zdSRHTns08mQlPE0CIbWRgU6/rlGecYNaamrDW9W8twqrAYkSX3SOoTbn7mOag9iYmUC i/bTrVt2DBpxtQf0XU1nbAeEMZnlctAvZ7IyS6Z04oT+YYC4iBmnvMF7E5+G4FrVprdY /qEA== X-Gm-Message-State: APjAAAXULAwKouEZvGHX8r48yHEG1lSLP6HuL7J72pX8+Ajb9sSGDzNY tpRm/zXwIhtKS+iJujPxONZVyezS X-Google-Smtp-Source: APXvYqyDCF3kj3exLcfgFAOhduvbo+vwHE3R+q1H8zZ6u/81FRkv2zC0pCGwYGKInPny6dFBmAd/nw== X-Received: by 2002:a81:34a:: with SMTP id 71mr2829050ywd.221.1581695399700; Fri, 14 Feb 2020 07:49:59 -0800 (PST) Received: from gateway.1015granger.net (c-68-61-232-219.hsd1.mi.comcast.net. [68.61.232.219]) by smtp.gmail.com with ESMTPSA id d66sm2574526ywc.16.2020.02.14.07.49.59 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 14 Feb 2020 07:49:59 -0800 (PST) Received: from klimt.1015granger.net (klimt.1015granger.net [192.168.1.55]) by gateway.1015granger.net (8.14.7/8.14.7) with ESMTP id 01EFnweU029156; Fri, 14 Feb 2020 15:49:58 GMT Subject: [PATCH RFC 3/9] svcrdma: Avoid DMA mapping small RPC Replies From: Chuck Lever To: bfields@fieldses.org Cc: linux-rdma@vger.kernel.org, linux-nfs@vger.kernel.org Date: Fri, 14 Feb 2020 10:49:58 -0500 Message-ID: <20200214154958.3848.99445.stgit@klimt.1015granger.net> In-Reply-To: <20200214151427.3848.49739.stgit@klimt.1015granger.net> References: <20200214151427.3848.49739.stgit@klimt.1015granger.net> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org On some platforms, DMA mapping part of a page is more costly than copying bytes. Indeed, not involving the I/O MMU can help the RPC/RDMA transport scale better for tiny I/Os across more RDMA devices. This is because interaction with the I/O MMU is eliminated for each of these small I/Os. Without the explicit unmapping, the NIC no longer needs to do a costly internal TLB shoot down for buffers that are just a handful of bytes. The heuristic for now is to pull-up when the size of the RPC message body is smaller than half the minimum Send buffer size. Signed-off-by: Chuck Lever --- include/trace/events/rpcrdma.h | 40 +++++++++++++++++++++++++++++++++ net/sunrpc/xprtrdma/svc_rdma_sendto.c | 25 +++++++++++++++++---- 2 files changed, 61 insertions(+), 4 deletions(-) diff --git a/include/trace/events/rpcrdma.h b/include/trace/events/rpcrdma.h index c0e4c93324f5..6f0d3e8ce95c 100644 --- a/include/trace/events/rpcrdma.h +++ b/include/trace/events/rpcrdma.h @@ -336,6 +336,44 @@ ), \ TP_ARGS(rqst)) +DECLARE_EVENT_CLASS(xdr_buf_class, + TP_PROTO( + const struct xdr_buf *xdr + ), + + TP_ARGS(xdr), + + TP_STRUCT__entry( + __field(const void *, head_base) + __field(size_t, head_len) + __field(const void *, tail_base) + __field(size_t, tail_len) + __field(unsigned int, page_len) + __field(unsigned int, msg_len) + ), + + TP_fast_assign( + __entry->head_base = xdr->head[0].iov_base; + __entry->head_len = xdr->head[0].iov_len; + __entry->tail_base = xdr->tail[0].iov_base; + __entry->tail_len = xdr->tail[0].iov_len; + __entry->page_len = xdr->page_len; + __entry->msg_len = xdr->len; + ), + + TP_printk("head=[%p,%zu] page=%u tail=[%p,%zu] len=%u", + __entry->head_base, __entry->head_len, __entry->page_len, + __entry->tail_base, __entry->tail_len, __entry->msg_len + ) +); + +#define DEFINE_XDRBUF_EVENT(name) \ + DEFINE_EVENT(xdr_buf_class, name, \ + TP_PROTO( \ + const struct xdr_buf *xdr \ + ), \ + TP_ARGS(xdr)) + /** ** Connection events **/ @@ -1634,6 +1672,8 @@ ) ); +DEFINE_XDRBUF_EVENT(svcrdma_send_pullup); + TRACE_EVENT(svcrdma_send_failed, TP_PROTO( const struct svc_rqst *rqst, diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c b/net/sunrpc/xprtrdma/svc_rdma_sendto.c index a11983c2056f..8ea21ca351e2 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c @@ -537,16 +537,32 @@ void svc_rdma_sync_reply_hdr(struct svcxprt_rdma *rdma, DMA_TO_DEVICE); } -/* If the xdr_buf has more elements than the device can - * transmit in a single RDMA Send, then the reply will - * have to be copied into a bounce buffer. +/** + * svc_rdma_pull_up_needed - Determine whether to use pull-up + * @rdma: controlling transport + * @ctxt: I/O resources for an RDMA Send + * @xdr: xdr_buf containing RPC message to transmit + * @wr_lst: pointer to start of Write chunk list + * + * Returns: + * %true if pull-up should be used + * %false otherwise */ static bool svc_rdma_pull_up_needed(struct svcxprt_rdma *rdma, + struct svc_rdma_send_ctxt *ctxt, struct xdr_buf *xdr, __be32 *wr_lst) { int elements; + /* Avoid the overhead of DMA mapping for small messages. + */ + if (xdr->len < RPCRDMA_V1_DEF_INLINE_SIZE >> 1) + return true; + + /* Check whether the xdr_buf has more elements than can + * fit in a single RDMA Send. + */ /* xdr->head */ elements = 1; @@ -627,6 +643,7 @@ static int svc_rdma_pull_up_reply_msg(struct svcxprt_rdma *rdma, ctxt->sc_sges[0].length, DMA_TO_DEVICE); + trace_svcrdma_send_pullup(xdr); return 0; } @@ -652,7 +669,7 @@ int svc_rdma_map_reply_msg(struct svcxprt_rdma *rdma, u32 xdr_pad; int ret; - if (svc_rdma_pull_up_needed(rdma, xdr, wr_lst)) + if (svc_rdma_pull_up_needed(rdma, ctxt, xdr, wr_lst)) return svc_rdma_pull_up_reply_msg(rdma, ctxt, xdr, wr_lst); ++ctxt->sc_cur_sge_no; From patchwork Fri Feb 14 15:50:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever III X-Patchwork-Id: 11382875 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4D865109A for ; Fri, 14 Feb 2020 18:20:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2DD2624649 for ; Fri, 14 Feb 2020 18:20:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="V37hCgX9" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730367AbgBNPuH (ORCPT ); Fri, 14 Feb 2020 10:50:07 -0500 Received: from mail-yb1-f193.google.com ([209.85.219.193]:33279 "EHLO mail-yb1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730359AbgBNPuG (ORCPT ); Fri, 14 Feb 2020 10:50:06 -0500 Received: by mail-yb1-f193.google.com with SMTP id b6so4973590ybr.0; Fri, 14 Feb 2020 07:50:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=jwD0yJAXKb6aNGl1dFtvN1rJFxGMRVHOHNoorWVyGAc=; b=V37hCgX9nes1YzhwO6Z/or5d0Yo4egTb21I173GJXMVjmlAkFsAW16uMvaKfvs9txI /i9vxqOchC3llvzabnDkfP7yrFxkN1tPLiGUE+HKhaDqwcWiQkUGH1YmZTd+2CRz8gzk bQ8+X+Fs4MBqn1U3Ve7BBCg97jpoHDkC1o0LyGlZ8fOlZ+PpCUQ8QzLAq9CC1AseIGb0 N1P2fJfQZwwP3Je3boexOKJO6Iq36ETOibXaRVj1Cliu3yQV0Bs2tuDxASyJRi4F34Db p1TRO0d0Ds5UXx/1S+rT4xRXG3XYjZCFcsG+7X4qIzF9P2KCaV/IsJaHyazGavbnaLNz ZbqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:subject:from:to:cc:date:message-id :in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=jwD0yJAXKb6aNGl1dFtvN1rJFxGMRVHOHNoorWVyGAc=; b=PaDX/RCqlaW0xXRAAx27PWUpfBGNgx4qXcqB0Qvd7by+rpipzHBVc9XFezkFaU6mx1 H2Z7rw5DCashJPprzX5gcBqpeE8aYhwq1caVM4qYlEnBHXwp+HwiLT8OtWfWl02qs/9/ H2hQCKnk4bDM1gErLF06i5lt1FfER7Zd7ziGAjzMiJeKZ6nKg+5kPUw3gI+2eBYwLUY5 AsL75/BHXjpjlQKHVaDgaPOfb4dRXgIrUf2LBzctMkmaZ83/a4ds+g2lDpXZ4mhQO8yf A51DLo7/cWcS/U0ZVpzIGAbHTs6WPr/xpX5bgR70nV3G6XJShVm9gFYA5305Bncvqsw/ IFgA== X-Gm-Message-State: APjAAAVHBqWpNK1Tsf2PIylYgwOHKB2YZRpALBSOwEpJ1XQrxfos+rCe Stourj56IkxKeqXDDXj6ThLjJZRw X-Google-Smtp-Source: APXvYqxZv2gV+4bJllDK70UW7Zt5C9H2eYKWzwoo/Qk4hRCg1n1AQ8f8aNlpwogctxku36TekO9FRQ== X-Received: by 2002:a25:80c5:: with SMTP id c5mr3100155ybm.364.1581695404940; Fri, 14 Feb 2020 07:50:04 -0800 (PST) Received: from gateway.1015granger.net (c-68-61-232-219.hsd1.mi.comcast.net. [68.61.232.219]) by smtp.gmail.com with ESMTPSA id r64sm2657589ywg.84.2020.02.14.07.50.04 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 14 Feb 2020 07:50:04 -0800 (PST) Received: from klimt.1015granger.net (klimt.1015granger.net [192.168.1.55]) by gateway.1015granger.net (8.14.7/8.14.7) with ESMTP id 01EFo3Ml029168; Fri, 14 Feb 2020 15:50:03 GMT Subject: [PATCH RFC 4/9] NFSD: Invoke svc_encode_read_payload in "read" NFSD encoders From: Chuck Lever To: bfields@fieldses.org Cc: linux-rdma@vger.kernel.org, linux-nfs@vger.kernel.org Date: Fri, 14 Feb 2020 10:50:03 -0500 Message-ID: <20200214155003.3848.37713.stgit@klimt.1015granger.net> In-Reply-To: <20200214151427.3848.49739.stgit@klimt.1015granger.net> References: <20200214151427.3848.49739.stgit@klimt.1015granger.net> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Have the NFSD encoders annotate the boundaries of every direct-data-placement eligible READ data payload. Then change svcrdma to use that annotation instead of the xdr->page_len when handling Write chunks. For NFSv4 on RDMA, that enables the ability to recognize multiple READ payloads per compound. Next step is to support multiple Write chunks. Signed-off-by: Chuck Lever --- fs/nfsd/nfs3xdr.c | 4 ++++ fs/nfsd/nfs4xdr.c | 3 +++ fs/nfsd/nfsxdr.c | 4 ++++ net/sunrpc/xprtrdma/svc_rdma_sendto.c | 15 +++------------ 4 files changed, 14 insertions(+), 12 deletions(-) diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c index aae514d40b64..8c272efbc94e 100644 --- a/fs/nfsd/nfs3xdr.c +++ b/fs/nfsd/nfs3xdr.c @@ -712,6 +712,8 @@ void fill_post_wcc(struct svc_fh *fhp) *p = 0; rqstp->rq_res.tail[0].iov_len = 4 - (resp->len&3); } + svc_encode_read_payload(rqstp, rqstp->rq_res.head[0].iov_len, + resp->len); return 1; } else return xdr_ressize_check(rqstp, p); @@ -737,6 +739,8 @@ void fill_post_wcc(struct svc_fh *fhp) *p = 0; rqstp->rq_res.tail[0].iov_len = 4 - (resp->count & 3); } + svc_encode_read_payload(rqstp, rqstp->rq_res.head[0].iov_len, + resp->count); return 1; } else return xdr_ressize_check(rqstp, p); diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c index 262f9fc76e4e..a8d3f8f035a0 100644 --- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -3547,6 +3547,8 @@ static __be32 nfsd4_encode_splice_read( buf->page_len = 0; return nfserr; } + svc_encode_read_payload(read->rd_rqstp, buf->head[0].iov_len, + maxcount); *(p++) = htonl(eof); *(p++) = htonl(maxcount); @@ -3713,6 +3715,7 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp, xdr_truncate_encode(xdr, length_offset); return nfserr; } + svc_encode_read_payload(readlink->rl_rqstp, length_offset, maxcount); wire_count = htonl(maxcount); write_bytes_to_xdr_buf(xdr->buf, length_offset, &wire_count, 4); diff --git a/fs/nfsd/nfsxdr.c b/fs/nfsd/nfsxdr.c index b51fe515f06f..98ea417042a6 100644 --- a/fs/nfsd/nfsxdr.c +++ b/fs/nfsd/nfsxdr.c @@ -462,6 +462,8 @@ __be32 *nfs2svc_encode_fattr(struct svc_rqst *rqstp, __be32 *p, struct svc_fh *f *p = 0; rqstp->rq_res.tail[0].iov_len = 4 - (resp->len&3); } + svc_encode_read_payload(rqstp, rqstp->rq_res.head[0].iov_len, + resp->len); return 1; } @@ -482,6 +484,8 @@ __be32 *nfs2svc_encode_fattr(struct svc_rqst *rqstp, __be32 *p, struct svc_fh *f *p = 0; rqstp->rq_res.tail[0].iov_len = 4 - (resp->count&3); } + svc_encode_read_payload(rqstp, rqstp->rq_res.head[0].iov_len, + resp->count); return 1; } diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c b/net/sunrpc/xprtrdma/svc_rdma_sendto.c index 8ea21ca351e2..40b4843be869 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c @@ -875,18 +875,9 @@ int svc_rdma_sendto(struct svc_rqst *rqstp) if (wr_lst) { /* XXX: Presume the client sent only one Write chunk */ - unsigned long offset; - unsigned int length; - - if (rctxt->rc_read_payload_length) { - offset = rctxt->rc_read_payload_offset; - length = rctxt->rc_read_payload_length; - } else { - offset = xdr->head[0].iov_len; - length = xdr->page_len; - } - ret = svc_rdma_send_write_chunk(rdma, wr_lst, xdr, offset, - length); + ret = svc_rdma_send_write_chunk(rdma, wr_lst, xdr, + rctxt->rc_read_payload_offset, + rctxt->rc_read_payload_length); if (ret < 0) goto err2; svc_rdma_xdr_encode_write_list(rdma_resp, wr_lst, ret); From patchwork Fri Feb 14 15:50:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever III X-Patchwork-Id: 11382871 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B4A1C1800 for ; Fri, 14 Feb 2020 18:20:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 94DD424673 for ; Fri, 14 Feb 2020 18:20:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="bSjN8hkz" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2394812AbgBNSUN (ORCPT ); Fri, 14 Feb 2020 13:20:13 -0500 Received: from mail-yw1-f66.google.com ([209.85.161.66]:37121 "EHLO mail-yw1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729617AbgBNPuL (ORCPT ); Fri, 14 Feb 2020 10:50:11 -0500 Received: by mail-yw1-f66.google.com with SMTP id l5so4451764ywd.4; Fri, 14 Feb 2020 07:50:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=jr8hTaKS8MrCYauE49IsR6qQ2XT9e1L6kQQ5nLMAksA=; b=bSjN8hkzZjErTsGxrTF6XrgH/YSuO7ZMhFtxqrT7PZb1lR4kSFS0Bw+N00Z96ma/EX kBx8zdPxgIU7PRnzZamc4q9zVdlMZeNoCWoo09kGJPCV1MnHRTCZro8KPsWgZnmSV5Dv SB0ZMlYrqE+WF8VGOihllrMpNiycwZKKKDTkPILAzcorJqDsAqBs3kwLTqzvjMfPGry6 TR86EzDBMIr+Wkx14vSfYoDGPQrUt3hUiQQ0wNVW2Av3/9uvWL9/Na3VNBVc0CiKJ0Mt n1PaPSgEWveSMGYi61D8fewJSUbv4bWtQlrPE9HLGMzRUDCy5xudSvR/mF3mosnQ0yHx mnlQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:subject:from:to:cc:date:message-id :in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=jr8hTaKS8MrCYauE49IsR6qQ2XT9e1L6kQQ5nLMAksA=; b=derL+jgjpliDCfaF6U3TOzJz7AGgw5feJoL4Zu5Uzol2a3dVbtKlRcfjvHdWDlRTFN C4JmkGGfojRa86rBp4in2579F4Ppykdlc3Q9Ytkpv5xwd80oJLv66vyMBH3vwMd1PeH/ CYHgeOZR426Mzu+feJkBks6l7EsZVbTDfby2Ujc3cSd0zhnRRmnxNocRBT17Xmp/5VKR fYaSIDcRBJXnSJYmVAjDB1WQjyY09wpf4XEcu4FRfz2EUZEDiMF1sYSyO8K9wBNFASBh YRdA1AE/uLPy6BCrHeE/jms7TNZ26mmX6NqDY3Rv/can0wLrKdFoxJ8DyiwdbhVarx2c 4jjw== X-Gm-Message-State: APjAAAXToSwaGzTtgPSmhnNXGJD21NCO4Cq418RpPPQ1YclLzOLlgbnH WfIzL3OHu7c2XVY60tNK2FKuH2Ux X-Google-Smtp-Source: APXvYqw4JMw99kpxsD07PE+Ap6VM8t7P6R3KqTxt6h2eXglty/7anuENT1znT4NywsVj7VQv04VCQw== X-Received: by 2002:a81:6389:: with SMTP id x131mr2809045ywb.270.1581695410150; Fri, 14 Feb 2020 07:50:10 -0800 (PST) Received: from gateway.1015granger.net (c-68-61-232-219.hsd1.mi.comcast.net. [68.61.232.219]) by smtp.gmail.com with ESMTPSA id d66sm2574758ywc.16.2020.02.14.07.50.09 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 14 Feb 2020 07:50:09 -0800 (PST) Received: from klimt.1015granger.net (klimt.1015granger.net [192.168.1.55]) by gateway.1015granger.net (8.14.7/8.14.7) with ESMTP id 01EFo8jo029171; Fri, 14 Feb 2020 15:50:08 GMT Subject: [PATCH RFC 5/9] svcrdma: Add trace point to examine client-provided write segment From: Chuck Lever To: bfields@fieldses.org Cc: linux-rdma@vger.kernel.org, linux-nfs@vger.kernel.org Date: Fri, 14 Feb 2020 10:50:08 -0500 Message-ID: <20200214155008.3848.6982.stgit@klimt.1015granger.net> In-Reply-To: <20200214151427.3848.49739.stgit@klimt.1015granger.net> References: <20200214151427.3848.49739.stgit@klimt.1015granger.net> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Ensure clients send large enough Write chunks. Signed-off-by: Chuck Lever --- include/trace/events/rpcrdma.h | 7 ++++--- net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 12 +++++++++--- 2 files changed, 13 insertions(+), 6 deletions(-) diff --git a/include/trace/events/rpcrdma.h b/include/trace/events/rpcrdma.h index 6f0d3e8ce95c..773f6d9fd800 100644 --- a/include/trace/events/rpcrdma.h +++ b/include/trace/events/rpcrdma.h @@ -1507,7 +1507,7 @@ ); #define DEFINE_SEGMENT_EVENT(name) \ - DEFINE_EVENT(svcrdma_segment_event, svcrdma_encode_##name,\ + DEFINE_EVENT(svcrdma_segment_event, svcrdma_##name,\ TP_PROTO( \ u32 handle, \ u32 length, \ @@ -1515,8 +1515,9 @@ ), \ TP_ARGS(handle, length, offset)) -DEFINE_SEGMENT_EVENT(rseg); -DEFINE_SEGMENT_EVENT(wseg); +DEFINE_SEGMENT_EVENT(decode_wseg); +DEFINE_SEGMENT_EVENT(encode_rseg); +DEFINE_SEGMENT_EVENT(encode_wseg); DECLARE_EVENT_CLASS(svcrdma_chunk_event, TP_PROTO( diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c index 71127d898562..2f16c0625226 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c @@ -420,13 +420,19 @@ static __be32 *xdr_check_write_chunk(__be32 *p, const __be32 *end, segcount = be32_to_cpup(p++); for (i = 0; i < segcount; i++) { - p++; /* handle */ - if (be32_to_cpup(p++) > maxlen) + u32 handle, length; + u64 offset; + + handle = be32_to_cpup(p++); + length = be32_to_cpup(p++); + if (length > maxlen) return NULL; - p += 2; /* offset */ + p = xdr_decode_hyper(p, &offset); if (p > end) return NULL; + + trace_svcrdma_decode_wseg(handle, length, offset); } return p; From patchwork Fri Feb 14 15:50:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever III X-Patchwork-Id: 11382867 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B09C91800 for ; Fri, 14 Feb 2020 18:20:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 90A3F24654 for ; Fri, 14 Feb 2020 18:20:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="cn595/bE" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2394810AbgBNSUJ (ORCPT ); Fri, 14 Feb 2020 13:20:09 -0500 Received: from mail-yw1-f68.google.com ([209.85.161.68]:45499 "EHLO mail-yw1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730409AbgBNPuQ (ORCPT ); Fri, 14 Feb 2020 10:50:16 -0500 Received: by mail-yw1-f68.google.com with SMTP id a125so4419915ywe.12; Fri, 14 Feb 2020 07:50:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=4CVhXYMIL7ngreD3h590y5QJsOqcAB/B0UruGgDbpao=; b=cn595/bE0+I30RhhDv44q3UwuufzvBh/cGPspIT9i3lLe4boPZ08NKo5iQPRdHj+Di 6jkQztoQPlPBi9SbhAJCzwi4hv74NgHd6pGoSbzduxf60gF5P3ZH0t0oUJQzMiT7IBSW JZlvFN1X4izV/mT4zvwVQAyx7RvJaLjc/2EwlmpLAWqO3YnoKcwi5+2RN/bZeFDi4ClG Owac+edJeZpXjnNt/PKH+hYxztb4KfS6MKgNvawsY4k9MlFfhjpMhFtZooef6x3OklN+ F8GEM/JH0dUECWU6WwDYaYE5J4Xh6VJIcApjN4mJ/2j1u8HMt2rleOxP6riML3xY5z+L 7pJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:subject:from:to:cc:date:message-id :in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=4CVhXYMIL7ngreD3h590y5QJsOqcAB/B0UruGgDbpao=; b=bLmVh+JEjO7XPZoYKfRwo+goHIrKLISZ4QHUiMR56fwhIEggHRFEwSRYwShwhOdS8E M96A7eSYY+82XKM2gHbqcuK3iqGpaFN8+bPyQ7C/Bz02ufDKjDC2Jmc0bkqQlMNPAop4 gA3ulhBtjQjOJ70AuDibAu0ehKPbk5IU+7t9kT88nwldukBElWx7vcvqbi0bx2+P+/dz UFwSr8kVyL61fDSl7WWHkxvet/hmAarLuLWyG4A2FnOAhQE1ckq9a4/kk/dUAiGELv+J CK7kMxSq8E8YVQ+eTnsn0cXnj2EOPes1IQUG0DE8cSeriUv/FMSmYDotOza6mfZHnGtt e0gg== X-Gm-Message-State: APjAAAWHwW5lXO7iurAcvNCCR+5IQIiQdHIUhK7EXEV68WUPTmCyBLgm mEauC32lCsE3v7Xk18MnfNgWnB0x X-Google-Smtp-Source: APXvYqyGEF+Z6KPkh+RZKk1b/x7jzQ20oMi8+tNvvrs/MKLB5w43OaZzDk+XJlT5A1UhkITqZRQldg== X-Received: by 2002:a81:9a09:: with SMTP id r9mr2852836ywg.244.1581695415446; Fri, 14 Feb 2020 07:50:15 -0800 (PST) Received: from gateway.1015granger.net (c-68-61-232-219.hsd1.mi.comcast.net. [68.61.232.219]) by smtp.gmail.com with ESMTPSA id o69sm2804046ywd.38.2020.02.14.07.50.14 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 14 Feb 2020 07:50:14 -0800 (PST) Received: from klimt.1015granger.net (klimt.1015granger.net [192.168.1.55]) by gateway.1015granger.net (8.14.7/8.14.7) with ESMTP id 01EFoEbC029175; Fri, 14 Feb 2020 15:50:14 GMT Subject: [PATCH RFC 6/9] svcrdma: De-duplicate code that locates Write and Reply chunks From: Chuck Lever To: bfields@fieldses.org Cc: linux-rdma@vger.kernel.org, linux-nfs@vger.kernel.org Date: Fri, 14 Feb 2020 10:50:14 -0500 Message-ID: <20200214155014.3848.84789.stgit@klimt.1015granger.net> In-Reply-To: <20200214151427.3848.49739.stgit@klimt.1015granger.net> References: <20200214151427.3848.49739.stgit@klimt.1015granger.net> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Cache the locations of the first Write chunk and the Reply chunk so that the Send path doesn't need to parse the Call header again. Signed-off-by: Chuck Lever --- include/linux/sunrpc/svc_rdma.h | 2 ++ net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 24 +++++++++++++------- net/sunrpc/xprtrdma/svc_rdma_sendto.c | 38 +++---------------------------- 3 files changed, 22 insertions(+), 42 deletions(-) diff --git a/include/linux/sunrpc/svc_rdma.h b/include/linux/sunrpc/svc_rdma.h index 04e4a34d1c6a..07baeb5f93c1 100644 --- a/include/linux/sunrpc/svc_rdma.h +++ b/include/linux/sunrpc/svc_rdma.h @@ -137,6 +137,8 @@ struct svc_rdma_recv_ctxt { unsigned int rc_page_count; unsigned int rc_hdr_count; u32 rc_inv_rkey; + __be32 *rc_write_list; + __be32 *rc_reply_chunk; unsigned int rc_read_payload_offset; unsigned int rc_read_payload_length; struct page *rc_pages[RPCSVC_MAXPAGES]; diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c index 2f16c0625226..91abe08f7d75 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c @@ -444,15 +444,17 @@ static __be32 *xdr_check_write_chunk(__be32 *p, const __be32 *end, * - This implementation supports only one Write chunk. * * Sanity checks: - * - Write list does not overflow buffer. + * - Write list does not overflow Receive buffer. * - Segment size limited by largest NFS data payload. * * Returns pointer to the following Reply chunk. */ -static __be32 *xdr_check_write_list(__be32 *p, const __be32 *end) +static __be32 *xdr_check_write_list(__be32 *p, const __be32 *end, + struct svc_rdma_recv_ctxt *ctxt) { u32 chcount; + ctxt->rc_write_list = p; chcount = 0; while (*p++ != xdr_zero) { p = xdr_check_write_chunk(p, end, MAX_BYTES_WRITE_SEG); @@ -461,6 +463,8 @@ static __be32 *xdr_check_write_list(__be32 *p, const __be32 *end) if (chcount++ > 1) return NULL; } + if (!chcount) + ctxt->rc_write_list = NULL; return p; } @@ -472,13 +476,16 @@ static __be32 *xdr_check_write_list(__be32 *p, const __be32 *end) * * Returns pointer to the following RPC header. */ -static __be32 *xdr_check_reply_chunk(__be32 *p, const __be32 *end) +static __be32 *xdr_check_reply_chunk(__be32 *p, const __be32 *end, + struct svc_rdma_recv_ctxt *ctxt) { + ctxt->rc_reply_chunk = p; if (*p++ != xdr_zero) { p = xdr_check_write_chunk(p, end, MAX_BYTES_SPECIAL_SEG); if (!p) return NULL; - } + } else + ctxt->rc_reply_chunk = NULL; return p; } @@ -554,7 +561,8 @@ static void svc_rdma_get_inv_rkey(struct svcxprt_rdma *rdma, * Assumptions: * - The transport header is entirely contained in the head iovec. */ -static int svc_rdma_xdr_decode_req(struct xdr_buf *rq_arg) +static int svc_rdma_xdr_decode_req(struct xdr_buf *rq_arg, + struct svc_rdma_recv_ctxt *ctxt) { __be32 *p, *end, *rdma_argp; unsigned int hdr_len; @@ -587,10 +595,10 @@ static int svc_rdma_xdr_decode_req(struct xdr_buf *rq_arg) p = xdr_check_read_list(rdma_argp + 4, end); if (!p) goto out_inval; - p = xdr_check_write_list(p, end); + p = xdr_check_write_list(p, end, ctxt); if (!p) goto out_inval; - p = xdr_check_reply_chunk(p, end); + p = xdr_check_reply_chunk(p, end, ctxt); if (!p) goto out_inval; if (p > end) @@ -792,7 +800,7 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp) rqstp->rq_next_page = rqstp->rq_respages; p = (__be32 *)rqstp->rq_arg.head[0].iov_base; - ret = svc_rdma_xdr_decode_req(&rqstp->rq_arg); + ret = svc_rdma_xdr_decode_req(&rqstp->rq_arg, ctxt); if (ret < 0) goto out_err; if (ret == 0) diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c b/net/sunrpc/xprtrdma/svc_rdma_sendto.c index 40b4843be869..3c0e41d378bc 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c @@ -454,36 +454,6 @@ static void svc_rdma_xdr_encode_reply_chunk(__be32 *rdma_resp, __be32 *rp_ch, xdr_encode_write_chunk(p, rp_ch, consumed); } -/* Parse the RPC Call's transport header. - */ -static void svc_rdma_get_write_arrays(__be32 *rdma_argp, - __be32 **write, __be32 **reply) -{ - __be32 *p; - - p = rdma_argp + rpcrdma_fixed_maxsz; - - /* Read list */ - while (*p++ != xdr_zero) - p += 5; - - /* Write list */ - if (*p != xdr_zero) { - *write = p; - while (*p++ != xdr_zero) - p += 1 + be32_to_cpu(*p) * 4; - } else { - *write = NULL; - p++; - } - - /* Reply chunk */ - if (*p != xdr_zero) - *reply = p; - else - *reply = NULL; -} - static int svc_rdma_dma_map_page(struct svcxprt_rdma *rdma, struct svc_rdma_send_ctxt *ctxt, struct page *page, @@ -842,14 +812,14 @@ int svc_rdma_sendto(struct svc_rqst *rqstp) struct svcxprt_rdma *rdma = container_of(xprt, struct svcxprt_rdma, sc_xprt); struct svc_rdma_recv_ctxt *rctxt = rqstp->rq_xprt_ctxt; - __be32 *p, *rdma_argp, *rdma_resp, *wr_lst, *rp_ch; + __be32 *rdma_argp = rctxt->rc_recv_buf; + __be32 *wr_lst = rctxt->rc_write_list; + __be32 *rp_ch = rctxt->rc_reply_chunk; struct xdr_buf *xdr = &rqstp->rq_res; struct svc_rdma_send_ctxt *sctxt; + __be32 *p, *rdma_resp; int ret; - rdma_argp = rctxt->rc_recv_buf; - svc_rdma_get_write_arrays(rdma_argp, &wr_lst, &rp_ch); - /* Create the RDMA response header. xprt->xpt_mutex, * acquired in svc_send(), serializes RPC replies. The * code path below that inserts the credit grant value From patchwork Fri Feb 14 15:50:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever III X-Patchwork-Id: 11382863 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1E045109A for ; Fri, 14 Feb 2020 18:19:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id F2ABA22314 for ; Fri, 14 Feb 2020 18:19:40 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="BpojZ8R3" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730455AbgBNPuY (ORCPT ); Fri, 14 Feb 2020 10:50:24 -0500 Received: from mail-yw1-f67.google.com ([209.85.161.67]:34136 "EHLO mail-yw1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730447AbgBNPuY (ORCPT ); Fri, 14 Feb 2020 10:50:24 -0500 Received: by mail-yw1-f67.google.com with SMTP id b186so4452811ywc.1; Fri, 14 Feb 2020 07:50:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=qKGrq16742ADTzGv+dk0VFkBm9oBbH1Kvy2o7JSpQYQ=; b=BpojZ8R3IVf5ub3tfRvDkwC6f0ql9S91UcqzEocvUx0s4kmxIy0LMsZlgcvZJwi1NT HB6ktIUQ+9JDeGlV4tNA5t80arcwToon0ZnS4kUjmrE6L+UcvOp38gMaCnMuzRjGkQ1a x3OzFp1/VRh8mGTkUdujJ79noagyOSCM8NH6eLPAuRYavZfXBVq1RobHFS1px4mO6oim rxE72D+px0J8saih9vfSQ5DWU4rxdB+yWv4be2W749Z0WD8pDoKZMHw9EdGppWSXKom9 DetbHbZ/eujTsnsBqehxA5IItBFkkc0b2CbCtkyPudDMfwV9i8pP8xUpJXp4gysWTIq5 Ma7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:subject:from:to:cc:date:message-id :in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=qKGrq16742ADTzGv+dk0VFkBm9oBbH1Kvy2o7JSpQYQ=; b=tRRto0HUVOJp06WQoLqleb8fj0dIod5ThN5CNqFgA9+TvnHOTch3cFNDZu6BJCW9ih 1BZ848nENvQs1XK8gFHxhQ4cWzRdqUSZ02iMrFdVaJp2H++VcfEw8iSvcTWkYAJwxR2I Nb3+ZMA64LlkwETrxHBv4RoUTjQqSNWoko3XoQuo71371A1zFPTaq2CkIJqs4V0LbxOd GfnkTTpSSVG48Q3mXYmaZoA+ssniJPSIV4uKJyhcD0ysgNpeongI9G0edPBlHChMU6UZ At3Ic2hOjjgeo3BPYjfQx8jmRcC8a0z+o1DDufOZp9Z01oq2ObJe3epb7/nzL8eMQcIL VwSw== X-Gm-Message-State: APjAAAXyWZ6Occtm3aMRa9HnRtADZ1JuBmUL5VpoUMP3QhmX/1ilLz99 Qv+lTy5RZ7DJAjZz4eVxrGBhieeo X-Google-Smtp-Source: APXvYqyntuopz4cDrg9a+wEWz4Y2gxi0VNVddZ7uuYU4DBtIqOY5rF5eH3ACurjQombg5ba8LKHDcQ== X-Received: by 2002:a81:3888:: with SMTP id f130mr2767613ywa.138.1581695422264; Fri, 14 Feb 2020 07:50:22 -0800 (PST) Received: from gateway.1015granger.net (c-68-61-232-219.hsd1.mi.comcast.net. [68.61.232.219]) by smtp.gmail.com with ESMTPSA id d199sm2716463ywh.83.2020.02.14.07.50.20 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 14 Feb 2020 07:50:20 -0800 (PST) Received: from klimt.1015granger.net (klimt.1015granger.net [192.168.1.55]) by gateway.1015granger.net (8.14.7/8.14.7) with ESMTP id 01EFoJ7m029178; Fri, 14 Feb 2020 15:50:19 GMT Subject: [PATCH RFC 7/9] svcrdma: Post RDMA Writes while XDR encoding replies From: Chuck Lever To: bfields@fieldses.org Cc: linux-rdma@vger.kernel.org, linux-nfs@vger.kernel.org Date: Fri, 14 Feb 2020 10:50:19 -0500 Message-ID: <20200214155019.3848.58561.stgit@klimt.1015granger.net> In-Reply-To: <20200214151427.3848.49739.stgit@klimt.1015granger.net> References: <20200214151427.3848.49739.stgit@klimt.1015granger.net> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org The only RPC/RDMA ordering requirement between RDMA Writes and RDMA Sends is that Writes have to be posted before the Send that sends the RPC Reply for that Write payload. The Linux NFS server implementation now has a transport method that can post READ Payload Writes earlier than svc_rdma_sendto: ->xpo_read_payload. Goals: - Get RDMA Writes going earlier so they are more likely to be complete at the remote end before the Send completes. - Allow more parallelism when dispatching RDMA operations by posting RDMA Writes before taking xpt_mutex. Signed-off-by: Chuck Lever --- net/sunrpc/xprtrdma/svc_rdma_sendto.c | 26 +++++++++++--------------- 1 file changed, 11 insertions(+), 15 deletions(-) diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c b/net/sunrpc/xprtrdma/svc_rdma_sendto.c index 3c0e41d378bc..273453a336b0 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c @@ -843,15 +843,9 @@ int svc_rdma_sendto(struct svc_rqst *rqstp) *p++ = xdr_zero; *p = xdr_zero; - if (wr_lst) { - /* XXX: Presume the client sent only one Write chunk */ - ret = svc_rdma_send_write_chunk(rdma, wr_lst, xdr, - rctxt->rc_read_payload_offset, - rctxt->rc_read_payload_length); - if (ret < 0) - goto err2; - svc_rdma_xdr_encode_write_list(rdma_resp, wr_lst, ret); - } + if (wr_lst) + svc_rdma_xdr_encode_write_list(rdma_resp, wr_lst, + rctxt->rc_read_payload_length); if (rp_ch) { ret = svc_rdma_send_reply_chunk(rdma, rp_ch, wr_lst, xdr); if (ret < 0) @@ -896,16 +890,16 @@ int svc_rdma_sendto(struct svc_rqst *rqstp) * @offset: payload's byte offset in @xdr * @length: size of payload, in bytes * - * Returns zero on success. - * - * For the moment, just record the xdr_buf location of the READ - * payload. svc_rdma_sendto will use that location later when - * we actually send the payload. + * Returns zero on success, or a negative errno. */ int svc_rdma_read_payload(struct svc_rqst *rqstp, unsigned int offset, unsigned int length) { struct svc_rdma_recv_ctxt *rctxt = rqstp->rq_xprt_ctxt; + struct svcxprt_rdma *rdma; + + if (!rctxt->rc_write_list) + return 0; /* XXX: Just one READ payload slot for now, since our * transport implementation currently supports only one @@ -914,5 +908,7 @@ int svc_rdma_read_payload(struct svc_rqst *rqstp, unsigned int offset, rctxt->rc_read_payload_offset = offset; rctxt->rc_read_payload_length = length; - return 0; + rdma = container_of(rqstp->rq_xprt, struct svcxprt_rdma, sc_xprt); + return svc_rdma_send_write_chunk(rdma, rctxt->rc_write_list, + &rqstp->rq_res, offset, length); } From patchwork Fri Feb 14 15:50:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever III X-Patchwork-Id: 11382857 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6CF98159A for ; Fri, 14 Feb 2020 18:19:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4439524649 for ; Fri, 14 Feb 2020 18:19:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="bH2Tjtku" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730473AbgBNPu2 (ORCPT ); Fri, 14 Feb 2020 10:50:28 -0500 Received: from mail-yb1-f195.google.com ([209.85.219.195]:33309 "EHLO mail-yb1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730469AbgBNPu1 (ORCPT ); Fri, 14 Feb 2020 10:50:27 -0500 Received: by mail-yb1-f195.google.com with SMTP id b6so4974142ybr.0; Fri, 14 Feb 2020 07:50:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=UHr7Egb239EH8xYPRJJNLMkIYC7QjqSp3fLfoqGYkbo=; b=bH2TjtkuGmyZ9RQT/dpqSYS3GxyEQFaVeyOzTpUFSlsZNWqU6vAgk2VD1/pzVsKzY3 W5ADw+7f0fVRZhYZWXAWKVWloowH29UosCmiINpPq6nUhlSFaLGDfPmvuLh5ykx1LNG8 8652RoR+/SdZ3AXwUjI6j2KnbLLG1gxvv1F8JLteMJfLZXAkrP34IkJMFQvJxnpuq8nt JPL0MswytQ125Tvz6IkFRWppDjEI4F802lCRsUNub5kd60TUjglJNvGMZT3QG/o4+jrw Khdd8DJv/ZeSYNipRVefBr3DYmb2UQpHiX5aBU+GFCorDPxHqMiJbql8j4OplIJHn7oQ HNuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:subject:from:to:cc:date:message-id :in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=UHr7Egb239EH8xYPRJJNLMkIYC7QjqSp3fLfoqGYkbo=; b=lnnkp2l0KPezUxwyVT1JvjkbBxzROSfcPQXg21gYq66vhFCMKxmMLj5eVJB/JnO0jY GWkAQ26MEIRTHACoTZWuliLWvmd2fR68plqG1PJj1NoK3YPnxVqs0ZIwd6xqW34g3kEo 6Cttbkg6+K6Ry8lJmSJFmru8X1Oa1ycfNSXxrq+B2JRV90btSwRfDosPZzcFeiztnEJC BVZyx+bPbaSKmpK4eYoBAiU9tr30c8vlx923obuBC5orxXt0xvL1olV7ajDjT2EsSDes StMGXtv3jpcNZk7nVziS0BY8rBar9E88DAxLt7mgRmYfG96Nl1LF5RfzmPcKxdpEZpnC GR2g== X-Gm-Message-State: APjAAAUzJBt3NOSaABJgOnl0BbmWO2Riq5rnTxXAdPz4cTeh0GDBApc5 9ryfc/+gdUPAYp2V9Dyq1G9SUvm9 X-Google-Smtp-Source: APXvYqwOyrXE9iE96+z2QS8eWrDoaz3BtMwqe8sJkdU+SNedSrTr/eg8FN9f29sCgNDBGLb4B1zzYg== X-Received: by 2002:a25:2c02:: with SMTP id s2mr3020157ybs.155.1581695426106; Fri, 14 Feb 2020 07:50:26 -0800 (PST) Received: from gateway.1015granger.net (c-68-61-232-219.hsd1.mi.comcast.net. [68.61.232.219]) by smtp.gmail.com with ESMTPSA id d13sm2547484ywj.91.2020.02.14.07.50.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 14 Feb 2020 07:50:25 -0800 (PST) Received: from klimt.1015granger.net (klimt.1015granger.net [192.168.1.55]) by gateway.1015granger.net (8.14.7/8.14.7) with ESMTP id 01EFoOkN029181; Fri, 14 Feb 2020 15:50:24 GMT Subject: [PATCH RFC 8/9] svcrdma: Refactor svc_rdma_sendto() From: Chuck Lever To: bfields@fieldses.org Cc: linux-rdma@vger.kernel.org, linux-nfs@vger.kernel.org Date: Fri, 14 Feb 2020 10:50:24 -0500 Message-ID: <20200214155024.3848.32817.stgit@klimt.1015granger.net> In-Reply-To: <20200214151427.3848.49739.stgit@klimt.1015granger.net> References: <20200214151427.3848.49739.stgit@klimt.1015granger.net> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org No behavior change expected, just preparing for subsequent patches. Pass the RPC request's svc_rdma_recv_ctxt deeper into the sendto() path. This will enable us to subsequent pass more information about the Reply into those lower-level functions. Since we're touching the synopses of these functions, let's also change the header encoding to work like other areas: Instead of walking over the beginning of the header when encoding each chunk list, use the "p = xdr_encode_blob(p);" style that is consistent with most other XDR-related code. Signed-off-by: Chuck Lever --- include/linux/sunrpc/svc_rdma.h | 2 - net/sunrpc/xprtrdma/svc_rdma_rw.c | 12 ++-- net/sunrpc/xprtrdma/svc_rdma_sendto.c | 98 ++++++++++++++------------------- 3 files changed, 50 insertions(+), 62 deletions(-) diff --git a/include/linux/sunrpc/svc_rdma.h b/include/linux/sunrpc/svc_rdma.h index 07baeb5f93c1..c1c4563066d9 100644 --- a/include/linux/sunrpc/svc_rdma.h +++ b/include/linux/sunrpc/svc_rdma.h @@ -178,7 +178,7 @@ extern int svc_rdma_send_write_chunk(struct svcxprt_rdma *rdma, unsigned int offset, unsigned long length); extern int svc_rdma_send_reply_chunk(struct svcxprt_rdma *rdma, - __be32 *rp_ch, bool writelist, + const struct svc_rdma_recv_ctxt *rctxt, struct xdr_buf *xdr); /* svc_rdma_sendto.c */ diff --git a/net/sunrpc/xprtrdma/svc_rdma_rw.c b/net/sunrpc/xprtrdma/svc_rdma_rw.c index b0ac535c8728..ca9d414bef9d 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_rw.c +++ b/net/sunrpc/xprtrdma/svc_rdma_rw.c @@ -545,8 +545,7 @@ int svc_rdma_send_write_chunk(struct svcxprt_rdma *rdma, __be32 *wr_ch, /** * svc_rdma_send_reply_chunk - Write all segments in the Reply chunk * @rdma: controlling RDMA transport - * @rp_ch: Reply chunk provided by client - * @writelist: true if client provided a Write list + * @rctxt: chunk list information * @xdr: xdr_buf containing an RPC Reply * * Returns a non-negative number of bytes the chunk consumed, or @@ -556,13 +555,14 @@ int svc_rdma_send_write_chunk(struct svcxprt_rdma *rdma, __be32 *wr_ch, * %-ENOTCONN if posting failed (connection is lost), * %-EIO if rdma_rw initialization failed (DMA mapping, etc). */ -int svc_rdma_send_reply_chunk(struct svcxprt_rdma *rdma, __be32 *rp_ch, - bool writelist, struct xdr_buf *xdr) +int svc_rdma_send_reply_chunk(struct svcxprt_rdma *rdma, + const struct svc_rdma_recv_ctxt *rctxt, + struct xdr_buf *xdr) { struct svc_rdma_write_info *info; int consumed, ret; - info = svc_rdma_write_info_alloc(rdma, rp_ch); + info = svc_rdma_write_info_alloc(rdma, rctxt->rc_reply_chunk); if (!info) return -ENOMEM; @@ -574,7 +574,7 @@ int svc_rdma_send_reply_chunk(struct svcxprt_rdma *rdma, __be32 *rp_ch, /* Send the page list in the Reply chunk only if the * client did not provide Write chunks. */ - if (!writelist && xdr->page_len) { + if (!rctxt->rc_write_list && xdr->page_len) { ret = svc_rdma_send_xdr_pagelist(info, xdr, xdr->head[0].iov_len, xdr->page_len); diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c b/net/sunrpc/xprtrdma/svc_rdma_sendto.c index 273453a336b0..7349a3f9aa5d 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c @@ -354,6 +354,14 @@ static unsigned int svc_rdma_reply_hdr_len(__be32 *rdma_resp) return (unsigned long)p - (unsigned long)rdma_resp; } +/* RPC-over-RDMA V1 replies never have a Read list. + */ +static __be32 *xdr_encode_read_list(__be32 *p) +{ + *p++ = xdr_zero; + return p; +} + /* One Write chunk is copied from Call transport header to Reply * transport header. Each segment's length field is updated to * reflect number of bytes consumed in the segment. @@ -406,16 +414,17 @@ static unsigned int xdr_encode_write_chunk(__be32 *dst, __be32 *src, * Assumptions: * - Client has provided only one Write chunk */ -static void svc_rdma_xdr_encode_write_list(__be32 *rdma_resp, __be32 *wr_ch, - unsigned int consumed) +static __be32 *xdr_encode_write_list(__be32 *p, + const struct svc_rdma_recv_ctxt *rctxt) { - unsigned int nsegs; - __be32 *p, *q; + unsigned int consumed, nsegs; + __be32 *q; - /* RPC-over-RDMA V1 replies never have a Read list. */ - p = rdma_resp + rpcrdma_fixed_maxsz + 1; + q = rctxt->rc_write_list; + if (!q) + goto out; - q = wr_ch; + consumed = rctxt->rc_read_payload_length; while (*q != xdr_zero) { nsegs = xdr_encode_write_chunk(p, q, consumed); q += 2 + nsegs * rpcrdma_segment_maxsz; @@ -424,10 +433,9 @@ static void svc_rdma_xdr_encode_write_list(__be32 *rdma_resp, __be32 *wr_ch, } /* Terminate Write list */ +out: *p++ = xdr_zero; - - /* Reply chunk discriminator; may be replaced later */ - *p = xdr_zero; + return p; } /* The client provided a Reply chunk in the Call message. Fill in @@ -435,23 +443,13 @@ static void svc_rdma_xdr_encode_write_list(__be32 *rdma_resp, __be32 *wr_ch, * number of bytes consumed in each segment. * * Assumptions: - * - Reply can always fit in the provided Reply chunk + * - Reply can always fit in the client-provided Reply chunk */ -static void svc_rdma_xdr_encode_reply_chunk(__be32 *rdma_resp, __be32 *rp_ch, - unsigned int consumed) +static void xdr_encode_reply_chunk(__be32 *p, + const struct svc_rdma_recv_ctxt *rctxt, + unsigned int length) { - __be32 *p; - - /* Find the Reply chunk in the Reply's xprt header. - * RPC-over-RDMA V1 replies never have a Read list. - */ - p = rdma_resp + rpcrdma_fixed_maxsz + 1; - - /* Skip past Write list */ - while (*p++ != xdr_zero) - p += 1 + be32_to_cpup(p) * rpcrdma_segment_maxsz; - - xdr_encode_write_chunk(p, rp_ch, consumed); + xdr_encode_write_chunk(p, rctxt->rc_reply_chunk, length); } static int svc_rdma_dma_map_page(struct svcxprt_rdma *rdma, @@ -735,15 +733,15 @@ static void svc_rdma_save_io_pages(struct svc_rqst *rqstp, */ static int svc_rdma_send_reply_msg(struct svcxprt_rdma *rdma, struct svc_rdma_send_ctxt *sctxt, - struct svc_rdma_recv_ctxt *rctxt, - struct svc_rqst *rqstp, - __be32 *wr_lst, __be32 *rp_ch) + const struct svc_rdma_recv_ctxt *rctxt, + struct svc_rqst *rqstp) { int ret; - if (!rp_ch) { + if (!rctxt->rc_reply_chunk) { ret = svc_rdma_map_reply_msg(rdma, sctxt, - &rqstp->rq_res, wr_lst); + &rqstp->rq_res, + rctxt->rc_write_list); if (ret < 0) return ret; } @@ -808,16 +806,12 @@ static int svc_rdma_send_error_msg(struct svcxprt_rdma *rdma, */ int svc_rdma_sendto(struct svc_rqst *rqstp) { - struct svc_xprt *xprt = rqstp->rq_xprt; struct svcxprt_rdma *rdma = - container_of(xprt, struct svcxprt_rdma, sc_xprt); + container_of(rqstp->rq_xprt, struct svcxprt_rdma, sc_xprt); struct svc_rdma_recv_ctxt *rctxt = rqstp->rq_xprt_ctxt; __be32 *rdma_argp = rctxt->rc_recv_buf; - __be32 *wr_lst = rctxt->rc_write_list; - __be32 *rp_ch = rctxt->rc_reply_chunk; - struct xdr_buf *xdr = &rqstp->rq_res; struct svc_rdma_send_ctxt *sctxt; - __be32 *p, *rdma_resp; + __be32 *p; int ret; /* Create the RDMA response header. xprt->xpt_mutex, @@ -830,32 +824,26 @@ int svc_rdma_sendto(struct svc_rqst *rqstp) sctxt = svc_rdma_send_ctxt_get(rdma); if (!sctxt) goto err0; - rdma_resp = sctxt->sc_xprt_buf; - p = rdma_resp; + p = sctxt->sc_xprt_buf; *p++ = *rdma_argp; *p++ = *(rdma_argp + 1); *p++ = rdma->sc_fc_credits; - *p++ = rp_ch ? rdma_nomsg : rdma_msg; + *p++ = rctxt->rc_reply_chunk ? rdma_nomsg : rdma_msg; - /* Start with empty chunks */ - *p++ = xdr_zero; - *p++ = xdr_zero; - *p = xdr_zero; - - if (wr_lst) - svc_rdma_xdr_encode_write_list(rdma_resp, wr_lst, - rctxt->rc_read_payload_length); - if (rp_ch) { - ret = svc_rdma_send_reply_chunk(rdma, rp_ch, wr_lst, xdr); + p = xdr_encode_read_list(p); + p = xdr_encode_write_list(p, rctxt); + if (rctxt->rc_reply_chunk) { + ret = svc_rdma_send_reply_chunk(rdma, rctxt, &rqstp->rq_res); if (ret < 0) goto err2; - svc_rdma_xdr_encode_reply_chunk(rdma_resp, rp_ch, ret); - } + xdr_encode_reply_chunk(p, rctxt, ret); + } else + *p = xdr_zero; - svc_rdma_sync_reply_hdr(rdma, sctxt, svc_rdma_reply_hdr_len(rdma_resp)); - ret = svc_rdma_send_reply_msg(rdma, sctxt, rctxt, rqstp, - wr_lst, rp_ch); + svc_rdma_sync_reply_hdr(rdma, sctxt, + svc_rdma_reply_hdr_len(sctxt->sc_xprt_buf)); + ret = svc_rdma_send_reply_msg(rdma, sctxt, rctxt, rqstp); if (ret < 0) goto err1; ret = 0; @@ -879,7 +867,7 @@ int svc_rdma_sendto(struct svc_rqst *rqstp) svc_rdma_send_ctxt_put(rdma, sctxt); err0: trace_svcrdma_send_failed(rqstp, ret); - set_bit(XPT_CLOSE, &xprt->xpt_flags); + set_bit(XPT_CLOSE, &rqstp->rq_xprt->xpt_flags); ret = -ENOTCONN; goto out; } From patchwork Fri Feb 14 15:50:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever III X-Patchwork-Id: 11382587 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 128EA109A for ; Fri, 14 Feb 2020 15:50:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D10ED24680 for ; Fri, 14 Feb 2020 15:50:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="HVGNXM6p" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730514AbgBNPud (ORCPT ); Fri, 14 Feb 2020 10:50:33 -0500 Received: from mail-yb1-f194.google.com ([209.85.219.194]:35378 "EHLO mail-yb1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730497AbgBNPuc (ORCPT ); Fri, 14 Feb 2020 10:50:32 -0500 Received: by mail-yb1-f194.google.com with SMTP id p123so4963564ybp.2; Fri, 14 Feb 2020 07:50:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=WNbd7mCfipl3ljyddNfjpm+YXZXCBhqAO3q8XYrPYfQ=; b=HVGNXM6pBokG7molyA8RlALBgVj4hmhu4fG6UdhfB+JMMAsgeUfZ/6r+nCiaDKfhzx 33w9kTOiaPBMqR5RD0mTtl2X/+g1eHDsCzJwn+ayfCHN5A7qIw+VsYxMYKIJTIvTj1DO iDV6bo6o3nEtv6w+EHMJZfjtCf1rjmrX06wmw22m/VZp6zTQSobNWKDmPKaitt6d1oDZ nFpbv9KOJ+849WSKH1rWY5NEgK/Omu3HPyfNyJfWhMrRW2AoWSO9XtCMvmyO4li6tEEe jY7JPpX45xHBL8lp5ylmvQmJvb89mKUuJan3RztwuDsmI5mKeMLrt7y/Maqk0PxEsVuN syOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:subject:from:to:cc:date:message-id :in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=WNbd7mCfipl3ljyddNfjpm+YXZXCBhqAO3q8XYrPYfQ=; b=AYOHZbp0MWL3ReXtJgAO6oeMq/Z3eIHkFd34CqQZGnNyiXmrGNmAhIW5T8zZtw24Px bxWHs4fRVykB4oI6VKibc2jgVtZs8cUc3w9uxysdDQjtjMsOiV8bGzQflaKdyCgmGwDk EyBDINR9KmZkEnTj7LIYqDbaDYufgIsFvE6LooEeL7R0iqyQlrRrrOwpT74ej0UbCKoO NSTJUeV5El7tGXLvjQM75V9xKfnSICAqnb2MSVK2WUMXZ4dXEjBsfZdcT+fiLJDEa96n /jartpotHMO/3UVwDg+hCLek5R7I7j/Lvu8ejBOU5CrooSCDFtvKxj2Gfa4VkxzpLdCd 1rww== X-Gm-Message-State: APjAAAVshDuxGSQpvVt6nw5SkWH0F4twnwmPhk97o/2EwCPvWUI3E+fF GCqwk9DSlMnEmCJuZCI37HbQlTI5 X-Google-Smtp-Source: APXvYqxPW6r1UI/UdBv1hvs51Q2t3Zjk9GUIWwGKI/mEhnw6WgkRapKRAKL+xh/9HVBbQtRT75MdLA== X-Received: by 2002:a25:e04a:: with SMTP id x71mr3148750ybg.211.1581695431327; Fri, 14 Feb 2020 07:50:31 -0800 (PST) Received: from gateway.1015granger.net (c-68-61-232-219.hsd1.mi.comcast.net. [68.61.232.219]) by smtp.gmail.com with ESMTPSA id l37sm2711450ywa.103.2020.02.14.07.50.30 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 14 Feb 2020 07:50:30 -0800 (PST) Received: from klimt.1015granger.net (klimt.1015granger.net [192.168.1.55]) by gateway.1015granger.net (8.14.7/8.14.7) with ESMTP id 01EFoTB6029184; Fri, 14 Feb 2020 15:50:29 GMT Subject: [PATCH RFC 9/9] svcrdma: Add data structure to track READ payloads From: Chuck Lever To: bfields@fieldses.org Cc: linux-rdma@vger.kernel.org, linux-nfs@vger.kernel.org Date: Fri, 14 Feb 2020 10:50:29 -0500 Message-ID: <20200214155029.3848.86626.stgit@klimt.1015granger.net> In-Reply-To: <20200214151427.3848.49739.stgit@klimt.1015granger.net> References: <20200214151427.3848.49739.stgit@klimt.1015granger.net> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org The Linux NFS/RDMA server implementation currently supports only a single Write chunk per RPC/RDMA request. Requests with more than one are so rare there has never been a strong need to support more. However we are aware of at least one existing NFS client implementation that can generate such requests, so let's dig in. Allocate a data structure at Receive time to keep track of the set of READ payloads and the Write chunks. Signed-off-by: Chuck Lever --- include/linux/sunrpc/svc_rdma.h | 15 +++- net/sunrpc/xprtrdma/svc_rdma_backchannel.c | 2 - net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 31 +++++++-- net/sunrpc/xprtrdma/svc_rdma_rw.c | 2 - net/sunrpc/xprtrdma/svc_rdma_sendto.c | 94 +++++++++++++--------------- 5 files changed, 80 insertions(+), 64 deletions(-) diff --git a/include/linux/sunrpc/svc_rdma.h b/include/linux/sunrpc/svc_rdma.h index c1c4563066d9..85e6b281a39b 100644 --- a/include/linux/sunrpc/svc_rdma.h +++ b/include/linux/sunrpc/svc_rdma.h @@ -124,6 +124,12 @@ enum { #define RPCSVC_MAXPAYLOAD_RDMA RPCSVC_MAXPAYLOAD +struct svc_rdma_payload { + __be32 *ra_chunk; + unsigned int ra_offset; + unsigned int ra_length; +}; + struct svc_rdma_recv_ctxt { struct llist_node rc_node; struct list_head rc_list; @@ -137,10 +143,10 @@ struct svc_rdma_recv_ctxt { unsigned int rc_page_count; unsigned int rc_hdr_count; u32 rc_inv_rkey; - __be32 *rc_write_list; + struct svc_rdma_payload *rc_read_payloads; __be32 *rc_reply_chunk; - unsigned int rc_read_payload_offset; - unsigned int rc_read_payload_length; + unsigned int rc_num_write_chunks; + unsigned int rc_cur_payload; struct page *rc_pages[RPCSVC_MAXPAGES]; }; @@ -193,7 +199,8 @@ extern void svc_rdma_sync_reply_hdr(struct svcxprt_rdma *rdma, unsigned int len); extern int svc_rdma_map_reply_msg(struct svcxprt_rdma *rdma, struct svc_rdma_send_ctxt *ctxt, - struct xdr_buf *xdr, __be32 *wr_lst); + struct xdr_buf *xdr, + unsigned int num_read_payloads); extern int svc_rdma_sendto(struct svc_rqst *); extern int svc_rdma_read_payload(struct svc_rqst *rqstp, unsigned int offset, unsigned int length); diff --git a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c index 908e78bb87c6..3b1baf15a1b7 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c +++ b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c @@ -117,7 +117,7 @@ static int svc_rdma_bc_sendto(struct svcxprt_rdma *rdma, { int ret; - ret = svc_rdma_map_reply_msg(rdma, ctxt, &rqst->rq_snd_buf, NULL); + ret = svc_rdma_map_reply_msg(rdma, ctxt, &rqst->rq_snd_buf, 0); if (ret < 0) return -EIO; diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c index 91abe08f7d75..85b8dd8ae772 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c @@ -193,7 +193,9 @@ void svc_rdma_recv_ctxts_destroy(struct svcxprt_rdma *rdma) out: ctxt->rc_page_count = 0; - ctxt->rc_read_payload_length = 0; + ctxt->rc_num_write_chunks = 0; + ctxt->rc_cur_payload = 0; + ctxt->rc_read_payloads = NULL; return ctxt; out_empty: @@ -216,7 +218,8 @@ void svc_rdma_recv_ctxt_put(struct svcxprt_rdma *rdma, for (i = 0; i < ctxt->rc_page_count; i++) put_page(ctxt->rc_pages[i]); - + kfree(ctxt->rc_read_payloads); + ctxt->rc_read_payloads = NULL; if (!ctxt->rc_temp) llist_add(&ctxt->rc_node, &rdma->sc_recv_ctxts); else @@ -452,9 +455,10 @@ static __be32 *xdr_check_write_chunk(__be32 *p, const __be32 *end, static __be32 *xdr_check_write_list(__be32 *p, const __be32 *end, struct svc_rdma_recv_ctxt *ctxt) { - u32 chcount; + u32 chcount, segcount; + __be32 *saved = p; + int i; - ctxt->rc_write_list = p; chcount = 0; while (*p++ != xdr_zero) { p = xdr_check_write_chunk(p, end, MAX_BYTES_WRITE_SEG); @@ -463,8 +467,22 @@ static __be32 *xdr_check_write_list(__be32 *p, const __be32 *end, if (chcount++ > 1) return NULL; } + ctxt->rc_num_write_chunks = chcount; if (!chcount) - ctxt->rc_write_list = NULL; + return p; + + ctxt->rc_read_payloads = kcalloc(sizeof(struct svc_rdma_payload), + chcount, GFP_KERNEL); + if (!ctxt->rc_read_payloads) + return NULL; + + i = 0; + p = saved; + while (*p++ != xdr_zero) { + ctxt->rc_read_payloads[i++].ra_chunk = p - 1; + segcount = be32_to_cpup(p++); + p += segcount * rpcrdma_segment_maxsz; + } return p; } @@ -484,8 +502,9 @@ static __be32 *xdr_check_reply_chunk(__be32 *p, const __be32 *end, p = xdr_check_write_chunk(p, end, MAX_BYTES_SPECIAL_SEG); if (!p) return NULL; - } else + } else { ctxt->rc_reply_chunk = NULL; + } return p; } diff --git a/net/sunrpc/xprtrdma/svc_rdma_rw.c b/net/sunrpc/xprtrdma/svc_rdma_rw.c index ca9d414bef9d..740ea4ee251d 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_rw.c +++ b/net/sunrpc/xprtrdma/svc_rdma_rw.c @@ -574,7 +574,7 @@ int svc_rdma_send_reply_chunk(struct svcxprt_rdma *rdma, /* Send the page list in the Reply chunk only if the * client did not provide Write chunks. */ - if (!rctxt->rc_write_list && xdr->page_len) { + if (!rctxt->rc_num_write_chunks && xdr->page_len) { ret = svc_rdma_send_xdr_pagelist(info, xdr, xdr->head[0].iov_len, xdr->page_len); diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c b/net/sunrpc/xprtrdma/svc_rdma_sendto.c index 7349a3f9aa5d..378a24b666bb 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c @@ -366,10 +366,10 @@ static __be32 *xdr_encode_read_list(__be32 *p) * transport header. Each segment's length field is updated to * reflect number of bytes consumed in the segment. * - * Returns number of segments in this chunk. + * Returns a pointer to the position to encode the next chunk. */ -static unsigned int xdr_encode_write_chunk(__be32 *dst, __be32 *src, - unsigned int remaining) +static __be32 *xdr_encode_write_chunk(__be32 *dst, __be32 *src, + unsigned int length) { unsigned int i, nsegs; u32 seg_len; @@ -386,15 +386,15 @@ static unsigned int xdr_encode_write_chunk(__be32 *dst, __be32 *src, *dst++ = *src++; /* bytes returned in this segment */ - seg_len = be32_to_cpu(*src); - if (remaining >= seg_len) { + seg_len = be32_to_cpup(src); + if (length >= seg_len) { /* entire segment was consumed */ *dst = *src; - remaining -= seg_len; + length -= seg_len; } else { /* segment only partly filled */ - *dst = cpu_to_be32(remaining); - remaining = 0; + *dst = cpu_to_be32(length); + length = 0; } dst++; src++; @@ -403,38 +403,25 @@ static unsigned int xdr_encode_write_chunk(__be32 *dst, __be32 *src, *dst++ = *src++; } - return nsegs; + return dst; } -/* The client provided a Write list in the Call message. Fill in - * the segments in the first Write chunk in the Reply's transport - * header with the number of bytes consumed in each segment. - * Remaining chunks are returned unused. - * - * Assumptions: - * - Client has provided only one Write chunk +/* The client provided a Write list in the Call message. For each + * READ payload, fill in the segments in the Write chunks in the + * Reply's transport header with the number of bytes consumed + * in each segment. Any remaining Write chunks are returned to + * the client unused. */ static __be32 *xdr_encode_write_list(__be32 *p, const struct svc_rdma_recv_ctxt *rctxt) { - unsigned int consumed, nsegs; - __be32 *q; - - q = rctxt->rc_write_list; - if (!q) - goto out; - - consumed = rctxt->rc_read_payload_length; - while (*q != xdr_zero) { - nsegs = xdr_encode_write_chunk(p, q, consumed); - q += 2 + nsegs * rpcrdma_segment_maxsz; - p += 2 + nsegs * rpcrdma_segment_maxsz; - consumed = 0; - } + unsigned int i; - /* Terminate Write list */ -out: - *p++ = xdr_zero; + for (i = 0; i < rctxt->rc_num_write_chunks; i++) + p = xdr_encode_write_chunk(p, + rctxt->rc_read_payloads[i].ra_chunk, + rctxt->rc_read_payloads[i].ra_length); + *p++ = xdr_zero; /* Terminate Write list */ return p; } @@ -519,7 +506,7 @@ void svc_rdma_sync_reply_hdr(struct svcxprt_rdma *rdma, static bool svc_rdma_pull_up_needed(struct svcxprt_rdma *rdma, struct svc_rdma_send_ctxt *ctxt, struct xdr_buf *xdr, - __be32 *wr_lst) + unsigned int num_write_chunks) { int elements; @@ -535,7 +522,7 @@ static bool svc_rdma_pull_up_needed(struct svcxprt_rdma *rdma, elements = 1; /* xdr->pages */ - if (!wr_lst) { + if (!num_write_chunks) { unsigned int remaining; unsigned long pageoff; @@ -563,7 +550,8 @@ static bool svc_rdma_pull_up_needed(struct svcxprt_rdma *rdma, */ static int svc_rdma_pull_up_reply_msg(struct svcxprt_rdma *rdma, struct svc_rdma_send_ctxt *ctxt, - struct xdr_buf *xdr, __be32 *wr_lst) + struct xdr_buf *xdr, + unsigned int num_write_chunks) { unsigned char *dst, *tailbase; unsigned int taillen; @@ -576,7 +564,7 @@ static int svc_rdma_pull_up_reply_msg(struct svcxprt_rdma *rdma, tailbase = xdr->tail[0].iov_base; taillen = xdr->tail[0].iov_len; - if (wr_lst) { + if (num_write_chunks) { u32 xdrpad; xdrpad = xdr_padsize(xdr->page_len); @@ -619,7 +607,7 @@ static int svc_rdma_pull_up_reply_msg(struct svcxprt_rdma *rdma, * @rdma: controlling transport * @ctxt: send_ctxt for the Send WR * @xdr: prepared xdr_buf containing RPC message - * @wr_lst: pointer to Call header's Write list, or NULL + * @num_read_payloads: count of separate READ payloads to send * * Load the xdr_buf into the ctxt's sge array, and DMA map each * element as it is added. @@ -628,7 +616,7 @@ static int svc_rdma_pull_up_reply_msg(struct svcxprt_rdma *rdma, */ int svc_rdma_map_reply_msg(struct svcxprt_rdma *rdma, struct svc_rdma_send_ctxt *ctxt, - struct xdr_buf *xdr, __be32 *wr_lst) + struct xdr_buf *xdr, unsigned int num_read_payloads) { unsigned int len, remaining; unsigned long page_off; @@ -637,8 +625,8 @@ int svc_rdma_map_reply_msg(struct svcxprt_rdma *rdma, u32 xdr_pad; int ret; - if (svc_rdma_pull_up_needed(rdma, ctxt, xdr, wr_lst)) - return svc_rdma_pull_up_reply_msg(rdma, ctxt, xdr, wr_lst); + if (svc_rdma_pull_up_needed(rdma, ctxt, xdr, num_read_payloads)) + return svc_rdma_pull_up_reply_msg(rdma, ctxt, xdr, num_read_payloads); ++ctxt->sc_cur_sge_no; ret = svc_rdma_dma_map_buf(rdma, ctxt, @@ -647,12 +635,12 @@ int svc_rdma_map_reply_msg(struct svcxprt_rdma *rdma, if (ret < 0) return ret; - /* If a Write chunk is present, the xdr_buf's page list + /* If Write chunks are present, the xdr_buf's page list * is not included inline. However the Upper Layer may * have added XDR padding in the tail buffer, and that * should not be included inline. */ - if (wr_lst) { + if (num_read_payloads) { base = xdr->tail[0].iov_base; len = xdr->tail[0].iov_len; xdr_pad = xdr_padsize(xdr->page_len); @@ -741,7 +729,7 @@ static int svc_rdma_send_reply_msg(struct svcxprt_rdma *rdma, if (!rctxt->rc_reply_chunk) { ret = svc_rdma_map_reply_msg(rdma, sctxt, &rqstp->rq_res, - rctxt->rc_write_list); + rctxt->rc_cur_payload); if (ret < 0) return ret; } @@ -885,18 +873,20 @@ int svc_rdma_read_payload(struct svc_rqst *rqstp, unsigned int offset, { struct svc_rdma_recv_ctxt *rctxt = rqstp->rq_xprt_ctxt; struct svcxprt_rdma *rdma; + unsigned int i; - if (!rctxt->rc_write_list) + if (!rctxt->rc_num_write_chunks) return 0; - /* XXX: Just one READ payload slot for now, since our - * transport implementation currently supports only one - * Write chunk. - */ - rctxt->rc_read_payload_offset = offset; - rctxt->rc_read_payload_length = length; + if (rctxt->rc_cur_payload > rctxt->rc_num_write_chunks) + return -ENOENT; + i = rctxt->rc_cur_payload++; + + rctxt->rc_read_payloads[i].ra_offset = offset; + rctxt->rc_read_payloads[i].ra_length = length; rdma = container_of(rqstp->rq_xprt, struct svcxprt_rdma, sc_xprt); - return svc_rdma_send_write_chunk(rdma, rctxt->rc_write_list, + return svc_rdma_send_write_chunk(rdma, + rctxt->rc_read_payloads[i].ra_chunk, &rqstp->rq_res, offset, length); }